Jim,
First of all, thank you for the opportunity to talk doing the summit!
As promised, here is a brief recap of those two pieces of feedback I shared, plus one additional note that we came up with later on:
* Multiline logs
* This is something that we frequently hear from our users. Not supporting multiline might be annoying but still bearable as long as the ordering of the lines was guaranteed to be consistent; unfortunately it’s not. That basically makes any stack trace printed to the logs useless.
* During the talk you mentioned that a workaround exists for Java-log4j (unfortunately the majority of our apps are in ruby) and that you’re considering a “permanent workaround”. For this I’d just add a suggestion: it looks like most stack traces follow the convention that lines after the first are indented… maybe this could be turned into a formal convention?
* Losing logs
* While we’d like to minimise/remove the possibility of losing logs (and see below for a note about this) it is much more important for us to know whether or not we are losing logs, and if so which ones.
* During our talk I mentioned that a potential solution would be to have all producers add a per-producer monotonically-increasing counter to each message. This would allow to unequivocably sort the messages in the correct order, know how many messages we’re losing for each producer and which position they were in within the stream.
* (not) losing logs
* You mentioned that reliable deliver is very often requested, and I heard from Gwenn that during the office hours there were many requests for this.
* Talking with my colleagues we came up with an observation that I think could be worth sharing. If you squint hard enough, doppler-etcd is the non-persistent, leaky equivalent of kafka-zookeeper. The only CF-specific part of doppler is the dropsonde protocol. We’ll explore if it is possible to have a kafka producer disguised as doppler, but it would also be interesting (and arguably better) if metron (and tc?) could be fitted with the ability to talk to different kinds of brokers. This would allow to have a single leaky component (metron) while the rest of the pipeline can benefit from the delivery guarantees of Kafka.
I hope what I wrote makes sense, in case it doesn’t I’ll try to clarify. Thanks for the awesome work on your part!
Carlo Alberto