App syslog drain performance improvements


Erik Jasiak
 

Hi CF community,

The Loggregator team has two medium-impact changes as part of CF v215.

* First, we fixed a bug causing slower performance with application
syslog drains. [1][2] The impact is that doppler should stream
application log messages much faster, and by default you should see
fewer “We’ve dropped 100 messages” problems, and related scalability issues.

** Special thanks to Daniel Jones / “EngineerBetter” in the CF community
for identifying the issue - it’s fantastic to see this level of
investigation and participation in the code base, which is what open
source is all about.

* Second, the buffer size for a Doppler is now configurable in bosh for
loggregator dopplers. The number of messages to buffer in doppler is
configurable as:

doppler.message_drain_buffer_size = 100

The default buffer size for dopplers is still set to 100 messages for
now, while we evaluate the effectiveness of fixing the latency bug.
However, we would appreciate feedback from those that do re-configure
their buffer sizes.

Known impacts of reconfiguring buffer size:

1) Memory usage from increased buffer analyzed, please see story[3].
Overhead of upping size appears to be minor, but if you have a different
experience please let us know.
2) When buffer size increased, falling behind a doppler runs the risk of
a much larger number of dropped messages.

Thank you all, and happy loggregating!

Erik Jasiak
PM, Loggregator team

[1] https://www.pivotaltracker.com/n/projects/993188/stories/99494586
[2] https://github.com/cloudfoundry/loggregator/pull/71
[3] https://www.pivotaltracker.com/story/show/100163298