Proposed changes to Gorouter log message format


Shannon Coen
 

The Routing team is moving from cloudfoundry/lager to uber-go/zap logging
library for the Gorouter log, as we have observed improvements to
throughput performance using zap. This change has no impact on the
access.log, only to gorouter.log.

The Gorouter log message will change in the following ways. Please let us
know if any of these are a concern. Thank you!

1. "session" removed from "data" key. Does anyone use this?

lager
"data":{"host":"10.0.32.21:4222","session":"1"}

zap
"data":{"host":"10.0.32.21:4222"}


2. Values for "source" key more granular. Seems helpful for filtering
messages for like operations

lager
"source":"vcap.gorouter"

zap
"source":"vcap.gorouter"
"source":"vcap.gorouter.nats"
"source":"vcap.gorouter.subscriber"
"source":"vcap.gorouter.router"
"source":"vcap.gorouter.registry"


3. Source no longer prepends value of "message" key, as it is redundant to
the "source" key.

lager
"message":"vcap.gorouter.starting"

zap
"message":"starting"

4. Key reordering. My expectation is that no one should be using key order
for log analysis.

lager
{"timestamp","source","message","log_level","data"}

zap
{"log_level","timestamp","message","source","data"}


Full example log messages:
https://gist.github.com/shalako/79c8d7e8bf6d26b9934bb2679914c856

Thank you,

Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.


Etourneau Gwenn
 

Shannon,

Can you share about the improvements you are talking about ?

Thanks

2017-01-19 9:46 GMT+09:00 Shannon Coen <scoen(a)pivotal.io>:

The Routing team is moving from cloudfoundry/lager to uber-go/zap logging
library for the Gorouter log, as we have observed improvements to
throughput performance using zap. This change has no impact on the
access.log, only to gorouter.log.

The Gorouter log message will change in the following ways. Please let us
know if any of these are a concern. Thank you!

1. "session" removed from "data" key. Does anyone use this?

lager
"data":{"host":"10.0.32.21:4222","session":"1"}

zap
"data":{"host":"10.0.32.21:4222"}


2. Values for "source" key more granular. Seems helpful for filtering
messages for like operations

lager
"source":"vcap.gorouter"

zap
"source":"vcap.gorouter"
"source":"vcap.gorouter.nats"
"source":"vcap.gorouter.subscriber"
"source":"vcap.gorouter.router"
"source":"vcap.gorouter.registry"


3. Source no longer prepends value of "message" key, as it is redundant
to the "source" key.

lager
"message":"vcap.gorouter.starting"

zap
"message":"starting"

4. Key reordering. My expectation is that no one should be using key order
for log analysis.

lager
{"timestamp","source","message","log_level","data"}

zap
{"log_level","timestamp","message","source","data"}


Full example log messages: https://gist.github.com/shalako/
79c8d7e8bf6d26b9934bb2679914c856

Thank you,

Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.


Mike Youngstrom
 


4. Key reordering. My expectation is that no one should be using key order
for log analysis.

lager
{"timestamp","source","message","log_level","data"}

zap
{"log_level","timestamp","message","source","data"}


Our Splunk configuration requires the timestamp to be early in the log
message. Should be fine as the second item. But, might cause issues any
further into the message than that.

Mike


Shannon Coen
 

Mike, thank you for your response. The Routing team tells me we can count
on the library-provided timestamp appearing in the second position. To be
clear, a full log message has two timestamps, one provided by the logging
library (epoch), and one that our Gorouter code prepends every message with
(human readable).

[2017-01-13 22:05:58+0000]
{"log_level":1,"timestamp":1484345158.9532504,"message":"starting","source":"vcap.gorouter","data":{}}

If you're referring to the human readable one, that will continue to
prepend the JSON object.

Etourneau, we've seen 10-20% improvement in throughput by swapping lager
for zap in this use case. There are still valid reasons for other CF
components to continue using lager, including features it supports that zap
does not. However, we don't have requirements for those features in
Gorouter, and performance is particularly important in this component. We
will be publishing a blog post as a sequel to one we published recently, in
which we'll share the tooling we've developed for exploring performance
improvements, what we've tried that worked and didn't, and our results.


Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.

On Wed, Jan 18, 2017 at 9:50 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

4. Key reordering. My expectation is that no one should be using key order
for log analysis.

lager
{"timestamp","source","message","log_level","data"}

zap
{"log_level","timestamp","message","source","data"}


Our Splunk configuration requires the timestamp to be early in the log
message. Should be fine as the second item. But, might cause issues any
further into the message than that.

Mike


Mike Youngstrom
 

Thanks Shannon.

FWIW we strip the first timestamp from log messages sent to Splunk because
Splunk doesn't detect the message as JSON unless the entire event is
properly formatted json. Either way, I think always second position should
be fine.

Mike

On Thu, Jan 19, 2017 at 12:44 PM, Shannon Coen <scoen(a)pivotal.io> wrote:

Mike, thank you for your response. The Routing team tells me we can count
on the library-provided timestamp appearing in the second position. To be
clear, a full log message has two timestamps, one provided by the logging
library (epoch), and one that our Gorouter code prepends every message with
(human readable).

[2017-01-13 22:05:58+0000] {"log_level":1,"timestamp":
1484345158.9532504,"message":"starting","source":"vcap.
gorouter","data":{}}

If you're referring to the human readable one, that will continue to
prepend the JSON object.

Etourneau, we've seen 10-20% improvement in throughput by swapping lager
for zap in this use case. There are still valid reasons for other CF
components to continue using lager, including features it supports that zap
does not. However, we don't have requirements for those features in
Gorouter, and performance is particularly important in this component. We
will be publishing a blog post as a sequel to one we published recently, in
which we'll share the tooling we've developed for exploring performance
improvements, what we've tried that worked and didn't, and our results.


Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.

On Wed, Jan 18, 2017 at 9:50 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

4. Key reordering. My expectation is that no one should be using key
order for log analysis.

lager
{"timestamp","source","message","log_level","data"}

zap
{"log_level","timestamp","message","source","data"}


Our Splunk configuration requires the timestamp to be early in the log
message. Should be fine as the second item. But, might cause issues any
further into the message than that.

Mike


Gwenn Etourneau
 

Thanks Shannon ! that's a pretty good news hope we can increase perf more.
:)

On Fri, Jan 20, 2017 at 6:20 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Thanks Shannon.

FWIW we strip the first timestamp from log messages sent to Splunk because
Splunk doesn't detect the message as JSON unless the entire event is
properly formatted json. Either way, I think always second position should
be fine.

Mike

On Thu, Jan 19, 2017 at 12:44 PM, Shannon Coen <scoen(a)pivotal.io> wrote:

Mike, thank you for your response. The Routing team tells me we can count
on the library-provided timestamp appearing in the second position. To be
clear, a full log message has two timestamps, one provided by the logging
library (epoch), and one that our Gorouter code prepends every message with
(human readable).

[2017-01-13 22:05:58+0000] {"log_level":1,"timestamp":148
4345158.9532504,"message":"starting","source":"vcap.gorouter","data":{}}

If you're referring to the human readable one, that will continue to
prepend the JSON object.

Etourneau, we've seen 10-20% improvement in throughput by swapping lager
for zap in this use case. There are still valid reasons for other CF
components to continue using lager, including features it supports that zap
does not. However, we don't have requirements for those features in
Gorouter, and performance is particularly important in this component. We
will be publishing a blog post as a sequel to one we published recently, in
which we'll share the tooling we've developed for exploring performance
improvements, what we've tried that worked and didn't, and our results.


Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.

On Wed, Jan 18, 2017 at 9:50 PM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

4. Key reordering. My expectation is that no one should be using key
order for log analysis.

lager
{"timestamp","source","message","log_level","data"}

zap
{"log_level","timestamp","message","source","data"}


Our Splunk configuration requires the timestamp to be early in the log
message. Should be fine as the second item. But, might cause issues any
further into the message than that.

Mike


Carlo Alberto Ferraris
 

Shannon,
Are there plans to extend the use of zap to all CF components?

Thanks,
Carlo


Shannon Coen
 

Hello Carlo,

There are no plans to use Zap in other components.

Best,

Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.

On Sun, Jan 22, 2017 at 5:12 PM, Carlo Alberto Ferraris <
carlo.ferraris(a)rakuten.com> wrote:

Shannon,
Are there plans to extend the use of zap to all CF components?

Thanks,
Carlo