Multi-Line Loggregator events and the new Splunk "HTTP Event Collector" API


Mike Youngstrom <youngm@...>
 

Splunk recently released its new "HTTP Event Collector" that greatly
simplifies how data can be streamed directly into Splunk without going to
an intermediate log file. It would be great to utilize this to efficiently
stream Loggregator information into Splunk.

For the most part loggregator appears to be very compatible with this API
with the exception of multi-line log messages.

The problem is that using this API splunk takes every request as an
independent splunk event. This completely eliminates anything splunk did
in the past to attempt to detect multi-line log messages.

Wouldn't it be great if a single loggregator event could contain multiple
log lines then these events could be easily streamed directly into Splunk
using this new api multiple lines preserved and all?

The previous attempt to bring up this topic fizzled [0]. With a new LAMB
PM coming I thought I'd ask my previous questions again.

In the previous thread [0] Erik mentioned a lot of work that he thought
would lead to multi-line log messages. But, it seems to me that the main
issue is simply how can a client actually communicate an multi-line event
to an agent? I don't think this issue is about breaking apart and then
combining log event rather how can I just I as a client hint to loggregator
that it should include multiple lines included into a single event?

Could it be as simple as escaping new lines with a '\' to notify the agent
to not end that event?

This problem cannot be solved without some help from loggregator.

Mike

[0]
https://lists.cloudfoundry.org/archives/list/cf-dev%40lists.cloudfoundry.org/thread/O6NDVGV44IBMVKZQXWOFIYOIC6CDU27G/


Rohit Kumar
 

Hi Mike,

As Erik mentioned in the last thread, multi-line logging is something which
the loggregator team would like to solve. But there are a few questions to
answer before we can come up with a clean solution. We want a design which
solves the problem while not breaking existing apps which do not require
this functionality. Before implementing a solution we would also want to
answer if we want to do it for both runtimes or just Diego, since the way
log lines are sent to Metron differ based on the runtime.

If we were to implement the solution which you described, where newlines
are escaped with a '\', I guess the expectation is that loggregator would
internally remove the escape character. This has performance implications
because now some part of loggregator will need to inspect the log message
and coalesce the message with the succeeding ones. We will need to do this
in a way which respects multi-tenancy. That means now we are storing
additional state related to log lines per app. We will also need to decide
how long loggregator needs to wait for the next lines in a multi-line log,
before deciding to send the line which it received. To me that's not a
simple change.

I am happy to continue this discussion and hear your thoughts on the
existing proposal or any other design alternatives.

Thanks,
Rohit

On Wed, Oct 7, 2015 at 10:45 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Splunk recently released its new "HTTP Event Collector" that greatly
simplifies how data can be streamed directly into Splunk without going to
an intermediate log file. It would be great to utilize this to efficiently
stream Loggregator information into Splunk.

For the most part loggregator appears to be very compatible with this API
with the exception of multi-line log messages.

The problem is that using this API splunk takes every request as an
independent splunk event. This completely eliminates anything splunk did
in the past to attempt to detect multi-line log messages.

Wouldn't it be great if a single loggregator event could contain multiple
log lines then these events could be easily streamed directly into Splunk
using this new api multiple lines preserved and all?

The previous attempt to bring up this topic fizzled [0]. With a new LAMB
PM coming I thought I'd ask my previous questions again.

In the previous thread [0] Erik mentioned a lot of work that he thought
would lead to multi-line log messages. But, it seems to me that the main
issue is simply how can a client actually communicate an multi-line event
to an agent? I don't think this issue is about breaking apart and then
combining log event rather how can I just I as a client hint to loggregator
that it should include multiple lines included into a single event?

Could it be as simple as escaping new lines with a '\' to notify the agent
to not end that event?

This problem cannot be solved without some help from loggregator.

Mike

[0]
https://lists.cloudfoundry.org/archives/list/cf-dev%40lists.cloudfoundry.org/thread/O6NDVGV44IBMVKZQXWOFIYOIC6CDU27G/


Mike Youngstrom <youngm@...>
 

Thanks for the response Rohit. I hope this is the beginning of a good long
discussion on the topic. :)

Before going too deep with the '\' proposal are you aware if the
loggregator team considered any other possible ways an application could
hint to the agent that this line should wait for future lines before
sending the event? I'm not necessarily in love with the '\' approach just
throwing an idea out to start a discussion.

Mike

On Wed, Oct 7, 2015 at 7:58 PM, Rohit Kumar <rokumar(a)pivotal.io> wrote:

Hi Mike,

As Erik mentioned in the last thread, multi-line logging is something
which the loggregator team would like to solve. But there are a few
questions to answer before we can come up with a clean solution. We want a
design which solves the problem while not breaking existing apps which do
not require this functionality. Before implementing a solution we would
also want to answer if we want to do it for both runtimes or just Diego,
since the way log lines are sent to Metron differ based on the runtime.

If we were to implement the solution which you described, where newlines
are escaped with a '\', I guess the expectation is that loggregator would
internally remove the escape character. This has performance implications
because now some part of loggregator will need to inspect the log message
and coalesce the message with the succeeding ones. We will need to do this
in a way which respects multi-tenancy. That means now we are storing
additional state related to log lines per app. We will also need to decide
how long loggregator needs to wait for the next lines in a multi-line log,
before deciding to send the line which it received. To me that's not a
simple change.

I am happy to continue this discussion and hear your thoughts on the
existing proposal or any other design alternatives.

Thanks,
Rohit

On Wed, Oct 7, 2015 at 10:45 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Splunk recently released its new "HTTP Event Collector" that greatly
simplifies how data can be streamed directly into Splunk without going to
an intermediate log file. It would be great to utilize this to efficiently
stream Loggregator information into Splunk.

For the most part loggregator appears to be very compatible with this API
with the exception of multi-line log messages.

The problem is that using this API splunk takes every request as an
independent splunk event. This completely eliminates anything splunk did
in the past to attempt to detect multi-line log messages.

Wouldn't it be great if a single loggregator event could contain multiple
log lines then these events could be easily streamed directly into Splunk
using this new api multiple lines preserved and all?

The previous attempt to bring up this topic fizzled [0]. With a new LAMB
PM coming I thought I'd ask my previous questions again.

In the previous thread [0] Erik mentioned a lot of work that he thought
would lead to multi-line log messages. But, it seems to me that the main
issue is simply how can a client actually communicate an multi-line event
to an agent? I don't think this issue is about breaking apart and then
combining log event rather how can I just I as a client hint to loggregator
that it should include multiple lines included into a single event?

Could it be as simple as escaping new lines with a '\' to notify the
agent to not end that event?

This problem cannot be solved without some help from loggregator.

Mike

[0]
https://lists.cloudfoundry.org/archives/list/cf-dev%40lists.cloudfoundry.org/thread/O6NDVGV44IBMVKZQXWOFIYOIC6CDU27G/


Mike Youngstrom <youngm@...>
 

Another possible idea. Allow an application to send a single log line with
the new line characters escaped "Some Log Line1\\nSome Log Line2" Then
Loggregator could either remove the escape on the logging agent or if that
is too processor expensive then make it a standard responsibility of
clients to unescape these lines.

I can get fairly far myself with this approach by simply unescaping in our
Splunk processor. The problem is other aspects of CF don't expect this so
cf logs doesn't work correctly for example.

Mike

On Thu, Oct 8, 2015 at 11:31 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Thanks for the response Rohit. I hope this is the beginning of a good
long discussion on the topic. :)

Before going too deep with the '\' proposal are you aware if the
loggregator team considered any other possible ways an application could
hint to the agent that this line should wait for future lines before
sending the event? I'm not necessarily in love with the '\' approach just
throwing an idea out to start a discussion.

Mike

On Wed, Oct 7, 2015 at 7:58 PM, Rohit Kumar <rokumar(a)pivotal.io> wrote:

Hi Mike,

As Erik mentioned in the last thread, multi-line logging is something
which the loggregator team would like to solve. But there are a few
questions to answer before we can come up with a clean solution. We want a
design which solves the problem while not breaking existing apps which do
not require this functionality. Before implementing a solution we would
also want to answer if we want to do it for both runtimes or just Diego,
since the way log lines are sent to Metron differ based on the runtime.

If we were to implement the solution which you described, where newlines
are escaped with a '\', I guess the expectation is that loggregator would
internally remove the escape character. This has performance implications
because now some part of loggregator will need to inspect the log message
and coalesce the message with the succeeding ones. We will need to do this
in a way which respects multi-tenancy. That means now we are storing
additional state related to log lines per app. We will also need to decide
how long loggregator needs to wait for the next lines in a multi-line log,
before deciding to send the line which it received. To me that's not a
simple change.

I am happy to continue this discussion and hear your thoughts on the
existing proposal or any other design alternatives.

Thanks,
Rohit

On Wed, Oct 7, 2015 at 10:45 AM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

Splunk recently released its new "HTTP Event Collector" that greatly
simplifies how data can be streamed directly into Splunk without going to
an intermediate log file. It would be great to utilize this to efficiently
stream Loggregator information into Splunk.

For the most part loggregator appears to be very compatible with this
API with the exception of multi-line log messages.

The problem is that using this API splunk takes every request as an
independent splunk event. This completely eliminates anything splunk did
in the past to attempt to detect multi-line log messages.

Wouldn't it be great if a single loggregator event could contain
multiple log lines then these events could be easily streamed directly into
Splunk using this new api multiple lines preserved and all?

The previous attempt to bring up this topic fizzled [0]. With a new
LAMB PM coming I thought I'd ask my previous questions again.

In the previous thread [0] Erik mentioned a lot of work that he thought
would lead to multi-line log messages. But, it seems to me that the main
issue is simply how can a client actually communicate an multi-line event
to an agent? I don't think this issue is about breaking apart and then
combining log event rather how can I just I as a client hint to loggregator
that it should include multiple lines included into a single event?

Could it be as simple as escaping new lines with a '\' to notify the
agent to not end that event?

This problem cannot be solved without some help from loggregator.

Mike

[0]
https://lists.cloudfoundry.org/archives/list/cf-dev%40lists.cloudfoundry.org/thread/O6NDVGV44IBMVKZQXWOFIYOIC6CDU27G/


Rohit Kumar
 

We have thrown around one approach which solves the problem but would
require changes in the runtime. That solution would expose a socket to the
container where the application could emit logs. The application would now
have control over what delimits a message.

The implementation of this solution though would need coordination with the
runtime, as the socket would need to be plumbed from the container all the
way to metron. The messages would also need to be associated with the
application ID when they reach metron.

Rohit

On Fri, Oct 9, 2015 at 1:53 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Another possible idea. Allow an application to send a single log line
with the new line characters escaped "Some Log Line1\\nSome Log Line2" Then
Loggregator could either remove the escape on the logging agent or if that
is too processor expensive then make it a standard responsibility of
clients to unescape these lines.

I can get fairly far myself with this approach by simply unescaping in our
Splunk processor. The problem is other aspects of CF don't expect this so
cf logs doesn't work correctly for example.

Mike

On Thu, Oct 8, 2015 at 11:31 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Thanks for the response Rohit. I hope this is the beginning of a good
long discussion on the topic. :)

Before going too deep with the '\' proposal are you aware if the
loggregator team considered any other possible ways an application could
hint to the agent that this line should wait for future lines before
sending the event? I'm not necessarily in love with the '\' approach just
throwing an idea out to start a discussion.

Mike

On Wed, Oct 7, 2015 at 7:58 PM, Rohit Kumar <rokumar(a)pivotal.io> wrote:

Hi Mike,

As Erik mentioned in the last thread, multi-line logging is something
which the loggregator team would like to solve. But there are a few
questions to answer before we can come up with a clean solution. We want a
design which solves the problem while not breaking existing apps which do
not require this functionality. Before implementing a solution we would
also want to answer if we want to do it for both runtimes or just Diego,
since the way log lines are sent to Metron differ based on the runtime.

If we were to implement the solution which you described, where newlines
are escaped with a '\', I guess the expectation is that loggregator would
internally remove the escape character. This has performance implications
because now some part of loggregator will need to inspect the log message
and coalesce the message with the succeeding ones. We will need to do this
in a way which respects multi-tenancy. That means now we are storing
additional state related to log lines per app. We will also need to decide
how long loggregator needs to wait for the next lines in a multi-line log,
before deciding to send the line which it received. To me that's not a
simple change.

I am happy to continue this discussion and hear your thoughts on the
existing proposal or any other design alternatives.

Thanks,
Rohit

On Wed, Oct 7, 2015 at 10:45 AM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

Splunk recently released its new "HTTP Event Collector" that greatly
simplifies how data can be streamed directly into Splunk without going to
an intermediate log file. It would be great to utilize this to efficiently
stream Loggregator information into Splunk.

For the most part loggregator appears to be very compatible with this
API with the exception of multi-line log messages.

The problem is that using this API splunk takes every request as an
independent splunk event. This completely eliminates anything splunk did
in the past to attempt to detect multi-line log messages.

Wouldn't it be great if a single loggregator event could contain
multiple log lines then these events could be easily streamed directly into
Splunk using this new api multiple lines preserved and all?

The previous attempt to bring up this topic fizzled [0]. With a new
LAMB PM coming I thought I'd ask my previous questions again.

In the previous thread [0] Erik mentioned a lot of work that he thought
would lead to multi-line log messages. But, it seems to me that the main
issue is simply how can a client actually communicate an multi-line event
to an agent? I don't think this issue is about breaking apart and then
combining log event rather how can I just I as a client hint to loggregator
that it should include multiple lines included into a single event?

Could it be as simple as escaping new lines with a '\' to notify the
agent to not end that event?

This problem cannot be solved without some help from loggregator.

Mike

[0]
https://lists.cloudfoundry.org/archives/list/cf-dev%40lists.cloudfoundry.org/thread/O6NDVGV44IBMVKZQXWOFIYOIC6CDU27G/


Mike Youngstrom <youngm@...>
 


We have thrown around one approach which solves the problem but would
require changes in the runtime. That solution would expose a socket to the
container where the application could emit logs. The application would now
have control over what delimits a message.
Any thoughts on what protocol this socket would listen too? Raw
Dropsonde? Syslog?

I've been thinking about your questions regarding the '\' approach:

As Erik mentioned in the last thread, multi-line logging is something which
the loggregator team would like to solve. But there are a few questions to
answer before we can come up with a clean solution. We want a design which
solves the problem while not breaking existing apps which do not require
this functionality. Before implementing a solution we would also want to
answer if we want to do it for both runtimes or just Diego, since the way
log lines are sent to Metron differ based on the runtime.
I'd be perfectly happy if the solution was only for Diego. We are
surviving today but I think the feasibility of our current solution is
running out.


I guess the expectation is that loggregator would internally remove the
escape character.
I think this would be optimal.


This has performance implications because now some part of loggregator
will need to inspect the log message and coalesce the message with the
succeeding ones. We will need to do this in a way which respects
multi-tenancy. That means now we are storing additional state related to
log lines per app. We will also need to decide how long loggregator needs
to wait for the next lines in a multi-line log, before deciding to send the
line which it received. To me that's not a simple change.
Can you help me understand what you are referring to with "coalescing
messages" and "storing additional state related to log lines per app"? The
way I see it the current agents buffer until they see a '\n' then they
create an event. Adding an escaped line '\\n' the logic would be very much
the same as it is today, buffer until you find an unescaped new line. Then
unescape the remaining new lines.

Seems somewhat straight forward to me. Thoughts on considering a pull
request that does something like this?

Mike


Jim CF Campbell
 

New Loggregator PM chiming in.

This is definitely on the Loggregator roadmap. Only issues are selecting a
design and finalizing (as much as is ever possible) where it lives in the
priority order. We would certainly consider a pull request if it met a
consensus architecture model ala Rohit's concerns.

On Thu, Oct 15, 2015 at 11:19 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

We have thrown around one approach which solves the problem but would
require changes in the runtime. That solution would expose a socket to the
container where the application could emit logs. The application would now
have control over what delimits a message.
Any thoughts on what protocol this socket would listen too? Raw
Dropsonde? Syslog?

I've been thinking about your questions regarding the '\' approach:

As Erik mentioned in the last thread, multi-line logging is something
which the loggregator team would like to solve. But there are a few
questions to answer before we can come up with a clean solution. We want a
design which solves the problem while not breaking existing apps which do
not require this functionality. Before implementing a solution we would
also want to answer if we want to do it for both runtimes or just Diego,
since the way log lines are sent to Metron differ based on the runtime.
I'd be perfectly happy if the solution was only for Diego. We are
surviving today but I think the feasibility of our current solution is
running out.


I guess the expectation is that loggregator would internally remove the
escape character.
I think this would be optimal.


This has performance implications because now some part of loggregator
will need to inspect the log message and coalesce the message with the
succeeding ones. We will need to do this in a way which respects
multi-tenancy. That means now we are storing additional state related to
log lines per app. We will also need to decide how long loggregator needs
to wait for the next lines in a multi-line log, before deciding to send the
line which it received. To me that's not a simple change.
Can you help me understand what you are referring to with "coalescing
messages" and "storing additional state related to log lines per app"? The
way I see it the current agents buffer until they see a '\n' then they
create an event. Adding an escaped line '\\n' the logic would be very much
the same as it is today, buffer until you find an unescaped new line. Then
unescape the remaining new lines.

Seems somewhat straight forward to me. Thoughts on considering a pull
request that does something like this?

Mike

--
Jim Campbell
Product Manager
Pivotal Labs


Mike Youngstrom <youngm@...>
 

Great! Thanks Jim. Sounds completely reasonable. Hopefully we can keep
this thread moving and help derive some future designs out of it. Would
you prefer to keep this discussion on the mailling list or a github issue?

Mike

On Thu, Oct 15, 2015 at 12:26 PM, Jim Campbell <jcampbell(a)pivotal.io> wrote:

New Loggregator PM chiming in.

This is definitely on the Loggregator roadmap. Only issues are selecting a
design and finalizing (as much as is ever possible) where it lives in the
priority order. We would certainly consider a pull request if it met a
consensus architecture model ala Rohit's concerns.



On Thu, Oct 15, 2015 at 11:19 AM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

We have thrown around one approach which solves the problem but would
require changes in the runtime. That solution would expose a socket to the
container where the application could emit logs. The application would now
have control over what delimits a message.
Any thoughts on what protocol this socket would listen too? Raw
Dropsonde? Syslog?

I've been thinking about your questions regarding the '\' approach:

As Erik mentioned in the last thread, multi-line logging is something
which the loggregator team would like to solve. But there are a few
questions to answer before we can come up with a clean solution. We want a
design which solves the problem while not breaking existing apps which do
not require this functionality. Before implementing a solution we would
also want to answer if we want to do it for both runtimes or just Diego,
since the way log lines are sent to Metron differ based on the runtime.
I'd be perfectly happy if the solution was only for Diego. We are
surviving today but I think the feasibility of our current solution is
running out.


I guess the expectation is that loggregator would internally remove the
escape character.
I think this would be optimal.


This has performance implications because now some part of loggregator
will need to inspect the log message and coalesce the message with the
succeeding ones. We will need to do this in a way which respects
multi-tenancy. That means now we are storing additional state related to
log lines per app. We will also need to decide how long loggregator needs
to wait for the next lines in a multi-line log, before deciding to send the
line which it received. To me that's not a simple change.
Can you help me understand what you are referring to with "coalescing
messages" and "storing additional state related to log lines per app"? The
way I see it the current agents buffer until they see a '\n' then they
create an event. Adding an escaped line '\\n' the logic would be very much
the same as it is today, buffer until you find an unescaped new line. Then
unescape the remaining new lines.

Seems somewhat straight forward to me. Thoughts on considering a pull
request that does something like this?

Mike


--
Jim Campbell
Product Manager
Pivotal Labs