Re: Logstash and Multiline Log Entry


Mike Youngstrom <youngm@...>
 

I replied a while ago but it appears the mailing list dropped it. Here is
another attempt.

Thanks for the response Erik.

{quote}
* Loggregator's dropsonde protocol didn't allow for a clean way to
enforce/tag multi-line data consistency - something we are about to put
forward a proposal to remedy.
** Timestamps are not a clean mechanism for reliably re-assembling a
multi-line log - some combination of app-instance and order-of-output would
need to be tacked on, or a decent vector-time implementation.
{quote}

It seems these features above are focused on breaking apart a multi-line
event into multiple events then re-assembling them somewhere else. I think
I've been looking at the issue from a different perspective. I've been
looking for ways to allow an application to get multiple lines into a
single Loggregator event. That would be good enough to solve my problems
without all the complexity of breaking up and reassembling a series of
events. Even if Loggregator had ways to add metadata to join multiple
events into 1 we are still faced with the problem of how the
dea_logging_agent knows when an app logs multiple stdout/err lines when
should those be considered a multi-line message.

Do you have any thoughts on how an app might hint to the logging agent that
a message is a multi-line message? It seems to me that is the real problem
I need solved. Or am I missing something?

Mike

On Tue, Aug 11, 2015 at 1:26 PM, Erik Jasiak <mjasiak(a)pivotal.io> wrote:

Hi Steve and Simon; hello again Mike,

First, apologies for the delay in reply on this one- I've also been
trying to come up with a simple, short answer to this problem. I failed.

Here are the high-level, non-technical answers:
1) Yes, we'd love to enable multi-line logging. Regardless of any
other challenges, we know that there's interest.
2) The problem is multi-layered, and extends beyond loggregator.
2a) Most of the problems with multi-line logging that overlap
loggregator also overlap "general scalability" - problems we've been
handling as part of moving toward collector retirement.
3) We have a hack day project looking at anything "quick and dirty" to
help fix this.
4) Redirecting app logs have known workarounds (eg in Java: via log4j
or similar) while we tackled this - not preferred at all, but do-able.

#########

Technical answers: Loggregator's goals are "Fast, thorough, dumb."
Multi-line logging - as handled by loggregator - has no clean way of
working at the moment w/o violating "fast" or "dumb" principles today.
We're getting there though.

Here's how we've been working towards a fix:
* Syslog drains were not performant enough, or could not handle
large java traces - something we recently fixed[1][2][3] and are going to
email about separately.
* Horizontal scalability allows for overall better performance and
reliability, but pushes the cost on data consistency to the edges of
loggregator (hence nozzles, injectors.)
* Loggregator's dropsonde protocol didn't allow for a clean way to
enforce/tag multi-line data consistency - something we are about to put
forward a proposal to remedy.
** Timestamps are not a clean mechanism for reliably re-assembling a
multi-line log - some combination of app-instance and order-of-output would
need to be tacked on, or a decent vector-time implementation. We'd need a
way to add this metadata that would allow for re-assembly (see protocol
item above). We'd also have to add extra info at DEA or garden without
sacrificing performance - and we know that the DEA logging agent today
already has questions around "acceptable" performance.

So a multiline fix intersects our goals today. I will do my best to
highlight stories that help us with multi-line logging, and we need to do a
better job at communicating that we're working toward it, even if it's not
the obvious target goal.

,
Erik

[1] https://www.pivotaltracker.com/story/show/99494586
[2] https://www.pivotaltracker.com/story/show/97928938
[3] https://www.pivotaltracker.com/story/show/100163298


Steve Wall wrote:

Now I see what that means. Each line of a multiline log message could be
sent to a different logstash server. Definitely problematic. Especially
with the ephemeral nature of the CF logs there needs to be a viable
solution to persist the logs and syslog seems to be a natural solution. I'm
located in Denver and attend the local CF meetups held in the Pivotal
offices. I believe some LAMB devs attend. I'll be sure to bring it up with
them.
-Steve

On Wed, Jul 29, 2015 at 9:47 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Thanks Steve. Though I'm no logstash expert I assume this won't work if
you have multiple logstash machine's doing filtering like Simon mentioned
right? Same is true for us with splunk if you are forwarding logs to more
than one indexer via the REST api. I'd still like to have a discussion
with Erik about this problem see if he thinks there is anything that can be
done in loggregator to help.

Mike

On Wed, Jul 29, 2015 at 9:00 AM, Steve Wall <
steve.wall(a)primetimesoftware.com> wrote:

Here's a suggested pattern to handle stack traces.


http://stackoverflow.com/questions/31657863/logstash-and-multiline-log-entry-from-cloud-foundry?noredirect=1#comment51279061_31657863


On Mon, Jul 27, 2015 at 11:02 AM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

Yet another request for improved multi line log message handling. Is
there any update from the LAMB team on plans to improve this problem?
There have been several proposed solutions but I'm not aware of anything
actually making it into the LAMB tracker. It would be great if we could
hear from Erik on this issue. Does the LAMB team believe it is not an
issue? Are there plans to improve this situation? Whatever the
perspective lets discuss it as a community and see if there are any options
better than the current. I'd really like to see something turned into a
tracker issue if there are better options.

Mike

[0]
http://lists.cloudfoundry.org/pipermail/cf-dev/2015-June/000423.html
[1] http://lists.cloudfoundry.org/pipermail/cf-dev/2015-May/000083.html
[2]
https://groups.google.com/a/cloudfoundry.org/forum/?utm_medium=email&utm_source=footer#!msg/vcap-dev/B1W6_vO0oyo/84X1eAtFsKoJ

On Mon, Jul 27, 2015 at 9:47 AM, Simon Johansson <
simon(a)simonjohansson.com> wrote:

This is a tricky one. Especially if you have more than one logstash
machine doing filtering as they will do filtering independently of each
other as the events come in.

The reason why CF adds a timestamp to each line is because how syslog
works, where each line is its own even.

What we tend to do in my company is to log this kind of stuff via GELF
or with Sentry.

On Mon, Jul 27, 2015 at 5:41 PM, Steve Wall <stevewallone(a)gmail.com>
wrote:

Hello,
We are sending CF logs message to an ELK stack. Multiline logs
message are broken out into several log messages in Logstash. One end per
line of the multiline log message. This is problematic when stack traces
dumped to the log. Each line of the stack trace is translated into a log
message. Trying to view this through Kibana is nearly impossible. Logstash
provides a Grok feature allowing for the manipulation of the log messages.
One common solution is to create a Grok filter that using a timestamp to
indicate when a log entry starts and to combine all lines until the next
timestamp into one log message. The problem is that CF adds a timestamp to
every line. Has anyone come up with a good Grok expression to handle
multiline log message coming out of CF?
Thanks!
Steve



_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev

_______________________________________________
cf-dev mailing listcf-dev(a)lists.cloudfoundry.orghttps://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Join {cf-dev@lists.cloudfoundry.org to automatically receive all group messages.