Re: Generic data points for dropsonde


Johannes Tuchscherer
 

The current way of sending metrics as either Values or Counters through the
pipeline makes the development of a downstream consumer (=nozzle) pretty
easy. If you look at the datadog nozzle[0], it just takes all ValueMetrics
and Counters and sends them off to datadog. The nozzle does not have to
know anything about these metrics (e.g. their origin, name, or layout).

Adding a new way to send metrics as a nested object would make the
downstream implementation certainly more complicated. In that case, the
nozzle developer has to know what metrics are included inside the generic
point (basically the schema of the metric) and treat each point
accordingly. For example, if I were to write a nozzle that emits metrics to
Graphite with a StatsD client (like it is done here[1]), I need to know if
my int64 value is a Gauge or a Counter. Also, my consumer is under constant
risk of breaking when the upstream schema changes.

We are already facing this problem with the container metrics. But at least
the container metrics are in a defined format that is well documented and
not likely to change.

I agree with you, though, the the dropsonde protocol could use a mechanism
for easier extension. Having a GenericPoint and/or GenericEvent seems like
a good idea that I whole-heartedly support. I would just like to stay away
from nested metrics. I think the cost of adding more logic into the
downstream consumer (and making it harder to maintain) is not worth the
benefit of a more concise metric transport.


[0] https://github.com/cloudfoundry-incubator/datadog-firehose-nozzle
[1] https://github.com/CloudCredo/graphite-nozzle

On Tue, Sep 1, 2015 at 5:52 PM, Benjamin Black <bblack(a)pivotal.io> wrote:

great questions, dwayne.

1) the partition key is intended to be used in a similar manner to
partitioners in distributed systems like cassandra or kafka. the specific
behavior i would like to make part of the contract is two-fold: that all
data with the same key is routed to the same partition and that all data in
a partition is FIFO (meaning no ordering guarantees beyond arrival time).

this could help with the multi-line log problem by ensuring a single
consumer will receive all lines for a given log entry in order, allowing
simple reassembly. however, the lines might be interleaved with other lines
with the same key or even other keys that happen to map to the same
partition, so the consumer does require a bit of intelligence. this was
actually one of the driving scenarios for adding the key.

2) i expect typical points to be in the hundreds of bytes to a few KB. if
we find ourselves regularly needing much larger points, especially near
that 64KB limit, i'd look to the JSON representation as the hierarchical
structure is more efficiently managed there.


b




On Tue, Sep 1, 2015 at 4:42 PM, <dschultz(a)pivotal.io> wrote:

Hi Ben,

I was wondering if you could give a concrete use case for the partition
key functionality.

In particular I am interested in how we solve multi line log entries. I
think it would be better to solve it by keeping all the data (the multiple
lines) together throughout the logging/metrics pipeline, but could see how
something like a partition key might help keep the data together as well.

Second question: how large do you see these point messages getting
(average and max)? There are still several stages of the logging/metrics
pipeline that use UDP with a standard 64K size limit.

Thanks,
Dwayne

On Aug 28, 2015, at 4:54 PM, Benjamin Black <bblack(a)pivotal.io> wrote:

All,

The existing dropsonde protocol uses a different message type for each
event type. HttpStart, HttpStop, ContainerMetrics, and so on are all
distinct types in the protocol definition. This requires protocol changes
to introduce any new event type, making such changes very expensive. We've
been working for the past few weeks on an addition to the dropsonde
protocol to support easier future extension to new types of events and to
make it easier for users to define their own events.

The document linked below [1] describes a generic data point message
capable of carrying multi-dimensional, multi-metric points as sets of
name/value pairs. This new message is expected to be added as an additional
entry in the existing dropsonde protocol metric type enum. Things are now
at a point where we'd like to get feedback from the community before moving
forward with implementation.

Please contribute your thoughts on the document in whichever way you are
most comfortable: comments on the document, email here, or email directly
to me. If you comment on the document, please make sure you are logged in
so we can keep track of who is asking for what. Your views are not just
appreciated, but critical to the continued health and success of the Cloud
Foundry community. Thank you!


b

[1]
https://docs.google.com/document/d/1SzvT1BjrBPqUw6zfSYYFfaW9vX_dTZZjn5sl2nxB6Bc/edit?usp=sharing




Join cf-dev@lists.cloudfoundry.org to automatically receive all group messages.