Re: Generic data points for dropsonde

Benjamin Black

great questions, dwayne.

1) the partition key is intended to be used in a similar manner to
partitioners in distributed systems like cassandra or kafka. the specific
behavior i would like to make part of the contract is two-fold: that all
data with the same key is routed to the same partition and that all data in
a partition is FIFO (meaning no ordering guarantees beyond arrival time).

this could help with the multi-line log problem by ensuring a single
consumer will receive all lines for a given log entry in order, allowing
simple reassembly. however, the lines might be interleaved with other lines
with the same key or even other keys that happen to map to the same
partition, so the consumer does require a bit of intelligence. this was
actually one of the driving scenarios for adding the key.

2) i expect typical points to be in the hundreds of bytes to a few KB. if
we find ourselves regularly needing much larger points, especially near
that 64KB limit, i'd look to the JSON representation as the hierarchical
structure is more efficiently managed there.


On Tue, Sep 1, 2015 at 4:42 PM, <dschultz(a)> wrote:

Hi Ben,

I was wondering if you could give a concrete use case for the partition
key functionality.

In particular I am interested in how we solve multi line log entries. I
think it would be better to solve it by keeping all the data (the multiple
lines) together throughout the logging/metrics pipeline, but could see how
something like a partition key might help keep the data together as well.

Second question: how large do you see these point messages getting
(average and max)? There are still several stages of the logging/metrics
pipeline that use UDP with a standard 64K size limit.


On Aug 28, 2015, at 4:54 PM, Benjamin Black <bblack(a)> wrote:


The existing dropsonde protocol uses a different message type for each
event type. HttpStart, HttpStop, ContainerMetrics, and so on are all
distinct types in the protocol definition. This requires protocol changes
to introduce any new event type, making such changes very expensive. We've
been working for the past few weeks on an addition to the dropsonde
protocol to support easier future extension to new types of events and to
make it easier for users to define their own events.

The document linked below [1] describes a generic data point message
capable of carrying multi-dimensional, multi-metric points as sets of
name/value pairs. This new message is expected to be added as an additional
entry in the existing dropsonde protocol metric type enum. Things are now
at a point where we'd like to get feedback from the community before moving
forward with implementation.

Please contribute your thoughts on the document in whichever way you are
most comfortable: comments on the document, email here, or email directly
to me. If you comment on the document, please make sure you are logged in
so we can keep track of who is asking for what. Your views are not just
appreciated, but critical to the continued health and success of the Cloud
Foundry community. Thank you!



Join to automatically receive all group messages.