Re: Questions about removal of the heartbeat message type from dropsonde-protocol

Erik Jasiak <ejasiak@...>

(resend #2)
Hi again Mike,

There were quite a few pros and cons that went into it; the high (low?)
lights from my notes are below. I'll have the rest of the team check in
if they have more info.

1) A ruby version of the dropsonde-protocol would require some amount of
maintaining state done by the consumer, which is more challenging in ruby.
2) How to shoehorn a heartbeat mechanism into the statsd injector (by
its nature, statsd sends last known value; is a heartbeat binary yes/no, or
milliseconds uptime, and a component is dead when there's no increase?)
3) Whose job is it to maintain heartbeat state to begin with?
Metron's, as the aggregator of dropsonde counters? A Nozzle's?
4) Is the correct model to use heartbeats as the 'source of truth' about
a component being alive, regardless of the data being broadcast, or does a
component / developer prefer the non-statsd-model of wanting metric updates
to serve as a heartbeat? (We've leaned toward the statsd model of 'last
update is valid', but then that implies everyone agrees a heartbeat is
really a running uptime counter or similar.)

We didn't have answers to all of these questions; what we did find was
that dropsonde-protocol heartbeats were rarely being used, and largely
being ignored. Because they were also in the way of figuring out a path
forward for things like dropsonde with ruby, we went for their removal
until we had a clearer use case and strategy, or we could handle them in a
cleaner, generally agreed upon way.

Hope that helps,

On Sat, Aug 8, 2015 at 11:32 AM, Mike Youngstrom <youngm(a)> wrote:

I noticed that heartbeat messages are no longer a part of the

Can I get a quick summary of the thinking behind this change?

Is there an assumption that we should be using the bosh health manager and
not the firehose for this type of thing?

I'm just like some background and help understanding the LAMB team's
monitoring mindset regarding the removal of this message.


Join { to automatically receive all group messages.