Telemetry within BOSH


Mike Lloyd <mike@...>
 

Hey folks,

 

I have an interesting use case in front of me that I’m trying to figure out how to approach both sanely as well as sustainably. I have a use case where I want highly structured telemetry data from the BOSH director for downstream analytics. My target goal of this use case is to have a comprehensive and clear perspective into where BOSH is spending it’s time. I could see any type of telemetry data being immensely useful for operators and CFF developers as it can help give insight into where improvements can be made.

 

Currently I’m exploring 3 options:

  • Dump of the Director database.
    • Pro: Anything that’s logged to the database is available, can be used with external data visualisation solutions.
    • Con: Schema relations and changes with versions make this difficult to sustain; database schema isn’t well documented; potentially lots of SQL script maintenance; only visible for what’s stored in the database.
  • Adding telemetry hooks to the Director and BOSH Agent.
    • Pro: Every action the director takes can be tracked and measured through the majority of BOSH.
    • Cons: Immense amount of initial work; finding a telemetry framework; handling telemetry output.
  • Adding telemetry hooks to the BOSH CLI.
    • Pro: Every action the CLI takes can be tracked and measured.
    • Cons: Only CLI actions can be tracked; immense amount of initial work; finding a telemetry framework; handling telemetry output.

 

Looking across the industry, telemetry is very prevalent, and it’s almost always an opt-in model, so anything I explore . Since I haven’t seen anything like this discussed in the mailing lists before, I wanted to surface my explorations to get others’ thoughts and opinions on telemetry within BOSH.

 

Respectfully,

 

Mike Lloyd

t: @mxplusc

g: @mxplusb
Professional Member, ACM

 

Join cf-bosh@lists.cloudfoundry.org to automatically receive all group messages.