Re: Telemetry within BOSH


Marco Voelz
 

Dear Mike,


Thanks for bringing this up. More insight in what the Director could definitely be helpful for a number of things. 


Concerning your use-case, I hope you can help me understand a few points:

  • What is it that you're trying to achieve? My current understanding is that you're trying to run some analytics on the data you want to gather. Let's say you have all the data for your Director – how are you going to use it, specifically?
  • At which level of granularity would you imagine this data to be gathered? When I'm reading 'where the Director spends its time', I'm understanding anything between 'each single statement in the code' to 'each interaction with the IaaS/CPI'.
  • To get a better understanding of where you're coming from: Have you looked at particular libraries/tools/software for this use-case that you want to share as examples when gathering telemetry data for other services you worked with?

Thanks and warm regards
Marco


From: cf-bosh@... <cf-bosh@...> on behalf of Mike Lloyd <mike@...>
Sent: Thursday, December 13, 2018 1:12:41 AM
To: cf-bosh@...
Subject: [cf-bosh] Telemetry within BOSH
 

Hey folks,

 

I have an interesting use case in front of me that I’m trying to figure out how to approach both sanely as well as sustainably. I have a use case where I want highly structured telemetry data from the BOSH director for downstream analytics. My target goal of this use case is to have a comprehensive and clear perspective into where BOSH is spending it’s time. I could see any type of telemetry data being immensely useful for operators and CFF developers as it can help give insight into where improvements can be made.

 

Currently I’m exploring 3 options:

  • Dump of the Director database.
    • Pro: Anything that’s logged to the database is available, can be used with external data visualisation solutions.
    • Con: Schema relations and changes with versions make this difficult to sustain; database schema isn’t well documented; potentially lots of SQL script maintenance; only visible for what’s stored in the database.
  • Adding telemetry hooks to the Director and BOSH Agent.
    • Pro: Every action the director takes can be tracked and measured through the majority of BOSH.
    • Cons: Immense amount of initial work; finding a telemetry framework; handling telemetry output.
  • Adding telemetry hooks to the BOSH CLI.
    • Pro: Every action the CLI takes can be tracked and measured.
    • Cons: Only CLI actions can be tracked; immense amount of initial work; finding a telemetry framework; handling telemetry output.

 

Looking across the industry, telemetry is very prevalent, and it’s almost always an opt-in model, so anything I explore . Since I haven’t seen anything like this discussed in the mailing lists before, I wanted to surface my explorations to get others’ thoughts and opinions on telemetry within BOSH.

 

Respectfully,

 

Mike Lloyd

t: @mxplusc

g: @mxplusb
Professional Member, ACM

 

Join cf-bosh@lists.cloudfoundry.org to automatically receive all group messages.