HM9000 metrics
Pablo Alonso Rodriguez <palonsoro@...>
Good morning.
Recently, I have been revising metrics emitted by CF components. In order to understand HM9000 metrics, I have been reading the metrics documentation (at https://github.com/cloudfoundry/hm9000#metricsserver) I post this message because I have two questions. First question: Not all the metrics retrieved via Ops Metrics are documented there. Is there any additional documentation? If not, could you please explain my what do the following metrics mean? - StartEvacuating, StartCrashed, StartMissing - StopDuplicate, StopEvacuationComplete, StopExtra I have some guesses about some of them, but I am not completely sure about them. Second question: I do not fully understand the difference between the concepts of "instances" and "indices" at metrics like "NumberOfCrashedIndices" and "NumberOfCrashedInstances". For example, I have one crashed app in my CF instance, and "NumberOfCrashedIndices" reports '1' and "NumberOfCrashedInstances" reports '3'. If I have a look at `cf app myapp`, I see one single crashed instance (this was expected). If I have a look at hm9000 dump, I see the following about my crashed app (UUIDs have been replaced by false ones): Guid: 7ef08c44-102d-11e5-9c0d-0fb30c2610f7 | Version: 8e16b09a-102d-11e5-b6ce-27f9445313f8 Desired: [1] instances, (STARTED, STAGED) Heartbeats: [0 CRASHED] a42a7236102d11e5813abfab583ad850 on 1-abc [0 CRASHED] b35b9f1e102d11e5ad29cfc4c2c4e3ea on 2-ac3 [0 CRASHED] bbd37658102d11e5ba8e2b98d1fd1793 on 4-a67 CrashCounts: [0]:7499 Pending Starts: [0] priority:1.00 send:2m34.628437793s So, what does all this mean? I do not understand why do I get 3 heartbeats while I only was trying to start a single instance. Thank you in advance
|
|