Re: DEA Monitoring Capabilities


Chawki, Amin <amin.chawki@...>
 

-What information were you trying to understand from mem_used_bytes?

We used mem_used_bytes and mem_free_bytes (currently metrics from bosh) to get an overview over the real overall memory usage of all apps as an approximation. This helps us to get a better understanding of the current overcommit factor.

-As far as the healthy metric from HM9000, it was quite misleading. It reported healthy as long as the metrics server was running which wasn't any indication of health. What exactly do you want to know?

Ah ok, I was not aware of that. Is there any reliable way to verify whether HM9000 is healthy?

Best Regards and Thanks,
Amin


From: Michael Fraenkel <michael.fraenkel(a)gmail.com>
Reply-To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev(a)lists.cloudfoundry.org>
Date: Monday 23 May 2016 at 13:17
To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev(a)lists.cloudfoundry.org>
Subject: [cf-dev] Re: DEA Monitoring Capabilities

When 234 was released, we did not realize that Collector was creating additional metrics. Based on reports, we have added back any missing metrics that people felt were needed. Let me know if we still have missing metrics as you move beyond 234.

In 234, while we did not report available_memory_ratio, we do report remaining_memory. If your DEAs have the same amount of memory, the ratio can be computed or you can use the current value directly.

What information were you trying to understand from mem_used_bytes?

As far as the healthy metric from HM9000, it was quite misleading. It reported healthy as long as the metrics server was running which wasn't any indication of health. What exactly do you want to know?

- Michael

On 5/20/16 4:41 AM, Chawki, Amin wrote:
Hi,

by upgrading to CF v234 (including pre-release v232) we lost all our monitoring capabilities regarding DEA and HM9000 (we were still using Collector). By migrating to Firehose only a fraction of the metrics was available. Very important metrics for our productive systems like ‘available_memory_ratio’ were just added in CF v235. In the meantime, we were pretty much “flying blind”.

We replaced not existing metrics like ‘DEA…mem_used_bytes’ and ‘HM9000…healthy’, which were available via Collector, with metrics from Bosh. Is this the way to go or are there any plans to add them again?

Best Regards,
Amin

Join {cf-dev@lists.cloudfoundry.org to automatically receive all group messages.