Just one question what about moving to Diego to get ride of HM9000 / DEA ?
toggle quoted message
Show quoted text
On Tue, Dec 15, 2015 at 9:44 PM, Masumi Ito <msmi10f(a)gmail.com> wrote:
I found that one of etcds hanged up delayed the detection of crashed
application instances, resulting in the slow recovery time. Although this
depended on the condition of which hm9000 processes were connecting to the
each etcd VM, it approximately took up to 15min to recover and I think it
too long delayed.
Does anyone know how to calculate time for hm9000 to detect a hanged etcd
and switch to healthy etcds? I have encounted two different scenarios as
1. hm9000 analyzer was connecting to the hanged etcd however hm9000 listner
was connecting to the normal etcd. (About 8 min for analyzer to be
recovered. The other hm9000 analyzer took over instead.)
The analyzer seemed to be hanged up accidentally just after the
etcd was hanged because "Analyzer completed succesfully" was not found in
After approximately 8 min passed, the other hm9000 analyzer acquired the
lock and started to work instead. And then it identified crashed instance
and enqueued start message. the crashed app was relaunched within ten min
after the detection.
2. hm9000 analyzer was connecting to the normal etcd however hm9000 listner
was connecting to the hanged etcd. (About 15 min for listener to be
recovered. The same hm9000 listener seemed to be recovered somehow.)
The listener started to fail to sync heartbeats just after the connected
etcd was hanged. After 15min, "Save took too long. Not bumping freshness."
was showed in the listner's log and then analyzer also complained about the
old actual state: "Analyzer failed with error - Error:Actual state is not
fresh" and stopped analyzing tasks. After 10 sec hm9000 listener had
recovered somehow and started to bump freshness periodically then analyzer
also started to analyze actual state and desied state and raised the
to start a crashed instance.
View this message in context:
Sent from the CF Dev mailing list archive at Nabble.com.