Resurrector meltdown causes problem on deployment with only one job



We enabled bosh resurrector on micro bosh director. But we found recreate action never happened for the deployment which has only one job.

I found below description from bosh doc:

BOSH uses the BOSH Resurrector to help it recover from many issues. The Resurrector automatically instructs the BOSH Director to rebuild unresponsive VMs unless the system is in meltdown.

Meltdown occurs when the number of unresponsive VM alerts within a specified time period exceeds a specified threshold. This threshold is a percentage of the total number of VMs in the deployment. You specify the time_threshold and percent_threshold properties in your manifest.

For example, in a deployment with 40 VMs, percent_threshold set to 20%, and time_threshold set to 60 seconds, automatic recovery fails if the Resurrector receives eight or more unresponsive VM alerts within 60 seconds.

But even we set percent_threshold to 1, meltdown still occurred and the unresponsive VM never got recreated.

My micro bosh version is 1.2732.0.

Does anyone know how to solve this problem? Any help will be appreciated.


Join { to automatically receive all group messages.