Reg the monit of the bosh vm in cloud foundry


Nithiyasri Gnanasekaran -X (ngnanase - TECH MAHINDRA LIM@Cisco) <ngnanase at cisco.com...>
 

Hi

I am using cloud foundry -231 and have created a bosh release of a job, which takes long time (more than 5 mins) to complete.

During bosh deploy, sometimes the job fails. On observing the monit log, the bosh checks the pid every 30 secs and reports error and retries after 10 secs. After retrying 6/7 times, the job fails.
Just after the job has failed, the job is completed successfully and all the process is running in the bosh vm.

So I increased the timeout of the monit process to 90 secs instead of default 30sec. Now the bosh checks for the pid file (every 90 secs) but only thrice(not 7 or 8 times as before) and reports failure, though .

Pls suggest me how can I control the monit of the bosh vm in cloud foundry to wait until the job completes. Please note this issue doesnt happen always..


Regards
Nithiyasri


Subhankar Chattopadhyay <subho.atg@...>
 

Hi,

You may consider putting your long running script in pre-start. Please
check this. https://bosh.io/docs/pre-start.html


Subhankar Chattopadhyay


On Thu, Jul 14, 2016 at 11:42 PM, Nithiyasri Gnanasekaran -X (ngnanase
- TECH MAHINDRA LIM at Cisco) <ngnanase(a)cisco.com> wrote:
Hi



I am using cloud foundry -231 and have created a bosh release of a job,
which takes long time (more than 5 mins) to complete.



During bosh deploy, sometimes the job fails. On observing the monit log, the
bosh checks the pid every 30 secs and reports error and retries after 10
secs. After retrying 6/7 times, the job fails.

Just after the job has failed, the job is completed successfully and all the
process is running in the bosh vm.



So I increased the timeout of the monit process to 90 secs instead of
default 30sec. Now the bosh checks for the pid file (every 90 secs) but only
thrice(not 7 or 8 times as before) and reports failure, though .



Pls suggest me how can I control the monit of the bosh vm in cloud foundry
to wait until the job completes. Please note this issue doesnt happen
always..





Regards

Nithiyasri


--




Subhankar Chattopadhyay
Bangalore, India


Vik R <vagcom.ben@...>
 

See whether the following helps:

https://mmonit.com/monit/documentation/monit.html#SERVICE-POLL-TIME



On Thu, Jul 14, 2016 at 11:12 AM, Nithiyasri Gnanasekaran -X (ngnanase -
TECH MAHINDRA LIM at Cisco) <ngnanase(a)cisco.com> wrote:

Hi



I am using cloud foundry -231 and have created a bosh release of a job,
which takes long time (more than 5 mins) to complete.



During bosh deploy, sometimes the job fails. On observing the monit log,
the bosh checks the pid every 30 secs and reports error and retries after
10 secs. After retrying 6/7 times, the job fails.

Just after the job has failed, the job is completed successfully and all
the process is running in the bosh vm.



So I increased the timeout of the monit process to 90 secs instead of
default 30sec. Now the bosh checks for the pid file (every 90 secs) but
only thrice(not 7 or 8 times as before) and reports failure, though .



Pls suggest me how can I control the monit of the bosh vm in cloud foundry
to wait until the job completes. Please note this issue doesnt happen
always..





Regards

Nithiyasri