Cornelia Davis <cdavis@...>
I've just been trying this today and the resurrector does not seem to be functioning.
Running a new bosh-lite instance on vagrant, just deployed fresh yesterday with stemcell 2776.
I wsh'd into one of the warden containers and stopped the agent - indeed bosh sees this as follows
+-----------------+--------------------+----------------------+------------+ | Job/index | State | Resource Pool | IPs | +-----------------+--------------------+----------------------+------------+ | unknown/unknown | unresponsive agent | | | | mysql/0 | running | common-resource-pool | 10.244.0.2 | | wordpress/0 | running | common-resource-pool | 10.244.0.6 | +-----------------+--------------------+----------------------+------------+
But the resurrector never recovers it.
|
|
I've tried it just now with my deployment and saw that Director ran `scan and fix` task after HM saw missing agent.
``` Director task 16 Started scanning 1 vms Started scanning 1 vms > Checking VM states. Done (00:00:10) Started scanning 1 vms > 0 OK, 1 unresponsive, 0 missing, 0 unbound, 0 out of sync. Done (00:00:00) Done scanning 1 vms (00:00:10)
Started applying problem resolutions > unresponsive_agent 2: Recreate VM. Done (00:01:19)
Task 16 done
Started 2015-12-04 05:26:46 UTC Finished 2015-12-04 05:28:15 UTC Duration 00:01:29 ```
toggle quoted messageShow quoted text
On Thu, Dec 3, 2015 at 9:21 PM, Cornelia Davis <cdavis(a)pivotal.io> wrote: I've just been trying this today and the resurrector does not seem to be functioning.
Running a new bosh-lite instance on vagrant, just deployed fresh yesterday with stemcell 2776.
I wsh'd into one of the warden containers and stopped the agent - indeed bosh sees this as follows
+-----------------+--------------------+----------------------+------------+ | Job/index | State | Resource Pool | IPs |
+-----------------+--------------------+----------------------+------------+ | unknown/unknown | unresponsive agent | | | | mysql/0 | running | common-resource-pool | 10.244.0.2 | | wordpress/0 | running | common-resource-pool | 10.244.0.6 |
+-----------------+--------------------+----------------------+------------+
But the resurrector never recovers it.
|
|
Cornelia Davis <cdavis@...>
No such task from my director. Any suggestions on how I might go about figuring out why not?
toggle quoted messageShow quoted text
On Thu, Dec 3, 2015 at 9:33 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote: I've tried it just now with my deployment and saw that Director ran `scan and fix` task after HM saw missing agent.
``` Director task 16 Started scanning 1 vms Started scanning 1 vms > Checking VM states. Done (00:00:10) Started scanning 1 vms > 0 OK, 1 unresponsive, 0 missing, 0 unbound, 0 out of sync. Done (00:00:00) Done scanning 1 vms (00:00:10)
Started applying problem resolutions > unresponsive_agent 2: Recreate VM. Done (00:01:19)
Task 16 done
Started 2015-12-04 05:26:46 UTC Finished 2015-12-04 05:28:15 UTC Duration 00:01:29 ```
On Thu, Dec 3, 2015 at 9:21 PM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
I've just been trying this today and the resurrector does not seem to be functioning.
Running a new bosh-lite instance on vagrant, just deployed fresh yesterday with stemcell 2776.
I wsh'd into one of the warden containers and stopped the agent - indeed bosh sees this as follows
+-----------------+--------------------+----------------------+------------+ | Job/index | State | Resource Pool | IPs |
+-----------------+--------------------+----------------------+------------+ | unknown/unknown | unresponsive agent | | | | mysql/0 | running | common-resource-pool | 10.244.0.2 | | wordpress/0 | running | common-resource-pool | 10.244.0.6 |
+-----------------+--------------------+----------------------+------------+
But the resurrector never recovers it.
-- Cornelia Davis (805) 452 8941
|
|
Cornelia, did you change/create bosh users in lieu of the default admin/admin user?
If so, then perhaps the HM cannot connect to the director
toggle quoted messageShow quoted text
On Fri, Dec 4, 2015 at 4:55 AM, Cornelia Davis <cdavis(a)pivotal.io> wrote: No such task from my director. Any suggestions on how I might go about figuring out why not? On Thu, Dec 3, 2015 at 9:33 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:
I've tried it just now with my deployment and saw that Director ran `scan and fix` task after HM saw missing agent.
``` Director task 16 Started scanning 1 vms Started scanning 1 vms > Checking VM states. Done (00:00:10) Started scanning 1 vms > 0 OK, 1 unresponsive, 0 missing, 0 unbound, 0 out of sync. Done (00:00:00) Done scanning 1 vms (00:00:10)
Started applying problem resolutions > unresponsive_agent 2: Recreate VM. Done (00:01:19)
Task 16 done
Started 2015-12-04 05:26:46 UTC Finished 2015-12-04 05:28:15 UTC Duration 00:01:29 ```
On Thu, Dec 3, 2015 at 9:21 PM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
I've just been trying this today and the resurrector does not seem to be functioning.
Running a new bosh-lite instance on vagrant, just deployed fresh yesterday with stemcell 2776.
I wsh'd into one of the warden containers and stopped the agent - indeed bosh sees this as follows
+-----------------+--------------------+----------------------+------------+ | Job/index | State | Resource Pool | IPs |
+-----------------+--------------------+----------------------+------------+ | unknown/unknown | unresponsive agent | | | | mysql/0 | running | common-resource-pool | 10.244.0.2 | | wordpress/0 | running | common-resource-pool | 10.244.0.6 |
+-----------------+--------------------+----------------------+------------+
But the resurrector never recovers it.
-- Cornelia Davis (805) 452 8941
|
|
Cornelia Davis <cdavis@...>
I didn't change any passwords but Nic, that was the key. The password isn't set right in the health_monitor config file. Thanks! I'm up and running now. On Fri, Dec 4, 2015 at 7:55 AM, Dr Nic Williams <drnicwilliams(a)gmail.com> wrote: Cornelia, did you change/create bosh users in lieu of the default admin/admin user?
If so, then perhaps the HM cannot connect to the director
On Fri, Dec 4, 2015 at 4:55 AM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
No such task from my director. Any suggestions on how I might go about figuring out why not?
On Thu, Dec 3, 2015 at 9:33 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:
I've tried it just now with my deployment and saw that Director ran `scan and fix` task after HM saw missing agent.
``` Director task 16 Started scanning 1 vms Started scanning 1 vms > Checking VM states. Done (00:00:10) Started scanning 1 vms > 0 OK, 1 unresponsive, 0 missing, 0 unbound, 0 out of sync. Done (00:00:00) Done scanning 1 vms (00:00:10)
Started applying problem resolutions > unresponsive_agent 2: Recreate VM. Done (00:01:19)
Task 16 done
Started 2015-12-04 05:26:46 UTC Finished 2015-12-04 05:28:15 UTC Duration 00:01:29 ```
On Thu, Dec 3, 2015 at 9:21 PM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
I've just been trying this today and the resurrector does not seem to be functioning.
Running a new bosh-lite instance on vagrant, just deployed fresh yesterday with stemcell 2776.
I wsh'd into one of the warden containers and stopped the agent - indeed bosh sees this as follows
+-----------------+--------------------+----------------------+------------+ | Job/index | State | Resource Pool | IPs |
+-----------------+--------------------+----------------------+------------+ | unknown/unknown | unresponsive agent | | | | mysql/0 | running | common-resource-pool | 10.244.0.2 | | wordpress/0 | running | common-resource-pool | 10.244.0.6 |
+-----------------+--------------------+----------------------+------------+
But the resurrector never recovers it.
-- Cornelia Davis (805) 452 8941
-- Cornelia Davis (805) 452 8941
|
|
are you sure you are on the latest version of bosh-lite? hm's has been using admin/admin in bosh-lite for some time now.
Sent from my iPhone
toggle quoted messageShow quoted text
On Dec 4, 2015, at 8:10 AM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
I didn't change any passwords but Nic, that was the key. The password isn't set right in the health_monitor config file. Thanks! I'm up and running now.
On Fri, Dec 4, 2015 at 7:55 AM, Dr Nic Williams <drnicwilliams(a)gmail.com> wrote: Cornelia, did you change/create bosh users in lieu of the default admin/admin user?
If so, then perhaps the HM cannot connect to the director
On Fri, Dec 4, 2015 at 4:55 AM, Cornelia Davis <cdavis(a)pivotal.io> wrote: No such task from my director. Any suggestions on how I might go about figuring out why not?
On Thu, Dec 3, 2015 at 9:33 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote: I've tried it just now with my deployment and saw that Director ran `scan and fix` task after HM saw missing agent.
``` Director task 16 Started scanning 1 vms Started scanning 1 vms > Checking VM states. Done (00:00:10) Started scanning 1 vms > 0 OK, 1 unresponsive, 0 missing, 0 unbound, 0 out of sync. Done (00:00:00) Done scanning 1 vms (00:00:10)
Started applying problem resolutions > unresponsive_agent 2: Recreate VM. Done (00:01:19)
Task 16 done
Started 2015-12-04 05:26:46 UTC Finished 2015-12-04 05:28:15 UTC Duration 00:01:29 ```
On Thu, Dec 3, 2015 at 9:21 PM, Cornelia Davis <cdavis(a)pivotal.io> wrote: I've just been trying this today and the resurrector does not seem to be functioning.
Running a new bosh-lite instance on vagrant, just deployed fresh yesterday with stemcell 2776.
I wsh'd into one of the warden containers and stopped the agent - indeed bosh sees this as follows
+-----------------+--------------------+----------------------+------------+ | Job/index | State | Resource Pool | IPs | +-----------------+--------------------+----------------------+------------+ | unknown/unknown | unresponsive agent | | | | mysql/0 | running | common-resource-pool | 10.244.0.2 | | wordpress/0 | running | common-resource-pool | 10.244.0.6 | +-----------------+--------------------+----------------------+------------+
But the resurrector never recovers it.
-- Cornelia Davis (805) 452 8941
-- Cornelia Davis (805) 452 8941
|
|
I recently ran into an issue where despite having an up-to-date git repo for bosh-lite, my base vagrant box was a bit outdated and I had to update that manually using `vagrant box update`.
You can use `vagrant box outdated` to find out if this is a problem for you as well.
Do these from your bosh-lite checkout.
Best, — Casey
toggle quoted messageShow quoted text
On Fri, Dec 4, 2015 at 11:14 AM Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote: are you sure you are on the latest version of bosh-lite? hm's has been using admin/admin in bosh-lite for some time now.
Sent from my iPhone
On Dec 4, 2015, at 8:10 AM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
I didn't change any passwords but Nic, that was the key. The password isn't set right in the health_monitor config file. Thanks! I'm up and running now.
On Fri, Dec 4, 2015 at 7:55 AM, Dr Nic Williams <drnicwilliams(a)gmail.com> wrote:
Cornelia, did you change/create bosh users in lieu of the default admin/admin user?
If so, then perhaps the HM cannot connect to the director
On Fri, Dec 4, 2015 at 4:55 AM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
No such task from my director. Any suggestions on how I might go about
figuring out why not?
On Thu, Dec 3, 2015 at 9:33 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:
I've tried it just now with my deployment and saw that Director ran `scan and fix` task after HM saw missing agent.
``` Director task 16 Started scanning 1 vms Started scanning 1 vms > Checking VM states. Done (00:00:10) Started scanning 1 vms > 0 OK, 1 unresponsive, 0 missing, 0 unbound, 0 out of sync. Done (00:00:00) Done scanning 1 vms (00:00:10)
Started applying problem resolutions > unresponsive_agent 2: Recreate VM. Done (00:01:19)
Task 16 done
Started 2015-12-04 05:26:46 UTC Finished 2015-12-04 05:28:15 UTC Duration 00:01:29 ```
On Thu, Dec 3, 2015 at 9:21 PM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
I've just been trying this today and the resurrector does not seem to be functioning.
Running a new bosh-lite instance on vagrant, just deployed fresh yesterday with stemcell 2776.
I wsh'd into one of the warden containers and stopped the agent - indeed bosh sees this as follows
+-----------------+--------------------+----------------------+------------+ | Job/index | State | Resource Pool | IPs |
+-----------------+--------------------+----------------------+------------+ | unknown/unknown | unresponsive agent | | | | mysql/0 | running | common-resource-pool | 10.244.0.2 | | wordpress/0 | running | common-resource-pool | 10.244.0.6 |
+-----------------+--------------------+----------------------+------------+
But the resurrector never recovers it.
-- Cornelia Davis (805) 452 8941
-- Cornelia Davis (805) 452 8941
|
|
Cornelia Davis <cdavis@...>
Bingo Casey. That was it! Thanks all.
toggle quoted messageShow quoted text
On Fri, Dec 4, 2015 at 8:53 AM, Casey West <cwest(a)pivotal.io> wrote: I recently ran into an issue where despite having an up-to-date git repo for bosh-lite, my base vagrant box was a bit outdated and I had to update that manually using `vagrant box update`.
You can use `vagrant box outdated` to find out if this is a problem for you as well.
Do these from your bosh-lite checkout.
Best, — Casey
On Fri, Dec 4, 2015 at 11:14 AM Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:
are you sure you are on the latest version of bosh-lite? hm's has been using admin/admin in bosh-lite for some time now.
Sent from my iPhone
On Dec 4, 2015, at 8:10 AM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
I didn't change any passwords but Nic, that was the key. The password isn't set right in the health_monitor config file. Thanks! I'm up and running now.
On Fri, Dec 4, 2015 at 7:55 AM, Dr Nic Williams <drnicwilliams(a)gmail.com> wrote:
Cornelia, did you change/create bosh users in lieu of the default admin/admin user?
If so, then perhaps the HM cannot connect to the director
On Fri, Dec 4, 2015 at 4:55 AM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
No such task from my director. Any suggestions on how I might go about
figuring out why not?
On Thu, Dec 3, 2015 at 9:33 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:
I've tried it just now with my deployment and saw that Director ran `scan and fix` task after HM saw missing agent.
``` Director task 16 Started scanning 1 vms Started scanning 1 vms > Checking VM states. Done (00:00:10) Started scanning 1 vms > 0 OK, 1 unresponsive, 0 missing, 0 unbound, 0 out of sync. Done (00:00:00) Done scanning 1 vms (00:00:10)
Started applying problem resolutions > unresponsive_agent 2: Recreate VM. Done (00:01:19)
Task 16 done
Started 2015-12-04 05:26:46 UTC Finished 2015-12-04 05:28:15 UTC Duration 00:01:29 ```
On Thu, Dec 3, 2015 at 9:21 PM, Cornelia Davis <cdavis(a)pivotal.io> wrote:
I've just been trying this today and the resurrector does not seem to be functioning.
Running a new bosh-lite instance on vagrant, just deployed fresh yesterday with stemcell 2776.
I wsh'd into one of the warden containers and stopped the agent - indeed bosh sees this as follows
+-----------------+--------------------+----------------------+------------+ | Job/index | State | Resource Pool | IPs |
+-----------------+--------------------+----------------------+------------+ | unknown/unknown | unresponsive agent | | | | mysql/0 | running | common-resource-pool | 10.244.0.2 | | wordpress/0 | running | common-resource-pool | 10.244.0.6 |
+-----------------+--------------------+----------------------+------------+
But the resurrector never recovers it.
-- Cornelia Davis (805) 452 8941
-- Cornelia Davis (805) 452 8941
-- Cornelia Davis (805) 452 8941
|
|