Re: Failed to Backup & Restore BOSH DIRECTOR
Ronak Banka
On router vm , after getting to root user try
"sv stop agent && sv start agent"
On Wed, Oct 7, 2015 at 2:42 PM, Guangcai Wang <guangcai.wang(a)gmail.com>
wrote:
"sv stop agent && sv start agent"
On Wed, Oct 7, 2015 at 2:42 PM, Guangcai Wang <guangcai.wang(a)gmail.com>
wrote:
Thanks Ronak. But that's not what I want. I don't want to recreate the VM.
I need to identify why only router_z1 cannot get connected with the NEW
BOSH DIRECTOR.
I restarted that VM and checked the bosh log in router_z1 job VM. but I
cannot find anything wrong.
On Wed, Oct 7, 2015 at 4:37 PM, ronak banka <ronakbanka.cse(a)gmail.com>
wrote:you can do bosh cck , you can select options from there when it will
check the status for that router machine, either restart the vm or recreate
vm using last known spec.
Refer: https://bosh.io/docs/sysadmin-commands.html#health
On Wed, Oct 7, 2015 at 2:30 PM, Guangcai Wang <guangcai.wang(a)gmail.com>
wrote:I am referring the wiki page to restore BOSH DIRECTOR in my openstack
env. There is a slight difference(detailed below). I can get all the status
of job vms except router_z1 job vm. Any special for router_z1 job vm? How
to debug?
+------------------------------------+--------------------+---------------+-----------------+
| Job/index | State | Resource
Pool | IPs |
+------------------------------------+--------------------+---------------+-----------------+
| unknown/unknown | unresponsive agent |
| |
| api_worker_z1/0 | running | small_z1
| 192.168.110.211 |
| api_z1/0 | running | large_z1
| 192.168.110.209 |
| clock_global/0 | running | small_z1
| 192.168.110.210 |
| doppler_z1/0 | running | small_z1
| 192.168.110.214 |
| etcd_z1/0 | running | small_z1
| 192.168.110.203 |
| ha_proxy_z1/0 | running | router_z1
| 192.168.110.201 |
| hm9000_z1/0 | running | small_z1
| 192.168.110.212 |
| loggregator_trafficcontroller_z1/0 | running | small_z1
| 192.168.110.215 |
| nats_z1/0 | running | small_z1
| 192.168.110.202 |
| runner_z1/0 | running | runner_z1
| 192.168.110.213 |
| runner_z1/1 | running | runner_z1
| 192.168.110.217 |
| runner_z1/2 | running | runner_z1
| 192.168.110.218 |
| stats_z1/0 | running | small_z1
| 192.168.110.204 |
| uaa_z1/0 | starting | small_z1
| 192.168.110.207 |
+------------------------------------+--------------------+---------------+-----------------+
BOSH 1.3008.0
Ruby: 2.1.6
My process:
1. create snapshot of the persistent volume attached to OLD BOSH DIRECTOR
2. create new volume (restored volume) based on the snapshot
3. create a NEW BOSH DIRECTOR VM (with different ip) using microbosh
4. detach new persistent volume which is attached to NEW BOSH DIRECTOR
5. attach new restored volume to NEW BOSH DIRECTOR
6. replace the persistent disk id in /*var/vcap/bosh/settings.json and *
bosh-deployments.yml
7. restart agent and all services in NEW BOSH DIRECTOR VM
8. get all VMs' status except router_z1 (above)
Note: step 1, 2, 3 are different with the wiki page.
Reference:
[1]
https://github.com/cloudfoundry-community/cf-docs-contrib/wiki/Backup-and-disaster-recovery