Re: Failed to Backup & Restore BOSH DIRECTOR


Ronak Banka
 

you can do bosh cck , you can select options from there when it will check
the status for that router machine, either restart the vm or recreate vm
using last known spec.

Refer: https://bosh.io/docs/sysadmin-commands.html#health

On Wed, Oct 7, 2015 at 2:30 PM, Guangcai Wang <guangcai.wang(a)gmail.com>
wrote:

I am referring the wiki page to restore BOSH DIRECTOR in my openstack env.
There is a slight difference(detailed below). I can get all the status of
job vms except router_z1 job vm. Any special for router_z1 job vm? How to
debug?


+------------------------------------+--------------------+---------------+-----------------+
| Job/index | State | Resource Pool
| IPs |

+------------------------------------+--------------------+---------------+-----------------+
| unknown/unknown | unresponsive agent |
| |
| api_worker_z1/0 | running | small_z1
| 192.168.110.211 |
| api_z1/0 | running | large_z1
| 192.168.110.209 |
| clock_global/0 | running | small_z1
| 192.168.110.210 |
| doppler_z1/0 | running | small_z1
| 192.168.110.214 |
| etcd_z1/0 | running | small_z1
| 192.168.110.203 |
| ha_proxy_z1/0 | running | router_z1
| 192.168.110.201 |
| hm9000_z1/0 | running | small_z1
| 192.168.110.212 |
| loggregator_trafficcontroller_z1/0 | running | small_z1
| 192.168.110.215 |
| nats_z1/0 | running | small_z1
| 192.168.110.202 |
| runner_z1/0 | running | runner_z1
| 192.168.110.213 |
| runner_z1/1 | running | runner_z1
| 192.168.110.217 |
| runner_z1/2 | running | runner_z1
| 192.168.110.218 |
| stats_z1/0 | running | small_z1
| 192.168.110.204 |
| uaa_z1/0 | starting | small_z1
| 192.168.110.207 |

+------------------------------------+--------------------+---------------+-----------------+

BOSH 1.3008.0
Ruby: 2.1.6

My process:
1. create snapshot of the persistent volume attached to OLD BOSH DIRECTOR
2. create new volume (restored volume) based on the snapshot
3. create a NEW BOSH DIRECTOR VM (with different ip) using microbosh
4. detach new persistent volume which is attached to NEW BOSH DIRECTOR
5. attach new restored volume to NEW BOSH DIRECTOR
6. replace the persistent disk id in /*var/vcap/bosh/settings.json and *
bosh-deployments.yml
7. restart agent and all services in NEW BOSH DIRECTOR VM
8. get all VMs' status except router_z1 (above)

Note: step 1, 2, 3 are different with the wiki page.


Reference:
[1]
https://github.com/cloudfoundry-community/cf-docs-contrib/wiki/Backup-and-disaster-recovery

Join cf-dev@lists.cloudfoundry.org to automatically receive all group messages.