Re: Update Parallelization in Cloud Foundry

Marco Voelz

Thanks for clarifying this for me, Amit.

Warm regards

On 09/03/16 07:43, "Amit Gupta" <agupta(a)<mailto:agupta(a)>> wrote:

You can probably try to start everything in parallel, and either set very long update timeouts, or allow the deployment to fail with the expectation that it will eventually correct itself. Or you can start things in a strict order, and have stronger constraints on the possible failure scenarios, and be able to debug the root cause of a failure better.

Certain things do depend on NATS, and thus won't work until NATS is up. The main thing I can currently think of is registering routes with gorouter, which is done both for apps and for system components (e.g. the route-registrar registers api.SYSTEM_DOMAIN on behalf of the CC).


On Tue, Mar 8, 2016 at 2:14 AM, Voelz, Marco <marco.voelz(a)<mailto:marco.voelz(a)>> wrote:
Does NATS also need to come up before any of the other components?

On 07/03/16 21:16, "Amit Gupta" <agupta(a)<mailto:agupta(a)>> wrote:

Hey Omar,

You can set the "serial" property at the global level of a deployment (you can think of it as setting a default for all jobs), and then override it at the individual job levels. You will want the consul server jobs to be deployed first, with serial: true, and max_in_flight: 1. The important thing here is, if you have more than one server in your consul cluster, they need to come up one at a time to ensure the cluster orchestration goes smoothly. The same is true if your etcd cluster has more than one server in it. If you're using the postgres job for CCDB and/or UAADB (instead of some external database), then you will want the postgres job to come up before CC and/or UAA. Similarly, if you're using the provided blobstore job instead of an external blobstore, you'll want it up before CC comes up.

You might be able to get away with parallelizing some of the things above. E.g. if you bring the CC and blobstore up at the same time, CC might fail to start for a while until Blobstore comes up, and then CC might successfully start up. Monit also generally keeps retrying even after BOSH gives up. So your deploy might fail but later on, you might see everything up and running.


On Mon, Mar 7, 2016 at 5:54 AM, Omar Elazhary <omazhary(a)<mailto:omazhary(a)>> wrote:
Hello everyone,

I know it is possible to update and redeploy components in parallel in cloud foundry by setting the "serial" property in the deployment manifest to "false". However, is such a thing recommended? Are there particular job dependencies that I need to pay attention to?


Join to automatically receive all group messages.