Re: Update Parallelization in Cloud Foundry


Amit Kumar Gupta
 

You can probably try to start everything in parallel, and either set very
long update timeouts, or allow the deployment to fail with the expectation
that it will eventually correct itself. Or you can start things in a
strict order, and have stronger constraints on the possible failure
scenarios, and be able to debug the root cause of a failure better.

Certain things do depend on NATS, and thus won't work until NATS is up.
The main thing I can currently think of is registering routes with
gorouter, which is done both for apps and for system components (e.g. the
route-registrar registers api.SYSTEM_DOMAIN on behalf of the CC).

Best,
Amit

On Tue, Mar 8, 2016 at 2:14 AM, Voelz, Marco <marco.voelz(a)sap.com> wrote:

Does NATS also need to come up before any of the other components?

On 07/03/16 21:16, "Amit Gupta" <agupta(a)pivotal.io> wrote:

Hey Omar,

You can set the "serial" property at the global level of a deployment (you
can think of it as setting a default for all jobs), and then override it at
the individual job levels. You will want the consul server jobs to be
deployed first, with serial: true, and max_in_flight: 1. The important
thing here is, if you have more than one server in your consul cluster,
they need to come up one at a time to ensure the cluster orchestration goes
smoothly. The same is true if your etcd cluster has more than one server
in it. If you're using the postgres job for CCDB and/or UAADB (instead of
some external database), then you will want the postgres job to come up
before CC and/or UAA. Similarly, if you're using the provided blobstore
job instead of an external blobstore, you'll want it up before CC comes up.

You might be able to get away with parallelizing some of the things
above. E.g. if you bring the CC and blobstore up at the same time, CC
might fail to start for a while until Blobstore comes up, and then CC might
successfully start up. Monit also generally keeps retrying even after BOSH
gives up. So your deploy might fail but later on, you might see everything
up and running.

Cheers,
Amit

On Mon, Mar 7, 2016 at 5:54 AM, Omar Elazhary <omazhary(a)gmail.com> wrote:

Hello everyone,

I know it is possible to update and redeploy components in parallel in
cloud foundry by setting the "serial" property in the deployment manifest
to "false". However, is such a thing recommended? Are there particular job
dependencies that I need to pay attention to?

Regards,
Omar


Join {cf-dev@lists.cloudfoundry.org to automatically receive all group messages.