Re: Update Parallelization in Cloud Foundry


Dieu Cao <dcao@...>
 

It should also be considered that in some scenarios the order of deployment
as recommended serially will most often be the most tested in terms of
ensuring backwards compatibility of code changes during deployment.

For example, a new end point might be added to cloud controller to be used
by DEAs/CELLs and it is assumed that because of the serial deployment
order, that all cloud controller's will have completed updating and thus
the new end point available prior to DEAs/CELLs updating so then code
changes to DEAs/CELLs can simply switch over to using the new end points as
they update and there is no need to keep the code on DEAs/CELLs that used
the older end points.

-Dieu
CF Runtime PMC Lead

On Wed, Mar 9, 2016 at 2:34 AM, Voelz, Marco <marco.voelz(a)sap.com> wrote:

Thanks for clarifying this for me, Amit.

Warm regards
Marco

On 09/03/16 07:43, "Amit Gupta" <agupta(a)pivotal.io> wrote:

You can probably try to start everything in parallel, and either set very
long update timeouts, or allow the deployment to fail with the expectation
that it will eventually correct itself. Or you can start things in a
strict order, and have stronger constraints on the possible failure
scenarios, and be able to debug the root cause of a failure better.

Certain things do depend on NATS, and thus won't work until NATS is up.
The main thing I can currently think of is registering routes with
gorouter, which is done both for apps and for system components (e.g. the
route-registrar registers api.SYSTEM_DOMAIN on behalf of the CC).

Best,
Amit

On Tue, Mar 8, 2016 at 2:14 AM, Voelz, Marco <marco.voelz(a)sap.com> wrote:

Does NATS also need to come up before any of the other components?

On 07/03/16 21:16, "Amit Gupta" <agupta(a)pivotal.io> wrote:

Hey Omar,

You can set the "serial" property at the global level of a deployment
(you can think of it as setting a default for all jobs), and then override
it at the individual job levels. You will want the consul server jobs to
be deployed first, with serial: true, and max_in_flight: 1. The important
thing here is, if you have more than one server in your consul cluster,
they need to come up one at a time to ensure the cluster orchestration goes
smoothly. The same is true if your etcd cluster has more than one server
in it. If you're using the postgres job for CCDB and/or UAADB (instead of
some external database), then you will want the postgres job to come up
before CC and/or UAA. Similarly, if you're using the provided blobstore
job instead of an external blobstore, you'll want it up before CC comes up.

You might be able to get away with parallelizing some of the things
above. E.g. if you bring the CC and blobstore up at the same time, CC
might fail to start for a while until Blobstore comes up, and then CC might
successfully start up. Monit also generally keeps retrying even after BOSH
gives up. So your deploy might fail but later on, you might see everything
up and running.

Cheers,
Amit

On Mon, Mar 7, 2016 at 5:54 AM, Omar Elazhary <omazhary(a)gmail.com> wrote:

Hello everyone,

I know it is possible to update and redeploy components in parallel in
cloud foundry by setting the "serial" property in the deployment manifest
to "false". However, is such a thing recommended? Are there particular job
dependencies that I need to pay attention to?

Regards,
Omar



Join cf-dev@lists.cloudfoundry.org to automatically receive all group messages.