Which components can be HA?


John Wong
 

Hi

Is http://docs.cloudfoundry.org/concepts/high-availability.html up to date?

1) Why is collector listed 1 but in scalable process table?

2) How do you run a second Health Manager in standby mode if only 1 can run
at any time?

3) Do we still need clock job? Is it also 1 instance?

4) I notice I have a job called api_workers, and I believe that's
compilation machine. I run two of these 24x7, is that necessary? The doc
said it is active if we need to compile things (say deploying a new
release). Is that all? I don't think they handle application code
compilation.

5) What about syslog? Can it have 2? I understand we have to choose what to
be HA or not...I am not sure "the BOSH resurrector will recover the VM if
it becomes non-responsive" convinces me because all of these jobs are
deployed with BOSH but if BOSH is down I am facing some outage. I know Dr.
Nic has some article regarding HA bosh.


Correct me if I am wrong.

Thanks.

John


Dieu Cao <dcao@...>
 

1) I'll ask our doc team to clarify the title of the section.
It's not recommended to run more than 1 collector. This component collects
metrics from system components. We use it in combination with Datadog to
monitor the many components of cloud foundry. This component is not
strictly required for an HA system.

2) HM9000 can have multiple active instances. No need for a standby mode.

3) The Cloud Controller clock periodically schedules Cloud Controller clean
up tasks for app usage events, audit events, failed jobs, and more. Only
single instance of this job is necessary.

4) Likely the job called api_workers is actually the cloud controller
workers. These are not compilation vms.
Cloud Controller worker processes background tasks submitted via clients of
the api.

5) I'm not sure what you mean by this. Do you mean loggregator? or doppler?

-Dieu
CF Runtime PM

On Tue, May 5, 2015 at 1:19 PM, John Wong <gokoproject(a)gmail.com> wrote:

Hi

Is http://docs.cloudfoundry.org/concepts/high-availability.html up to
date?

1) Why is collector listed 1 but in scalable process table?

2) How do you run a second Health Manager in standby mode if only 1 can
run at any time?

3) Do we still need clock job? Is it also 1 instance?

4) I notice I have a job called api_workers, and I believe that's
compilation machine. I run two of these 24x7, is that necessary? The doc
said it is active if we need to compile things (say deploying a new
release). Is that all? I don't think they handle application code
compilation.

5) What about syslog? Can it have 2? I understand we have to choose what
to be HA or not...I am not sure "the BOSH resurrector will recover the VM
if it becomes non-responsive" convinces me because all of these jobs are
deployed with BOSH but if BOSH is down I am facing some outage. I know Dr.
Nic has some article regarding HA bosh.


Correct me if I am wrong.

Thanks.

John

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


John Wong
 

Hi Dieu

Thank you for the answers. They are very helpful.

Regarding #4, you are right, I believe when I do CF deployment I get these
short-lived VMs that compile different CF jobs.

Regarding #5, I think it is doppler in our latest deployment (v193, I know
still behind the most current version). I think in very old CF version
there used to be

loggreator_traffic
loggreator
syslog_loggreator

(as seen in the documentation mentions syslog loggreator).

So we probably don't need to worry about syslog then.


It seems like these are the one we can run >=2
NAT
DEA
UAA
HM9000
CC
Workers
Doppler
Log traffic controller
Gorouter
NFS (use s3 in our case)
Postgres (use RDS in our case)


These are the one not to run with > 1
collector
bosh
clock


Not sure:
stats server (metro agent?)


Thanks.

John

On Wed, May 6, 2015 at 2:27 AM, Dieu Cao <dcao(a)pivotal.io> wrote:

1) I'll ask our doc team to clarify the title of the section.
It's not recommended to run more than 1 collector. This component
collects metrics from system components. We use it in combination with
Datadog to monitor the many components of cloud foundry. This component is
not strictly required for an HA system.

2) HM9000 can have multiple active instances. No need for a standby mode.

3) The Cloud Controller clock periodically schedules Cloud Controller
clean up tasks for app usage events, audit events, failed jobs, and more.
Only single instance of this job is necessary.

4) Likely the job called api_workers is actually the cloud controller
workers. These are not compilation vms.
Cloud Controller worker processes background tasks submitted via clients
of the api.

5) I'm not sure what you mean by this. Do you mean loggregator? or doppler?

-Dieu
CF Runtime PM

On Tue, May 5, 2015 at 1:19 PM, John Wong <gokoproject(a)gmail.com> wrote:

Hi

Is http://docs.cloudfoundry.org/concepts/high-availability.html up to
date?

1) Why is collector listed 1 but in scalable process table?

2) How do you run a second Health Manager in standby mode if only 1 can
run at any time?

3) Do we still need clock job? Is it also 1 instance?

4) I notice I have a job called api_workers, and I believe that's
compilation machine. I run two of these 24x7, is that necessary? The doc
said it is active if we need to compile things (say deploying a new
release). Is that all? I don't think they handle application code
compilation.

5) What about syslog? Can it have 2? I understand we have to choose what
to be HA or not...I am not sure "the BOSH resurrector will recover the VM
if it becomes non-responsive" convinces me because all of these jobs are
deployed with BOSH but if BOSH is down I am facing some outage. I know Dr.
Nic has some article regarding HA bosh.


Correct me if I am wrong.

Thanks.

John

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev