Re: 3 etcd nodes don't work well in single zone

Amit Kumar Gupta

Hi Tony,

The logs you've retrieved only go back to Jul 21, which I can't correlate
with the "?/2" issues you were seeing. If you could possibly record again
a bunch of occurrences of flapping between "2/2" and "?/2" for an app
(along with datetime stamps), and then immediately get logs from *all* the
HM and etcd nodes (`bosh logs` only gets logs from one node at a time), I
can try to dig in more. It's important to get the logs from the HM and
etcd VMs soon after recording the "?/2" events, otherwise BOSH may
rotate/archive the logs and then make them harder to obtain.


On Tue, Jul 21, 2015 at 4:53 PM, Amit Gupta <agupta(a)> wrote:

You should definitely not run etcd with 2 instances. You can read more
recommended cluster sizes in the etcd docs:

I will look at the attached logs and get back to you, but wanted to make
sure to advise you to run either 1 or 3 nodes. With 2, you can wedge the
system, because it will need all nodes to be up to achieve quorum. If you
roll one of the two nodes, it will not be able to rejoin the cluster, and
the service will be stuck in an unavailable state.

Amit, CF OSS Release Integration PM
Pivotal Software, Inc.
View this message in context:
Sent from the CF Dev mailing list archive at
cf-dev mailing list

Join { to automatically receive all group messages.