Re: Update: Locks & Service Discovery in CF Runtime


Evan Farrar <evanfarrar@...>
 

We decided to move off of Consul, but why? This is fair question, and I'm
sorry for a slow response. I hope to answer extensively and transparently
as the project lead of the Infrastructure team maintaining consul-release.

There is not a single, definitive reason, so I think it is best to provide
as much context as possible to understand the motivations. I have written a
document about how the Consul and Cloud Foundry integration has gone, and
the thought process involved in our decision to stop that integration in
the future. It includes as much raw data as I could find.

https://docs.google.com/document/d/1qdLNIWQQzluXw5rnc39raAYOnnSdDUjhUOrovUE0NJI/edit?usp=sharing

Please comment on the doc, reply on this thread, or discuss in
#infrastructure in slack[1] with your thoughts.

[1] http://slack.cloudfoundry.org/

On Sun, May 7, 2017 at 3:08 PM, Benjamin Gandon <benjamin(a)gandon.org> wrote:

The road off Consul looks like it is long but necessary. Consul looks like
a spof in CF, when you know how much the platform needs it, and when you
read that sometimes plain upgrades break it badly.

Plus, the myriad of logic in the confab wrapper around Consul is an
example of how much Consul is hard to manage and keep up properly.

Don't forget that recently PCF benefitted a CRE (SRE-tye) shared review
from Google.
Don't forget that we have converging evidences that let us think Google
stays away from etcd for their hosted K8s on GCP.

My guess is that internally, Google SREs might have evidences at scale
that systems like etcd or consul should be avoided, and this understanding
is being ported to CF through the CRE program.


Also, moving away from Consul is like choosing to build Diego instead of
building on top of K8s. Controlling the agenda is important. I mean not
being forced to run after a project that has its own. Ensuring which
value is put into the product, and that this value is consistent with the
rest of the platform, is also important.


These are just thoughts. I would love to read more precise info about the
Why, for this "away-from-Consul" move. Guys?


Le 26 avr. 2017 à 18:48, Voelz, Marco <marco.voelz(a)sap.com> a écrit :

Dear Luan,

Maybe that's a stupid question which has already been answered, but
doesn't consul release 0.8 address most of the criticism from the CF
community?

I see now big efforts on all sides (CF and BOSH teams) invested in
building our own solution to a problem which seems to be pretty generic.

Do we think that's something we should spend engineering resources on and
that others (e.g. HashiCorp in this case) cannot solve the problem to se be
our needs? At least to me it seems that they try to move in the right
direction.

Maybe my perspective on this is just too generic and I'm not deep enough
in the technical details.

What do you think?

Thanks and warm regards
Marco



On 24. Apr 2017, at 20:35, Luan Santos <lsantos(a)pivotal.io> wrote:

Hi all,

We have been working on the milestones proposed before in order to lessen
and remove our dependencies on Consul.

Please see the updated Locks & Service Discovery in CF Runtime
<https://docs.google.com/document/d/1zw2tQtpBqYol9usIuK_3VKmXHCMW6J9Dupzjr16J-TY/edit>
document for more details and discussion.

Thanks,

Luan
Software Engineer, Cloud Foundry @ Pivotal

Join cf-dev@lists.cloudfoundry.org to automatically receive all group messages.