Re: Elixir for bosh director?


Eric Malm <emalm@...>
 

Leandro,

If you intend your project eventually to be considered for the CFF to
adopt, please license it as Apache 2.0. That license is used uniformly
across other Foundation projects. Please see
https://www.cloudfoundry.org/governance/cff_ip_policy/ for more details.

I understand the technical benefit of the hot-reloading feature that Erlang
brings, but I view it as incompatible with the realities of how BOSH itself
is deployed. It's typically bootstrapped from some other tool in the BOSH
ecosystem, whether that be another BOSH instance, or the new v2 BOSH CLI,
or even the ancient bosh micro CLI plugin. Those tools all follow the BOSH
update pattern of stopping services on a VM, replacing the software bits
and configuration (and, in the CLI cases, even the VM itself!), and
restarting the services. Unless you go out of your way with the BOSH
release itself to violate the expectations of the BOSH job lifecycle,
there's no opportunity to take advantage of that hot-reloading feature, and
it wouldn't work at all anyway if the VM is replaced.

I think a more effective solution regarding downtime would be to make BOSH
deployable in a fully HA mode, which would address both availability during
upgrades and tolerance to a wider variety of failure modes (component, VM,
availability zone). I've heard Dmitriy mention that as a potential
direction for BOSH in the past, but taking a quick look at the BOSH project
tracker I don't currently see work related to that effort. Even then, for
almost everyone, BOSH is a means to the end of deploying the software you
really care about in a way that allows you to evolve it over time. So it's
typically not a substantial issue in practice for BOSH to have only a few
9s of availability, so long as the state it retains about the deployments
it manages can always be restored successfully to a new BOSH director
within a suitable period of time.

Finally, having been deeply involved in a rewrite of another major CF
subsystem (DEAs to Diego), +1000000 on Jonathan's observation about
rewrites always being harder to execute and taking longer than you expect,
even when you try to account for those expected delays. (This can be viewed
as one manifestation of the more general Hofstadter's Law
<https://en.wikipedia.org/wiki/Hofstadter%27s_law>.) If you do perceive
some benefits to simplifying the BOSH architecture and think that can be
achieved through a rewrite in a different language, look for seams and
interfaces to keep that change as small as possible while still being
impactful.

Thanks,
Eric, CF Diego PM

On Thu, Apr 27, 2017 at 12:00 PM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:

Guys, I'm not saying that the director is bad or wrong, actually what I
want is maybe to improve it a little bit without touching the logic or the
api, my final goal is maybe to create a drop in replacement but keeping the
agent and the logic in place. I know it can be hard work but OTP solves a
lot of "edge cases" of the classic languages out of the box.

Geoff by downtime I mean that, no matter what, in languages like
ruby/python/go or any "classical" language you need to stop and start again
the server to read the new code while in erlang / elixir there is no need
for this,since it has a feature that it is called "hot code reloading" (You
can read about it here
<http://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving>,
here <http://erlang.org/doc/man/code.html> and here
<http://www.unstablebuild.com/2016/03/18/hot-code-reload-in-elixir.html>)
it is one of the moto's of erlang 99.9999999% (nine nines of availability)
and you can read more here <https://pragprog.com/articles/erlang>.

Marco good catch and thanks for the suggestion for the license, maybe I'll
evaluate some others like Apache or LGPL.

Thanks,
Leandro.-

2017-04-27 20:26 GMT+02:00 Voelz, Marco <marco.voelz(a)sap.com>:

Dear Leandro,



I'd love to see your experiment grow – keep in mind that the Director is
around for quite a while and has some pretty complicated corner cases. Just
like any rewrite: It is pretty simple to get 80% right, but then you'll
spend much time on getting the remaining 20%.



A word on the license: If your target audience is really companies like
IBM, don't go with GPL. I now that for example GPL is a no-go for us at
SAP. I would assume a similar policy is in place in pretty much every big
enterprise.



Warm regards

Marco



*From: *Leandro David Cacciagioni <leandro.21.2008(a)gmail.com>
*Reply-To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Date: *Thursday, 27. April 2017 at 19:57
*To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Subject: *[cf-bosh] Re: Re: Elixir for bosh director?



OK what you quote is certainly amazing, anyway that only tackle the
Scalability in part (I know for sure that elixir/erlang can hold the
same a lot better) but it didn't solve the Fault Tolerance part or the true
no downtime deployments (I know that people like IBM will love to update
BOSH with true / zero downtime). Plus all the simplification in the
Director logic that can come from using the proper tool for the right job.
Anyway I think I'll start a POC as under GPL license to make a compatible
"BOSH director" using elixir. Anyone who will like to help more than
welcomed.



2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:

FWIW, we've managed BOSHes with many deployments, some of which consist
of ~1000 VMs, and not seen any direct performance issues of the BOSH
director. Just lengthy deploys due to having so many VMs to iterate
through.



I've also seen a significant uptick in responsiveness from the bosh cli
when using the v2 cli, since ruby isn't parsing for tons of gemfiles every
time I start the CLI up.





On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:



After more than 6 months working with elixir in prod, it crossed my mind
that maybe it deserves some time of experiment and think on the possibility
of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.

Some of the pros that I can list out of the box (without digging to much
in the technical side) are:

· Ruby like syntax (I know I know... This means a lot for people
that don't like erlang syntax) (I'm used to both so far)

· Easiness of development thanks to OTP & FP.

o Scalability (ex: http://www.phoenixframework.or
g/blog/the-road-to-2-million-websocket-connections)

o Fault-tolerance

o True no downtime updates.

· Simplification:

o nats can be deprecated.

o All the other jobs (Director, Registry, Blobstore, HM & CPI) can to
be OTP/Apps (Mix powered) under the same umbrella project.

o Clustering out of the box

· Perfomance wins, giving the nature of elixir/erlang/OTP is
easy to guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.

This is a suggestion and I would like to know if you agree or don't and
why.



Thanks,

Leandro.-




Join {cf-bosh@lists.cloudfoundry.org to automatically receive all group messages.