Leandro, To be honest if I have to choose, I will prefer Go over Elixir / Erlang. Most of the tool around CF is written in Go and the community people (I think) already spend time on learning Go and are now pretty good with. Not sure introducing another language just for the beauty (Yak shaving) is a good idea, I like the path that bosh cli took by rewriting everything in Go. Thanks. On Fri, Apr 28, 2017 at 6:15 AM, Leandro David Cacciagioni < leandro.21.2008(a)gmail.com> wrote: Eric, thanks a lot. Really appreciate your point of view and would have to say that yes my idea of involving elixir / erlarng is to have a proper multi vm deployment to create a fully redundant highly available bosh deployment. Regarding how you deploy and update the director and its components can change a little bit and maybe change the cli in the future ;) , anyway I know that a work like this can take a lot and it will gonna involve more people over time if the day comes. Let see if I can get some minimal POC over the next months at least with the basic features.
Thanks, Leandro.-
2017-04-27 21:57 GMT+02:00 Eric Malm <emalm(a)pivotal.io>:
Leandro,
If you intend your project eventually to be considered for the CFF to adopt, please license it as Apache 2.0. That license is used uniformly across other Foundation projects. Please see https://www.cloudfoundry.o rg/governance/cff_ip_policy/ for more details.
I understand the technical benefit of the hot-reloading feature that Erlang brings, but I view it as incompatible with the realities of how BOSH itself is deployed. It's typically bootstrapped from some other tool in the BOSH ecosystem, whether that be another BOSH instance, or the new v2 BOSH CLI, or even the ancient bosh micro CLI plugin. Those tools all follow the BOSH update pattern of stopping services on a VM, replacing the software bits and configuration (and, in the CLI cases, even the VM itself!), and restarting the services. Unless you go out of your way with the BOSH release itself to violate the expectations of the BOSH job lifecycle, there's no opportunity to take advantage of that hot-reloading feature, and it wouldn't work at all anyway if the VM is replaced.
I think a more effective solution regarding downtime would be to make BOSH deployable in a fully HA mode, which would address both availability during upgrades and tolerance to a wider variety of failure modes (component, VM, availability zone). I've heard Dmitriy mention that as a potential direction for BOSH in the past, but taking a quick look at the BOSH project tracker I don't currently see work related to that effort. Even then, for almost everyone, BOSH is a means to the end of deploying the software you really care about in a way that allows you to evolve it over time. So it's typically not a substantial issue in practice for BOSH to have only a few 9s of availability, so long as the state it retains about the deployments it manages can always be restored successfully to a new BOSH director within a suitable period of time.
Finally, having been deeply involved in a rewrite of another major CF subsystem (DEAs to Diego), +1000000 on Jonathan's observation about rewrites always being harder to execute and taking longer than you expect, even when you try to account for those expected delays. (This can be viewed as one manifestation of the more general Hofstadter's Law <https://en.wikipedia.org/wiki/Hofstadter%27s_law>.) If you do perceive some benefits to simplifying the BOSH architecture and think that can be achieved through a rewrite in a different language, look for seams and interfaces to keep that change as small as possible while still being impactful.
Thanks, Eric, CF Diego PM
On Thu, Apr 27, 2017 at 12:00 PM, Leandro David Cacciagioni < leandro.21.2008(a)gmail.com> wrote:
Guys, I'm not saying that the director is bad or wrong, actually what I want is maybe to improve it a little bit without touching the logic or the api, my final goal is maybe to create a drop in replacement but keeping the agent and the logic in place. I know it can be hard work but OTP solves a lot of "edge cases" of the classic languages out of the box.
Geoff by downtime I mean that, no matter what, in languages like ruby/python/go or any "classical" language you need to stop and start again the server to read the new code while in erlang / elixir there is no need for this,since it has a feature that it is called "hot code reloading" (You can read about it here <http://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving>, here <http://erlang.org/doc/man/code.html> and here <http://www.unstablebuild.com/2016/03/18/hot-code-reload-in-elixir.html>) it is one of the moto's of erlang 99.9999999% (nine nines of availability) and you can read more here <https://pragprog.com/articles/erlang>.
Marco good catch and thanks for the suggestion for the license, maybe I'll evaluate some others like Apache or LGPL.
Thanks, Leandro.-
2017-04-27 20:26 GMT+02:00 Voelz, Marco <marco.voelz(a)sap.com>:
Dear Leandro,
I'd love to see your experiment grow – keep in mind that the Director is around for quite a while and has some pretty complicated corner cases. Just like any rewrite: It is pretty simple to get 80% right, but then you'll spend much time on getting the remaining 20%.
A word on the license: If your target audience is really companies like IBM, don't go with GPL. I now that for example GPL is a no-go for us at SAP. I would assume a similar policy is in place in pretty much every big enterprise.
Warm regards
Marco
*From: *Leandro David Cacciagioni <leandro.21.2008(a)gmail.com> *Reply-To: *"Discussions about the Cloud Foundry BOSH project." < cf-bosh(a)lists.cloudfoundry.org> *Date: *Thursday, 27. April 2017 at 19:57 *To: *"Discussions about the Cloud Foundry BOSH project." < cf-bosh(a)lists.cloudfoundry.org> *Subject: *[cf-bosh] Re: Re: Elixir for bosh director?
OK what you quote is certainly amazing, anyway that only tackle the Scalability in part (I know for sure that elixir/erlang can hold the same a lot better) but it didn't solve the Fault Tolerance part or the true no downtime deployments (I know that people like IBM will love to update BOSH with true / zero downtime). Plus all the simplification in the Director logic that can come from using the proper tool for the right job. Anyway I think I'll start a POC as under GPL license to make a compatible "BOSH director" using elixir. Anyone who will like to help more than welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:
FWIW, we've managed BOSHes with many deployments, some of which consist of ~1000 VMs, and not seen any direct performance issues of the BOSH director. Just lengthy deploys due to having so many VMs to iterate through.
I've also seen a significant uptick in responsiveness from the bosh cli when using the v2 cli, since ruby isn't parsing for tons of gemfiles every time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni < leandro.21.2008(a)gmail.com> wrote:
After more than 6 months working with elixir in prod, it crossed my mind that maybe it deserves some time of experiment and think on the possibility of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to much in the technical side) are:
· Ruby like syntax (I know I know... This means a lot for people that don't like erlang syntax) (I'm used to both so far)
· Easiness of development thanks to OTP & FP.
o Scalability (ex: http://www.phoenixframework.or g/blog/the-road-to-2-million-websocket-connections)
o Fault-tolerance
o True no downtime updates.
· Simplification:
o nats can be deprecated.
o All the other jobs (Director, Registry, Blobstore, HM & CPI) can to be OTP/Apps (Mix powered) under the same umbrella project.
o Clustering out of the box
· Perfomance wins, giving the nature of elixir/erlang/OTP is easy to guess that a single bosh instance will gonna be able to manage more deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and why.
Thanks,
Leandro.-
|