Re: Elixir for bosh director?


Geoff Franks <geoff@...>
 

What kind of downtime are they seeing when upgrading the BOSH director? IIRC there were things like binary-bosh to keep bosh alive + responsive all the time, but in practice, the main cause of downtime for BOSH is the delay for compilation + VM creation/deletion during `create-env`, which I don't think rewriting in another language can solve (stemcell upgrades). Downtime from restarting the director is pretty minimal, and BOSH itself isn't a critical path for the availability of the VMs it deploys.

We've mitigated the create-env issues by using create-env to create a BOSH that will manage upper-level infrastructure that isn't as important, and have that BOSH deploy other BOSHes to deploy prod (lower risk of not restarting a failed VM due to a BOSH upgrade).

On Apr 27, 2017, at 1:57 PM, Leandro David Cacciagioni <leandro.21.2008(a)gmail.com> wrote:

OK what you quote is certainly amazing, anyway that only tackle the Scalability in part (I know for sure that elixir/erlang can hold the same a lot better) but it didn't solve the Fault Tolerance part or the true no downtime deployments (I know that people like IBM will love to update BOSH with true / zero downtime). Plus all the simplification in the Director logic that can come from using the proper tool for the right job. Anyway I think I'll start a POC as under GPL license to make a compatible "BOSH director" using elixir. Anyone who will like to help more than welcomed.

2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com <mailto:geoff(a)starkandwayne.com>>:
FWIW, we've managed BOSHes with many deployments, some of which consist of ~1000 VMs, and not seen any direct performance issues of the BOSH director. Just lengthy deploys due to having so many VMs to iterate through.

I've also seen a significant uptick in responsiveness from the bosh cli when using the v2 cli, since ruby isn't parsing for tons of gemfiles every time I start the CLI up.


On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <leandro.21.2008(a)gmail.com <mailto:leandro.21.2008(a)gmail.com>> wrote:

After more than 6 months working with elixir in prod, it crossed my mind that maybe it deserves some time of experiment and think on the possibility of a TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR.
Some of the pros that I can list out of the box (without digging to much in the technical side) are:
Ruby like syntax (I know I know... This means a lot for people that don't like erlang syntax) (I'm used to both so far)
Easiness of development thanks to OTP & FP.
Scalability (ex: http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections <http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections>)
Fault-tolerance
True no downtime updates.
Simplification:
nats can be deprecated.
All the other jobs (Director, Registry, Blobstore, HM & CPI) can to be OTP/Apps (Mix powered) under the same umbrella project.
Clustering out of the box
Perfomance wins, giving the nature of elixir/erlang/OTP is easy to guess that a single bosh instance will gonna be able to manage more deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and why.

Thanks,
Leandro.-

Join cf-bosh@lists.cloudfoundry.org to automatically receive all group messages.