Date
1 - 12 of 12
Elixir for bosh director?
Leandro David Cacciagioni
After more than 6 months working with elixir in prod, it crossed my mind
that maybe it deserves some time of experiment and think on the possibility
of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to much in
the technical side) are:
- Ruby like syntax (I know I know... This means a lot for people that
don't like erlang syntax) (I'm used to both so far)
- Easiness of development thanks to OTP & FP.
- Scalability (ex:
http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections
)
- Fault-tolerance
- True no downtime updates.
- Simplification:
- nats can be deprecated.
- All the other jobs (Director, Registry, Blobstore, HM & CPI) can to
be OTP/Apps (Mix powered) under the same umbrella project.
- Clustering out of the box
- Perfomance wins, giving the nature of elixir/erlang/OTP is easy to
guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and why.
Thanks,
Leandro.-
that maybe it deserves some time of experiment and think on the possibility
of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to much in
the technical side) are:
- Ruby like syntax (I know I know... This means a lot for people that
don't like erlang syntax) (I'm used to both so far)
- Easiness of development thanks to OTP & FP.
- Scalability (ex:
http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections
)
- Fault-tolerance
- True no downtime updates.
- Simplification:
- nats can be deprecated.
- All the other jobs (Director, Registry, Blobstore, HM & CPI) can to
be OTP/Apps (Mix powered) under the same umbrella project.
- Clustering out of the box
- Perfomance wins, giving the nature of elixir/erlang/OTP is easy to
guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and why.
Thanks,
Leandro.-
Geoff Franks <geoff@...>
FWIW, we've managed BOSHes with many deployments, some of which consist of ~1000 VMs, and not seen any direct performance issues of the BOSH director. Just lengthy deploys due to having so many VMs to iterate through.
I've also seen a significant uptick in responsiveness from the bosh cli when using the v2 cli, since ruby isn't parsing for tons of gemfiles every time I start the CLI up.
toggle quoted message
Show quoted text
I've also seen a significant uptick in responsiveness from the bosh cli when using the v2 cli, since ruby isn't parsing for tons of gemfiles every time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <leandro.21.2008(a)gmail.com> wrote:
After more than 6 months working with elixir in prod, it crossed my mind that maybe it deserves some time of experiment and think on the possibility of a TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR.
Some of the pros that I can list out of the box (without digging to much in the technical side) are:
Ruby like syntax (I know I know... This means a lot for people that don't like erlang syntax) (I'm used to both so far)
Easiness of development thanks to OTP & FP.
Scalability (ex: http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections <http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections>)
Fault-tolerance
True no downtime updates.
Simplification:
nats can be deprecated.
All the other jobs (Director, Registry, Blobstore, HM & CPI) can to be OTP/Apps (Mix powered) under the same umbrella project.
Clustering out of the box
Perfomance wins, giving the nature of elixir/erlang/OTP is easy to guess that a single bosh instance will gonna be able to manage more deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and why.
Thanks,
Leandro.-
Leandro David Cacciagioni
OK what you quote is certainly amazing, anyway that only tackle the
Scalability in part (I know for sure that elixir/erlang can hold the same a
lot better) but it didn't solve the Fault Tolerance part or the true no
downtime deployments (I know that people like IBM will love to update BOSH
with true / zero downtime). Plus all the simplification in the Director
logic that can come from using the proper tool for the right job. Anyway I
think I'll start a POC as under GPL license to make a compatible "BOSH
director" using elixir. Anyone who will like to help more than welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:
toggle quoted message
Show quoted text
Scalability in part (I know for sure that elixir/erlang can hold the same a
lot better) but it didn't solve the Fault Tolerance part or the true no
downtime deployments (I know that people like IBM will love to update BOSH
with true / zero downtime). Plus all the simplification in the Director
logic that can come from using the proper tool for the right job. Anyway I
think I'll start a POC as under GPL license to make a compatible "BOSH
director" using elixir. Anyone who will like to help more than welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:
FWIW, we've managed BOSHes with many deployments, some of which consist of
~1000 VMs, and not seen any direct performance issues of the BOSH director.
Just lengthy deploys due to having so many VMs to iterate through.
I've also seen a significant uptick in responsiveness from the bosh cli
when using the v2 cli, since ruby isn't parsing for tons of gemfiles every
time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
After more than 6 months working with elixir in prod, it crossed my mind
that maybe it deserves some time of experiment and think on the possibility
of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to much
in the technical side) are:
- Ruby like syntax (I know I know... This means a lot for people that
don't like erlang syntax) (I'm used to both so far)
- Easiness of development thanks to OTP & FP.
- Scalability (ex: http://www.phoenixframework.
org/blog/the-road-to-2-million-websocket-connections
<http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections>
)
- Fault-tolerance
- True no downtime updates.
- Simplification:
- nats can be deprecated.
- All the other jobs (Director, Registry, Blobstore, HM & CPI) can
to be OTP/Apps (Mix powered) under the same umbrella project.
- Clustering out of the box
- Perfomance wins, giving the nature of elixir/erlang/OTP is easy to
guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and
why.
Thanks,
Leandro.-
Geoff Franks <geoff@...>
What kind of downtime are they seeing when upgrading the BOSH director? IIRC there were things like binary-bosh to keep bosh alive + responsive all the time, but in practice, the main cause of downtime for BOSH is the delay for compilation + VM creation/deletion during `create-env`, which I don't think rewriting in another language can solve (stemcell upgrades). Downtime from restarting the director is pretty minimal, and BOSH itself isn't a critical path for the availability of the VMs it deploys.
We've mitigated the create-env issues by using create-env to create a BOSH that will manage upper-level infrastructure that isn't as important, and have that BOSH deploy other BOSHes to deploy prod (lower risk of not restarting a failed VM due to a BOSH upgrade).
toggle quoted message
Show quoted text
We've mitigated the create-env issues by using create-env to create a BOSH that will manage upper-level infrastructure that isn't as important, and have that BOSH deploy other BOSHes to deploy prod (lower risk of not restarting a failed VM due to a BOSH upgrade).
On Apr 27, 2017, at 1:57 PM, Leandro David Cacciagioni <leandro.21.2008(a)gmail.com> wrote:
OK what you quote is certainly amazing, anyway that only tackle the Scalability in part (I know for sure that elixir/erlang can hold the same a lot better) but it didn't solve the Fault Tolerance part or the true no downtime deployments (I know that people like IBM will love to update BOSH with true / zero downtime). Plus all the simplification in the Director logic that can come from using the proper tool for the right job. Anyway I think I'll start a POC as under GPL license to make a compatible "BOSH director" using elixir. Anyone who will like to help more than welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com <mailto:geoff(a)starkandwayne.com>>:
FWIW, we've managed BOSHes with many deployments, some of which consist of ~1000 VMs, and not seen any direct performance issues of the BOSH director. Just lengthy deploys due to having so many VMs to iterate through.
I've also seen a significant uptick in responsiveness from the bosh cli when using the v2 cli, since ruby isn't parsing for tons of gemfiles every time I start the CLI up.On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <leandro.21.2008(a)gmail.com <mailto:leandro.21.2008(a)gmail.com>> wrote:
After more than 6 months working with elixir in prod, it crossed my mind that maybe it deserves some time of experiment and think on the possibility of a TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR.
Some of the pros that I can list out of the box (without digging to much in the technical side) are:
Ruby like syntax (I know I know... This means a lot for people that don't like erlang syntax) (I'm used to both so far)
Easiness of development thanks to OTP & FP.
Scalability (ex: http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections <http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections>)
Fault-tolerance
True no downtime updates.
Simplification:
nats can be deprecated.
All the other jobs (Director, Registry, Blobstore, HM & CPI) can to be OTP/Apps (Mix powered) under the same umbrella project.
Clustering out of the box
Perfomance wins, giving the nature of elixir/erlang/OTP is easy to guess that a single bosh instance will gonna be able to manage more deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and why.
Thanks,
Leandro.-
Marco Voelz
Dear Leandro,
I'd love to see your experiment grow – keep in mind that the Director is around for quite a while and has some pretty complicated corner cases. Just like any rewrite: It is pretty simple to get 80% right, but then you'll spend much time on getting the remaining 20%.
A word on the license: If your target audience is really companies like IBM, don't go with GPL. I now that for example GPL is a no-go for us at SAP. I would assume a similar policy is in place in pretty much every big enterprise.
Warm regards
Marco
From: Leandro David Cacciagioni <leandro.21.2008(a)gmail.com>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <cf-bosh(a)lists.cloudfoundry.org>
Date: Thursday, 27. April 2017 at 19:57
To: "Discussions about the Cloud Foundry BOSH project." <cf-bosh(a)lists.cloudfoundry.org>
Subject: [cf-bosh] Re: Re: Elixir for bosh director?
OK what you quote is certainly amazing, anyway that only tackle the Scalability in part (I know for sure that elixir/erlang can hold the same a lot better) but it didn't solve the Fault Tolerance part or the true no downtime deployments (I know that people like IBM will love to update BOSH with true / zero downtime). Plus all the simplification in the Director logic that can come from using the proper tool for the right job. Anyway I think I'll start a POC as under GPL license to make a compatible "BOSH director" using elixir. Anyone who will like to help more than welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com<mailto:geoff(a)starkandwayne.com>>:
FWIW, we've managed BOSHes with many deployments, some of which consist of ~1000 VMs, and not seen any direct performance issues of the BOSH director. Just lengthy deploys due to having so many VMs to iterate through.
I've also seen a significant uptick in responsiveness from the bosh cli when using the v2 cli, since ruby isn't parsing for tons of gemfiles every time I start the CLI up.
toggle quoted message
Show quoted text
I'd love to see your experiment grow – keep in mind that the Director is around for quite a while and has some pretty complicated corner cases. Just like any rewrite: It is pretty simple to get 80% right, but then you'll spend much time on getting the remaining 20%.
A word on the license: If your target audience is really companies like IBM, don't go with GPL. I now that for example GPL is a no-go for us at SAP. I would assume a similar policy is in place in pretty much every big enterprise.
Warm regards
Marco
From: Leandro David Cacciagioni <leandro.21.2008(a)gmail.com>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <cf-bosh(a)lists.cloudfoundry.org>
Date: Thursday, 27. April 2017 at 19:57
To: "Discussions about the Cloud Foundry BOSH project." <cf-bosh(a)lists.cloudfoundry.org>
Subject: [cf-bosh] Re: Re: Elixir for bosh director?
OK what you quote is certainly amazing, anyway that only tackle the Scalability in part (I know for sure that elixir/erlang can hold the same a lot better) but it didn't solve the Fault Tolerance part or the true no downtime deployments (I know that people like IBM will love to update BOSH with true / zero downtime). Plus all the simplification in the Director logic that can come from using the proper tool for the right job. Anyway I think I'll start a POC as under GPL license to make a compatible "BOSH director" using elixir. Anyone who will like to help more than welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com<mailto:geoff(a)starkandwayne.com>>:
FWIW, we've managed BOSHes with many deployments, some of which consist of ~1000 VMs, and not seen any direct performance issues of the BOSH director. Just lengthy deploys due to having so many VMs to iterate through.
I've also seen a significant uptick in responsiveness from the bosh cli when using the v2 cli, since ruby isn't parsing for tons of gemfiles every time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <leandro.21.2008(a)gmail.com<mailto:leandro.21.2008(a)gmail.com>> wrote:
After more than 6 months working with elixir in prod, it crossed my mind that maybe it deserves some time of experiment and think on the possibility of a TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR.
Some of the pros that I can list out of the box (without digging to much in the technical side) are:
· Ruby like syntax (I know I know... This means a lot for people that don't like erlang syntax) (I'm used to both so far)
· Easiness of development thanks to OTP & FP.
o Scalability (ex: http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections)
o Fault-tolerance
o True no downtime updates.
· Simplification:
o nats can be deprecated.
o All the other jobs (Director, Registry, Blobstore, HM & CPI) can to be OTP/Apps (Mix powered) under the same umbrella project.
o Clustering out of the box
· Perfomance wins, giving the nature of elixir/erlang/OTP is easy to guess that a single bosh instance will gonna be able to manage more deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and why.
Thanks,
Leandro.-
After more than 6 months working with elixir in prod, it crossed my mind that maybe it deserves some time of experiment and think on the possibility of a TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR.
Some of the pros that I can list out of the box (without digging to much in the technical side) are:
· Ruby like syntax (I know I know... This means a lot for people that don't like erlang syntax) (I'm used to both so far)
· Easiness of development thanks to OTP & FP.
o Scalability (ex: http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections)
o Fault-tolerance
o True no downtime updates.
· Simplification:
o nats can be deprecated.
o All the other jobs (Director, Registry, Blobstore, HM & CPI) can to be OTP/Apps (Mix powered) under the same umbrella project.
o Clustering out of the box
· Perfomance wins, giving the nature of elixir/erlang/OTP is easy to guess that a single bosh instance will gonna be able to manage more deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and why.
Thanks,
Leandro.-
Leandro David Cacciagioni
Guys, I'm not saying that the director is bad or wrong, actually what I
want is maybe to improve it a little bit without touching the logic or the
api, my final goal is maybe to create a drop in replacement but keeping the
agent and the logic in place. I know it can be hard work but OTP solves a
lot of "edge cases" of the classic languages out of the box.
Geoff by downtime I mean that, no matter what, in languages like
ruby/python/go or any "classical" language you need to stop and start again
the server to read the new code while in erlang / elixir there is no need
for this,since it has a feature that it is called "hot code reloading" (You
can read about it here
<http://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving>,
here <http://erlang.org/doc/man/code.html> and here
<http://www.unstablebuild.com/2016/03/18/hot-code-reload-in-elixir.html>)
it is one of the moto's of erlang 99.9999999% (nine nines of availability)
and you can read more here <https://pragprog.com/articles/erlang>.
Marco good catch and thanks for the suggestion for the license, maybe I'll
evaluate some others like Apache or LGPL.
Thanks,
Leandro.-
2017-04-27 20:26 GMT+02:00 Voelz, Marco <marco.voelz(a)sap.com>:
toggle quoted message
Show quoted text
want is maybe to improve it a little bit without touching the logic or the
api, my final goal is maybe to create a drop in replacement but keeping the
agent and the logic in place. I know it can be hard work but OTP solves a
lot of "edge cases" of the classic languages out of the box.
Geoff by downtime I mean that, no matter what, in languages like
ruby/python/go or any "classical" language you need to stop and start again
the server to read the new code while in erlang / elixir there is no need
for this,since it has a feature that it is called "hot code reloading" (You
can read about it here
<http://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving>,
here <http://erlang.org/doc/man/code.html> and here
<http://www.unstablebuild.com/2016/03/18/hot-code-reload-in-elixir.html>)
it is one of the moto's of erlang 99.9999999% (nine nines of availability)
and you can read more here <https://pragprog.com/articles/erlang>.
Marco good catch and thanks for the suggestion for the license, maybe I'll
evaluate some others like Apache or LGPL.
Thanks,
Leandro.-
2017-04-27 20:26 GMT+02:00 Voelz, Marco <marco.voelz(a)sap.com>:
Dear Leandro,
I'd love to see your experiment grow – keep in mind that the Director is
around for quite a while and has some pretty complicated corner cases. Just
like any rewrite: It is pretty simple to get 80% right, but then you'll
spend much time on getting the remaining 20%.
A word on the license: If your target audience is really companies like
IBM, don't go with GPL. I now that for example GPL is a no-go for us at
SAP. I would assume a similar policy is in place in pretty much every big
enterprise.
Warm regards
Marco
*From: *Leandro David Cacciagioni <leandro.21.2008(a)gmail.com>
*Reply-To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Date: *Thursday, 27. April 2017 at 19:57
*To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Subject: *[cf-bosh] Re: Re: Elixir for bosh director?
OK what you quote is certainly amazing, anyway that only tackle the
Scalability in part (I know for sure that elixir/erlang can hold the same
a lot better) but it didn't solve the Fault Tolerance part or the true no
downtime deployments (I know that people like IBM will love to update BOSH
with true / zero downtime). Plus all the simplification in the Director
logic that can come from using the proper tool for the right job. Anyway I
think I'll start a POC as under GPL license to make a compatible "BOSH
director" using elixir. Anyone who will like to help more than welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:
FWIW, we've managed BOSHes with many deployments, some of which consist of
~1000 VMs, and not seen any direct performance issues of the BOSH director.
Just lengthy deploys due to having so many VMs to iterate through.
I've also seen a significant uptick in responsiveness from the bosh cli
when using the v2 cli, since ruby isn't parsing for tons of gemfiles every
time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
After more than 6 months working with elixir in prod, it crossed my mind
that maybe it deserves some time of experiment and think on the possibility
of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to much
in the technical side) are:
· Ruby like syntax (I know I know... This means a lot for people
that don't like erlang syntax) (I'm used to both so far)
· Easiness of development thanks to OTP & FP.
o Scalability (ex: http://www.phoenixframework.org/blog/the-road-to-2-
million-websocket-connections)
o Fault-tolerance
o True no downtime updates.
· Simplification:
o nats can be deprecated.
o All the other jobs (Director, Registry, Blobstore, HM & CPI) can to
be OTP/Apps (Mix powered) under the same umbrella project.
o Clustering out of the box
· Perfomance wins, giving the nature of elixir/erlang/OTP is easy
to guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and
why.
Thanks,
Leandro.-
Leandro David Cacciagioni
BTW did someone has a proper doc (like a swagger doc) for the api and for
the nats rpc? or knows how to generate it from the code
2017-04-27 21:00 GMT+02:00 Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com>:
toggle quoted message
Show quoted text
the nats rpc? or knows how to generate it from the code
2017-04-27 21:00 GMT+02:00 Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com>:
Guys, I'm not saying that the director is bad or wrong, actually what I
want is maybe to improve it a little bit without touching the logic or the
api, my final goal is maybe to create a drop in replacement but keeping the
agent and the logic in place. I know it can be hard work but OTP solves a
lot of "edge cases" of the classic languages out of the box.
Geoff by downtime I mean that, no matter what, in languages like
ruby/python/go or any "classical" language you need to stop and start again
the server to read the new code while in erlang / elixir there is no need
for this,since it has a feature that it is called "hot code reloading" (You
can read about it here
<http://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving>,
here <http://erlang.org/doc/man/code.html> and here
<http://www.unstablebuild.com/2016/03/18/hot-code-reload-in-elixir.html>)
it is one of the moto's of erlang 99.9999999% (nine nines of availability)
and you can read more here <https://pragprog.com/articles/erlang>.
Marco good catch and thanks for the suggestion for the license, maybe I'll
evaluate some others like Apache or LGPL.
Thanks,
Leandro.-
2017-04-27 20:26 GMT+02:00 Voelz, Marco <marco.voelz(a)sap.com>:Dear Leandro,
I'd love to see your experiment grow – keep in mind that the Director is
around for quite a while and has some pretty complicated corner cases. Just
like any rewrite: It is pretty simple to get 80% right, but then you'll
spend much time on getting the remaining 20%.
A word on the license: If your target audience is really companies like
IBM, don't go with GPL. I now that for example GPL is a no-go for us at
SAP. I would assume a similar policy is in place in pretty much every big
enterprise.
Warm regards
Marco
*From: *Leandro David Cacciagioni <leandro.21.2008(a)gmail.com>
*Reply-To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Date: *Thursday, 27. April 2017 at 19:57
*To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Subject: *[cf-bosh] Re: Re: Elixir for bosh director?
OK what you quote is certainly amazing, anyway that only tackle the
Scalability in part (I know for sure that elixir/erlang can hold the
same a lot better) but it didn't solve the Fault Tolerance part or the true
no downtime deployments (I know that people like IBM will love to update
BOSH with true / zero downtime). Plus all the simplification in the
Director logic that can come from using the proper tool for the right job.
Anyway I think I'll start a POC as under GPL license to make a compatible
"BOSH director" using elixir. Anyone who will like to help more than
welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:
FWIW, we've managed BOSHes with many deployments, some of which consist
of ~1000 VMs, and not seen any direct performance issues of the BOSH
director. Just lengthy deploys due to having so many VMs to iterate
through.
I've also seen a significant uptick in responsiveness from the bosh cli
when using the v2 cli, since ruby isn't parsing for tons of gemfiles every
time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
After more than 6 months working with elixir in prod, it crossed my mind
that maybe it deserves some time of experiment and think on the possibility
of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to much
in the technical side) are:
· Ruby like syntax (I know I know... This means a lot for people
that don't like erlang syntax) (I'm used to both so far)
· Easiness of development thanks to OTP & FP.
o Scalability (ex: http://www.phoenixframework.or
g/blog/the-road-to-2-million-websocket-connections)
o Fault-tolerance
o True no downtime updates.
· Simplification:
o nats can be deprecated.
o All the other jobs (Director, Registry, Blobstore, HM & CPI) can to
be OTP/Apps (Mix powered) under the same umbrella project.
o Clustering out of the box
· Perfomance wins, giving the nature of elixir/erlang/OTP is
easy to guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and
why.
Thanks,
Leandro.-
Dmitriy Kalinin
few small comments on the thread...
director. Just lengthy deploys due to having so many VMs to iterate through.
exactly what i was thinking.
lot of state (as it has to) and that state has to be migrated over time for
backwards compatibility etc.
you can technically deploy as many directors as you want today
(horizontally scalable) but they all in the end have to connect to some
shared state (database).
downtime-less deployments of the director will be achievable soon enough
when we expand agent's connectivity options to allow for connecting to
multiple directors. this will make rolling directors just a standard
procedure (like rolling cloud controllers in cf for example).
and bigger deployments than it does now.
if you take a look where majority of the time is spent, it's not in the
director itself but in all the other components director orchestrates (cpi,
installing jobs, startup, etc.). optimizing director would be focusing on
5% and most likely language choice isnt going to yield any noticeable
change.
not sure which parts you think can be simplified. director is a pretty
vanilla application that uses a db, etc.
On Thu, Apr 27, 2017 at 11:26 AM, Voelz, Marco <marco.voelz(a)sap.com> wrote:
FWIW, we've managed BOSHes with many deployments, some of which consistof ~1000 VMs, and not seen any direct performance issues of the BOSH
director. Just lengthy deploys due to having so many VMs to iterate through.
exactly what i was thinking.
Scalability, Fault-tolerance, True no downtime updates.that's all easy when there is no shared state to manage. director carries a
lot of state (as it has to) and that state has to be migrated over time for
backwards compatibility etc.
you can technically deploy as many directors as you want today
(horizontally scalable) but they all in the end have to connect to some
shared state (database).
I know that people like IBM will love to update BOSH with true / zerodowntime
downtime-less deployments of the director will be achievable soon enough
when we expand agent's connectivity options to allow for connecting to
multiple directors. this will make rolling directors just a standard
procedure (like rolling cloud controllers in cf for example).
Perfomance wins, giving the nature of elixir/erlang/OTP is easy to guessthat a single bosh instance will gonna be able to manage more deployments
and bigger deployments than it does now.
if you take a look where majority of the time is spent, it's not in the
director itself but in all the other components director orchestrates (cpi,
installing jobs, startup, etc.). optimizing director would be focusing on
5% and most likely language choice isnt going to yield any noticeable
change.
Plus all the simplification in the Director logic that can come fromusing the proper tool for the right job
not sure which parts you think can be simplified. director is a pretty
vanilla application that uses a db, etc.
On Thu, Apr 27, 2017 at 11:26 AM, Voelz, Marco <marco.voelz(a)sap.com> wrote:
Dear Leandro,
I'd love to see your experiment grow – keep in mind that the Director is
around for quite a while and has some pretty complicated corner cases. Just
like any rewrite: It is pretty simple to get 80% right, but then you'll
spend much time on getting the remaining 20%.
A word on the license: If your target audience is really companies like
IBM, don't go with GPL. I now that for example GPL is a no-go for us at
SAP. I would assume a similar policy is in place in pretty much every big
enterprise.
Warm regards
Marco
*From: *Leandro David Cacciagioni <leandro.21.2008(a)gmail.com>
*Reply-To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Date: *Thursday, 27. April 2017 at 19:57
*To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Subject: *[cf-bosh] Re: Re: Elixir for bosh director?
OK what you quote is certainly amazing, anyway that only tackle the
Scalability in part (I know for sure that elixir/erlang can hold the same
a lot better) but it didn't solve the Fault Tolerance part or the true no
downtime deployments (I know that people like IBM will love to update BOSH
with true / zero downtime). Plus all the simplification in the Director
logic that can come from using the proper tool for the right job. Anyway I
think I'll start a POC as under GPL license to make a compatible "BOSH
director" using elixir. Anyone who will like to help more than welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:
FWIW, we've managed BOSHes with many deployments, some of which consist of
~1000 VMs, and not seen any direct performance issues of the BOSH director.
Just lengthy deploys due to having so many VMs to iterate through.
I've also seen a significant uptick in responsiveness from the bosh cli
when using the v2 cli, since ruby isn't parsing for tons of gemfiles every
time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
After more than 6 months working with elixir in prod, it crossed my mind
that maybe it deserves some time of experiment and think on the possibility
of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to much
in the technical side) are:
· Ruby like syntax (I know I know... This means a lot for people
that don't like erlang syntax) (I'm used to both so far)
· Easiness of development thanks to OTP & FP.
o Scalability (ex: http://www.phoenixframework.org/blog/the-road-to-2-
million-websocket-connections)
o Fault-tolerance
o True no downtime updates.
· Simplification:
o nats can be deprecated.
o All the other jobs (Director, Registry, Blobstore, HM & CPI) can to
be OTP/Apps (Mix powered) under the same umbrella project.
o Clustering out of the box
· Perfomance wins, giving the nature of elixir/erlang/OTP is easy
to guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and
why.
Thanks,
Leandro.-
Eric Malm <emalm@...>
Leandro,
If you intend your project eventually to be considered for the CFF to
adopt, please license it as Apache 2.0. That license is used uniformly
across other Foundation projects. Please see
https://www.cloudfoundry.org/governance/cff_ip_policy/ for more details.
I understand the technical benefit of the hot-reloading feature that Erlang
brings, but I view it as incompatible with the realities of how BOSH itself
is deployed. It's typically bootstrapped from some other tool in the BOSH
ecosystem, whether that be another BOSH instance, or the new v2 BOSH CLI,
or even the ancient bosh micro CLI plugin. Those tools all follow the BOSH
update pattern of stopping services on a VM, replacing the software bits
and configuration (and, in the CLI cases, even the VM itself!), and
restarting the services. Unless you go out of your way with the BOSH
release itself to violate the expectations of the BOSH job lifecycle,
there's no opportunity to take advantage of that hot-reloading feature, and
it wouldn't work at all anyway if the VM is replaced.
I think a more effective solution regarding downtime would be to make BOSH
deployable in a fully HA mode, which would address both availability during
upgrades and tolerance to a wider variety of failure modes (component, VM,
availability zone). I've heard Dmitriy mention that as a potential
direction for BOSH in the past, but taking a quick look at the BOSH project
tracker I don't currently see work related to that effort. Even then, for
almost everyone, BOSH is a means to the end of deploying the software you
really care about in a way that allows you to evolve it over time. So it's
typically not a substantial issue in practice for BOSH to have only a few
9s of availability, so long as the state it retains about the deployments
it manages can always be restored successfully to a new BOSH director
within a suitable period of time.
Finally, having been deeply involved in a rewrite of another major CF
subsystem (DEAs to Diego), +1000000 on Jonathan's observation about
rewrites always being harder to execute and taking longer than you expect,
even when you try to account for those expected delays. (This can be viewed
as one manifestation of the more general Hofstadter's Law
<https://en.wikipedia.org/wiki/Hofstadter%27s_law>.) If you do perceive
some benefits to simplifying the BOSH architecture and think that can be
achieved through a rewrite in a different language, look for seams and
interfaces to keep that change as small as possible while still being
impactful.
Thanks,
Eric, CF Diego PM
On Thu, Apr 27, 2017 at 12:00 PM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
If you intend your project eventually to be considered for the CFF to
adopt, please license it as Apache 2.0. That license is used uniformly
across other Foundation projects. Please see
https://www.cloudfoundry.org/governance/cff_ip_policy/ for more details.
I understand the technical benefit of the hot-reloading feature that Erlang
brings, but I view it as incompatible with the realities of how BOSH itself
is deployed. It's typically bootstrapped from some other tool in the BOSH
ecosystem, whether that be another BOSH instance, or the new v2 BOSH CLI,
or even the ancient bosh micro CLI plugin. Those tools all follow the BOSH
update pattern of stopping services on a VM, replacing the software bits
and configuration (and, in the CLI cases, even the VM itself!), and
restarting the services. Unless you go out of your way with the BOSH
release itself to violate the expectations of the BOSH job lifecycle,
there's no opportunity to take advantage of that hot-reloading feature, and
it wouldn't work at all anyway if the VM is replaced.
I think a more effective solution regarding downtime would be to make BOSH
deployable in a fully HA mode, which would address both availability during
upgrades and tolerance to a wider variety of failure modes (component, VM,
availability zone). I've heard Dmitriy mention that as a potential
direction for BOSH in the past, but taking a quick look at the BOSH project
tracker I don't currently see work related to that effort. Even then, for
almost everyone, BOSH is a means to the end of deploying the software you
really care about in a way that allows you to evolve it over time. So it's
typically not a substantial issue in practice for BOSH to have only a few
9s of availability, so long as the state it retains about the deployments
it manages can always be restored successfully to a new BOSH director
within a suitable period of time.
Finally, having been deeply involved in a rewrite of another major CF
subsystem (DEAs to Diego), +1000000 on Jonathan's observation about
rewrites always being harder to execute and taking longer than you expect,
even when you try to account for those expected delays. (This can be viewed
as one manifestation of the more general Hofstadter's Law
<https://en.wikipedia.org/wiki/Hofstadter%27s_law>.) If you do perceive
some benefits to simplifying the BOSH architecture and think that can be
achieved through a rewrite in a different language, look for seams and
interfaces to keep that change as small as possible while still being
impactful.
Thanks,
Eric, CF Diego PM
On Thu, Apr 27, 2017 at 12:00 PM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
Guys, I'm not saying that the director is bad or wrong, actually what I
want is maybe to improve it a little bit without touching the logic or the
api, my final goal is maybe to create a drop in replacement but keeping the
agent and the logic in place. I know it can be hard work but OTP solves a
lot of "edge cases" of the classic languages out of the box.
Geoff by downtime I mean that, no matter what, in languages like
ruby/python/go or any "classical" language you need to stop and start again
the server to read the new code while in erlang / elixir there is no need
for this,since it has a feature that it is called "hot code reloading" (You
can read about it here
<http://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving>,
here <http://erlang.org/doc/man/code.html> and here
<http://www.unstablebuild.com/2016/03/18/hot-code-reload-in-elixir.html>)
it is one of the moto's of erlang 99.9999999% (nine nines of availability)
and you can read more here <https://pragprog.com/articles/erlang>.
Marco good catch and thanks for the suggestion for the license, maybe I'll
evaluate some others like Apache or LGPL.
Thanks,
Leandro.-
2017-04-27 20:26 GMT+02:00 Voelz, Marco <marco.voelz(a)sap.com>:Dear Leandro,
I'd love to see your experiment grow – keep in mind that the Director is
around for quite a while and has some pretty complicated corner cases. Just
like any rewrite: It is pretty simple to get 80% right, but then you'll
spend much time on getting the remaining 20%.
A word on the license: If your target audience is really companies like
IBM, don't go with GPL. I now that for example GPL is a no-go for us at
SAP. I would assume a similar policy is in place in pretty much every big
enterprise.
Warm regards
Marco
*From: *Leandro David Cacciagioni <leandro.21.2008(a)gmail.com>
*Reply-To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Date: *Thursday, 27. April 2017 at 19:57
*To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Subject: *[cf-bosh] Re: Re: Elixir for bosh director?
OK what you quote is certainly amazing, anyway that only tackle the
Scalability in part (I know for sure that elixir/erlang can hold the
same a lot better) but it didn't solve the Fault Tolerance part or the true
no downtime deployments (I know that people like IBM will love to update
BOSH with true / zero downtime). Plus all the simplification in the
Director logic that can come from using the proper tool for the right job.
Anyway I think I'll start a POC as under GPL license to make a compatible
"BOSH director" using elixir. Anyone who will like to help more than
welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:
FWIW, we've managed BOSHes with many deployments, some of which consist
of ~1000 VMs, and not seen any direct performance issues of the BOSH
director. Just lengthy deploys due to having so many VMs to iterate
through.
I've also seen a significant uptick in responsiveness from the bosh cli
when using the v2 cli, since ruby isn't parsing for tons of gemfiles every
time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
After more than 6 months working with elixir in prod, it crossed my mind
that maybe it deserves some time of experiment and think on the possibility
of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to much
in the technical side) are:
· Ruby like syntax (I know I know... This means a lot for people
that don't like erlang syntax) (I'm used to both so far)
· Easiness of development thanks to OTP & FP.
o Scalability (ex: http://www.phoenixframework.or
g/blog/the-road-to-2-million-websocket-connections)
o Fault-tolerance
o True no downtime updates.
· Simplification:
o nats can be deprecated.
o All the other jobs (Director, Registry, Blobstore, HM & CPI) can to
be OTP/Apps (Mix powered) under the same umbrella project.
o Clustering out of the box
· Perfomance wins, giving the nature of elixir/erlang/OTP is
easy to guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and
why.
Thanks,
Leandro.-
Leandro David Cacciagioni
Eric, thanks a lot. Really appreciate your point of view and would have to
say that yes my idea of involving elixir / erlarng is to have a proper
multi vm deployment to create a fully redundant highly available bosh
deployment. Regarding how you deploy and update the director and its
components can change a little bit and maybe change the cli in the future
;) , anyway I know that a work like this can take a lot and it will gonna
involve more people over time if the day comes.
Let see if I can get some minimal POC over the next months at least with
the basic features.
Thanks,
Leandro.-
2017-04-27 21:57 GMT+02:00 Eric Malm <emalm(a)pivotal.io>:
toggle quoted message
Show quoted text
say that yes my idea of involving elixir / erlarng is to have a proper
multi vm deployment to create a fully redundant highly available bosh
deployment. Regarding how you deploy and update the director and its
components can change a little bit and maybe change the cli in the future
;) , anyway I know that a work like this can take a lot and it will gonna
involve more people over time if the day comes.
Let see if I can get some minimal POC over the next months at least with
the basic features.
Thanks,
Leandro.-
2017-04-27 21:57 GMT+02:00 Eric Malm <emalm(a)pivotal.io>:
Leandro,
If you intend your project eventually to be considered for the CFF to
adopt, please license it as Apache 2.0. That license is used uniformly
across other Foundation projects. Please see https://www.cloudfoundry.
org/governance/cff_ip_policy/ for more details.
I understand the technical benefit of the hot-reloading feature that
Erlang brings, but I view it as incompatible with the realities of how BOSH
itself is deployed. It's typically bootstrapped from some other tool in the
BOSH ecosystem, whether that be another BOSH instance, or the new v2 BOSH
CLI, or even the ancient bosh micro CLI plugin. Those tools all follow the
BOSH update pattern of stopping services on a VM, replacing the software
bits and configuration (and, in the CLI cases, even the VM itself!), and
restarting the services. Unless you go out of your way with the BOSH
release itself to violate the expectations of the BOSH job lifecycle,
there's no opportunity to take advantage of that hot-reloading feature, and
it wouldn't work at all anyway if the VM is replaced.
I think a more effective solution regarding downtime would be to make BOSH
deployable in a fully HA mode, which would address both availability during
upgrades and tolerance to a wider variety of failure modes (component, VM,
availability zone). I've heard Dmitriy mention that as a potential
direction for BOSH in the past, but taking a quick look at the BOSH project
tracker I don't currently see work related to that effort. Even then, for
almost everyone, BOSH is a means to the end of deploying the software you
really care about in a way that allows you to evolve it over time. So it's
typically not a substantial issue in practice for BOSH to have only a few
9s of availability, so long as the state it retains about the deployments
it manages can always be restored successfully to a new BOSH director
within a suitable period of time.
Finally, having been deeply involved in a rewrite of another major CF
subsystem (DEAs to Diego), +1000000 on Jonathan's observation about
rewrites always being harder to execute and taking longer than you expect,
even when you try to account for those expected delays. (This can be viewed
as one manifestation of the more general Hofstadter's Law
<https://en.wikipedia.org/wiki/Hofstadter%27s_law>.) If you do perceive
some benefits to simplifying the BOSH architecture and think that can be
achieved through a rewrite in a different language, look for seams and
interfaces to keep that change as small as possible while still being
impactful.
Thanks,
Eric, CF Diego PM
On Thu, Apr 27, 2017 at 12:00 PM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:Guys, I'm not saying that the director is bad or wrong, actually what I
want is maybe to improve it a little bit without touching the logic or the
api, my final goal is maybe to create a drop in replacement but keeping the
agent and the logic in place. I know it can be hard work but OTP solves a
lot of "edge cases" of the classic languages out of the box.
Geoff by downtime I mean that, no matter what, in languages like
ruby/python/go or any "classical" language you need to stop and start again
the server to read the new code while in erlang / elixir there is no need
for this,since it has a feature that it is called "hot code reloading" (You
can read about it here
<http://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving>,
here <http://erlang.org/doc/man/code.html> and here
<http://www.unstablebuild.com/2016/03/18/hot-code-reload-in-elixir.html>)
it is one of the moto's of erlang 99.9999999% (nine nines of availability)
and you can read more here <https://pragprog.com/articles/erlang>.
Marco good catch and thanks for the suggestion for the license, maybe
I'll evaluate some others like Apache or LGPL.
Thanks,
Leandro.-
2017-04-27 20:26 GMT+02:00 Voelz, Marco <marco.voelz(a)sap.com>:Dear Leandro,
I'd love to see your experiment grow – keep in mind that the Director is
around for quite a while and has some pretty complicated corner cases. Just
like any rewrite: It is pretty simple to get 80% right, but then you'll
spend much time on getting the remaining 20%.
A word on the license: If your target audience is really companies like
IBM, don't go with GPL. I now that for example GPL is a no-go for us at
SAP. I would assume a similar policy is in place in pretty much every big
enterprise.
Warm regards
Marco
*From: *Leandro David Cacciagioni <leandro.21.2008(a)gmail.com>
*Reply-To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Date: *Thursday, 27. April 2017 at 19:57
*To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Subject: *[cf-bosh] Re: Re: Elixir for bosh director?
OK what you quote is certainly amazing, anyway that only tackle the
Scalability in part (I know for sure that elixir/erlang can hold the
same a lot better) but it didn't solve the Fault Tolerance part or the true
no downtime deployments (I know that people like IBM will love to update
BOSH with true / zero downtime). Plus all the simplification in the
Director logic that can come from using the proper tool for the right job.
Anyway I think I'll start a POC as under GPL license to make a compatible
"BOSH director" using elixir. Anyone who will like to help more than
welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:
FWIW, we've managed BOSHes with many deployments, some of which consist
of ~1000 VMs, and not seen any direct performance issues of the BOSH
director. Just lengthy deploys due to having so many VMs to iterate
through.
I've also seen a significant uptick in responsiveness from the bosh cli
when using the v2 cli, since ruby isn't parsing for tons of gemfiles every
time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
After more than 6 months working with elixir in prod, it crossed my mind
that maybe it deserves some time of experiment and think on the possibility
of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to much
in the technical side) are:
· Ruby like syntax (I know I know... This means a lot for
people that don't like erlang syntax) (I'm used to both so far)
· Easiness of development thanks to OTP & FP.
o Scalability (ex: http://www.phoenixframework.or
g/blog/the-road-to-2-million-websocket-connections)
o Fault-tolerance
o True no downtime updates.
· Simplification:
o nats can be deprecated.
o All the other jobs (Director, Registry, Blobstore, HM & CPI) can
to be OTP/Apps (Mix powered) under the same umbrella project.
o Clustering out of the box
· Perfomance wins, giving the nature of elixir/erlang/OTP is
easy to guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and
why.
Thanks,
Leandro.-
Gwenn Etourneau
Leandro,
To be honest if I have to choose, I will prefer Go over Elixir / Erlang.
Most of the tool around CF is written in Go and the community people (I
think) already spend time on learning Go and are now pretty good with.
Not sure introducing another language just for the beauty (Yak shaving) is
a good idea, I like the path that bosh cli took by rewriting everything in
Go.
Thanks.
On Fri, Apr 28, 2017 at 6:15 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
To be honest if I have to choose, I will prefer Go over Elixir / Erlang.
Most of the tool around CF is written in Go and the community people (I
think) already spend time on learning Go and are now pretty good with.
Not sure introducing another language just for the beauty (Yak shaving) is
a good idea, I like the path that bosh cli took by rewriting everything in
Go.
Thanks.
On Fri, Apr 28, 2017 at 6:15 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
Eric, thanks a lot. Really appreciate your point of view and would have to
say that yes my idea of involving elixir / erlarng is to have a proper
multi vm deployment to create a fully redundant highly available bosh
deployment. Regarding how you deploy and update the director and its
components can change a little bit and maybe change the cli in the future
;) , anyway I know that a work like this can take a lot and it will gonna
involve more people over time if the day comes.
Let see if I can get some minimal POC over the next months at least with
the basic features.
Thanks,
Leandro.-
2017-04-27 21:57 GMT+02:00 Eric Malm <emalm(a)pivotal.io>:Leandro,
If you intend your project eventually to be considered for the CFF to
adopt, please license it as Apache 2.0. That license is used uniformly
across other Foundation projects. Please see https://www.cloudfoundry.o
rg/governance/cff_ip_policy/ for more details.
I understand the technical benefit of the hot-reloading feature that
Erlang brings, but I view it as incompatible with the realities of how BOSH
itself is deployed. It's typically bootstrapped from some other tool in the
BOSH ecosystem, whether that be another BOSH instance, or the new v2 BOSH
CLI, or even the ancient bosh micro CLI plugin. Those tools all follow the
BOSH update pattern of stopping services on a VM, replacing the software
bits and configuration (and, in the CLI cases, even the VM itself!), and
restarting the services. Unless you go out of your way with the BOSH
release itself to violate the expectations of the BOSH job lifecycle,
there's no opportunity to take advantage of that hot-reloading feature, and
it wouldn't work at all anyway if the VM is replaced.
I think a more effective solution regarding downtime would be to make
BOSH deployable in a fully HA mode, which would address both availability
during upgrades and tolerance to a wider variety of failure modes
(component, VM, availability zone). I've heard Dmitriy mention that as a
potential direction for BOSH in the past, but taking a quick look at the
BOSH project tracker I don't currently see work related to that effort.
Even then, for almost everyone, BOSH is a means to the end of deploying the
software you really care about in a way that allows you to evolve it over
time. So it's typically not a substantial issue in practice for BOSH to
have only a few 9s of availability, so long as the state it retains about
the deployments it manages can always be restored successfully to a new
BOSH director within a suitable period of time.
Finally, having been deeply involved in a rewrite of another major CF
subsystem (DEAs to Diego), +1000000 on Jonathan's observation about
rewrites always being harder to execute and taking longer than you expect,
even when you try to account for those expected delays. (This can be viewed
as one manifestation of the more general Hofstadter's Law
<https://en.wikipedia.org/wiki/Hofstadter%27s_law>.) If you do perceive
some benefits to simplifying the BOSH architecture and think that can be
achieved through a rewrite in a different language, look for seams and
interfaces to keep that change as small as possible while still being
impactful.
Thanks,
Eric, CF Diego PM
On Thu, Apr 27, 2017 at 12:00 PM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:Guys, I'm not saying that the director is bad or wrong, actually what I
want is maybe to improve it a little bit without touching the logic or the
api, my final goal is maybe to create a drop in replacement but keeping the
agent and the logic in place. I know it can be hard work but OTP solves a
lot of "edge cases" of the classic languages out of the box.
Geoff by downtime I mean that, no matter what, in languages like
ruby/python/go or any "classical" language you need to stop and start again
the server to read the new code while in erlang / elixir there is no need
for this,since it has a feature that it is called "hot code reloading" (You
can read about it here
<http://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving>,
here <http://erlang.org/doc/man/code.html> and here
<http://www.unstablebuild.com/2016/03/18/hot-code-reload-in-elixir.html>)
it is one of the moto's of erlang 99.9999999% (nine nines of availability)
and you can read more here <https://pragprog.com/articles/erlang>.
Marco good catch and thanks for the suggestion for the license, maybe
I'll evaluate some others like Apache or LGPL.
Thanks,
Leandro.-
2017-04-27 20:26 GMT+02:00 Voelz, Marco <marco.voelz(a)sap.com>:Dear Leandro,
I'd love to see your experiment grow – keep in mind that the Director
is around for quite a while and has some pretty complicated corner cases.
Just like any rewrite: It is pretty simple to get 80% right, but then
you'll spend much time on getting the remaining 20%.
A word on the license: If your target audience is really companies like
IBM, don't go with GPL. I now that for example GPL is a no-go for us at
SAP. I would assume a similar policy is in place in pretty much every big
enterprise.
Warm regards
Marco
*From: *Leandro David Cacciagioni <leandro.21.2008(a)gmail.com>
*Reply-To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Date: *Thursday, 27. April 2017 at 19:57
*To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Subject: *[cf-bosh] Re: Re: Elixir for bosh director?
OK what you quote is certainly amazing, anyway that only tackle the
Scalability in part (I know for sure that elixir/erlang can hold the
same a lot better) but it didn't solve the Fault Tolerance part or the true
no downtime deployments (I know that people like IBM will love to update
BOSH with true / zero downtime). Plus all the simplification in the
Director logic that can come from using the proper tool for the right job.
Anyway I think I'll start a POC as under GPL license to make a compatible
"BOSH director" using elixir. Anyone who will like to help more than
welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:
FWIW, we've managed BOSHes with many deployments, some of which consist
of ~1000 VMs, and not seen any direct performance issues of the BOSH
director. Just lengthy deploys due to having so many VMs to iterate
through.
I've also seen a significant uptick in responsiveness from the bosh cli
when using the v2 cli, since ruby isn't parsing for tons of gemfiles every
time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
After more than 6 months working with elixir in prod, it crossed my
mind that maybe it deserves some time of experiment and think on the
possibility of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to
much in the technical side) are:
· Ruby like syntax (I know I know... This means a lot for
people that don't like erlang syntax) (I'm used to both so far)
· Easiness of development thanks to OTP & FP.
o Scalability (ex: http://www.phoenixframework.or
g/blog/the-road-to-2-million-websocket-connections)
o Fault-tolerance
o True no downtime updates.
· Simplification:
o nats can be deprecated.
o All the other jobs (Director, Registry, Blobstore, HM & CPI) can
to be OTP/Apps (Mix powered) under the same umbrella project.
o Clustering out of the box
· Perfomance wins, giving the nature of elixir/erlang/OTP is
easy to guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't and
why.
Thanks,
Leandro.-
Leandro David Cacciagioni
To be honest I like go... But for CLI or clients, there, in that field I
don't know any other language as easy to compile or as easy to get up and
running.
Anyway in the server field go has the same major problem as any imperative
language... Shared mutable state... Which makes it not the best fit for a
highly concurrent distributed Bosh. Plus elixir has OTP which makes your
life extremely easy, and the elixir syntax is similar to Ruby which is the
current language of choice for the Bosh director, that's why I choose it
over pure erlang.
Once again my plan is just to replace the server side of the equation,
because for CLIs elixir is a no go when you compare it with golang.
Anyway this is my point of view, which I'll try to prove with some
coding... Then if it picks up on the community great!!! If don't tough luck.
Thanks,
Leandro.-
toggle quoted message
Show quoted text
don't know any other language as easy to compile or as easy to get up and
running.
Anyway in the server field go has the same major problem as any imperative
language... Shared mutable state... Which makes it not the best fit for a
highly concurrent distributed Bosh. Plus elixir has OTP which makes your
life extremely easy, and the elixir syntax is similar to Ruby which is the
current language of choice for the Bosh director, that's why I choose it
over pure erlang.
Once again my plan is just to replace the server side of the equation,
because for CLIs elixir is a no go when you compare it with golang.
Anyway this is my point of view, which I'll try to prove with some
coding... Then if it picks up on the community great!!! If don't tough luck.
Thanks,
Leandro.-
On Apr 28, 2017 02:43, "Gwenn Etourneau" <getourneau(a)pivotal.io> wrote:
Leandro,
To be honest if I have to choose, I will prefer Go over Elixir / Erlang.
Most of the tool around CF is written in Go and the community people (I
think) already spend time on learning Go and are now pretty good with.
Not sure introducing another language just for the beauty (Yak shaving) is
a good idea, I like the path that bosh cli took by rewriting everything in
Go.
Thanks.
On Fri, Apr 28, 2017 at 6:15 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:Eric, thanks a lot. Really appreciate your point of view and would have
to say that yes my idea of involving elixir / erlarng is to have a proper
multi vm deployment to create a fully redundant highly available bosh
deployment. Regarding how you deploy and update the director and its
components can change a little bit and maybe change the cli in the future
;) , anyway I know that a work like this can take a lot and it will gonna
involve more people over time if the day comes.
Let see if I can get some minimal POC over the next months at least with
the basic features.
Thanks,
Leandro.-
2017-04-27 21:57 GMT+02:00 Eric Malm <emalm(a)pivotal.io>:Leandro,
If you intend your project eventually to be considered for the CFF to
adopt, please license it as Apache 2.0. That license is used uniformly
across other Foundation projects. Please see https://www.cloudfoundry.o
rg/governance/cff_ip_policy/ for more details.
I understand the technical benefit of the hot-reloading feature that
Erlang brings, but I view it as incompatible with the realities of how BOSH
itself is deployed. It's typically bootstrapped from some other tool in the
BOSH ecosystem, whether that be another BOSH instance, or the new v2 BOSH
CLI, or even the ancient bosh micro CLI plugin. Those tools all follow the
BOSH update pattern of stopping services on a VM, replacing the software
bits and configuration (and, in the CLI cases, even the VM itself!), and
restarting the services. Unless you go out of your way with the BOSH
release itself to violate the expectations of the BOSH job lifecycle,
there's no opportunity to take advantage of that hot-reloading feature, and
it wouldn't work at all anyway if the VM is replaced.
I think a more effective solution regarding downtime would be to make
BOSH deployable in a fully HA mode, which would address both availability
during upgrades and tolerance to a wider variety of failure modes
(component, VM, availability zone). I've heard Dmitriy mention that as a
potential direction for BOSH in the past, but taking a quick look at the
BOSH project tracker I don't currently see work related to that effort.
Even then, for almost everyone, BOSH is a means to the end of deploying the
software you really care about in a way that allows you to evolve it over
time. So it's typically not a substantial issue in practice for BOSH to
have only a few 9s of availability, so long as the state it retains about
the deployments it manages can always be restored successfully to a new
BOSH director within a suitable period of time.
Finally, having been deeply involved in a rewrite of another major CF
subsystem (DEAs to Diego), +1000000 on Jonathan's observation about
rewrites always being harder to execute and taking longer than you expect,
even when you try to account for those expected delays. (This can be viewed
as one manifestation of the more general Hofstadter's Law
<https://en.wikipedia.org/wiki/Hofstadter%27s_law>.) If you do perceive
some benefits to simplifying the BOSH architecture and think that can be
achieved through a rewrite in a different language, look for seams and
interfaces to keep that change as small as possible while still being
impactful.
Thanks,
Eric, CF Diego PM
On Thu, Apr 27, 2017 at 12:00 PM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:Guys, I'm not saying that the director is bad or wrong, actually what I
want is maybe to improve it a little bit without touching the logic or the
api, my final goal is maybe to create a drop in replacement but keeping the
agent and the logic in place. I know it can be hard work but OTP solves a
lot of "edge cases" of the classic languages out of the box.
Geoff by downtime I mean that, no matter what, in languages like
ruby/python/go or any "classical" language you need to stop and start again
the server to read the new code while in erlang / elixir there is no need
for this,since it has a feature that it is called "hot code reloading" (You
can read about it here
<http://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving>,
here <http://erlang.org/doc/man/code.html> and here
<http://www.unstablebuild.com/2016/03/18/hot-code-reload-in-elixir.html>)
it is one of the moto's of erlang 99.9999999% (nine nines of availability)
and you can read more here <https://pragprog.com/articles/erlang>.
Marco good catch and thanks for the suggestion for the license, maybe
I'll evaluate some others like Apache or LGPL.
Thanks,
Leandro.-
2017-04-27 20:26 GMT+02:00 Voelz, Marco <marco.voelz(a)sap.com>:Dear Leandro,
I'd love to see your experiment grow – keep in mind that the Director
is around for quite a while and has some pretty complicated corner cases.
Just like any rewrite: It is pretty simple to get 80% right, but then
you'll spend much time on getting the remaining 20%.
A word on the license: If your target audience is really companies
like IBM, don't go with GPL. I now that for example GPL is a no-go for us
at SAP. I would assume a similar policy is in place in pretty much every
big enterprise.
Warm regards
Marco
*From: *Leandro David Cacciagioni <leandro.21.2008(a)gmail.com>
*Reply-To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Date: *Thursday, 27. April 2017 at 19:57
*To: *"Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
*Subject: *[cf-bosh] Re: Re: Elixir for bosh director?
OK what you quote is certainly amazing, anyway that only tackle the
Scalability in part (I know for sure that elixir/erlang can hold the
same a lot better) but it didn't solve the Fault Tolerance part or the true
no downtime deployments (I know that people like IBM will love to update
BOSH with true / zero downtime). Plus all the simplification in the
Director logic that can come from using the proper tool for the right job.
Anyway I think I'll start a POC as under GPL license to make a compatible
"BOSH director" using elixir. Anyone who will like to help more than
welcomed.
2017-04-27 17:59 GMT+02:00 Geoff Franks <geoff(a)starkandwayne.com>:
FWIW, we've managed BOSHes with many deployments, some of which
consist of ~1000 VMs, and not seen any direct performance issues of the
BOSH director. Just lengthy deploys due to having so many VMs to iterate
through.
I've also seen a significant uptick in responsiveness from the bosh
cli when using the v2 cli, since ruby isn't parsing for tons of gemfiles
every time I start the CLI up.
On Apr 27, 2017, at 9:01 AM, Leandro David Cacciagioni <
leandro.21.2008(a)gmail.com> wrote:
After more than 6 months working with elixir in prod, it crossed my
mind that maybe it deserves some time of experiment and think on the
possibility of a *TOTAL REWRITE OF BOSH DIRECTOR USING ELIXIR*.
Some of the pros that I can list out of the box (without digging to
much in the technical side) are:
· Ruby like syntax (I know I know... This means a lot for
people that don't like erlang syntax) (I'm used to both so far)
· Easiness of development thanks to OTP & FP.
o Scalability (ex: http://www.phoenixframework.or
g/blog/the-road-to-2-million-websocket-connections)
o Fault-tolerance
o True no downtime updates.
· Simplification:
o nats can be deprecated.
o All the other jobs (Director, Registry, Blobstore, HM & CPI) can
to be OTP/Apps (Mix powered) under the same umbrella project.
o Clustering out of the box
· Perfomance wins, giving the nature of elixir/erlang/OTP is
easy to guess that a single bosh instance will gonna be able to manage more
deployments and bigger deployments than it does now.
This is a suggestion and I would like to know if you agree or don't
and why.
Thanks,
Leandro.-