Some results running CloudController under JRuby


Steffen Uhlig
 

Hi,
 
the Flintstone team recently spent some time researching the potential benefits of running the CloudController under JRuby. We were hoping to find evidence that JRuby (and the underlying JVM) would allow us to make better use of multiple cores, and maybe also lead to significant response time improvements when answering many parallel requests.
 
This exercise wasn't set up as a scientific benchmark; it is more of a spike that would allow us to judge whether it is worth investigating the next level of detail.
 
We would like to share some early results in the hope to get feedback from the community.
 
In our measurements, we saw 20..30% improvement in both average response time and throughput when 10 or more concurrent requests were made (using ApacheBench against the `/v2/orgs/*/spaces` endpoint).
 
Graphs:
    * Throughput: https://goo.gl/NuWkvf
    * Response Time: https://goo.gl/ItPBHN
 
We patched a CC VM to use JRuby 9000 under OpenJDK 8. WEBrick was used as we weren't able to quickly find a drop-in replacement for Thin (as used under MRI). All measurements were taken on a 2014 MacBook Pro running Cloudfoundry in a BOSH Lite environment. Simulating network latency by adding a 100 ms sleep to each request did not change the overall picture.
 
For more details see the spreadsheet* and our repository** with the test scripts.
 
Steffen
(on behalf of the Flintstone Team)
 
 


Dieu Cao <dcao@...>
 

Hi Steffen,

Cool stuff. It's good to see there's some improvement under jruby.
Would be interested to see how this performs on aws or on soft layer.
Also, were you able to run CATS?
Or is more work needed to deal with NATS etc?
Other pros/cons that you've found?

-Dieu
CF CAPI PM



On Mon, Oct 19, 2015 at 1:25 PM, Steffen Uhlig <Steffen.Uhlig(a)de.ibm.com>
wrote:

Hi,

the Flintstone team recently spent some time researching the potential
benefits of running the CloudController under JRuby. We were hoping to find
evidence that JRuby (and the underlying JVM) would allow us to make better
use of multiple cores, and maybe also lead to significant response time
improvements when answering many parallel requests.

This exercise wasn't set up as a scientific benchmark; it is more of a
spike that would allow us to judge whether it is worth investigating the
next level of detail.

We would like to share some early results in the hope to get feedback from
the community.

In our measurements, we saw 20..30% improvement in both average response
time and throughput when 10 or more concurrent requests were made (using
ApacheBench against the `/v2/orgs/*/spaces` endpoint).

Graphs:
* Throughput: https://goo.gl/NuWkvf
* Response Time: https://goo.gl/ItPBHN

We patched a CC VM to use JRuby 9000 under OpenJDK 8. WEBrick was used as
we weren't able to quickly find a drop-in replacement for Thin (as used
under MRI). All measurements were taken on a 2014 MacBook Pro running
Cloudfoundry in a BOSH Lite environment. Simulating network latency by
adding a 100 ms sleep to each request did not change the overall picture.

For more details see the spreadsheet* and our repository** with the test
scripts.

Steffen
(on behalf of the Flintstone Team)

*
https://docs.google.com/spreadsheets/d/1C1raorozKrf_RO-fiS5Nw38GPsMMgAegyw5iCxO8VT0/

** https://github.com/suhlig/jruby-scalability


Amit Kumar Gupta
 

Awesome!

Got some questions:
- what are differences to pre-packaging dependencies?
- any differences in pre-packaging time?
- what differences for packaging/compilation dependencies?
- any difference in packaging/compilation time?
- there are going to be some new job and packages blobs, and maybe some old
ones go away? what are the size differences?
- any changes to start-up or update times during bosh deploy/update?

Amit

On Mon, Oct 19, 2015 at 2:06 PM, Dieu Cao <dcao(a)pivotal.io> wrote:

Hi Steffen,

Cool stuff. It's good to see there's some improvement under jruby.
Would be interested to see how this performs on aws or on soft layer.
Also, were you able to run CATS?
Or is more work needed to deal with NATS etc?
Other pros/cons that you've found?

-Dieu
CF CAPI PM



On Mon, Oct 19, 2015 at 1:25 PM, Steffen Uhlig <Steffen.Uhlig(a)de.ibm.com>
wrote:

Hi,

the Flintstone team recently spent some time researching the potential
benefits of running the CloudController under JRuby. We were hoping to find
evidence that JRuby (and the underlying JVM) would allow us to make better
use of multiple cores, and maybe also lead to significant response time
improvements when answering many parallel requests.

This exercise wasn't set up as a scientific benchmark; it is more of a
spike that would allow us to judge whether it is worth investigating the
next level of detail.

We would like to share some early results in the hope to get feedback
from the community.

In our measurements, we saw 20..30% improvement in both average response
time and throughput when 10 or more concurrent requests were made (using
ApacheBench against the `/v2/orgs/*/spaces` endpoint).

Graphs:
* Throughput: https://goo.gl/NuWkvf
* Response Time: https://goo.gl/ItPBHN

We patched a CC VM to use JRuby 9000 under OpenJDK 8. WEBrick was used as
we weren't able to quickly find a drop-in replacement for Thin (as used
under MRI). All measurements were taken on a 2014 MacBook Pro running
Cloudfoundry in a BOSH Lite environment. Simulating network latency by
adding a 100 ms sleep to each request did not change the overall picture.

For more details see the spreadsheet* and our repository** with the test
scripts.

Steffen
(on behalf of the Flintstone Team)

*
https://docs.google.com/spreadsheets/d/1C1raorozKrf_RO-fiS5Nw38GPsMMgAegyw5iCxO8VT0/

** https://github.com/suhlig/jruby-scalability


Steffen Uhlig
 

Hi,
 
Dieu wrote:
 
> Would be interested to see how this performs on aws or on soft layer.
 
Good point  - is there an AWS envrionment we could use? We'll look into Softlayer.
 
> Also, were you able to run CATS? 
 
We did not attempt to run CATS, because we focused on one endpoint. There others would probably not have functioned.
 
> Or is more work needed to deal with NATS etc?
 
More work is certainly needed for that. We haven't bothered so far to update the depencencies we have 'patched' for the spike. 
 
> Other pros/cons that you've found?
 
Pros not mentioned yet:
 
- There are a lot more tools to run and monitor JVM based applications.
- JRuby in interpreted mode did not seem to be slower, in terms of development workflow.
 
Cons:
 
- JRruby is an additional layer (of complexity) on top of Ruby
- JRuby is still more in a niche
 
Amit wrote:
 
> what are differences to pre-packaging dependencies?
> any differences in pre-packaging time?
> what differences for packaging/compilation dependencies?
 
Our approach was minimal; we captured code changes to CC and dependencies that need updates on a branch. Those changes would need to be translated into packaging.
From our spike, we think that the (pre-) packaging would be similar to how UAA handles the OpenJDK. JRuby itself goes on top of that, and is simply an additional unzip command.
Some gems will need updates to work with JRuby, (see Gemfile in the branch). Mostly around yaml support, thin, and NATS. 
 
> any difference in packaging/compilation time?
 
We did not measure times. From a Ruby perspective, JRuby did not feel different. The Ruby-to-Java compilation can happen in two modes; interpreted and compiled. We only worked in the interpreted mode. In compiled, the Ruby code is compiled to Java upfront, which should lead to further improvements.
 
> there are going to be some new job and packages blobs, and maybe some old ones go away? what are the size differences?
 
We can re-use OpenJDK from UAA. JRuby is a 40 MiB download, where we can eliminate samples etc., so probably 30 MiB in addition.
 
> any changes to start-up or update times during bosh deploy/update?
 
Sorry, we did not measure this, either ;-)
 
 
Regards
 
Marc & Steffen
 
 

----- Original message -----
From: Amit Gupta <agupta@...>
To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev@...>
Cc:
Subject: [cf-dev] Re: Re: Some results running CloudController under JRuby
Date: Mon, Oct 19, 2015 11:40 PM
 
Awesome!
 
Got some questions:
- what are differences to pre-packaging dependencies?
- any differences in pre-packaging time?
- what differences for packaging/compilation dependencies?
- any difference in packaging/compilation time?
- there are going to be some new job and packages blobs, and maybe some old ones go away? what are the size differences?
- any changes to start-up or update times during bosh deploy/update?
 
Amit
 
On Mon, Oct 19, 2015 at 2:06 PM, Dieu Cao <dcao@...> wrote:
Hi Steffen,
 
Cool stuff.  It's good to see there's some improvement under jruby.
Would be interested to see how this performs on aws or on soft layer.
Also, were you able to run CATS? 
Or is more work needed to deal with NATS etc?
Other pros/cons that you've found?
 
-Dieu
CF CAPI PM
 
 
 
On Mon, Oct 19, 2015 at 1:25 PM, Steffen Uhlig <Steffen.Uhlig@...> wrote:
Hi,
 
the Flintstone team recently spent some time researching the potential benefits of running the CloudController under JRuby. We were hoping to find evidence that JRuby (and the underlying JVM) would allow us to make better use of multiple cores, and maybe also lead to significant response time improvements when answering many parallel requests.
 
This exercise wasn't set up as a scientific benchmark; it is more of a spike that would allow us to judge whether it is worth investigating the next level of detail.
 
We would like to share some early results in the hope to get feedback from the community.
 
In our measurements, we saw 20..30% improvement in both average response time and throughput when 10 or more concurrent requests were made (using ApacheBench against the `/v2/orgs/*/spaces` endpoint).
 
Graphs:
    * Throughput: https://goo.gl/NuWkvf
    * Response Time: https://goo.gl/ItPBHN
 
We patched a CC VM to use JRuby 9000 under OpenJDK 8. WEBrick was used as we weren't able to quickly find a drop-in replacement for Thin (as used under MRI). All measurements were taken on a 2014 MacBook Pro running Cloudfoundry in a BOSH Lite environment. Simulating network latency by adding a 100 ms sleep to each request did not change the overall picture.
 
For more details see the spreadsheet* and our repository** with the test scripts.
 
Steffen
(on behalf of the Flintstone Team)