Re: [vcap-dev] Java OOM debugging
Lari Hotari <Lari@...>
I created a few tools to debug OOM problems since the application I was
responsible for running on CF was failing constantly because of OOM
problems. The problems I had, turned out not to be actual memory leaks
in the Java application.
In the "cf events appname" log I would get entries like this:
2015-xx-xxTxx:xx:xx.00-0400 app.crash appname index:
1, reason: CRASHED, exit_description: out of memory, exit_status: 255
These type of entries are produced when the container goes over it's
memory resource limits. It doesn't mean that there is a memory leak in
the Java application. The container gets killed by the Linux kernel oom
based on the resource limits set to the warden container.
The memory limit is specified in number of bytes. It is enforced usingIn my case it never got killed by the killjava.sh script that gets
called in the java-buildpack when an OOM happens in Java.
This is the tool I built to debug the problems:
I deployed that app as part of the forked buildpack I'm using.
Please read the readme about what it's limitations are. It worked for
me, but it might not work for you. It's opensource and you can fork it. :)
There is a solution in my toolcase for creating a heapdump and uploading
that to S3:
The readme explains how to setup Amazon S3 keys for this:
Once you get a dump, you can then analyse the dump in a java profiler
tool like YourKit.
I also have a solution that forks the java-buildpack modifies
killjava.sh and adds a script that uploads the heapdump to S3 in the
case of OOM:
In java-buildpack-diagnostics-app I have also other tools for getting
Linux operation system specific memory information, for example:
These tools are handy for looking at details of the Java process RSS
memory usage growth.
There is also a solution for getting ssh shell access inside your
application with tmate.io:
(this version is only compatible with the new "cflinuxfs2" stack)
It looks like there are serious problems on CloudFoundry with the memory
sizing calculation. An application that doesn't have a OOM problem will
get killed by the oom killer because the Java process will go over the
I filed this issue:
https://github.com/cloudfoundry/java-buildpack/issues/157 , but that
might not cover everything.
The workaround for that in my case was to add a native key under
memory_sizes in open_jdk_jre.yml and set the minimum to 330M (that is
for a 2GB total memory).
that was how I got the app I'm running on CF to stay within the memory
bounds. I'm sure there is now also a way to get the keys without forking
the buildpack. I could have also adjusted the percentage portions, but I
wanted to set a hard minimum for this case.
It was also required to do some other tuning.
I added this to JAVA_OPTS:
while trying to keep the Java process from growing in RSS memory size.
The memory overhead of a 64 bit Java process on Linux can be reduced by
specifying these environment variables:
MALLOC_ARENA_MAX works only on cflinuxfs2 stack (the lucid64 stack has a
buggy version of glibc).
explanation about MALLOC_ARENA_MAX from Heroku:
some measurement data how it reduces memory consumption:
I have created a PR to add this to CF java-buildpack:
I also created an issues
I hope this information helps others struggling with OOM problems in CF.
I'm not saying that this is a ready made solution just for you. YMMV. It
worked for me.
On 15-04-29 10:53 AM, Head-Rapson, David wrote: