Re: [vcap-dev] Java OOM debugging


Christopher Frost
 

Hi,

We've had a look through this and think it would be useful to give our
perspective for now.

To clarify the Memory heuristics, they are just weightings and not
percentages, they were never meant to add up to a hundred. The fact the
default settings add up to 105 is purely chance. Will see if the docs can
be improved around this.

The fact the JRE isn't detecting an OOM error and triggering the
killjava.sh script and because increasing the native memory prolongs the
OOM, we believe the leak is occurring outside the JRE. NIO, JNI, lots of
threads creation, etc... .There are other things running in the container
that could be consuming the memory.

If you suspect Tomcat then 8.0.22 is out and has been published for use in
the Java Buildpack, moving up to that might help things. We still haven't
seen a rash of memory problems from the wider Tomcat community for either
8.0.20 or 8.0.22 though so this would be unexpected.

It is actually possible to set the MALLOC_ARENA_MAX using an environment
variable.

cf set-env <APP> MALLOC_ARENA_MAX 2

This can also be specified in an applications manifest file.

Finally, a change has gone in to the master branch of the Java buildpack
that moves all of the memory heuristics code to an external program written
in Go. This means that scaling a Java application with cf scale no longer
requires a separate restage for the new memory setting to be applied as the
Go code will calculate them during every application start. During staging
you will also be able to see the memory settings that have been calculated
come out on the console. The plan is to release this new feature in version
3.1 of the buildpack in a week or so time.

We do still have an issue on our backlog to look at this and it shouldn't
be too long before we get to it.
https://www.pivotaltracker.com/story/show/94381284

Chris.

*On Mon, May 11, 2015 at 13:19 PM, Lari Hotari <Lari at hotari.net
<https://lists.cloudfoundry.org/mailman/listinfo/cf-dev>
**<mailto:Lari at hotari.net
<https://lists.cloudfoundry.org/mailman/listinfo/cf-dev>>> wrote:*

fyi. Tomcat 8.0.20 might be consuming more memory than
8.0.18:https://github.com/cloudfoundry/java-buildpack/issues/166#issuecomment-94517568

* Other things we’ve tried:
*>>* - We set verbose garbage collection to verify there was no
*>* memory size issues within the JVM. There wasn’t.
*>>* - We tried setting minimum memory for native, it had no
*>* effect. The container still gets killed
*>>* - We tried adjusting the ‘memory heuristics’ so that they
*>* added up to 80 rather than 100. This had the effect of causing a delay
*>* in the container being killed. However it still was killed.
*>I think adjusting memory heuristics so that they add up to 80 doesn't
make a difference because the values aren't percentages.
The values are proportional weighting values used in the memory
calculation:https://github.com/grails-samples/java-buildpack/blob/b4abf89/docs/jre-oracle_jre.md#memory-calculation

I found out that the only way to reserve "unused" memory is to set a
high value for the native memory lower bound in the memory_sizes.native
setting of config/open_jdk_jre.yml .
Example:https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25

>>* This seems like classic memory leak behaviour to me.
*>In my case it wasn't a classical Java memory leak, since the Java
application wasn't leaking memory. I was able to confirm this by getting
some heap dumps with the HeapDumpServlet
(https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/HeapDumpServlet.groovy)
and analyzing them.

In my case the JVM's RSS memory size is slowly growing. It probably is
some kind of memory leak since one process I've been monitoring now is
very close to the memory limit. The uptime is now almost 3 weeks.

Here is the latest diff of the meminfo report.https://gist.github.com/lhotari/ee77decc2585f56cf3ad#file-meminfo_diff_example2-txt

*From a Java perspective this isn't classical. The JVM heap isn't filling
*up. The problem is that RSS size is slowly growing and will eventually
cause the Java process to cross the memory boundary so that the process
gets kill by the Linux kernel cgroups OOM killer.

RSS size might be growing because of many reasons. I have been able to
slow down the growth by doing the various MALLOC_ and JVM parameter
tuning (-XX:MinMetaspaceExpansion=1M -XX:CodeCacheExpansionSize=1M). I'm
able to get a longer uptime, but the problem isn't solved.

Lari


On 15-05-11 06:41 AM, Head-Rapson, David wrote:
* Thanks for the continued advice.
*>> >>* We’ve hit on a key discovery after yet another a soak test this weekend.
*>>* - When we deploy using Tomcat 8.0.18 we don’t see the issue
*>>* - When we deploy using Tomcat 8.0.20 (same app version, same
*>* CF space, same services bound, same JBP code version, same JRE
*>* version, running at the same time), we see the crashes occurring after
*>* just a couple of hours.
*>> >>* Ideally we’d go ahead with the memory calculations you mentioned
*>* however we’re stuck on lucid64 because we’re using Pivotal CF 1.3.x &
*>* we’re having upgrade issues to 1.4.x.
*>>* So we’re not able to adjust MALLOC_ARENA_MAX, nor are we able to view
*>* RSS in pmap as you describe
*>> >>* Other things we’ve tried:
*>>* - We set verbose garbage collection to verify there was no
*>* memory size issues within the JVM. There wasn’t.
*>>* - We tried setting minimum memory for native, it had no
*>* effect. The container still gets killed
*>>* - We tried adjusting the ‘memory heuristics’ so that they
*>* added up to 80 rather than 100. This had the effect of causing a delay
*>* in the container being killed. However it still was killed.
*>> >>* This seems like classic memory leak behaviour to me.
*>> >>* *From:*Lari Hotari [mailto:lari.hotari at sagire.fi <https://lists.cloudfoundry.org/mailman/listinfo/cf-dev>] *On Behalf Of *Lari
*>* Hotari
*>* *Sent:* 08 May 2015 16:25
*>* *To:* Daniel Jones; Head-Rapson, David
*>* *Cc:* cf-dev at lists.cloudfoundry.org <https://lists.cloudfoundry.org/mailman/listinfo/cf-dev>
*>* *Subject:* Re: [Cf-dev] [vcap-dev] Java OOM debugging
*>> >>>* For my case, it turned out to be essential to reserve enough memory
*>* for "native" in the JBP. For the 2GB total memory, I set the minimum
*>* to 330M. With that setting I have been able to get over 2 weeks up
*>* time by now.
*>>* I mentioned this in my previous email:
*>>* The workaround for that in my case was to add a native key under
*>* memory_sizes in open_jdk_jre.yml and set the minimum to 330M (that is
*>* for a 2GB total memory).
*>* see example
*>* https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25 <https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25>
*>* that was how I got the app I'm running on CF to stay within the memory
*>* bounds. I'm sure there is now also a way to get the keys without
*>* forking the buildpack. I could have also adjusted the percentage
*>* portions, but I wanted to set a hard minimum for this case.
*>>>* I've been trying to get some insight by diffing the reports gathered
*>* from the meminfo servlet
*>* https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemoryInfoServlet.groovy <https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemoryInfoServlet.groovy>
*>>>* Here is such an example of a diff:
*>* https://gist.github.com/lhotari/ee77decc2585f56cf3ad#file-meminfo_diff_example-txt <https://gist.github.com/lhotari/ee77decc2585f56cf3ad#file-meminfo_diff_example-txt>
*>>* meminfo has pmap output included to get the report of the memory map
*>* of the process. I have just noticed that most of the memory has
*>* already been mmap:ed from the OS and it's just growing in RSS size.
*>* For example:
*>* < 00000000a7600000 1471488 1469556 1469556 rw--- [ anon ]
*>* > 00000000a7600000 1471744 1470444 1470444 rw--- [ anon ]
*>>* The pmap output from lucid64 didn't include the RSS size, so you have
*>* to use cflinuxfs2 for this. It's also better because of other reasons.
*>* The glibc in lucid64 is old and has some bugs around the MALLOC_ARENA_MAX.
*>>* I was manually able to estimate the maximum size of the RSS size of
*>* what the Java process will consume by simply picking the large
*>* anon-blocks from the pmap report and calculating those blocks by the
*>* allocated virtual size (VSS).
*>* Based on this calculation, I picked the minimum of 330M for "native"
*>* in open_jdk_jre.yml as I mentioned before.
*>>* It looks like these rows are for the Heap size:
*>* < 00000000a7600000 1471488 1469556 1469556 rw--- [ anon ]
*>* > 00000000a7600000 1471744 1470444 1470444 rw--- [ anon ]
*>>* It looks like the JVM doesn't fully allocate that block in RSS
*>* initially and most of the growth of RSS size comes from that in my
*>* case. In your case, it might be something different.
*>>* I also added a servlet for getting glibc malloc_info statistics in XML
*>* format (). I haven't really analysed that information because of time
*>* constraints and because I don't have a pressing problem any more. btw.
*>* The malloc_info XML report is missing some key elements, that has been
*>* added in later glibc versions
*>* (https://github.com/bminor/glibc/commit/4d653a59ffeae0f46f76a40230e2cfa9587b7e7e <https://github.com/bminor/glibc/commit/4d653a59ffeae0f46f76a40230e2cfa9587b7e7e>).
*>>* If killjava.sh never fires and the app crashed with Warden out of
*>* memory errors, then I believe it's the kernel's cgroups OOM killer
*>* that has killed the container processes. I have found this location
*>* where Warden oom notifier gets the OOM notification event:
*>* https://github.com/cloudfoundry/warden/blob/ad18bff/warden/lib/warden/container/features/mem_limit.rb#L70 <https://github.com/cloudfoundry/warden/blob/ad18bff/warden/lib/warden/container/features/mem_limit.rb#L70>
*>* This is the oom.c source code:
*>* https://github.com/cloudfoundry/warden/blob/ad18bff7dc56acbc55ff10bcc6045ebdf0b20c97/warden/src/oom/oom.c <https://github.com/cloudfoundry/warden/blob/ad18bff7dc56acbc55ff10bcc6045ebdf0b20c97/warden/src/oom/oom.c>
*>* . It reads the cgroups control files and receives events from the
*>* kernel that way.
*>>* I'd suggest that you use pmap for the Java process after it has
*>* started and calculate the maximum RSS size by calculating the VSS size
*>* of the large anon blocks instead of RSS for the blocks that the Java
*>* process has reserved for it's different memory areas (I think you
*>* shouldn't . You should discard adding VSS for the
*>* CompressedClassSpaceSize block.
*>* After this calculation, add enough memory to the "native" parameter in
*>* JBP until the RSS size calculated this way stays under the limit.
*>* That's the only "method" I have come up by now.
*>>* It might be required to have some RSS space allocated for any zip/jar
*>* files read by the Java process. I think that Java uses mmap files for
*>* zip file reading by default and that might go on top of all other limits.
*>* To test this theory, I'd suggest testing by adding
*>* -Dsun.zip.disableMemoryMapping=true system property setting to
*>* JAVA_OPTS. That disables the native mmap for zip/jar file reading. I
*>* haven't had time to test this assumption.
*>>* I guess the only way to understand how Java allocates memory is to
*>* look at the source code.
*>* from http://openjdk.java.net/projects/jdk8u/ <http://openjdk.java.net/projects/jdk8u/> , the instructions to get
*>* the source code of JDK 8:
*>* hg clone http://hg.openjdk.java.net/jdk8u/jdk8u;cd <http://hg.openjdk.java.net/jdk8u/jdk8u;cd> jdk8u;sh get_source.sh
*>* This tool is really good for grepping and searching the source code:
*>* http://geoff.greer.fm/ag/ <http://geoff.greer.fm/ag/> <http://geoff.greer.fm/ag/ <http://geoff.greer.fm/ag/>>
*>* On Ubuntu it's in silversearcher-ag package, "apt-get install
*>* silversearcher-ag" and on MacOSX brew it's "brew install
*>* the_silver_searcher".
*>* This alias is pretty useful:
*>* alias codegrep='ag --color --group --pager less -C 5'
*>* Then you just search for the correct location in code by starting with
*>* the tokens you know about:
*>* codegrep MaxMetaspaceSize
*>* this gives pretty good starting points in looking how the JDK
*>* allocates memory.
*>>* So the JDK source code is only a few commands away.
*>>* It would be interesting to hear more about this if someone has the
*>* time to dig in to this. This is about how far I got and I hope sharing
*>* this information helps someone continue. :)
*>>>* Lari
*>* github/twitter: lhotari
*>

Truncated



--
--
Christopher Frost - GoPivotal UK

Join {cf-dev@lists.cloudfoundry.org to automatically receive all group messages.