Re: Is it possible to use git push to deploy applications on CF
Alexander Lomov <alexander.lomov@...>
Hey.
toggle quoted message
Show quoted text
The simplest way to add this behaviour is to add `cf push` command to `.git/hooks/pre-push` executable file. The detail you can find in git docs [0] In this article you can find the possible reasons not to use `cf push` together with `git push` [1] [0] http://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks [1] http://blog.pivotal.io/pivotal-labs/labs/deploying-jruby-rails-application-cloud-foundry ------------------------ Alex Lomov *Altoros* — Cloud Foundry deployment, training and integration *Twitter:* @code1n <https://twitter.com/code1n> *GitHub:* @allomov <https://gist.github.com/allomov> On Wed, May 13, 2015 at 10:04 AM, Alan Moran <bonzofenix(a)gmail.com> wrote:
Hi Kinjal, |
|
Re: Is it possible to use git push to deploy applications on CF
Alan Morán <bonzofenix at gmail.com...>
Hi Kinjal,
toggle quoted message
Show quoted text
CF push does not support git input afaik. But It would be fairly simple to implement a cf-cli plugin that does that from the client side to offer a heroku-like experience. Regards, — Alan On May 12, 2015, at 10:44 PM, Kinjal Doshi <kindoshi(a)gmail.com> wrote: |
|
Is it possible to use git push to deploy applications on CF
Kinjal Doshi
Hi,
I would like to know if it is possible to deploy applications on cloud foundry using git push. Or is it that only CF CLI can be used for pushing applications? Thanks, Kinjal |
|
Adding multiple users to user/auditor roles of an orgnization
Anil Ambati <aambati@...>
Hi,
is there a CF API to add multiple users to multiple roles of an organization? I have looked at the CF docs, but did not find any indication that such API exists. Thank you. Regards, Anil |
|
Re: cf-release v208 is now available
John Wong
Great release!!! Congrat.
toggle quoted message
Show quoted text
Just a couple questions (but if this is the right thread to ask please excuse me and let me know). - Manifest templates no longer include resource pool sizes details <https://github.com/cloudfoundry/cf-release/commit/fc26ee26443d79d765df490910ea0b4c9706d6ba> https://github.com/cloudfoundry/cf-release/commit/fc26ee26443d79d765df490910ea0b4c9706d6ba In a way I was "spoiled" and never really asked why we needed resource pool but went alone with it, but what does the commit comment "bosh director can figure this out automatically" mean? - Adjusted ephemeral disk sizes on new instance types for AWS template to be more realisticdetails <https://www.pivotaltracker.com/story/show/91780134> I just want to make sure I understand the underscore for each of the size is just some syntax thing for the template, not something I would actually write in my manifest. Also c3.large by default has 2x16SSD, so are we taking 4Gb (from the template) from the ephemeral/instance? And congratulation for merging UAA and Login server. So now all we need is 2 VMs minimally if we really want to have HA (aside from enabling bosh resurrect). Thanks in advance. John Wong On Tue, May 12, 2015 at 8:22 PM, Dieu Cao <dcao(a)pivotal.io> wrote:
The cf-release v208 was released on May 12th, 2015 |
|
cf-release v208 is now available
Dieu Cao <dcao@...>
The cf-release v208 was released on May 12th, 2015
- Please see note about merge of UAA/Login server jobs below to maintain zero down time for CC and UAA for existing deployments. Runtime - [Experimental] Work continues on support for Asynchronous Service Instance Operationsdetails <https://www.pivotaltracker.com/epic/show/1561148> - Completed Improvements to Recursive Deletion of Org and Space, in support of Asynchronous Service Operations details <https://www.pivotaltracker.com/epic/show/1751766> - [Experimental] Work continues on /v3 and Application Process Types details <https://www.pivotaltracker.com/epic/show/1334418> - [Experimental] Work continues on Route API details <https://www.pivotaltracker.com/epic/show/1590160> - [Experimental] Work continues on Context Path Routes details <https://www.pivotaltracker.com/epic/show/1808212> - Work continues on support for Service Keys details <https://www.pivotaltracker.com/epic/show/1743366> - Work continues on support for Arbitrary Service Parameters details <https://www.pivotaltracker.com/epic/show/1725984> - Adjusted ephemeral disk sizes on new instance types for AWS template to be more realisticdetails <https://www.pivotaltracker.com/story/show/91780134> - Including staticfile buildpack v1.0.0 details <https://github.com/cloudfoundry/staticfile-buildpack/releases/tag/v1.0.0> - Removed separate login job from minimal aws deployment details <https://www.pivotaltracker.com/story/show/93505400> - Allow acceptance test timeouts to be set via manifest details <https://github.com/cloudfoundry/cf-release/commit/b6c1f33771213ded1cf7c982f5f6fafb3d900197> - Update default cipher list for haproxy and gorouter details <https://www.pivotaltracker.com/story/show/91129360> - Addressed tcpdump CVE-2015-0261, CVE-2015-2153, CVE-2015-2154, CVE-2015-2155details <https://www.pivotaltracker.com/story/show/93371680> - Upgrading php buildpack to v3.1.1 details <https://github.com/cloudfoundry/php-buildpack/releases/tag/v3.1.1> - Manifest templates no longer include resource pool sizes details <https://github.com/cloudfoundry/cf-release/commit/fc26ee26443d79d765df490910ea0b4c9706d6ba> - Upgrading ruby buildpack to v1.3.1 details <https://github.com/cloudfoundry/ruby-buildpack/releases/tag/v1.3.1> - Bump CLI to 6.11.1 for CATS and remove darwin CLI details <https://www.pivotaltracker.com/story/show/92595438> - Upgrade cf-release to use ruby 2.1.6 and remove ruby 2.1.4 for CC, Collector, Warden, DEAdetails <https://www.pivotaltracker.com/story/show/92547532> - Addresses ruby CVE-2015-1855 - cloudfoundry/cf-release #660 <https://github.com/cloudfoundry/cf-release/pull/660>: Add security group for cf-mysql subnets to bosh-lite details <https://www.pivotaltracker.com/story/show/92658768> - Adjust VCAP_ID as endpoint/sticky cookie changes details <https://www.pivotaltracker.com/story/show/92796282> - Disable compression when creating proxy connection details <https://www.pivotaltracker.com/story/show/93362206> - cleanup regex details <https://github.com/cloudfoundry/cloud_controller_ng/commit/5257a8af6990e71cd1e34ae8978dfe4773b32826> - A space developer can create a wildcard route for private domains details <https://www.pivotaltracker.com/story/show/82612406> - Allow commands to be reset to nothing details <https://www.pivotaltracker.com/story/show/93406896> UAA Updates - Merged UAA & Login Server details <https://github.com/cloudfoundry/uaa/releases/tag/2.0.0> - Multi-tenant UAA details <https://github.com/cloudfoundry/uaa/releases/tag/2.1.0> - Registering wildcard routes for *.login and *.uaa details <https://github.com/cloudfoundry/cf-release/commit/0260567d9761700dbccde3088165121d7933e058> - Zero Downtime Upgrade Procedure - Perform the cf-release upgrade and keep number of login server of jobs the same as your existing deploy. - Change the number of Login Server Job instances to 0 and re-deploy after initial deploy completes. Note: The combination of Older Login Server jobs and the newly merged UAA/Login Server job is not supported. This should be done only for a short duration to achieve the zero downtime. The Login Server instances should be deleted via a bosh redeploy immediately after a successful upgrade Used Configuration - BOSH Version: 152 - Stemcell Version: 2889 - CC Api Version: 2.25.0 Commit summary <http://htmlpreview.github.io/?https://github.com/cloudfoundry-community/cf-docs-contrib/blob/master/release_notes/cf-208-whats-in-the-deploy.html> Compatible Diego Version - final release 1198 commit <https://github.com/cloudfoundry-incubator/diego-release/commit/f7b15f8da536eee7be696896890943dbc6202242> https://github.com/cloudfoundry/cf-release/releases/tag/v208 |
|
Re: Recipe to install Diego?
Eric Malm <emalm@...>
Hi, Tom,
toggle quoted message
Show quoted text
The Diego team does deploy Diego to AWS as part of our testing pipeline. We haven't fully published our tooling for doing so, but you can see some of our process in the deploy_diego CI script in diego-release <https://github.com/cloudfoundry-incubator/diego-release/blob/develop/scripts/ci/deploy_diego>, which uses diego-release's generate-deployment-manifest script. This script is set up differently from the generate_deployment_manifest script in cf-release, in that it takes a fixed sequence of stubs and a deployment directory as arguments instead of an infrastructure type and an arbitrary list of stubs to merge in. The full list of stubs is described in the usage message for the script, but here are the parts that should be most relevant for you to deploy Diego to AWS or OpenStack: - IaaS settings (arg #5): This is a stub that should contain an "iaas_settings" hash with several expected subfields (compilation_cloud_properties, resource_pool_cloud_properties, stemcell, subnet_configs). The manifest generation script takes these values and uses them to populate certain fields in the diego manifest's resource_pools, networks, and compilation sections. This will likely be the stub you need to customize the most for an AWS or OpenStack deployment, as this will contain all the information about the network and security group configuration for that environment. - Deployments directory (arg #7): This is a directory that should contain your CF deployment manifest as the file 'cf.yml'. The manifest generation script will extract certain values from the CF manifest so the Diego deployment can integrate correctly with various services in CF (for example, NATS and consul). - Director UUID (arg #1): This is a stub containing "director_uuid: <your-director-uuid>"; you may already have such a stub for generating your CF manifest. - Instance count overrides (arg #3): This is a stub containing any instance-count changes for the diego jobs. Depending on the size of your desired cluster, you'll want to change these values from the defaults that the manifest-generation/diego.yml template provides in the jobs section. Depending on how you wish to configure the Diego deployment, there may be some additional properties you want to add to the property-overrides stub (arg #2). I doubt you'll need to change anything in the persistent-disk overrides or additional-jobs stubs (args #4 and #6), unless you're customizing your deployment extensively. In any case, the stubs under manifest-generation/bosh-lite-stubs should give you examples to customize for your own deployment, and the manifest-generation/diego.yml template will show you which values from those stubs are consumed in manifest generation. Also, as Diego matures and becomes the principal backend for running application instances in CF, these manifest-generation patterns may change substantially. Thanks, Eric Malm, CF Runtime Diego PM On Tue, May 12, 2015 at 8:48 AM, Ken Ojiri <ozzozz(a)gmail.com> wrote:
Hi, |
|
Re: Purge files on NFS or S3?
Jon Price
Make sure you only delete the resource files, not everything...
toggle quoted message
Show quoted text
Jon Price Intel Corp. On May 11, 2015 10:05 PM, Dieu Cao <dcao(a)pivotal.io> wrote:
An option could be to just delete all the resource files on the blobstore. The effect would be that for binaries that would have been matched, they would be uploaded again on the first new push including those binaries. On Monday, May 11, 2015, John Wong <gokoproject(a)gmail.com<mailto:gokoproject(a)gmail.com>> wrote: Hi all Thanks. No I was just curious if there was a way to identify what to remove in the blobstore because I was surprised the size of my blobstore at this point. I will check what's in there (maybe James is right it is mostly resource files). I am currently using NFS. I can build a CF with S3 as my blobstore. John On Mon, May 11, 2015 at 11:36 AM, Chad Woolley <thewoolleyman(a)gmail.com> wrote: Not sure if this is what you need, but you can manually sync + delete files from a local filesystem (including NFS mount) to/from S3: http://s3tools.org/s3cmd-sync ... with `—delete-removed` option -- Chad On Sat, May 9, 2015 at 12:19 AM, James Bayer <jbayer(a)pivotal.io> wrote: _______________________________________________ cf-dev mailing list cf-dev(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-dev |
|
Re: Recipe to install Diego?
Ken Ojiri
Hi,
toggle quoted message
Show quoted text
I use spiff manifest templates included by cf-release and diego-release, and generate manifests by spiff, but I usually use the manifests as reference materials. I finally adjust my own manifests by refering to spiff generated manifests, job definitions of cf-release and/or diego-release, and do try-and-error... Now, setting parameters of diego components are changing with every version, so job definitions of diego-release are essential reference. Regards, Ken Ojiri --- Ken Ojiri <ozzozz(a)gmail.com> Mitaka, Tokyo Japan On Tue, May 12, 2015 at 5:56 PM, 王天青 <wang.tianqing.cn(a)gmail.com> wrote:
Hi Ken, |
|
Scaling Java Application
Christopher Frost
When deploying a Java application to Cloud Foundry the Java memory settings
for the application are decided based on the configured memory weighting during staging. This means that, unlike other apps, if the application is scaled to give it more memory it needs to be *restage*d it to get updated Java memory settings. This has now been improved with an improved memory calculator written by Steve Powell[2]. The Memory Calculator[1] will be run during every application start to ensure the application gets up-to-date memory settings, its output is shown during staging. -----> Downloading Open JDK Like Memory Calculator 1.1.1_RELEASE from https://download.run.pivotal.io/memory-calculator/trusty/x86_64/memory-calculator-1.1.1_RELEASE (found in cache) Memory Settings: -XX:MaxMetaspaceSize=64M -XX:MetaspaceSize=64M -Xss995K -Xmx382293K -Xms382293K Then scaling the application to double the memory will result in new memory settings without having to restage the application. cf scale my-application -m 1G -Xmx768M -Xms768M -XX:MaxMetaspaceSize=104857K -XX:MetaspaceSize=104857K -Xss1M This new feature is currently available on the master branch of the buildpack [3] and will be released in due course. Chris. [1] https://github.com/cloudfoundry/java-buildpack-memory-calculator [2] https://github.com/Zteve [3] https://github.com/cloudfoundry/java-buildpack -- Christopher Frost - GoPivotal UK |
|
Scailing Java Applications
Christopher Frost
When deploying a Java application to Cloud Foundry the Java memory settings
for the application are decided based on the configured memory weighting during staging. This means that, unlike other apps, if the application is scaled to give it more memory it needs to be *restage*d it to get updated Java memory settings. This has now been improved with an improved memory calculator written by Steve Powell[2]. The Memory Calculator[1] will be run during every application start to ensure the application gets up-to-date memory settings, its output is shown during staging. -----> Downloading Open JDK Like Memory Calculator 1.1.1_RELEASE from https://download.run.pivotal.io/memory-calculator/trusty/x86_64/memory-calculator-1.1.1_RELEASE (found in cache) Memory Settings: -XX:MaxMetaspaceSize=64M -XX:MetaspaceSize=64M -Xss995K -Xmx382293K -Xms382293K Then scaling the application to double the memory will result in new memory settings without having to restage the application. cf scale my-application -m 1G -Xmx768M -Xms768M -XX:MaxMetaspaceSize=104857K -XX:MetaspaceSize=104857K -Xss1M This new feature is currently available on the master branch of the buildpack [3] and will be released in due course. Chris. [1] https://github.com/cloudfoundry/java-buildpack-memory-calculator [2] https://github.com/Zteve [3] https://github.com/cloudfoundry/java-buildpack -- Christopher Frost - Pivotal UK |
|
Follow up on multiple line log outputs in CF
George Li
Hi,
this is a follow up on the archived posting https://groups.google.com/a/cloudfoundry.org/forum/?utm_medium=email&utm_source=footer#!msg/vcap-dev/B1W6_vO0oyo/84X1eAtFsKoJ. I cannot find any new postings on that thread. I am using Cloud Foundry version "6.11.2-2a26d55-2015-04-27T21:11:44+00:00" and want to know what options I have to handle multiple line logs in a multi-tenant environment. Since multiple instances of multiple applications are all sending logs to a single Logstash server, is it best to avoid having multiple lines in my log? I can live with sticking to single line logs except for outputting exception stack trace, not to mention that I only have control over my code. Thanks. |
|
Code license question
peteb@...
Hello,
I am a software developer and was wondering what is the code license for your CloudFoundry Community Code, such as: the go cfc client: https://github.com/cloudfoundry-community/go-cfclient ? Thanks, kind regards, Piotr |
|
Re: Recipe to install Diego?
王天青 <wang.tianqing.cn at gmail.com...>
Hi Ken,
toggle quoted message
Show quoted text
How do you generate the manifest file? Thanks Best Regards~! Grissom On Mon, May 11, 2015 at 9:17 PM OzzOzz <ozzozz(a)gmail.com> wrote:
Hi, |
|
Re: Purge files on NFS or S3?
Dieu Cao <dcao@...>
An option could be to just delete all the resource files on the blobstore.
toggle quoted message
Show quoted text
The effect would be that for binaries that would have been matched, they would be uploaded again on the first new push including those binaries. On Monday, May 11, 2015, John Wong <gokoproject(a)gmail.com> wrote:
Hi all |
|
Re: Purge files on NFS or S3?
John Wong
Hi all
Thanks. No I was just curious if there was a way to identify what to remove in the blobstore because I was surprised the size of my blobstore at this point. I will check what's in there (maybe James is right it is mostly resource files). I am currently using NFS. I can build a CF with S3 as my blobstore. John On Mon, May 11, 2015 at 11:36 AM, Chad Woolley <thewoolleyman(a)gmail.com> wrote: Not sure if this is what you need, but you can manually sync + delete |
|
Re: Purge files on NFS or S3?
Chad Woolley <thewoolleyman@...>
Not sure if this is what you need, but you can manually sync + delete files
from a local filesystem (including NFS mount) to/from S3: http://s3tools.org/s3cmd-sync ... with `—delete-removed` option -- Chad On Sat, May 9, 2015 at 12:19 AM, James Bayer <jbayer(a)pivotal.io> wrote: intervention. deleted with them and that proper garbage collection occurs with that. to update the CC_DB references too i'm pretty sure. i'm interested if you find out more. far, and I wonder if there's a systematic way to purge files we don't need (or how do I know I don't need them)? existing process (if any) work with S3?
|
|
Re: [vcap-dev] Java OOM debugging
Lari Hotari <Lari@...>
fyi. Tomcat 8.0.20 might be consuming more memory than 8.0.18:
https://github.com/cloudfoundry/java-buildpack/issues/166#issuecomment-94517568 Other things we’ve tried:I think adjusting memory heuristics so that they add up to 80 doesn't make a difference because the values aren't percentages. The values are proportional weighting values used in the memory calculation: https://github.com/grails-samples/java-buildpack/blob/b4abf89/docs/jre-oracle_jre.md#memory-calculation I found out that the only way to reserve "unused" memory is to set a high value for the native memory lower bound in the memory_sizes.native setting of config/open_jdk_jre.yml . Example: https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25 In my case it wasn't a classical Java memory leak, since the Java application wasn't leaking memory. I was able to confirm this by getting some heap dumps with the HeapDumpServlet (https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/HeapDumpServlet.groovy) and analyzing them. In my case the JVM's RSS memory size is slowly growing. It probably is some kind of memory leak since one process I've been monitoring now is very close to the memory limit. The uptime is now almost 3 weeks. Here is the latest diff of the meminfo report. https://gist.github.com/lhotari/ee77decc2585f56cf3ad#file-meminfo_diff_example2-txt From a Java perspective this isn't classical. The JVM heap isn't filling up. The problem is that RSS size is slowly growing and will eventually cause the Java process to cross the memory boundary so that the process gets kill by the Linux kernel cgroups OOM killer. RSS size might be growing because of many reasons. I have been able to slow down the growth by doing the various MALLOC_ and JVM parameter tuning (-XX:MinMetaspaceExpansion=1M -XX:CodeCacheExpansionSize=1M). I'm able to get a longer uptime, but the problem isn't solved. Lari On 15-05-11 06:41 AM, Head-Rapson, David wrote:
|
|
Re: Recipe to install Diego?
Ken Ojiri
Hi,
toggle quoted message
Show quoted text
I have posted a sample BOSH deployment manifest to Gist. https://gist.github.com/ozzozz/4c08c37863b703a75afc I could deploy cf-release v207 and diego-release 0.1099.0 to AWS Tokyo region by MicroBOSH. I could also deploy cf-release and diego-release to OpenStack(Juno). The manifests differs only in 'networks', 'cloud_properties' and 'stemcell'. Regards, Ken --- <ozzozz(a)gmail.com> Mitaka, Tokyo Japan On Sat, May 9, 2015 at 8:57 PM, Tom Sherrod <tom.sherrod(a)gmail.com> wrote:
Hi, |
|
Re: [vcap-dev] Java OOM debugging
Dave Head-Rapson
Thanks for the continued advice.
We’ve hit on a key discovery after yet another a soak test this weekend. - When we deploy using Tomcat 8.0.18 we don’t see the issue - When we deploy using Tomcat 8.0.20 (same app version, same CF space, same services bound, same JBP code version, same JRE version, running at the same time), we see the crashes occurring after just a couple of hours. Ideally we’d go ahead with the memory calculations you mentioned however we’re stuck on lucid64 because we’re using Pivotal CF 1.3.x & we’re having upgrade issues to 1.4.x. So we’re not able to adjust MALLOC_ARENA_MAX, nor are we able to view RSS in pmap as you describe Other things we’ve tried: - We set verbose garbage collection to verify there was no memory size issues within the JVM. There wasn’t. - We tried setting minimum memory for native, it had no effect. The container still gets killed - We tried adjusting the ‘memory heuristics’ so that they added up to 80 rather than 100. This had the effect of causing a delay in the container being killed. However it still was killed. This seems like classic memory leak behaviour to me. From: Lari Hotari [mailto:lari.hotari(a)sagire.fi] On Behalf Of Lari Hotari Sent: 08 May 2015 16:25 To: Daniel Jones; Head-Rapson, David Cc: cf-dev(a)lists.cloudfoundry.org Subject: Re: [Cf-dev] [vcap-dev] Java OOM debugging For my case, it turned out to be essential to reserve enough memory for "native" in the JBP. For the 2GB total memory, I set the minimum to 330M. With that setting I have been able to get over 2 weeks up time by now. I mentioned this in my previous email: The workaround for that in my case was to add a native key under memory_sizes in open_jdk_jre.yml and set the minimum to 330M (that is for a 2GB total memory). see example https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25 that was how I got the app I'm running on CF to stay within the memory bounds. I'm sure there is now also a way to get the keys without forking the buildpack. I could have also adjusted the percentage portions, but I wanted to set a hard minimum for this case. I've been trying to get some insight by diffing the reports gathered from the meminfo servlet https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemoryInfoServlet.groovy Here is such an example of a diff: https://gist.github.com/lhotari/ee77decc2585f56cf3ad#file-meminfo_diff_example-txt meminfo has pmap output included to get the report of the memory map of the process. I have just noticed that most of the memory has already been mmap:ed from the OS and it's just growing in RSS size. For example: < 00000000a7600000 1471488 1469556 1469556 rw--- [ anon ] 00000000a7600000 1471744 1470444 1470444 rw--- [ anon ]The pmap output from lucid64 didn't include the RSS size, so you have to use cflinuxfs2 for this. It's also better because of other reasons. The glibc in lucid64 is old and has some bugs around the MALLOC_ARENA_MAX. I was manually able to estimate the maximum size of the RSS size of what the Java process will consume by simply picking the large anon-blocks from the pmap report and calculating those blocks by the allocated virtual size (VSS). Based on this calculation, I picked the minimum of 330M for "native" in open_jdk_jre.yml as I mentioned before. It looks like these rows are for the Heap size: < 00000000a7600000 1471488 1469556 1469556 rw--- [ anon ] 00000000a7600000 1471744 1470444 1470444 rw--- [ anon ]It looks like the JVM doesn't fully allocate that block in RSS initially and most of the growth of RSS size comes from that in my case. In your case, it might be something different. I also added a servlet for getting glibc malloc_info statistics in XML format (). I haven't really analysed that information because of time constraints and because I don't have a pressing problem any more. btw. The malloc_info XML report is missing some key elements, that has been added in later glibc versions (https://github.com/bminor/glibc/commit/4d653a59ffeae0f46f76a40230e2cfa9587b7e7e). If killjava.sh never fires and the app crashed with Warden out of memory errors, then I believe it's the kernel's cgroups OOM killer that has killed the container processes. I have found this location where Warden oom notifier gets the OOM notification event: https://github.com/cloudfoundry/warden/blob/ad18bff/warden/lib/warden/container/features/mem_limit.rb#L70 This is the oom.c source code: https://github.com/cloudfoundry/warden/blob/ad18bff7dc56acbc55ff10bcc6045ebdf0b20c97/warden/src/oom/oom.c . It reads the cgroups control files and receives events from the kernel that way. I'd suggest that you use pmap for the Java process after it has started and calculate the maximum RSS size by calculating the VSS size of the large anon blocks instead of RSS for the blocks that the Java process has reserved for it's different memory areas (I think you shouldn't . You should discard adding VSS for the CompressedClassSpaceSize block. After this calculation, add enough memory to the "native" parameter in JBP until the RSS size calculated this way stays under the limit. That's the only "method" I have come up by now. It might be required to have some RSS space allocated for any zip/jar files read by the Java process. I think that Java uses mmap files for zip file reading by default and that might go on top of all other limits. To test this theory, I'd suggest testing by adding -Dsun.zip.disableMemoryMapping=true system property setting to JAVA_OPTS. That disables the native mmap for zip/jar file reading. I haven't had time to test this assumption. I guess the only way to understand how Java allocates memory is to look at the source code. from http://openjdk.java.net/projects/jdk8u/ , the instructions to get the source code of JDK 8: hg clone http://hg.openjdk.java.net/jdk8u/jdk8u;cd jdk8u;sh get_source.sh This tool is really good for grepping and searching the source code: http://geoff.greer.fm/ag/ On Ubuntu it's in silversearcher-ag package, "apt-get install silversearcher-ag" and on MacOSX brew it's "brew install the_silver_searcher". This alias is pretty useful: alias codegrep='ag --color --group --pager less -C 5' Then you just search for the correct location in code by starting with the tokens you know about: codegrep MaxMetaspaceSize this gives pretty good starting points in looking how the JDK allocates memory. So the JDK source code is only a few commands away. It would be interesting to hear more about this if someone has the time to dig in to this. This is about how far I got and I hope sharing this information helps someone continue. :) Lari github/twitter: lhotari On 15-05-08 10:02 AM, Daniel Jones wrote: Hi Lari et al, Thanks for your help Lari. David and I are pairing on this issue, and we're yet to resolve it. We're in the process of creating a repeatable test case (our most crashy app makes calls to external services that need mocking), but in the meantime, here's what we've seen. Between Java Buildpack commit e89e546 and 17162df, we see apps crashing with Warden out of memory errors. killjava.sh never fires, and this has led us to believe that the kernel is shooting a cgroup process in the head after the cgroup oversteps its memory limit. We cannot find any evidence of the OOM killer firing in any logs, but we may not be looking in the right place. The JBP is setting heap to be 70%, metaspace to be 15% (with max set to the same as initial), 5% for "stack", 5% for "normalised stack" and 10% for "native". We do not understand why this adds up to 105%, but haven't looked into the JBP algorithm yet. Any pointers on what "normalised stack" is would be much appreciated, as this doesn't appear in the list of heuristics supplied via app env. Other team members tried applying the same settings that you suggested - thanks for this. Apps still crash with these settings, albeit less frequently. After reading the blog you linked to (http://java.dzone.com/articles/java-8-permgen-metaspace) we wondered whether the increased reserved metaspace claimed after metaspace GC might be causing a problem; however we reused the test code to create a metaspace leak in a CF app and saw metaspace GCs occur correctly, and memory usage never grow over MaxMetaspaceSize. This figures, as the committed metaspace is still less than MaxMetaspaceSize, and the reserved appears to be whatever RAM is free across the whole DEA. We noted that an Oracle blog (https://blogs.oracle.com/poonam/entry/about_g1_garbage_collector_permanent) mentions that the metaspace size parameters are approximate. We're currently wondering if native allocations by Tomcat (APR, NIO) are taking up more container memory, and so when the metaspace fills, it's creeping slightly over the limit and triggering the kernel's OOM killer. Any suggestions would be much appreciated. We've tried to resist tweaking heuristics blindly, but are running out of options as we're struggling to figure out how the Java process is using committed memory. pmap seems to show virtual memory, and so it's hard to see if things like the metaspace or NIO ByteBuffers are nabbing too much and trigger the kernel's OOM killer. Thanks for all your help, Daniel Jones & David Head-Rapson On Wed, Apr 29, 2015 at 8:07 PM, Lari Hotari <Lari(a)hotari.net<mailto:Lari(a)hotari.net>> wrote: Hi, I created a few tools to debug OOM problems since the application I was responsible for running on CF was failing constantly because of OOM problems. The problems I had, turned out not to be actual memory leaks in the Java application. In the "cf events appname" log I would get entries like this: 2015-xx-xxTxx:xx:xx.00-0400 app.crash appname index: 1, reason: CRASHED, exit_description: out of memory, exit_status: 255 These type of entries are produced when the container goes over it's memory resource limits. It doesn't mean that there is a memory leak in the Java application. The container gets killed by the Linux kernel oom killer (https://github.com/cloudfoundry/warden/blob/master/warden/README.md#limit-handle-mem-value) based on the resource limits set to the warden container. The memory limit is specified in number of bytes. It is enforced using the control group associated with the container. When a container exceeds this limit, one or more of its processes will be killed by the kernel. Additionally, the Warden will be notified that an OOM happened and it subsequently tears down the container. In my case it never got killed by the killjava.sh script that gets called in the java-buildpack when an OOM happens in Java. This is the tool I built to debug the problems: https://github.com/lhotari/java-buildpack-diagnostics-app I deployed that app as part of the forked buildpack I'm using. Please read the readme about what it's limitations are. It worked for me, but it might not work for you. It's opensource and you can fork it. :) There is a solution in my toolcase for creating a heapdump and uploading that to S3: https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/HeapDumpServlet.groovy The readme explains how to setup Amazon S3 keys for this: https://github.com/lhotari/java-buildpack-diagnostics-app#amazon-s3-setup Once you get a dump, you can then analyse the dump in a java profiler tool like YourKit. I also have a solution that forks the java-buildpack modifies killjava.sh and adds a script that uploads the heapdump to S3 in the case of OOM: https://github.com/lhotari/java-buildpack/commit/2d654b80f3bf1a0e0f1bae4f29cb85f56f5f8c46 In java-buildpack-diagnostics-app I have also other tools for getting Linux operation system specific memory information, for example: https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemoryInfoServlet.groovy https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemorySmapServlet.groovy https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MallocInfoServlet.groovy These tools are handy for looking at details of the Java process RSS memory usage growth. There is also a solution for getting ssh shell access inside your application with tmate.io<http://tmate.io>: https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/TmateSshServlet.groovy (this version is only compatible with the new "cflinuxfs2" stack) It looks like there are serious problems on CloudFoundry with the memory sizing calculation. An application that doesn't have a OOM problem will get killed by the oom killer because the Java process will go over the memory limits. I filed this issue: https://github.com/cloudfoundry/java-buildpack/issues/157 , but that might not cover everything. The workaround for that in my case was to add a native key under memory_sizes in open_jdk_jre.yml and set the minimum to 330M (that is for a 2GB total memory). see example https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25 that was how I got the app I'm running on CF to stay within the memory bounds. I'm sure there is now also a way to get the keys without forking the buildpack. I could have also adjusted the percentage portions, but I wanted to set a hard minimum for this case. It was also required to do some other tuning. I added this to JAVA_OPTS: -XX:CompressedClassSpaceSize=256M -XX:InitialCodeCacheSize=64M -XX:CodeCacheExpansionSize=1M -XX:CodeCacheMinimumFreeSpace=1M -XX:ReservedCodeCacheSize=200M -XX:MinMetaspaceExpansion=1M -XX:MaxMetaspaceExpansion=8M -XX:MaxDirectMemorySize=96M while trying to keep the Java process from growing in RSS memory size. The memory overhead of a 64 bit Java process on Linux can be reduced by specifying these environment variables: stack: cflinuxfs2 . . . env: MALLOC_ARENA_MAX: 2 MALLOC_MMAP_THRESHOLD_: 131072 MALLOC_TRIM_THRESHOLD_: 131072 MALLOC_TOP_PAD_: 131072 MALLOC_MMAP_MAX_: 65536 MALLOC_ARENA_MAX works only on cflinuxfs2 stack (the lucid64 stack has a buggy version of glibc). explanation about MALLOC_ARENA_MAX from Heroku: https://devcenter.heroku.com/articles/tuning-glibc-memory-behavior some measurement data how it reduces memory consumption: https://devcenter.heroku.com/articles/testing-cedar-14-memory-use I have created a PR to add this to CF java-buildpack: https://github.com/cloudfoundry/java-buildpack/pull/160 I also created an issues https://github.com/cloudfoundry/java-buildpack/issues/163 and https://github.com/cloudfoundry/java-buildpack/pull/159 . I hope this information helps others struggling with OOM problems in CF. I'm not saying that this is a ready made solution just for you. YMMV. It worked for me. -Lari On 15-04-29 10:53 AM, Head-Rapson, David wrote: Hi, I’m after some guidance on how to get profile Java apps in CF, in order to get to the bottom of memory issues. We have an app that’s crashing every few hours with OOM error, most likely it’s a memory leak. I’d like to profile the JVM and work out what’s eating memory, however tools like yourkit require connectivity INTO the JVM server (i.e. the warden container), either via host / port or via SSH. Since warden containers cannot be connected to on ports other than for HTTP and cannot be SSHd to, neither of these works for me. I tried installed a standalone JDK onto the warden container, however as soon as I ran ‘jmap’ to invoke the dump, warden cleaned up the container – most likely for memory over-consumption. I had previously found a hack in the Weblogic buildpack (https://github.com/pivotal-cf/weblogic-buildpack/blob/master/docs/container-wls-monitoring.md) for modifying the start script which, when used with –XX:HeapDumpOnOutOfMemoryError, should copy any heapdump files to a file share somewhere. I have my own custom buildpack so I could use something similar. Has anyone got a better solution than this? We would love to use newrelic / app dynamics for this however we’re not allowed. And I’m not 100% certain they could help with this either. Dave The information transmitted is intended for the person or entity to which it is addressed and may contain confidential, privileged or copyrighted material. If you receive this in error, please contact the sender and delete the material from any computer. Fidelity only gives information on products and services and does not give investment advice to retail clients based on individual circumstances. Any comments or statements made are not necessarily those of Fidelity. All e-mails may be monitored. FIL Investments International (Reg. No.1448245), FIL Investment Services (UK) Limited (Reg. No. 2016555), FIL Pensions Management (Reg. No. 2015142) and Financial Administration Services Limited (Reg. No. 1629709) are authorised and regulated in the UK by the Financial Conduct Authority. FIL Life Insurance Limited (Reg No. 3406905) is authorised in the UK by the Prudential Regulation Authority and regulated in the UK by the Financial Conduct Authority and the Prudential Regulation Authority. Registered offices at Oakhill House, 130 Tonbridge Road, Hildenborough, Tonbridge, Kent TN11 9DZ. -- You received this message because you are subscribed to the Google Groups "Cloud Foundry Developers" group. To view this discussion on the web visit https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/DFFA4ADB9F3BC34194429921AB329336408CAB04%40UKFIL7006WIN.intl.intlroot.fid-intl.com<https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/DFFA4ADB9F3BC34194429921AB329336408CAB04%40UKFIL7006WIN.intl.intlroot.fid-intl.com?utm_medium=email&utm_source=footer>. To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe(a)cloudfoundry.org<mailto:vcap-dev+unsubscribe(a)cloudfoundry.org>. _______________________________________________ Cf-dev mailing list Cf-dev(a)lists.cloudfoundry.org<mailto:Cf-dev(a)lists.cloudfoundry.org> https://lists.cloudfoundry.org/mailman/listinfo/cf-dev -- Regards, Daniel Jones EngineerBetter.com |
|