Date   

Adding multiple users to user/auditor roles of an orgnization

Anil Ambati <aambati@...>
 

Hi,
is there a CF API to add multiple users to multiple roles of an organization? I have looked at the CF docs, but did not find any indication that such API exists.

Thank you.

Regards,
Anil


Re: cf-release v208 is now available

John Wong
 

Great release!!! Congrat.

Just a couple questions (but if this is the right thread to ask please
excuse me and let me know).


- Manifest templates no longer include resource pool sizes details
<https://github.com/cloudfoundry/cf-release/commit/fc26ee26443d79d765df490910ea0b4c9706d6ba>

https://github.com/cloudfoundry/cf-release/commit/fc26ee26443d79d765df490910ea0b4c9706d6ba

In a way I was "spoiled" and never really asked why we needed resource pool
but went alone with it, but what does the commit comment "bosh director can
figure this out automatically" mean?


- Adjusted ephemeral disk sizes on new instance types for AWS template
to be more realisticdetails
<https://www.pivotaltracker.com/story/show/91780134>

I just want to make sure I understand the underscore for each of the size
is just some syntax thing for the template, not something I would actually
write in my manifest. Also c3.large by default has 2x16SSD, so are we
taking 4Gb (from the template) from the ephemeral/instance?

And congratulation for merging UAA and Login server. So now all we need is
2 VMs minimally if we really want to have HA (aside from enabling
bosh resurrect).

Thanks in advance.

John Wong

On Tue, May 12, 2015 at 8:22 PM, Dieu Cao <dcao(a)pivotal.io> wrote:

The cf-release v208 was released on May 12th, 2015

- Please see note about merge of UAA/Login server jobs below to
maintain zero down time for CC and UAA for existing deployments.

Runtime

- [Experimental] Work continues on support for Asynchronous Service
Instance Operationsdetails
<https://www.pivotaltracker.com/epic/show/1561148>
- Completed Improvements to Recursive Deletion of Org and Space, in
support of Asynchronous Service Operations details
<https://www.pivotaltracker.com/epic/show/1751766>
- [Experimental] Work continues on /v3 and Application Process Types
details <https://www.pivotaltracker.com/epic/show/1334418>
- [Experimental] Work continues on Route API details
<https://www.pivotaltracker.com/epic/show/1590160>
- [Experimental] Work continues on Context Path Routes details
<https://www.pivotaltracker.com/epic/show/1808212>
- Work continues on support for Service Keys details
<https://www.pivotaltracker.com/epic/show/1743366>
- Work continues on support for Arbitrary Service Parameters details
<https://www.pivotaltracker.com/epic/show/1725984>
- Adjusted ephemeral disk sizes on new instance types for AWS template
to be more realisticdetails
<https://www.pivotaltracker.com/story/show/91780134>
- Including staticfile buildpack v1.0.0 details
<https://github.com/cloudfoundry/staticfile-buildpack/releases/tag/v1.0.0>
- Removed separate login job from minimal aws deployment details
<https://www.pivotaltracker.com/story/show/93505400>
- Allow acceptance test timeouts to be set via manifest details
<https://github.com/cloudfoundry/cf-release/commit/b6c1f33771213ded1cf7c982f5f6fafb3d900197>
- Update default cipher list for haproxy and gorouter details
<https://www.pivotaltracker.com/story/show/91129360>
- Addressed tcpdump CVE-2015-0261, CVE-2015-2153, CVE-2015-2154,
CVE-2015-2155details
<https://www.pivotaltracker.com/story/show/93371680>
- Upgrading php buildpack to v3.1.1 details
<https://github.com/cloudfoundry/php-buildpack/releases/tag/v3.1.1>
- Manifest templates no longer include resource pool sizes details
<https://github.com/cloudfoundry/cf-release/commit/fc26ee26443d79d765df490910ea0b4c9706d6ba>
- Upgrading ruby buildpack to v1.3.1 details
<https://github.com/cloudfoundry/ruby-buildpack/releases/tag/v1.3.1>
- Bump CLI to 6.11.1 for CATS and remove darwin CLI details
<https://www.pivotaltracker.com/story/show/92595438>
- Upgrade cf-release to use ruby 2.1.6 and remove ruby 2.1.4 for CC,
Collector, Warden, DEAdetails
<https://www.pivotaltracker.com/story/show/92547532>
- Addresses ruby CVE-2015-1855
- cloudfoundry/cf-release #660
<https://github.com/cloudfoundry/cf-release/pull/660>: Add security
group for cf-mysql subnets to bosh-lite details
<https://www.pivotaltracker.com/story/show/92658768>
- Adjust VCAP_ID as endpoint/sticky cookie changes details
<https://www.pivotaltracker.com/story/show/92796282>
- Disable compression when creating proxy connection details
<https://www.pivotaltracker.com/story/show/93362206>
- cleanup regex details
<https://github.com/cloudfoundry/cloud_controller_ng/commit/5257a8af6990e71cd1e34ae8978dfe4773b32826>
- A space developer can create a wildcard route for private domains
details <https://www.pivotaltracker.com/story/show/82612406>
- Allow commands to be reset to nothing details
<https://www.pivotaltracker.com/story/show/93406896>

UAA Updates

- Merged UAA & Login Server details
<https://github.com/cloudfoundry/uaa/releases/tag/2.0.0>
- Multi-tenant UAA details
<https://github.com/cloudfoundry/uaa/releases/tag/2.1.0>
- Registering wildcard routes for *.login and *.uaa details
<https://github.com/cloudfoundry/cf-release/commit/0260567d9761700dbccde3088165121d7933e058>
- Zero Downtime Upgrade Procedure
- Perform the cf-release upgrade and keep number of login server of
jobs the same as your existing deploy.
- Change the number of Login Server Job instances to 0 and
re-deploy after initial deploy completes.

Note: The combination of Older Login Server jobs and the newly merged
UAA/Login Server job is not supported. This should be done only for a short
duration to achieve the zero downtime. The Login Server instances should be
deleted via a bosh redeploy immediately after a successful upgrade
Used Configuration

- BOSH Version: 152
- Stemcell Version: 2889
- CC Api Version: 2.25.0

Commit summary
<http://htmlpreview.github.io/?https://github.com/cloudfoundry-community/cf-docs-contrib/blob/master/release_notes/cf-208-whats-in-the-deploy.html>
Compatible Diego Version

- final release 1198 commit
<https://github.com/cloudfoundry-incubator/diego-release/commit/f7b15f8da536eee7be696896890943dbc6202242>


https://github.com/cloudfoundry/cf-release/releases/tag/v208

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


cf-release v208 is now available

Dieu Cao <dcao@...>
 

The cf-release v208 was released on May 12th, 2015

- Please see note about merge of UAA/Login server jobs below to maintain
zero down time for CC and UAA for existing deployments.

Runtime

- [Experimental] Work continues on support for Asynchronous Service
Instance Operationsdetails
<https://www.pivotaltracker.com/epic/show/1561148>
- Completed Improvements to Recursive Deletion of Org and Space, in
support of Asynchronous Service Operations details
<https://www.pivotaltracker.com/epic/show/1751766>
- [Experimental] Work continues on /v3 and Application Process Types
details <https://www.pivotaltracker.com/epic/show/1334418>
- [Experimental] Work continues on Route API details
<https://www.pivotaltracker.com/epic/show/1590160>
- [Experimental] Work continues on Context Path Routes details
<https://www.pivotaltracker.com/epic/show/1808212>
- Work continues on support for Service Keys details
<https://www.pivotaltracker.com/epic/show/1743366>
- Work continues on support for Arbitrary Service Parameters details
<https://www.pivotaltracker.com/epic/show/1725984>
- Adjusted ephemeral disk sizes on new instance types for AWS template
to be more realisticdetails
<https://www.pivotaltracker.com/story/show/91780134>
- Including staticfile buildpack v1.0.0 details
<https://github.com/cloudfoundry/staticfile-buildpack/releases/tag/v1.0.0>
- Removed separate login job from minimal aws deployment details
<https://www.pivotaltracker.com/story/show/93505400>
- Allow acceptance test timeouts to be set via manifest details
<https://github.com/cloudfoundry/cf-release/commit/b6c1f33771213ded1cf7c982f5f6fafb3d900197>
- Update default cipher list for haproxy and gorouter details
<https://www.pivotaltracker.com/story/show/91129360>
- Addressed tcpdump CVE-2015-0261, CVE-2015-2153, CVE-2015-2154,
CVE-2015-2155details <https://www.pivotaltracker.com/story/show/93371680>
- Upgrading php buildpack to v3.1.1 details
<https://github.com/cloudfoundry/php-buildpack/releases/tag/v3.1.1>
- Manifest templates no longer include resource pool sizes details
<https://github.com/cloudfoundry/cf-release/commit/fc26ee26443d79d765df490910ea0b4c9706d6ba>
- Upgrading ruby buildpack to v1.3.1 details
<https://github.com/cloudfoundry/ruby-buildpack/releases/tag/v1.3.1>
- Bump CLI to 6.11.1 for CATS and remove darwin CLI details
<https://www.pivotaltracker.com/story/show/92595438>
- Upgrade cf-release to use ruby 2.1.6 and remove ruby 2.1.4 for CC,
Collector, Warden, DEAdetails
<https://www.pivotaltracker.com/story/show/92547532>
- Addresses ruby CVE-2015-1855
- cloudfoundry/cf-release #660
<https://github.com/cloudfoundry/cf-release/pull/660>: Add security
group for cf-mysql subnets to bosh-lite details
<https://www.pivotaltracker.com/story/show/92658768>
- Adjust VCAP_ID as endpoint/sticky cookie changes details
<https://www.pivotaltracker.com/story/show/92796282>
- Disable compression when creating proxy connection details
<https://www.pivotaltracker.com/story/show/93362206>
- cleanup regex details
<https://github.com/cloudfoundry/cloud_controller_ng/commit/5257a8af6990e71cd1e34ae8978dfe4773b32826>
- A space developer can create a wildcard route for private domains
details <https://www.pivotaltracker.com/story/show/82612406>
- Allow commands to be reset to nothing details
<https://www.pivotaltracker.com/story/show/93406896>

UAA Updates

- Merged UAA & Login Server details
<https://github.com/cloudfoundry/uaa/releases/tag/2.0.0>
- Multi-tenant UAA details
<https://github.com/cloudfoundry/uaa/releases/tag/2.1.0>
- Registering wildcard routes for *.login and *.uaa details
<https://github.com/cloudfoundry/cf-release/commit/0260567d9761700dbccde3088165121d7933e058>
- Zero Downtime Upgrade Procedure
- Perform the cf-release upgrade and keep number of login server of
jobs the same as your existing deploy.
- Change the number of Login Server Job instances to 0 and re-deploy
after initial deploy completes.

Note: The combination of Older Login Server jobs and the newly merged
UAA/Login Server job is not supported. This should be done only for a short
duration to achieve the zero downtime. The Login Server instances should be
deleted via a bosh redeploy immediately after a successful upgrade
Used Configuration

- BOSH Version: 152
- Stemcell Version: 2889
- CC Api Version: 2.25.0

Commit summary
<http://htmlpreview.github.io/?https://github.com/cloudfoundry-community/cf-docs-contrib/blob/master/release_notes/cf-208-whats-in-the-deploy.html>
Compatible Diego Version

- final release 1198 commit
<https://github.com/cloudfoundry-incubator/diego-release/commit/f7b15f8da536eee7be696896890943dbc6202242>


https://github.com/cloudfoundry/cf-release/releases/tag/v208


Re: Recipe to install Diego?

Eric Malm <emalm@...>
 

Hi, Tom,

The Diego team does deploy Diego to AWS as part of our testing pipeline. We
haven't fully published our tooling for doing so, but you can see some of
our process in the deploy_diego CI script in diego-release
<https://github.com/cloudfoundry-incubator/diego-release/blob/develop/scripts/ci/deploy_diego>,
which uses diego-release's generate-deployment-manifest script. This script
is set up differently from the generate_deployment_manifest script in
cf-release, in that it takes a fixed sequence of stubs and a deployment
directory as arguments instead of an infrastructure type and an arbitrary
list of stubs to merge in. The full list of stubs is described in the usage
message for the script, but here are the parts that should be most relevant
for you to deploy Diego to AWS or OpenStack:

- IaaS settings (arg #5): This is a stub that should contain an
"iaas_settings" hash with several expected subfields
(compilation_cloud_properties, resource_pool_cloud_properties,
stemcell, subnet_configs). The manifest generation script takes these
values and uses them to populate certain fields in the diego manifest's
resource_pools, networks, and compilation sections. This will likely be the
stub you need to customize the most for an AWS or OpenStack deployment, as
this will contain all the information about the network and security group
configuration for that environment.
- Deployments directory (arg #7): This is a directory that should contain
your CF deployment manifest as the file 'cf.yml'. The manifest generation
script will extract certain values from the CF manifest so the Diego
deployment can integrate correctly with various services in CF (for
example, NATS and consul).
- Director UUID (arg #1): This is a stub containing "director_uuid:
<your-director-uuid>"; you may already have such a stub for generating your
CF manifest.
- Instance count overrides (arg #3): This is a stub containing any
instance-count changes for the diego jobs. Depending on the size of your
desired cluster, you'll want to change these values from the defaults that
the manifest-generation/diego.yml template provides in the jobs section.

Depending on how you wish to configure the Diego deployment, there may be
some additional properties you want to add to the property-overrides stub
(arg #2). I doubt you'll need to change anything in the persistent-disk
overrides or additional-jobs stubs (args #4 and #6), unless you're
customizing your deployment extensively. In any case, the stubs under
manifest-generation/bosh-lite-stubs should give you examples to customize
for your own deployment, and the manifest-generation/diego.yml template
will show you which values from those stubs are consumed in manifest
generation.

Also, as Diego matures and becomes the principal backend for running
application instances in CF, these manifest-generation patterns may change
substantially.

Thanks,
Eric Malm, CF Runtime Diego PM

On Tue, May 12, 2015 at 8:48 AM, Ken Ojiri <ozzozz(a)gmail.com> wrote:

Hi,

I use spiff manifest templates included by cf-release and diego-release,
and generate manifests by spiff, but I usually use the manifests as
reference materials.
I finally adjust my own manifests by refering to spiff generated manifests,
job definitions of cf-release and/or diego-release, and do try-and-error...

Now, setting parameters of diego components are changing with every
version,
so job definitions of diego-release are essential reference.

Regards,
Ken Ojiri


---
Ken Ojiri <ozzozz(a)gmail.com>
Mitaka, Tokyo Japan


On Tue, May 12, 2015 at 5:56 PM, 王天青 <wang.tianqing.cn(a)gmail.com> wrote:
Hi Ken,

How do you generate the manifest file?

Thanks
Best Regards~!
Grissom

On Mon, May 11, 2015 at 9:17 PM OzzOzz <ozzozz(a)gmail.com> wrote:

Hi,

I have posted a sample BOSH deployment manifest to Gist.
https://gist.github.com/ozzozz/4c08c37863b703a75afc
I could deploy cf-release v207 and diego-release 0.1099.0 to AWS Tokyo
region by MicroBOSH.

I could also deploy cf-release and diego-release to OpenStack(Juno).
The manifests differs only in 'networks', 'cloud_properties' and
'stemcell'.

Regards,
Ken

---
<ozzozz(a)gmail.com>
Mitaka, Tokyo Japan


On Sat, May 9, 2015 at 8:57 PM, Tom Sherrod <tom.sherrod(a)gmail.com>
wrote:
Hi,

Are there any examples or docs on installing Diego with
bosh/microbosh?
Using the bosh-lite as a template, I'm tripping up on various parts.
Is
this
even a valid direction in installing?
Either AWS or Openstack..

Thanks,
Tom

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: Purge files on NFS or S3?

Jon Price
 

Make sure you only delete the resource files, not everything...

Jon Price
Intel Corp.

On May 11, 2015 10:05 PM, Dieu Cao <dcao(a)pivotal.io> wrote:
An option could be to just delete all the resource files on the blobstore. The effect would be that for binaries that would have been matched, they would be uploaded again on the first new push including those binaries.

On Monday, May 11, 2015, John Wong <gokoproject(a)gmail.com<mailto:gokoproject(a)gmail.com>> wrote:
Hi all

Thanks. No I was just curious if there was a way to identify what to remove in the blobstore because I was surprised the size of my blobstore at this point. I will check what's in there (maybe James is right it is mostly resource files). I am currently using NFS. I can build a CF with S3 as my blobstore.

John


On Mon, May 11, 2015 at 11:36 AM, Chad Woolley <thewoolleyman(a)gmail.com> wrote:
Not sure if this is what you need, but you can manually sync + delete files from a local filesystem (including NFS mount) to/from S3:

http://s3tools.org/s3cmd-sync

... with `—delete-removed` option

-- Chad


On Sat, May 9, 2015 at 12:19 AM, James Bayer <jbayer(a)pivotal.io> wrote:

john, i think the resource files may grow forever right now without intervention.

i'm pretty confident that when apps are deleted that their droplets are deleted with them and that proper garbage collection occurs with that.

i'm unaware of any NFS file system to s3 blob migration. you would need to update the CC_DB references too i'm pretty sure. i'm interested if you find out more.

On Tue, May 5, 2015 at 1:14 PM, John Wong <gokoproject(a)gmail.com> wrote:

Hi

I just looked at our disk usage on NFS server. We have used like 200G so far, and I wonder if there's a systematic way to purge files we don't need (or how do I know I don't need them)?

Similarly, if I were to replace NFS server with S3 instead, does the existing process (if any) work with S3?

Thanks.

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: Recipe to install Diego?

Ken Ojiri
 

Hi,

I use spiff manifest templates included by cf-release and diego-release,
and generate manifests by spiff, but I usually use the manifests as
reference materials.
I finally adjust my own manifests by refering to spiff generated manifests,
job definitions of cf-release and/or diego-release, and do try-and-error...

Now, setting parameters of diego components are changing with every version,
so job definitions of diego-release are essential reference.

Regards,
Ken Ojiri


---
Ken Ojiri <ozzozz(a)gmail.com>
Mitaka, Tokyo Japan

On Tue, May 12, 2015 at 5:56 PM, 王天青 <wang.tianqing.cn(a)gmail.com> wrote:
Hi Ken,

How do you generate the manifest file?

Thanks
Best Regards~!
Grissom

On Mon, May 11, 2015 at 9:17 PM OzzOzz <ozzozz(a)gmail.com> wrote:

Hi,

I have posted a sample BOSH deployment manifest to Gist.
https://gist.github.com/ozzozz/4c08c37863b703a75afc
I could deploy cf-release v207 and diego-release 0.1099.0 to AWS Tokyo
region by MicroBOSH.

I could also deploy cf-release and diego-release to OpenStack(Juno).
The manifests differs only in 'networks', 'cloud_properties' and
'stemcell'.

Regards,
Ken

---
<ozzozz(a)gmail.com>
Mitaka, Tokyo Japan


On Sat, May 9, 2015 at 8:57 PM, Tom Sherrod <tom.sherrod(a)gmail.com> wrote:
Hi,

Are there any examples or docs on installing Diego with bosh/microbosh?
Using the bosh-lite as a template, I'm tripping up on various parts. Is
this
even a valid direction in installing?
Either AWS or Openstack..

Thanks,
Tom

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Scaling Java Application

Christopher Frost
 

When deploying a Java application to Cloud Foundry the Java memory settings
for the application are decided based on the configured memory weighting
during staging. This means that, unlike other apps, if the application is
scaled to give it more memory it needs to be *restage*d it to get updated
Java memory settings. This has now been improved with an improved memory
calculator written by Steve Powell[2]. The Memory Calculator[1] will be run
during every application start to ensure the application gets up-to-date
memory settings, its output is shown during staging.

-----> Downloading Open JDK Like Memory Calculator 1.1.1_RELEASE from
https://download.run.pivotal.io/memory-calculator/trusty/x86_64/memory-calculator-1.1.1_RELEASE
(found in cache)
Memory Settings: -XX:MaxMetaspaceSize=64M -XX:MetaspaceSize=64M
-Xss995K -Xmx382293K -Xms382293K

Then scaling the application to double the memory will result in new memory
settings without having to restage the application.

cf scale my-application -m 1G

-Xmx768M -Xms768M -XX:MaxMetaspaceSize=104857K -XX:MetaspaceSize=104857K
-Xss1M

This new feature is currently available on the master branch of the
buildpack [3] and will be released in due course.


Chris.

[1] https://github.com/cloudfoundry/java-buildpack-memory-calculator
[2] https://github.com/Zteve
[3] https://github.com/cloudfoundry/java-buildpack

--
Christopher Frost - GoPivotal UK


Scailing Java Applications

Christopher Frost
 

When deploying a Java application to Cloud Foundry the Java memory settings
for the application are decided based on the configured memory weighting
during staging. This means that, unlike other apps, if the application is
scaled to give it more memory it needs to be *restage*d it to get updated
Java memory settings. This has now been improved with an improved memory
calculator written by Steve Powell[2]. The Memory Calculator[1] will be run
during every application start to ensure the application gets up-to-date
memory settings, its output is shown during staging.

-----> Downloading Open JDK Like Memory Calculator 1.1.1_RELEASE from
https://download.run.pivotal.io/memory-calculator/trusty/x86_64/memory-calculator-1.1.1_RELEASE
(found in cache)
Memory Settings: -XX:MaxMetaspaceSize=64M -XX:MetaspaceSize=64M
-Xss995K -Xmx382293K -Xms382293K

Then scaling the application to double the memory will result in new memory
settings without having to restage the application.

cf scale my-application -m 1G

-Xmx768M -Xms768M -XX:MaxMetaspaceSize=104857K -XX:MetaspaceSize=104857K
-Xss1M


This new feature is currently available on the master branch of the
buildpack [3] and will be released in due course.


Chris.

[1] https://github.com/cloudfoundry/java-buildpack-memory-calculator
[2] https://github.com/Zteve
[3] https://github.com/cloudfoundry/java-buildpack

--
Christopher Frost - Pivotal UK


Follow up on multiple line log outputs in CF

George Li
 

Hi,

this is a follow up on the archived posting
https://groups.google.com/a/cloudfoundry.org/forum/?utm_medium=email&utm_source=footer#!msg/vcap-dev/B1W6_vO0oyo/84X1eAtFsKoJ.
I cannot find any new postings on that thread.
I am using Cloud Foundry version "6.11.2-2a26d55-2015-04-27T21:11:44+00:00"
and want to know what options I have to handle multiple line logs in a
multi-tenant environment. Since multiple instances of multiple applications
are all sending logs to a single Logstash server, is it best to avoid
having multiple lines in my log? I can live with sticking to single line
logs except for outputting exception stack trace, not to mention that I
only have control over my code.

Thanks.


Code license question

peteb@...
 

Hello,

I am a software developer and was wondering what is the code license for your CloudFoundry Community Code, such as: the go cfc client: https://github.com/cloudfoundry-community/go-cfclient ?

Thanks,
kind regards,
Piotr


Re: Recipe to install Diego?

王天青 <wang.tianqing.cn at gmail.com...>
 

Hi Ken,

How do you generate the manifest file?

Thanks
Best Regards~!
Grissom

On Mon, May 11, 2015 at 9:17 PM OzzOzz <ozzozz(a)gmail.com> wrote:

Hi,

I have posted a sample BOSH deployment manifest to Gist.
https://gist.github.com/ozzozz/4c08c37863b703a75afc
I could deploy cf-release v207 and diego-release 0.1099.0 to AWS Tokyo
region by MicroBOSH.

I could also deploy cf-release and diego-release to OpenStack(Juno).
The manifests differs only in 'networks', 'cloud_properties' and
'stemcell'.

Regards,
Ken

---
<ozzozz(a)gmail.com>
Mitaka, Tokyo Japan


On Sat, May 9, 2015 at 8:57 PM, Tom Sherrod <tom.sherrod(a)gmail.com> wrote:
Hi,

Are there any examples or docs on installing Diego with bosh/microbosh?
Using the bosh-lite as a template, I'm tripping up on various parts. Is
this
even a valid direction in installing?
Either AWS or Openstack..

Thanks,
Tom

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: Purge files on NFS or S3?

Dieu Cao <dcao@...>
 

An option could be to just delete all the resource files on the blobstore.
The effect would be that for binaries that would have been matched, they
would be uploaded again on the first new push including those binaries.

On Monday, May 11, 2015, John Wong <gokoproject(a)gmail.com> wrote:

Hi all

Thanks. No I was just curious if there was a way to identify what to
remove in the blobstore because I was surprised the size of my blobstore at
this point. I will check what's in there (maybe James is right it is mostly
resource files). I am currently using NFS. I can build a CF with S3 as my
blobstore.

John


On Mon, May 11, 2015 at 11:36 AM, Chad Woolley <thewoolleyman(a)gmail.com
<javascript:_e(%7B%7D,'cvml','thewoolleyman(a)gmail.com');>> wrote:

Not sure if this is what you need, but you can manually sync + delete
files from a local filesystem (including NFS mount) to/from S3:

http://s3tools.org/s3cmd-sync

... with `—delete-removed` option

-- Chad


On Sat, May 9, 2015 at 12:19 AM, James Bayer <jbayer(a)pivotal.io
<javascript:_e(%7B%7D,'cvml','jbayer(a)pivotal.io');>> wrote:

john, i think the resource files may grow forever right now without
intervention.

i'm pretty confident that when apps are deleted that their droplets are
deleted with them and that proper garbage collection occurs with that.

i'm unaware of any NFS file system to s3 blob migration. you would need
to update the CC_DB references too i'm pretty sure. i'm interested if you
find out more.

On Tue, May 5, 2015 at 1:14 PM, John Wong <gokoproject(a)gmail.com
<javascript:_e(%7B%7D,'cvml','gokoproject(a)gmail.com');>> wrote:

Hi

I just looked at our disk usage on NFS server. We have used like 200G
so far, and I wonder if there's a systematic way to purge files we don't
need (or how do I know I don't need them)?

Similarly, if I were to replace NFS server with S3 instead, does the
existing process (if any) work with S3?

Thanks.

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
<javascript:_e(%7B%7D,'cvml','cf-dev(a)lists.cloudfoundry.org');>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
<javascript:_e(%7B%7D,'cvml','cf-dev(a)lists.cloudfoundry.org');>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
<javascript:_e(%7B%7D,'cvml','cf-dev(a)lists.cloudfoundry.org');>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: Purge files on NFS or S3?

John Wong
 

Hi all

Thanks. No I was just curious if there was a way to identify what to remove
in the blobstore because I was surprised the size of my blobstore at this
point. I will check what's in there (maybe James is right it is mostly
resource files). I am currently using NFS. I can build a CF with S3 as my
blobstore.

John


On Mon, May 11, 2015 at 11:36 AM, Chad Woolley <thewoolleyman(a)gmail.com>
wrote:

Not sure if this is what you need, but you can manually sync + delete
files from a local filesystem (including NFS mount) to/from S3:

http://s3tools.org/s3cmd-sync

... with `—delete-removed` option

-- Chad


On Sat, May 9, 2015 at 12:19 AM, James Bayer <jbayer(a)pivotal.io> wrote:

john, i think the resource files may grow forever right now without
intervention.

i'm pretty confident that when apps are deleted that their droplets are
deleted with them and that proper garbage collection occurs with that.

i'm unaware of any NFS file system to s3 blob migration. you would need
to update the CC_DB references too i'm pretty sure. i'm interested if you
find out more.

On Tue, May 5, 2015 at 1:14 PM, John Wong <gokoproject(a)gmail.com> wrote:

Hi

I just looked at our disk usage on NFS server. We have used like 200G
so far, and I wonder if there's a systematic way to purge files we don't
need (or how do I know I don't need them)?

Similarly, if I were to replace NFS server with S3 instead, does the
existing process (if any) work with S3?

Thanks.

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: Purge files on NFS or S3?

Chad Woolley <thewoolleyman@...>
 

Not sure if this is what you need, but you can manually sync + delete files
from a local filesystem (including NFS mount) to/from S3:

http://s3tools.org/s3cmd-sync

... with `—delete-removed` option

-- Chad

On Sat, May 9, 2015 at 12:19 AM, James Bayer <jbayer(a)pivotal.io> wrote:

john, i think the resource files may grow forever right now without
intervention.

i'm pretty confident that when apps are deleted that their droplets are
deleted with them and that proper garbage collection occurs with that.

i'm unaware of any NFS file system to s3 blob migration. you would need
to update the CC_DB references too i'm pretty sure. i'm interested if you
find out more.

On Tue, May 5, 2015 at 1:14 PM, John Wong <gokoproject(a)gmail.com> wrote:

Hi

I just looked at our disk usage on NFS server. We have used like 200G so
far, and I wonder if there's a systematic way to purge files we don't need
(or how do I know I don't need them)?

Similarly, if I were to replace NFS server with S3 instead, does the
existing process (if any) work with S3?

Thanks.

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: [vcap-dev] Java OOM debugging

Lari Hotari <Lari@...>
 

fyi. Tomcat 8.0.20 might be consuming more memory than 8.0.18:
https://github.com/cloudfoundry/java-buildpack/issues/166#issuecomment-94517568

Other things we’ve tried:

- We set verbose garbage collection to verify there was no
memory size issues within the JVM. There wasn’t.

- We tried setting minimum memory for native, it had no
effect. The container still gets killed

- We tried adjusting the ‘memory heuristics’ so that they
added up to 80 rather than 100. This had the effect of causing a delay
in the container being killed. However it still was killed.
I think adjusting memory heuristics so that they add up to 80 doesn't
make a difference because the values aren't percentages.
The values are proportional weighting values used in the memory
calculation:
https://github.com/grails-samples/java-buildpack/blob/b4abf89/docs/jre-oracle_jre.md#memory-calculation

I found out that the only way to reserve "unused" memory is to set a
high value for the native memory lower bound in the memory_sizes.native
setting of config/open_jdk_jre.yml .
Example:
https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25



This seems like classic memory leak behaviour to me.
In my case it wasn't a classical Java memory leak, since the Java
application wasn't leaking memory. I was able to confirm this by getting
some heap dumps with the HeapDumpServlet
(https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/HeapDumpServlet.groovy)
and analyzing them.

In my case the JVM's RSS memory size is slowly growing. It probably is
some kind of memory leak since one process I've been monitoring now is
very close to the memory limit. The uptime is now almost 3 weeks.

Here is the latest diff of the meminfo report.
https://gist.github.com/lhotari/ee77decc2585f56cf3ad#file-meminfo_diff_example2-txt

From a Java perspective this isn't classical. The JVM heap isn't filling
up. The problem is that RSS size is slowly growing and will eventually
cause the Java process to cross the memory boundary so that the process
gets kill by the Linux kernel cgroups OOM killer.

RSS size might be growing because of many reasons. I have been able to
slow down the growth by doing the various MALLOC_ and JVM parameter
tuning (-XX:MinMetaspaceExpansion=1M -XX:CodeCacheExpansionSize=1M). I'm
able to get a longer uptime, but the problem isn't solved.

Lari


On 15-05-11 06:41 AM, Head-Rapson, David wrote:

Thanks for the continued advice.



We’ve hit on a key discovery after yet another a soak test this weekend.

- When we deploy using Tomcat 8.0.18 we don’t see the issue

- When we deploy using Tomcat 8.0.20 (same app version, same
CF space, same services bound, same JBP code version, same JRE
version, running at the same time), we see the crashes occurring after
just a couple of hours.



Ideally we’d go ahead with the memory calculations you mentioned
however we’re stuck on lucid64 because we’re using Pivotal CF 1.3.x &
we’re having upgrade issues to 1.4.x.

So we’re not able to adjust MALLOC_ARENA_MAX, nor are we able to view
RSS in pmap as you describe



Other things we’ve tried:

- We set verbose garbage collection to verify there was no
memory size issues within the JVM. There wasn’t.

- We tried setting minimum memory for native, it had no
effect. The container still gets killed

- We tried adjusting the ‘memory heuristics’ so that they
added up to 80 rather than 100. This had the effect of causing a delay
in the container being killed. However it still was killed.



This seems like classic memory leak behaviour to me.



*From:*Lari Hotari [mailto:lari.hotari(a)sagire.fi] *On Behalf Of *Lari
Hotari
*Sent:* 08 May 2015 16:25
*To:* Daniel Jones; Head-Rapson, David
*Cc:* cf-dev(a)lists.cloudfoundry.org
*Subject:* Re: [Cf-dev] [vcap-dev] Java OOM debugging




For my case, it turned out to be essential to reserve enough memory
for "native" in the JBP. For the 2GB total memory, I set the minimum
to 330M. With that setting I have been able to get over 2 weeks up
time by now.

I mentioned this in my previous email:

The workaround for that in my case was to add a native key under
memory_sizes in open_jdk_jre.yml and set the minimum to 330M (that is
for a 2GB total memory).
see example
https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25
that was how I got the app I'm running on CF to stay within the memory
bounds. I'm sure there is now also a way to get the keys without
forking the buildpack. I could have also adjusted the percentage
portions, but I wanted to set a hard minimum for this case.


I've been trying to get some insight by diffing the reports gathered
from the meminfo servlet
https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemoryInfoServlet.groovy


Here is such an example of a diff:
https://gist.github.com/lhotari/ee77decc2585f56cf3ad#file-meminfo_diff_example-txt

meminfo has pmap output included to get the report of the memory map
of the process. I have just noticed that most of the memory has
already been mmap:ed from the OS and it's just growing in RSS size.
For example:
< 00000000a7600000 1471488 1469556 1469556 rw--- [ anon ]
00000000a7600000 1471744 1470444 1470444 rw--- [ anon ]
The pmap output from lucid64 didn't include the RSS size, so you have
to use cflinuxfs2 for this. It's also better because of other reasons.
The glibc in lucid64 is old and has some bugs around the MALLOC_ARENA_MAX.

I was manually able to estimate the maximum size of the RSS size of
what the Java process will consume by simply picking the large
anon-blocks from the pmap report and calculating those blocks by the
allocated virtual size (VSS).
Based on this calculation, I picked the minimum of 330M for "native"
in open_jdk_jre.yml as I mentioned before.

It looks like these rows are for the Heap size:
< 00000000a7600000 1471488 1469556 1469556 rw--- [ anon ]
00000000a7600000 1471744 1470444 1470444 rw--- [ anon ]
It looks like the JVM doesn't fully allocate that block in RSS
initially and most of the growth of RSS size comes from that in my
case. In your case, it might be something different.

I also added a servlet for getting glibc malloc_info statistics in XML
format (). I haven't really analysed that information because of time
constraints and because I don't have a pressing problem any more. btw.
The malloc_info XML report is missing some key elements, that has been
added in later glibc versions
(https://github.com/bminor/glibc/commit/4d653a59ffeae0f46f76a40230e2cfa9587b7e7e).

If killjava.sh never fires and the app crashed with Warden out of
memory errors, then I believe it's the kernel's cgroups OOM killer
that has killed the container processes. I have found this location
where Warden oom notifier gets the OOM notification event:
https://github.com/cloudfoundry/warden/blob/ad18bff/warden/lib/warden/container/features/mem_limit.rb#L70
This is the oom.c source code:
https://github.com/cloudfoundry/warden/blob/ad18bff7dc56acbc55ff10bcc6045ebdf0b20c97/warden/src/oom/oom.c
. It reads the cgroups control files and receives events from the
kernel that way.

I'd suggest that you use pmap for the Java process after it has
started and calculate the maximum RSS size by calculating the VSS size
of the large anon blocks instead of RSS for the blocks that the Java
process has reserved for it's different memory areas (I think you
shouldn't . You should discard adding VSS for the
CompressedClassSpaceSize block.
After this calculation, add enough memory to the "native" parameter in
JBP until the RSS size calculated this way stays under the limit.
That's the only "method" I have come up by now.

It might be required to have some RSS space allocated for any zip/jar
files read by the Java process. I think that Java uses mmap files for
zip file reading by default and that might go on top of all other limits.
To test this theory, I'd suggest testing by adding
-Dsun.zip.disableMemoryMapping=true system property setting to
JAVA_OPTS. That disables the native mmap for zip/jar file reading. I
haven't had time to test this assumption.

I guess the only way to understand how Java allocates memory is to
look at the source code.
from http://openjdk.java.net/projects/jdk8u/ , the instructions to get
the source code of JDK 8:
hg clone http://hg.openjdk.java.net/jdk8u/jdk8u;cd jdk8u;sh get_source.sh
This tool is really good for grepping and searching the source code:
http://geoff.greer.fm/ag/ <http://geoff.greer.fm/ag/>
On Ubuntu it's in silversearcher-ag package, "apt-get install
silversearcher-ag" and on MacOSX brew it's "brew install
the_silver_searcher".
This alias is pretty useful:
alias codegrep='ag --color --group --pager less -C 5'
Then you just search for the correct location in code by starting with
the tokens you know about:
codegrep MaxMetaspaceSize
this gives pretty good starting points in looking how the JDK
allocates memory.

So the JDK source code is only a few commands away.

It would be interesting to hear more about this if someone has the
time to dig in to this. This is about how far I got and I hope sharing
this information helps someone continue. :)


Lari
github/twitter: lhotari

On 15-05-08 10:02 AM, Daniel Jones wrote:

Hi Lari et al,



Thanks for your help Lari.



David and I are pairing on this issue, and we're yet to resolve
it. We're in the process of creating a repeatable test case (our
most crashy app makes calls to external services that need
mocking), but in the meantime, here's what we've seen.



Between Java Buildpack commit e89e546 and 17162df, we see apps
crashing with Warden out of memory errors. killjava.sh never
fires, and this has led us to believe that the kernel is shooting
a cgroup process in the head after the cgroup oversteps its memory
limit. We cannot find any evidence of the OOM killer firing in any
logs, but we may not be looking in the right place.



The JBP is setting heap to be 70%, metaspace to be 15% (with max
set to the same as initial), 5% for "stack", 5% for "normalised
stack" and 10% for "native". We do not understand why this adds up
to 105%, but haven't looked into the JBP algorithm yet. Any
pointers on what "normalised stack" is would be much appreciated,
as this doesn't appear in the list of heuristics supplied via app env.



Other team members tried applying the same settings that you
suggested - thanks for this. Apps still crash with these settings,
albeit less frequently.



After reading the blog you linked to
(http://java.dzone.com/articles/java-8-permgen-metaspace) we
wondered whether the increased /reserved /metaspace claimed after
metaspace GC might be causing a problem; however we reused the
test code to create a metaspace leak in a CF app and saw metaspace
GCs occur correctly, and memory usage never grow over
MaxMetaspaceSize. This figures, as the committed metaspace is
still less than MaxMetaspaceSize, and the reserved appears to be
whatever RAM is free across the whole DEA.



We noted that an Oracle blog
(https://blogs.oracle.com/poonam/entry/about_g1_garbage_collector_permanent)
mentions that the metaspace size parameters are approximate. We're
currently wondering if native allocations by Tomcat (APR, NIO) are
taking up more container memory, and so when the metaspace fills,
it's creeping slightly over the limit and triggering the kernel's
OOM killer.



Any suggestions would be much appreciated. We've tried to resist
tweaking heuristics blindly, but are running out of options as
we're struggling to figure out how the Java process is using
/committed/ memory. pmap seems to show virtual memory, and so it's
hard to see if things like the metaspace or NIO ByteBuffers are
nabbing too much and trigger the kernel's OOM killer.



Thanks for all your help,



Daniel Jones & David Head-Rapson



On Wed, Apr 29, 2015 at 8:07 PM, Lari Hotari <Lari(a)hotari.net
<mailto:Lari(a)hotari.net>> wrote:

Hi,

I created a few tools to debug OOM problems since the application
I was responsible for running on CF was failing constantly because
of OOM problems. The problems I had, turned out not to be actual
memory leaks in the Java application.

In the "cf events appname" log I would get entries like this:
2015-xx-xxTxx:xx:xx.00-0400 app.crash appname
index: 1, reason: CRASHED, exit_description: out of memory,
exit_status: 255

These type of entries are produced when the container goes over
it's memory resource limits. It doesn't mean that there is a
memory leak in the Java application. The container gets killed by
the Linux kernel oom killer
(https://github.com/cloudfoundry/warden/blob/master/warden/README.md#limit-handle-mem-value)
based on the resource limits set to the warden container.

The memory limit is specified in number of bytes. It is enforced
using the control group associated with the container. When a
container exceeds this limit, one or more of its processes will be
killed by the kernel. Additionally, the Warden will be notified
that an OOM happened and it subsequently tears down the container.

In my case it never got killed by the killjava.sh script that gets
called in the java-buildpack when an OOM happens in Java.

This is the tool I built to debug the problems:
https://github.com/lhotari/java-buildpack-diagnostics-app
I deployed that app as part of the forked buildpack I'm using.
Please read the readme about what it's limitations are. It worked
for me, but it might not work for you. It's opensource and you can
fork it. :)

There is a solution in my toolcase for creating a heapdump and
uploading that to S3:
https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/HeapDumpServlet.groovy
The readme explains how to setup Amazon S3 keys for this:
https://github.com/lhotari/java-buildpack-diagnostics-app#amazon-s3-setup
Once you get a dump, you can then analyse the dump in a java
profiler tool like YourKit.

I also have a solution that forks the java-buildpack modifies
killjava.sh and adds a script that uploads the heapdump to S3 in
the case of OOM:
https://github.com/lhotari/java-buildpack/commit/2d654b80f3bf1a0e0f1bae4f29cb85f56f5f8c46

In java-buildpack-diagnostics-app I have also other tools for
getting Linux operation system specific memory information, for
example:

https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemoryInfoServlet.groovy
https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemorySmapServlet.groovy
https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MallocInfoServlet.groovy

These tools are handy for looking at details of the Java process
RSS memory usage growth.

There is also a solution for getting ssh shell access inside your
application with tmate.io <http://tmate.io>:
https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/TmateSshServlet.groovy
(this version is only compatible with the new "cflinuxfs2" stack)

It looks like there are serious problems on CloudFoundry with the
memory sizing calculation. An application that doesn't have a OOM
problem will get killed by the oom killer because the Java process
will go over the memory limits.
I filed this issue:
https://github.com/cloudfoundry/java-buildpack/issues/157 , but
that might not cover everything.

The workaround for that in my case was to add a native key under
memory_sizes in open_jdk_jre.yml and set the minimum to 330M (that
is for a 2GB total memory).
see example
https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25
that was how I got the app I'm running on CF to stay within the
memory bounds. I'm sure there is now also a way to get the keys
without forking the buildpack. I could have also adjusted the
percentage portions, but I wanted to set a hard minimum for this case.

It was also required to do some other tuning.

I added this to JAVA_OPTS:
-XX:CompressedClassSpaceSize=256M -XX:InitialCodeCacheSize=64M
-XX:CodeCacheExpansionSize=1M -XX:CodeCacheMinimumFreeSpace=1M
-XX:ReservedCodeCacheSize=200M -XX:MinMetaspaceExpansion=1M
-XX:MaxMetaspaceExpansion=8M -XX:MaxDirectMemorySize=96M
while trying to keep the Java process from growing in RSS memory size.

The memory overhead of a 64 bit Java process on Linux can be
reduced by specifying these environment variables:

stack: cflinuxfs2
.
.
.
env:
MALLOC_ARENA_MAX: 2
MALLOC_MMAP_THRESHOLD_: 131072
MALLOC_TRIM_THRESHOLD_: 131072
MALLOC_TOP_PAD_: 131072
MALLOC_MMAP_MAX_: 65536

MALLOC_ARENA_MAX works only on cflinuxfs2 stack (the lucid64 stack
has a buggy version of glibc).

explanation about MALLOC_ARENA_MAX from Heroku:
https://devcenter.heroku.com/articles/tuning-glibc-memory-behavior
some measurement data how it reduces memory consumption:
https://devcenter.heroku.com/articles/testing-cedar-14-memory-use

I have created a PR to add this to CF java-buildpack:
https://github.com/cloudfoundry/java-buildpack/pull/160

I also created an issues
https://github.com/cloudfoundry/java-buildpack/issues/163 and
https://github.com/cloudfoundry/java-buildpack/pull/159 .

I hope this information helps others struggling with OOM problems
in CF.
I'm not saying that this is a ready made solution just for you.
YMMV. It worked for me.

-Lari




On 15-04-29 10:53 AM, Head-Rapson, David wrote:

Hi,

I’m after some guidance on how to get profile Java apps in CF,
in order to get to the bottom of memory issues.

We have an app that’s crashing every few hours with OOM error,
most likely it’s a memory leak.

I’d like to profile the JVM and work out what’s eating memory,
however tools like yourkit require connectivity INTO the JVM
server (i.e. the warden container), either via host / port or
via SSH.

Since warden containers cannot be connected to on ports other
than for HTTP and cannot be SSHd to, neither of these works
for me.



I tried installed a standalone JDK onto the warden container,
however as soon as I ran ‘jmap’ to invoke the dump, warden
cleaned up the container – most likely for memory
over-consumption.



I had previously found a hack in the Weblogic buildpack
(https://github.com/pivotal-cf/weblogic-buildpack/blob/master/docs/container-wls-monitoring.md)
for modifying the start script which, when used with
–XX:HeapDumpOnOutOfMemoryError, should copy any heapdump files
to a file share somewhere. I have my own custom buildpack so
I could use something similar.

Has anyone got a better solution than this?



We would love to use newrelic / app dynamics for this however
we’re not allowed. And I’m not 100% certain they could help
with this either.



Dave



The information transmitted is intended for the person or
entity to which it is addressed and may contain confidential,
privileged or copyrighted material. If you receive this in
error, please contact the sender and delete the material from
any computer. Fidelity only gives information on products and
services and does not give investment advice to retail clients
based on individual circumstances. Any comments or statements
made are not necessarily those of Fidelity. All e-mails may be
monitored. FIL Investments International (Reg. No.1448245),
FIL Investment Services (UK) Limited (Reg. No. 2016555), FIL
Pensions Management (Reg. No. 2015142) and Financial
Administration Services Limited (Reg. No. 1629709) are
authorised and regulated in the UK by the Financial Conduct
Authority. FIL Life Insurance Limited (Reg No. 3406905) is
authorised in the UK by the Prudential Regulation Authority
and regulated in the UK by the Financial Conduct Authority and
the Prudential Regulation Authority. Registered offices at
Oakhill House, 130 Tonbridge Road, Hildenborough, Tonbridge,
Kent TN11 9DZ.

--
You received this message because you are subscribed to the
Google Groups "Cloud Foundry Developers" group.
To view this discussion on the web visit
https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/DFFA4ADB9F3BC34194429921AB329336408CAB04%40UKFIL7006WIN.intl.intlroot.fid-intl.com
<https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/DFFA4ADB9F3BC34194429921AB329336408CAB04%40UKFIL7006WIN.intl.intlroot.fid-intl.com?utm_medium=email&utm_source=footer>.
To unsubscribe from this group and stop receiving emails from
it, send an email to vcap-dev+unsubscribe(a)cloudfoundry.org
<mailto:vcap-dev+unsubscribe(a)cloudfoundry.org>.




_______________________________________________
Cf-dev mailing list
Cf-dev(a)lists.cloudfoundry.org <mailto:Cf-dev(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev





--

Regards,



Daniel Jones

EngineerBetter.com



Re: Recipe to install Diego?

Ken Ojiri
 

Hi,

I have posted a sample BOSH deployment manifest to Gist.
https://gist.github.com/ozzozz/4c08c37863b703a75afc
I could deploy cf-release v207 and diego-release 0.1099.0 to AWS Tokyo
region by MicroBOSH.

I could also deploy cf-release and diego-release to OpenStack(Juno).
The manifests differs only in 'networks', 'cloud_properties' and 'stemcell'.

Regards,
Ken

---
<ozzozz(a)gmail.com>
Mitaka, Tokyo Japan

On Sat, May 9, 2015 at 8:57 PM, Tom Sherrod <tom.sherrod(a)gmail.com> wrote:
Hi,

Are there any examples or docs on installing Diego with bosh/microbosh?
Using the bosh-lite as a template, I'm tripping up on various parts. Is this
even a valid direction in installing?
Either AWS or Openstack..

Thanks,
Tom

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: [vcap-dev] Java OOM debugging

Dave Head-Rapson
 

Thanks for the continued advice.

We’ve hit on a key discovery after yet another a soak test this weekend.

- When we deploy using Tomcat 8.0.18 we don’t see the issue

- When we deploy using Tomcat 8.0.20 (same app version, same CF space, same services bound, same JBP code version, same JRE version, running at the same time), we see the crashes occurring after just a couple of hours.

Ideally we’d go ahead with the memory calculations you mentioned however we’re stuck on lucid64 because we’re using Pivotal CF 1.3.x & we’re having upgrade issues to 1.4.x.
So we’re not able to adjust MALLOC_ARENA_MAX, nor are we able to view RSS in pmap as you describe

Other things we’ve tried:

- We set verbose garbage collection to verify there was no memory size issues within the JVM. There wasn’t.

- We tried setting minimum memory for native, it had no effect. The container still gets killed

- We tried adjusting the ‘memory heuristics’ so that they added up to 80 rather than 100. This had the effect of causing a delay in the container being killed. However it still was killed.

This seems like classic memory leak behaviour to me.

From: Lari Hotari [mailto:lari.hotari(a)sagire.fi] On Behalf Of Lari Hotari
Sent: 08 May 2015 16:25
To: Daniel Jones; Head-Rapson, David
Cc: cf-dev(a)lists.cloudfoundry.org
Subject: Re: [Cf-dev] [vcap-dev] Java OOM debugging


For my case, it turned out to be essential to reserve enough memory for "native" in the JBP. For the 2GB total memory, I set the minimum to 330M. With that setting I have been able to get over 2 weeks up time by now.

I mentioned this in my previous email:

The workaround for that in my case was to add a native key under memory_sizes in open_jdk_jre.yml and set the minimum to 330M (that is for a 2GB total memory).
see example https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25
that was how I got the app I'm running on CF to stay within the memory bounds. I'm sure there is now also a way to get the keys without forking the buildpack. I could have also adjusted the percentage portions, but I wanted to set a hard minimum for this case.

I've been trying to get some insight by diffing the reports gathered from the meminfo servlet https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemoryInfoServlet.groovy

Here is such an example of a diff:
https://gist.github.com/lhotari/ee77decc2585f56cf3ad#file-meminfo_diff_example-txt

meminfo has pmap output included to get the report of the memory map of the process. I have just noticed that most of the memory has already been mmap:ed from the OS and it's just growing in RSS size. For example:
< 00000000a7600000 1471488 1469556 1469556 rw--- [ anon ]
00000000a7600000 1471744 1470444 1470444 rw--- [ anon ]
The pmap output from lucid64 didn't include the RSS size, so you have to use cflinuxfs2 for this. It's also better because of other reasons. The glibc in lucid64 is old and has some bugs around the MALLOC_ARENA_MAX.

I was manually able to estimate the maximum size of the RSS size of what the Java process will consume by simply picking the large anon-blocks from the pmap report and calculating those blocks by the allocated virtual size (VSS).
Based on this calculation, I picked the minimum of 330M for "native" in open_jdk_jre.yml as I mentioned before.

It looks like these rows are for the Heap size:
< 00000000a7600000 1471488 1469556 1469556 rw--- [ anon ]
00000000a7600000 1471744 1470444 1470444 rw--- [ anon ]
It looks like the JVM doesn't fully allocate that block in RSS initially and most of the growth of RSS size comes from that in my case. In your case, it might be something different.

I also added a servlet for getting glibc malloc_info statistics in XML format (). I haven't really analysed that information because of time constraints and because I don't have a pressing problem any more. btw. The malloc_info XML report is missing some key elements, that has been added in later glibc versions (https://github.com/bminor/glibc/commit/4d653a59ffeae0f46f76a40230e2cfa9587b7e7e).

If killjava.sh never fires and the app crashed with Warden out of memory errors, then I believe it's the kernel's cgroups OOM killer that has killed the container processes. I have found this location where Warden oom notifier gets the OOM notification event:
https://github.com/cloudfoundry/warden/blob/ad18bff/warden/lib/warden/container/features/mem_limit.rb#L70
This is the oom.c source code: https://github.com/cloudfoundry/warden/blob/ad18bff7dc56acbc55ff10bcc6045ebdf0b20c97/warden/src/oom/oom.c . It reads the cgroups control files and receives events from the kernel that way.

I'd suggest that you use pmap for the Java process after it has started and calculate the maximum RSS size by calculating the VSS size of the large anon blocks instead of RSS for the blocks that the Java process has reserved for it's different memory areas (I think you shouldn't . You should discard adding VSS for the CompressedClassSpaceSize block.
After this calculation, add enough memory to the "native" parameter in JBP until the RSS size calculated this way stays under the limit.
That's the only "method" I have come up by now.

It might be required to have some RSS space allocated for any zip/jar files read by the Java process. I think that Java uses mmap files for zip file reading by default and that might go on top of all other limits.
To test this theory, I'd suggest testing by adding -Dsun.zip.disableMemoryMapping=true system property setting to JAVA_OPTS. That disables the native mmap for zip/jar file reading. I haven't had time to test this assumption.

I guess the only way to understand how Java allocates memory is to look at the source code.
from http://openjdk.java.net/projects/jdk8u/ , the instructions to get the source code of JDK 8:
hg clone http://hg.openjdk.java.net/jdk8u/jdk8u;cd jdk8u;sh get_source.sh
This tool is really good for grepping and searching the source code: http://geoff.greer.fm/ag/
On Ubuntu it's in silversearcher-ag package, "apt-get install silversearcher-ag" and on MacOSX brew it's "brew install the_silver_searcher".
This alias is pretty useful:
alias codegrep='ag --color --group --pager less -C 5'
Then you just search for the correct location in code by starting with the tokens you know about:
codegrep MaxMetaspaceSize
this gives pretty good starting points in looking how the JDK allocates memory.

So the JDK source code is only a few commands away.

It would be interesting to hear more about this if someone has the time to dig in to this. This is about how far I got and I hope sharing this information helps someone continue. :)


Lari
github/twitter: lhotari
On 15-05-08 10:02 AM, Daniel Jones wrote:
Hi Lari et al,

Thanks for your help Lari.

David and I are pairing on this issue, and we're yet to resolve it. We're in the process of creating a repeatable test case (our most crashy app makes calls to external services that need mocking), but in the meantime, here's what we've seen.

Between Java Buildpack commit e89e546 and 17162df, we see apps crashing with Warden out of memory errors. killjava.sh never fires, and this has led us to believe that the kernel is shooting a cgroup process in the head after the cgroup oversteps its memory limit. We cannot find any evidence of the OOM killer firing in any logs, but we may not be looking in the right place.

The JBP is setting heap to be 70%, metaspace to be 15% (with max set to the same as initial), 5% for "stack", 5% for "normalised stack" and 10% for "native". We do not understand why this adds up to 105%, but haven't looked into the JBP algorithm yet. Any pointers on what "normalised stack" is would be much appreciated, as this doesn't appear in the list of heuristics supplied via app env.

Other team members tried applying the same settings that you suggested - thanks for this. Apps still crash with these settings, albeit less frequently.

After reading the blog you linked to (http://java.dzone.com/articles/java-8-permgen-metaspace) we wondered whether the increased reserved metaspace claimed after metaspace GC might be causing a problem; however we reused the test code to create a metaspace leak in a CF app and saw metaspace GCs occur correctly, and memory usage never grow over MaxMetaspaceSize. This figures, as the committed metaspace is still less than MaxMetaspaceSize, and the reserved appears to be whatever RAM is free across the whole DEA.

We noted that an Oracle blog (https://blogs.oracle.com/poonam/entry/about_g1_garbage_collector_permanent) mentions that the metaspace size parameters are approximate. We're currently wondering if native allocations by Tomcat (APR, NIO) are taking up more container memory, and so when the metaspace fills, it's creeping slightly over the limit and triggering the kernel's OOM killer.

Any suggestions would be much appreciated. We've tried to resist tweaking heuristics blindly, but are running out of options as we're struggling to figure out how the Java process is using committed memory. pmap seems to show virtual memory, and so it's hard to see if things like the metaspace or NIO ByteBuffers are nabbing too much and trigger the kernel's OOM killer.

Thanks for all your help,

Daniel Jones & David Head-Rapson

On Wed, Apr 29, 2015 at 8:07 PM, Lari Hotari <Lari(a)hotari.net<mailto:Lari(a)hotari.net>> wrote:
Hi,

I created a few tools to debug OOM problems since the application I was responsible for running on CF was failing constantly because of OOM problems. The problems I had, turned out not to be actual memory leaks in the Java application.

In the "cf events appname" log I would get entries like this:
2015-xx-xxTxx:xx:xx.00-0400 app.crash appname index: 1, reason: CRASHED, exit_description: out of memory, exit_status: 255

These type of entries are produced when the container goes over it's memory resource limits. It doesn't mean that there is a memory leak in the Java application. The container gets killed by the Linux kernel oom killer (https://github.com/cloudfoundry/warden/blob/master/warden/README.md#limit-handle-mem-value) based on the resource limits set to the warden container.

The memory limit is specified in number of bytes. It is enforced using the control group associated with the container. When a container exceeds this limit, one or more of its processes will be killed by the kernel. Additionally, the Warden will be notified that an OOM happened and it subsequently tears down the container.
In my case it never got killed by the killjava.sh script that gets called in the java-buildpack when an OOM happens in Java.

This is the tool I built to debug the problems:
https://github.com/lhotari/java-buildpack-diagnostics-app
I deployed that app as part of the forked buildpack I'm using.
Please read the readme about what it's limitations are. It worked for me, but it might not work for you. It's opensource and you can fork it. :)

There is a solution in my toolcase for creating a heapdump and uploading that to S3:
https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/HeapDumpServlet.groovy
The readme explains how to setup Amazon S3 keys for this: https://github.com/lhotari/java-buildpack-diagnostics-app#amazon-s3-setup
Once you get a dump, you can then analyse the dump in a java profiler tool like YourKit.

I also have a solution that forks the java-buildpack modifies killjava.sh and adds a script that uploads the heapdump to S3 in the case of OOM:
https://github.com/lhotari/java-buildpack/commit/2d654b80f3bf1a0e0f1bae4f29cb85f56f5f8c46

In java-buildpack-diagnostics-app I have also other tools for getting Linux operation system specific memory information, for example:

https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemoryInfoServlet.groovy
https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MemorySmapServlet.groovy
https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/MallocInfoServlet.groovy

These tools are handy for looking at details of the Java process RSS memory usage growth.

There is also a solution for getting ssh shell access inside your application with tmate.io<http://tmate.io>:
https://github.com/lhotari/java-buildpack-diagnostics-app/blob/master/src/main/groovy/io/github/lhotari/jbpdiagnostics/TmateSshServlet.groovy (this version is only compatible with the new "cflinuxfs2" stack)

It looks like there are serious problems on CloudFoundry with the memory sizing calculation. An application that doesn't have a OOM problem will get killed by the oom killer because the Java process will go over the memory limits.
I filed this issue: https://github.com/cloudfoundry/java-buildpack/issues/157 , but that might not cover everything.

The workaround for that in my case was to add a native key under memory_sizes in open_jdk_jre.yml and set the minimum to 330M (that is for a 2GB total memory).
see example https://github.com/grails-samples/java-buildpack/blob/22e0f6a/config/open_jdk_jre.yml#L25
that was how I got the app I'm running on CF to stay within the memory bounds. I'm sure there is now also a way to get the keys without forking the buildpack. I could have also adjusted the percentage portions, but I wanted to set a hard minimum for this case.

It was also required to do some other tuning.

I added this to JAVA_OPTS:
-XX:CompressedClassSpaceSize=256M -XX:InitialCodeCacheSize=64M -XX:CodeCacheExpansionSize=1M -XX:CodeCacheMinimumFreeSpace=1M -XX:ReservedCodeCacheSize=200M -XX:MinMetaspaceExpansion=1M -XX:MaxMetaspaceExpansion=8M -XX:MaxDirectMemorySize=96M
while trying to keep the Java process from growing in RSS memory size.

The memory overhead of a 64 bit Java process on Linux can be reduced by specifying these environment variables:

stack: cflinuxfs2
.
.
.
env:
MALLOC_ARENA_MAX: 2
MALLOC_MMAP_THRESHOLD_: 131072
MALLOC_TRIM_THRESHOLD_: 131072
MALLOC_TOP_PAD_: 131072
MALLOC_MMAP_MAX_: 65536

MALLOC_ARENA_MAX works only on cflinuxfs2 stack (the lucid64 stack has a buggy version of glibc).

explanation about MALLOC_ARENA_MAX from Heroku:
https://devcenter.heroku.com/articles/tuning-glibc-memory-behavior
some measurement data how it reduces memory consumption: https://devcenter.heroku.com/articles/testing-cedar-14-memory-use

I have created a PR to add this to CF java-buildpack:
https://github.com/cloudfoundry/java-buildpack/pull/160

I also created an issues https://github.com/cloudfoundry/java-buildpack/issues/163 and https://github.com/cloudfoundry/java-buildpack/pull/159 .

I hope this information helps others struggling with OOM problems in CF.
I'm not saying that this is a ready made solution just for you. YMMV. It worked for me.

-Lari



On 15-04-29 10:53 AM, Head-Rapson, David wrote:
Hi,
I’m after some guidance on how to get profile Java apps in CF, in order to get to the bottom of memory issues.
We have an app that’s crashing every few hours with OOM error, most likely it’s a memory leak.
I’d like to profile the JVM and work out what’s eating memory, however tools like yourkit require connectivity INTO the JVM server (i.e. the warden container), either via host / port or via SSH.
Since warden containers cannot be connected to on ports other than for HTTP and cannot be SSHd to, neither of these works for me.

I tried installed a standalone JDK onto the warden container, however as soon as I ran ‘jmap’ to invoke the dump, warden cleaned up the container – most likely for memory over-consumption.

I had previously found a hack in the Weblogic buildpack (https://github.com/pivotal-cf/weblogic-buildpack/blob/master/docs/container-wls-monitoring.md) for modifying the start script which, when used with –XX:HeapDumpOnOutOfMemoryError, should copy any heapdump files to a file share somewhere. I have my own custom buildpack so I could use something similar.
Has anyone got a better solution than this?

We would love to use newrelic / app dynamics for this however we’re not allowed. And I’m not 100% certain they could help with this either.

Dave

The information transmitted is intended for the person or entity to which it is addressed and may contain confidential, privileged or copyrighted material. If you receive this in error, please contact the sender and delete the material from any computer. Fidelity only gives information on products and services and does not give investment advice to retail clients based on individual circumstances. Any comments or statements made are not necessarily those of Fidelity. All e-mails may be monitored. FIL Investments International (Reg. No.1448245), FIL Investment Services (UK) Limited (Reg. No. 2016555), FIL Pensions Management (Reg. No. 2015142) and Financial Administration Services Limited (Reg. No. 1629709) are authorised and regulated in the UK by the Financial Conduct Authority. FIL Life Insurance Limited (Reg No. 3406905) is authorised in the UK by the Prudential Regulation Authority and regulated in the UK by the Financial Conduct Authority and the Prudential Regulation Authority. Registered offices at Oakhill House, 130 Tonbridge Road, Hildenborough, Tonbridge, Kent TN11 9DZ.
--
You received this message because you are subscribed to the Google Groups "Cloud Foundry Developers" group.
To view this discussion on the web visit https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/DFFA4ADB9F3BC34194429921AB329336408CAB04%40UKFIL7006WIN.intl.intlroot.fid-intl.com<https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/DFFA4ADB9F3BC34194429921AB329336408CAB04%40UKFIL7006WIN.intl.intlroot.fid-intl.com?utm_medium=email&utm_source=footer>.
To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+unsubscribe(a)cloudfoundry.org<mailto:vcap-dev+unsubscribe(a)cloudfoundry.org>.


_______________________________________________
Cf-dev mailing list
Cf-dev(a)lists.cloudfoundry.org<mailto:Cf-dev(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Regards,

Daniel Jones
EngineerBetter.com


Recipe to install Diego?

Lev Berman <lev.berman@...>
 

Hi,

I can share my experience on installing Diego on AWS. I followed the
instructions
for BOSH Lite deployment
<https://github.com/cloudfoundry-incubator/diego-release#deploying-diego-to-a-local-bosh-lite-instance>except
for the fact I replaced 3 templates with the ones you can find in the
attachment. Note that in my case instance-count-overrides.yml leads to a
one-AZ deployment. Prerequisites include creating a separate AWS subnet for
Diego. Also, you need to configure routes and security groups in the same
manner you did it for Cloud Foundry.

On Sat, May 9, 2015 at 2:57 PM, Tom Sherrod <tom.sherrod(a)gmail.com> wrote:

Hi,

Are there any examples or docs on installing Diego with bosh/microbosh?
Using the bosh-lite as a template, I'm tripping up on various parts. Is
this even a valid direction in installing?
Either AWS or Openstack..

Thanks,
Tom
--
Lev Berman

Altoros - Cloud Foundry deployment, training and integration

Github
*: https://github.com/ldmberman <https://github.com/ldmberman>*


Re: Is there an auto-completion script?

Daniel Kaplan
 

Great, thanks a lot for the links.

-Dan

On Thu, May 7, 2015 at 10:50 PM, Takeshi Morikawa <moog0814(a)gmail.com>
wrote:

Hi Daniel

I found this

cf(cli) completion
https://github.com/cf-buildpacks/cf_completion

bosh cli completion
https://github.com/anfernee/bosh-completion

Is my answer what you're hoping for?

2015-05-08 14:28 GMT+09:00 Daniel Kaplan <dkaplan(a)pivotal.io>:

Hi DevList,

I think it would be extra convenient if there was Cloud Foundry
auto-completion script that worked similar to the way git's
git-completion
<https://github.com/git/git/blob/master/contrib/completion/git-completion.bash>
works.

Does one already exist? If not, I might write it in my free time. Let
me know your thoughts.

Thanks,
Dan

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Recipe to install Diego?

Tom Sherrod <tom.sherrod@...>
 

Hi,

Are there any examples or docs on installing Diego with bosh/microbosh?
Using the bosh-lite as a template, I'm tripping up on various parts. Is
this even a valid direction in installing?
Either AWS or Openstack..

Thanks,
Tom

9301 - 9320 of 9390