Date   

Re: Concourse-Up: Deploy Concourse CI in a Single Command

Tim Lawrence <tim.lawrence1984@...>
 

on the beach or on the bench?

I find the glare on my screen difficult on the beach :-)

Good work!

On Thu, May 4, 2017 at 2:19 PM, Dan Young <dan.young(a)engineerbetter.com>
wrote:

Hi all,

I guess there are a lot of Concourse CI users on this list, so I thought
I'd mention a new tool that some of the EngineerBetter team built while on
the beach. Concourse-Up will let you deploy a production-ready Concourse on
AWS for your team, *using just a single command *and without needing to
know anything about BOSH. It supports custom domains, SSL termination,
scaling workers, and idempotent upgrades.

Here is a blog post: http://www.engineerbetter.com/2017/05/03/introducing-
concourse-up.html

And the GitHub repo: https://github.com/engineerbetter/concourse-up
Hopefully this will be useful for teams setting up Concourse for new
projects, or new Concourse users who less than enthusiastic about needing
to learn BOSH.

Feedback, feature requests and bug reports are most welcome.

Regards,
Dan Young - CEO
EngineerBetter Ltd <http://www.engineerbetter.com> - The UK Cloud Foundry
Specialists
@dan0young <http://www.twitter.com/dan0young>
+44 (0)7783 397092 <+447783397092>


Re: Questions on credential rotation

Dan Jahner
 

Hey Bernd,

I am the product manager of CredHub. We view our project as being part of
the 'rotate' component of Justin's vision. Repave is focused on recreating
instances to a known-good state, which is something outside of our area of
concern.

The current roadmap for CredHub is focused on pulling credentials into our
system; specifically BOSH deployment and service credentials, later
application credentials. Once we have a solid footing for storing and
managing access to these credentials, we plan to explore what possibilities
exist for reducing the friction of credential rotation.

Although I haven't spent a long time investigating, I would agree with your
characterization of the 3 classes of credentials. I think there is
overwhelming agreement that all components should allow credential rotation
without downtime, where possible, so I would expect it is on many teams'
radar. If not, I am happy to start conversations once we get to that phase
of our project.

Thanks,
Dan

On Thu, May 4, 2017 at 6:32 AM Krannich, Bernd <bernd.krannich(a)sap.com>
wrote:

Hello all,



We love Justin Smith’s approach of “Rotate, Repair, Repave” [1] when it
comes to security. Looking at how the “Rotate” aspect is handled in Cloud
Foundry and other BOSH deployments today, we think there’s currently three
classes of credentials:



1. Credentials that can be rotated by updating them and doing a
`bosh deploy` with zero downtime

2. Credentials that can be rotated by updating them and doing a
`bosh deploy` involving a downtime [2]

3. Credentials that cannot be rotated easily at all [3]



A couple of questions here:



· Is the above summary accurate?

· For updates involving a downtime, the only naïve solution I
could come up with is to support two sets of credentials during the
transition. Are there any more strategies?

· Are there any efforts to turn credentials falling under #2 and
#3 into ones that can be updated without downtime?

· CredHub [4] seems to be geared in the direction of “repave”. Is
this the case and does this maybe even support work on the previous bullet?



Thanks in advance,

Bernd



[1] https://www.youtube.com/watch?v=NUXpz0Dni50

[2]
https://github.com/cloudfoundry/cf-release/blob/master/templates/cf.yml#L634 might
be a good example

[3]
https://github.com/cloudfoundry/cf-release/blob/master/templates/cf.yml#L865 might
be a good example

[4] https://github.com/cloudfoundry-incubator/credhub





*Bernd Krannich*

SAP Cloud Platform

*SAP SE*

Dietmar-Hopp-Allee 16, 69190 Walldorf, Germany



E bernd.krannich(a)sap.com



Pflichtangaben/Mandatory Disclosure Statement: www.sap.com/impressum
<http://www.sap.com/company/legal/impressum.epx/>



Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse oder sonstige
vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtümlich
erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine
Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt. Bitte
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen
Dank.



This e-mail may contain trade secrets or privileged, undisclosed, or
otherwise confidential information. If you have received this e-mail in
error, you are hereby notified that any review, copying, or distribution of
it is strictly prohibited. Please inform us immediately and destroy the
original transmittal. Thank you for your cooperation.


disabling ipv6 at the kernel level in stemcells

Dmitriy Kalinin <dkalinin@...>
 

hey all,

we've disabled ipv6 via ipv6.disable=1 in grub in 3363.19+ linux stemcells
some time ago. in previous stemcells it was disabled via sysctl at bootup.

we are now aware that small number of releases may be affected (one release
for example was disabling portion of ipv6 functionality themselves but no
longer can succeed since /proc/... entry is gone and that code was not
checking for its existence). we also had a report that some java processes
may be affected if they were using particular libraries that for some
reason try to obtain local ipv6 address (even though ipv6 was disabled
before their startup).

we try very hard to avoid making any breaking changes in minor stemcell
versions; however, this changed turned out to be more disruptive than we
expected. given that it affects only small number of releases we have
decided to keep it in (hoping that it should be easy for release authors to
issue a patch if necessary).

(for folks thinking about the future ipv6 support, bosh-agent will
automatically turn it on at runtime if necessary.)

as usual feel free to reach out to us on #bosh slack if you need any help.

dmitriy


Questions on credential rotation

Krannich, Bernd <bernd.krannich@...>
 

Hello all,

We love Justin Smith’s approach of “Rotate, Repair, Repave” [1] when it comes to security. Looking at how the “Rotate” aspect is handled in Cloud Foundry and other BOSH deployments today, we think there’s currently three classes of credentials:


1. Credentials that can be rotated by updating them and doing a `bosh deploy` with zero downtime

2. Credentials that can be rotated by updating them and doing a `bosh deploy` involving a downtime [2]

3. Credentials that cannot be rotated easily at all [3]

A couple of questions here:


· Is the above summary accurate?

· For updates involving a downtime, the only naïve solution I could come up with is to support two sets of credentials during the transition. Are there any more strategies?

· Are there any efforts to turn credentials falling under #2 and #3 into ones that can be updated without downtime?

· CredHub [4] seems to be geared in the direction of “repave”. Is this the case and does this maybe even support work on the previous bullet?

Thanks in advance,
Bernd

[1] https://www.youtube.com/watch?v=NUXpz0Dni50
[2] https://github.com/cloudfoundry/cf-release/blob/master/templates/cf.yml#L634 might be a good example
[3] https://github.com/cloudfoundry/cf-release/blob/master/templates/cf.yml#L865 might be a good example
[4] https://github.com/cloudfoundry-incubator/credhub


Bernd Krannich
SAP Cloud Platform
SAP SE
Dietmar-Hopp-Allee 16, 69190 Walldorf, Germany

E bernd.krannich(a)sap.com<mailto:bernd.krannich(a)sap.com>

Pflichtangaben/Mandatory Disclosure Statement: www.sap.com/impressum<http://www.sap.com/company/legal/impressum.epx/>

Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtümlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

This e-mail may contain trade secrets or privileged, undisclosed, or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying, or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation.


Concourse-Up: Deploy Concourse CI in a Single Command

Dan Young <dan.young@...>
 

Hi all,

I guess there are a lot of Concourse CI users on this list, so I thought
I'd mention a new tool that some of the EngineerBetter team built while on
the beach. Concourse-Up will let you deploy a production-ready Concourse on
AWS for your team, *using just a single command *and without needing to
know anything about BOSH. It supports custom domains, SSL termination,
scaling workers, and idempotent upgrades.

Here is a blog post: http://www.engineerbetter.com/2017/05/03/
introducing-concourse-up.html

And the GitHub repo: https://github.com/engineerbetter/concourse-up
Hopefully this will be useful for teams setting up Concourse for new
projects, or new Concourse users who less than enthusiastic about needing
to learn BOSH.

Feedback, feature requests and bug reports are most welcome.

Regards,
Dan Young - CEO
EngineerBetter Ltd <http://www.engineerbetter.com> - The UK Cloud Foundry
Specialists
@dan0young <http://www.twitter.com/dan0young>
+44 (0)7783 397092 <+447783397092>


Re: proposal: unik & cloud foundry

Idit Levine
 

Great, looking forward and thanks for your help.

On Apr 29, 2017, at 4:19 PM, Michael Maximilien <maxim(a)us.ibm.com> wrote:

Wonderful. I can give you up to 15 mins. We use Zoom so you can share your screen if you want to do demo / presentation.

Look forward to it. Let me know if you need anything from me.

Best,

dr.max

ibm cloud labs
sillicon valley, ca
usa
maximilien.org <http://maximilien.org/>

Sent from my iPhone

On Apr 29, 2017, at 12:40 AM, Idit Levine <idit.levine(a)gmail.com <mailto:idit.levine(a)gmail.com>> wrote:

It does. Looking forward!

Sent from my iPhone

On Apr 28, 2017, at 4:43 PM, Chip Childers <cchilders(a)cloudfoundry.org <mailto:cchilders(a)cloudfoundry.org>> wrote:

Idit,

I spoke with Dr. Max about this proposal, and he and I both think that walking through the project and proposal live with the community would be best done during the monthly CAB call. The next call is on 5/17 at 11 AM Eastern US Time.

Hopefully that works for you!

-chip

On Thu, Apr 13, 2017 at 11:39 AM Idit Levine <idit.levine(a)gmail.com <mailto:idit.levine(a)gmail.com>> wrote:
Sounds good. I reply to the BOSH comment and it should defiantly be part of the discussion.
It make a lot of sense to held a f2f discussion at Cloud foundry summit. Chip, thoughts ?

I believe I reply to all the comments.

Cheers,
Idit


On Apr 13, 2017, at 4:29 AM, Michael Maximilien <mmaximilien(a)gmail.com <mailto:mmaximilien(a)gmail.com>> wrote:

Could we use time at the CF summit for f2f discussions?

There are also some comments in the docs. I added one more today w.r.t. to the BOSH CPI and interface in Unik.

Anyhow, we could of course have more than one discussion.

Best,

Max

On Thu, Apr 13, 2017 at 6:11 AM, Idit Levine <idit.levine(a)gmail.com <mailto:idit.levine(a)gmail.com>> wrote:
Hi Daniel,

Sorry for the late reply and thank you for the comments. Please see my answers inline.
Cloud Foundry aside, how does Unik go from receiving some app code to know exactly which kernel features are needed for the application at run time? This is the part that seems most magical to me.
First, Unik is not the one who make that decision, this decision is being done by the tool that build the unikernel and it is different tool for each unikernel type.

For example in IncludeOS, the mechanism used for extracting only what is needed from the operating system, is the one provided by default by modern linkers. Each part of the OS is compiled into an object-file, such as ip4.o, udp.o, pci_device.o etc., which are then combined using ar to form a static library os.a. When a program links with this library, only what’s necessary will automatically be extracted by the linker and end up in the final binary. To facilitate this build process a custom toolchain has been created.

Hope that make sense.
Where is the Unik daemon running - presumably somewhere of the operators choice, mostly likely as a BOSH-deployed job?
Yes, it is the Operators jobs and it will be smart to create BOSH release to UniK - right now the install process is done by running 'make'.
What happens if the Unik daemon dies - is it stateful? Will unikernels continue to run and be manageable?
The unikernels will continue running but will not be managed by UniK until the daemon will be restarted - just like docker.
Does the unikernel image get created at app staging time?
If the image is not pre built it will be built it, otherwise the existence image will be used.
Has there been any discussion about how to strategically implement this alongside Diego, instead of the tactical solution of 'going through' Diego?
I am going to set time with Chip help and we can all discuss it together. I think that clean integration should be done in Garden. But it should be discussed and decided by the community.

I hope that clarify some of your concerns and please feel free to ask me any question.

Thanks,
Idit


On Mar 23, 2017, at 4:32 PM, Daniel Jones <daniel.jones(a)engineerbetter.com <mailto:daniel.jones(a)engineerbetter.com>> wrote:

Hi Idit,

Thanks for sending out the proposal, and thanks for giving your talk in Santa Clara last year, which I attended and was excited by.

I've read the proposal and heard the talk, but I'll be frank - I still don't get it. That may well be because I'm a simpleton who doesn't know much about kernels, let alone unikernels, but if I don't ask some silly questions I'll probably never know.

Cloud Foundry aside, how does Unik go from receiving some app code to know exactly which kernel features are needed for the application at run time? This is the part that seems most magical to me.
Where is the Unik daemon running - presumably somewhere of the operators choice, mostly likely as a BOSH-deployed job?
What happens if the Unik daemon dies - is it stateful? Will unikernels continue to run and be manageable?
Does the unikernel image get created at app staging time?
Has there been any discussion about how to strategically implement this alongside Diego, instead of the tactical solution of 'going through' Diego?

I'm excited by the promise of unikernels - being able to cut out so much bloat and indirection would be a massive win for efficiency if they usurped containers as a unit of currency. I wonder how much CO2 emission we could avoid by stripping abstraction away, instead of piling it on!

Regards,
Daniel Jones - CTO
+44 (0)79 8000 9153 <tel:+44%207980%20009153>
@DanielJonesEB <https://twitter.com/DanielJonesEB>
EngineerBetter Ltd <http://www.engineerbetter.com/> - UK Cloud Foundry Specialists

On 16 March 2017 at 21:09, Michael Maximilien <mmaximilien(a)gmail.com <mailto:mmaximilien(a)gmail.com>> wrote:
Thanks for this submission Idit and Dell/EMC.

I look forward to comments from community as we work this though the CF-extensions process.

Best.

On Thu, Mar 16, 2017 at 1:25 PM Idit Levine <idit.levine(a)gmail.com <mailto:idit.levine(a)gmail.com>> wrote:
One clarification, we propose it to the CF incubation program.


On Mar 16, 2017, at 3:59 PM, Idit Levine <idit.levine(a)gmail.com <mailto:idit.levine(a)gmail.com>> wrote:

Hi all,

We at Dell EMC would like to propose to contribute project unik (https://github.com/emc-advanced-dev/unik <https://github.com/emc-advanced-dev/unik>) and its integration with Cloud Foundry (https://github.com/emc-advanced-dev/cf-unik-buildpack <https://github.com/emc-advanced-dev/cf-unik-buildpack>) to Cloud Foundry community.

You can find the full official proposal at: https://docs.google.com/document/d/1Q9GakKpm6DMniJpWB-fqhE13SSPaj4-3sOWZ-I5nVyA/edit?usp=sharing <https://docs.google.com/document/d/1Q9GakKpm6DMniJpWB-fqhE13SSPaj4-3sOWZ-I5nVyA/edit?usp=sharing>
We of course welcome input and feedback on the proposal via inline commentary on the proposal document or directly to me.

Thanks,
Idit
--
dr.max Sent from my iPhone http://maximilien.org <http://maximilien.org/>



--
max
http://maximilien.org <http://maximilien.org/>
http://blog.maximilien.com <http://blog.maximilien.com/>
--
Chip Childers
CTO, Cloud Foundry Foundation
1.267.250.0815


Re: [cf-bosh] Re: BOSH CLI v2

Julz Friedman
 

I can't wait to try out the new log-in com-mand

[image: Gemoji image for :trollface:]

(Srsly great work bosh folks! :))

On Tue, 2 May 2017 at 06:39 Sean Keery <skeery(a)pivotal.io> wrote:

Excellent. Thanks to the team for all the hard work.

On Mon, May 1, 2017 at 7:39 PM Dmitriy Kalinin <dkalinin(a)pivotal.io>
wrote:

Hey all,

I am happy to announce BOSH CLI v2 is now generally available. CLI v2
incorporates tons of feedback received over the past few years. Some
features have been redesigned, some removed, and some hopefully much
improved.

You will find docs available on https://bosh.io/docs. Let us know if you
find any missing material (I'm sure there is some). Here are some
documentation pages worth mentioning:

- https://bosh.io/docs#basic-deploy - cli v2 section on the index page
- https://bosh.io/docs/cli-v2 - all commands
- https://bosh.io/docs/cli-v2-diff - some notable cli v1 vs v2
differences

CLI binary also links directly to the command specific documentation
section from its command help output (-h), so more information is just a
command+click away.

CLI v1 will continue to work and be supported for some time; however, new
Director features will not be exposed in v1.

Feel free to drop by #bosh slack channel if you have any questions,

BOSH team

--
*Sean Keery | Minister of Chaos | Pivotal Cloud Foundry Solutions*
Mobile: 970.274.1285 | skeery(a)pivotal.io
LinkedIn: @zgrinch <http://www.linkedin.com/in/zgrinch> | Twitter:
@zgrinch <https://twitter.com/zgrinch> | Github: @skibum55
<https://github.com/skibum55>


Adopt the Silicon Valley state of mind


Re: [cf-bosh] BOSH CLI v2

Sean Keery <skeery@...>
 

Excellent. Thanks to the team for all the hard work.

On Mon, May 1, 2017 at 7:39 PM Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:

Hey all,

I am happy to announce BOSH CLI v2 is now generally available. CLI v2
incorporates tons of feedback received over the past few years. Some
features have been redesigned, some removed, and some hopefully much
improved.

You will find docs available on https://bosh.io/docs. Let us know if you
find any missing material (I'm sure there is some). Here are some
documentation pages worth mentioning:

- https://bosh.io/docs#basic-deploy - cli v2 section on the index page
- https://bosh.io/docs/cli-v2 - all commands
- https://bosh.io/docs/cli-v2-diff - some notable cli v1 vs v2 differences

CLI binary also links directly to the command specific documentation
section from its command help output (-h), so more information is just a
command+click away.

CLI v1 will continue to work and be supported for some time; however, new
Director features will not be exposed in v1.

Feel free to drop by #bosh slack channel if you have any questions,

BOSH team

--
*Sean Keery | Minister of Chaos | Pivotal Cloud Foundry Solutions*
Mobile: 970.274.1285 | skeery(a)pivotal.io
LinkedIn: @zgrinch <http://www.linkedin.com/in/zgrinch> | Twitter: @zgrinch
<https://twitter.com/zgrinch> | Github: @skibum55
<https://github.com/skibum55>


Adopt the Silicon Valley state of mind


BOSH CLI v2

Dmitriy Kalinin <dkalinin@...>
 

Hey all,

I am happy to announce BOSH CLI v2 is now generally available. CLI v2
incorporates tons of feedback received over the past few years. Some
features have been redesigned, some removed, and some hopefully much
improved.

You will find docs available on https://bosh.io/docs. Let us know if you
find any missing material (I'm sure there is some). Here are some
documentation pages worth mentioning:

- https://bosh.io/docs#basic-deploy - cli v2 section on the index page
- https://bosh.io/docs/cli-v2 - all commands
- https://bosh.io/docs/cli-v2-diff - some notable cli v1 vs v2 differences

CLI binary also links directly to the command specific documentation
section from its command help output (-h), so more information is just a
command+click away.

CLI v1 will continue to work and be supported for some time; however, new
Director features will not be exposed in v1.

Feel free to drop by #bosh slack channel if you have any questions,

BOSH team


CVE-2017-4974: Blind SQL Injection with privileged UAA endpoints

Molly Crowther
 

CF devs,

Please see the following public link for information about a high CVE in
UAA. This is continuation of work that was originally released as part of
CVE-2017-4972 <https://www.cloudfoundry.org/cve-2017-4972/>. It's
essentially the same attack but on some endpoints that require more
privileges.

https://www.cloudfoundry.org/cve-2017-4974

Friendly reminder that you can subscribe to new Cloud Foundry security
issues at: https://www.cloudfoundry.org/category/security/feed/

Please let me know if you have any questions or concerns.

Thanks,
Molly Crowther
Cloud Foundry Foundation Security Team


CVE-2017-4961: BOSH Director Shell Injection Vulnerabilities

Molly Crowther
 

CF devs,

Please see the following public link for information about a high CVE in
bosh.
https://www.cloudfoundry.org/cve-2017-4961/

Friendly reminder that you can subscribe to new Cloud Foundry security
issues at: https://www.cloudfoundry.org/category/security/feed/

Please let me know if you have any questions or concerns.

Thanks,
Molly Crowther
Cloud Foundry Foundation Security Team


Re: How to capture CF runtime container and application memory usage

Stanislav German-Evtushenko
 

Hi,

CF runtime container and application memory usage
Do you know which one is accurate?
It is always hard to say what is actual memory usage because it depends on what we mean by it. And the meaning probably comes from the purpose.

Talking about CF it reports memory usage as RSS + CACHE - INACTIVE_FILE for a container. It is hard to say if it we can consider this being accurate or not but it is more or less what we can call "active memory", meaning that if we set memory limit less than that the application might become slower due to swapping out or using file i/o more often then required.

Some reference links:
- Linux Memory https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
- DEA: https://github.com/cloudfoundry/dea_ng/blob/65b5415770a767dd38e18eeef7576b9c38aa3509/lib/dea/stat_collector.rb#L60
- Guardian: https://github.com/cloudfoundry/guardian/blob/cf6012a27dbf5646062a9a83f8c95d12a65c9856/rundmc/runrunc/stats.go#L70

Best regards,
Stanislav

________________________________
From: Zhiyong Li <Zhiyong.Li(a)sas.com>
Sent: Sunday, April 30, 2017 1:02
To: cf-dev(a)lists.cloudfoundry.org
Subject: [cf-dev] How to capture CF runtime container and application memory usage

We have used four different commands to capture the memory usage for the Springboot apps running in CF runtime including: jcmp, ps, cf app and pmap. However, everyone seems report different values. Do you know which one is accurate? Here are example outputs:

Running “cf apps” the memory usage is reported as 930.4M:

C:\Program Files\Cloud Foundry>cf app mtp-identities
Showing health and status for app mtp-identities in org stocf / space dev as admin...

name: mtp-identities
requested state: started
instances: 1/1
usage: 2G x 1 instances
routes: vdmml.17w21.sas.com/identities
last uploaded: Thu 27 Apr 19:01:05 EDT 2017
stack: cflinuxfs2
buildpack: java_buildpack

state since cpu memory disk details
#0 running 2017-04-27T23:02:13Z 0.2% 930.4M of 2G 186M of 2G


If I ssh into the container and run ‘ps’, the RSS size is reported as 939M:

vcap(a)dpgdg9915eg:~$ ps axo user,pid,cpu,rss,size,vsize,nlwp,cmd --sort -rss
USER PID CPU RSS SIZE VSZ NLWP CMD
vcap 13 - 962008 5909184 6023768 58 /home/vcap/app/.java-buildpack/open_jdk_jre/bin/java -Djava.io.tmpdir=/home/vcap/tmp -XX:OnOutOfMemoryError=/home/vcap/app/.java-buildpack/open_jdk_jre/bin/killjava.sh -Xmx384M -XX:MaxMetaspaceSize=128M -Xss256K -Xms384M -XX:MetaspaceSize=128M -javaagent:/home/vcap/app/lib/aspectjweaver-1.8.9.jar -Dserver.port=8080 -Djava.security.egd=file:/dev/./urandom -Dconsul.token=tobeusedfordemosonlyclnt -XX:NativeMemoryTracking=detail -verbose:class -XX:MaxDirectMemorySize=10M -XX:ReservedCodeCacheSize=240M -cp /home/vcap/app/. org.springframework.boot.loader.JarLauncher

While still in the container, running pmap for the process, the RSS is reported as 944M, and the “dirty” column reports 921M:
Address Kbytes RSS Dirty Mode Mapping

total kB 6030924 966196 942984

None of this looks close to what is reported via the NMT (jcmp) totals though (644M of committed):

2017-04-28T12:28:03.81-0400 [APP/0] OUT Total: reserved=1889246KB +84548KB, committed=659678KB +137612KB


Re: How to capture CF runtime container and application memory usage

Mike Dalessio
 

Here's an interesting read that Julz has linked to in the past when
questions come up about memory usage in a Linux container:

https://fabiokung.com/2014/03/13/memory-inside-linux-containers/

On Mon, May 1, 2017 at 8:04 AM, Daniel Mikusa <dmikusa(a)pivotal.io> wrote:

If you're trying to get the memory usage for everything running in the
app's container, then `cf app` is your best bet. If you want to look at
specific things inside the container (cause there will be more than one
process running in an app container), you could use the other commands that
you mentioned.

One note about Java NMT, it does not account for all memory in the JVM's
process (notably missing are third party native code memory allocations
and JDK class libraries), so what it reports will always be less than the
actual usage.

Dan


On Sat, Apr 29, 2017 at 12:02 PM, Zhiyong Li <Zhiyong.Li(a)sas.com> wrote:

We have used four different commands to capture the memory usage for the
Springboot apps running in CF runtime including: jcmp, ps, cf app and pmap.
However, everyone seems report different values. Do you know which one is
accurate? Here are example outputs:

Running “cf apps” the memory usage is reported as 930.4M:

C:\Program Files\Cloud Foundry>cf app mtp-identities
Showing health and status for app mtp-identities in org stocf / space dev
as admin...

name: mtp-identities
requested state: started
instances: 1/1
usage: 2G x 1 instances
routes: vdmml.17w21.sas.com/identities
last uploaded: Thu 27 Apr 19:01:05 EDT 2017
stack: cflinuxfs2
buildpack: java_buildpack

state since cpu memory disk
details
#0 running 2017-04-27T23:02:13Z 0.2% 930.4M of 2G 186M of 2G


If I ssh into the container and run ‘ps’, the RSS size is reported as
939M:

vcap(a)dpgdg9915eg:~$ ps axo user,pid,cpu,rss,size,vsize,nlwp,cmd --sort
-rss
USER PID CPU RSS SIZE VSZ NLWP CMD
vcap 13 - 962008 5909184 6023768 58
/home/vcap/app/.java-buildpack/open_jdk_jre/bin/java
-Djava.io.tmpdir=/home/vcap/tmp -XX:OnOutOfMemoryError=/home/v
cap/app/.java-buildpack/open_jdk_jre/bin/killjava.sh -Xmx384M
-XX:MaxMetaspaceSize=128M -Xss256K -Xms384M -XX:MetaspaceSize=128M
-javaagent:/home/vcap/app/lib/aspectjweaver-1.8.9.jar -Dserver.port=8080
-Djava.security.egd=file:/dev/./urandom -Dconsul.token=tobeusedfordemosonlyclnt
-XX:NativeMemoryTracking=detail -verbose:class
-XX:MaxDirectMemorySize=10M -XX:ReservedCodeCacheSize=240M -cp
/home/vcap/app/. org.springframework.boot.loader.JarLauncher

While still in the container, running pmap for the process, the RSS is
reported as 944M, and the “dirty” column reports 921M:
Address Kbytes RSS Dirty Mode Mapping

total kB 6030924 966196 942984

None of this looks close to what is reported via the NMT (jcmp) totals
though (644M of committed):

2017-04-28T12:28:03.81-0400 [APP/0] OUT Total: reserved=1889246KB
+84548KB, committed=659678KB +137612KB


Re: How to capture CF runtime container and application memory usage

Daniel Mikusa
 

If you're trying to get the memory usage for everything running in the
app's container, then `cf app` is your best bet. If you want to look at
specific things inside the container (cause there will be more than one
process running in an app container), you could use the other commands that
you mentioned.

One note about Java NMT, it does not account for all memory in the JVM's
process (notably missing are third party native code memory allocations and
JDK class libraries), so what it reports will always be less than the
actual usage.

Dan

On Sat, Apr 29, 2017 at 12:02 PM, Zhiyong Li <Zhiyong.Li(a)sas.com> wrote:

We have used four different commands to capture the memory usage for the
Springboot apps running in CF runtime including: jcmp, ps, cf app and pmap.
However, everyone seems report different values. Do you know which one is
accurate? Here are example outputs:

Running “cf apps” the memory usage is reported as 930.4M:

C:\Program Files\Cloud Foundry>cf app mtp-identities
Showing health and status for app mtp-identities in org stocf / space dev
as admin...

name: mtp-identities
requested state: started
instances: 1/1
usage: 2G x 1 instances
routes: vdmml.17w21.sas.com/identities
last uploaded: Thu 27 Apr 19:01:05 EDT 2017
stack: cflinuxfs2
buildpack: java_buildpack

state since cpu memory disk
details
#0 running 2017-04-27T23:02:13Z 0.2% 930.4M of 2G 186M of 2G


If I ssh into the container and run ‘ps’, the RSS size is reported as 939M:

vcap(a)dpgdg9915eg:~$ ps axo user,pid,cpu,rss,size,vsize,nlwp,cmd --sort
-rss
USER PID CPU RSS SIZE VSZ NLWP CMD
vcap 13 - 962008 5909184 6023768 58 /home/vcap/app/.java-
buildpack/open_jdk_jre/bin/java -Djava.io.tmpdir=/home/vcap/tmp
-XX:OnOutOfMemoryError=/home/vcap/app/.java-buildpack/open_jdk_jre/bin/killjava.sh
-Xmx384M -XX:MaxMetaspaceSize=128M -Xss256K -Xms384M -XX:MetaspaceSize=128M
-javaagent:/home/vcap/app/lib/aspectjweaver-1.8.9.jar -Dserver.port=8080
-Djava.security.egd=file:/dev/./urandom -Dconsul.token=tobeusedfordemosonlyclnt
-XX:NativeMemoryTracking=detail -verbose:class
-XX:MaxDirectMemorySize=10M -XX:ReservedCodeCacheSize=240M -cp
/home/vcap/app/. org.springframework.boot.loader.JarLauncher

While still in the container, running pmap for the process, the RSS is
reported as 944M, and the “dirty” column reports 921M:
Address Kbytes RSS Dirty Mode Mapping

total kB 6030924 966196 942984

None of this looks close to what is reported via the NMT (jcmp) totals
though (644M of committed):

2017-04-28T12:28:03.81-0400 [APP/0] OUT Total: reserved=1889246KB
+84548KB, committed=659678KB +137612KB


Re: proposal: unik & cloud foundry

Michael Maximilien
 

Wonderful. I can give you up to 15 mins. We use Zoom so you can share your screen if you want to do demo / presentation.

Look forward to it. Let me know if you need anything from me.

Best,

dr.max

ibm cloud labs
sillicon valley, ca
usa
maximilien.org

Sent from my iPhone

On Apr 29, 2017, at 12:40 AM, Idit Levine <idit.levine(a)gmail.com> wrote:

It does. Looking forward!

Sent from my iPhone

On Apr 28, 2017, at 4:43 PM, Chip Childers <cchilders(a)cloudfoundry.org> wrote:

Idit,

I spoke with Dr. Max about this proposal, and he and I both think that walking through the project and proposal live with the community would be best done during the monthly CAB call. The next call is on 5/17 at 11 AM Eastern US Time.

Hopefully that works for you!

-chip

On Thu, Apr 13, 2017 at 11:39 AM Idit Levine <idit.levine(a)gmail.com> wrote:
Sounds good. I reply to the BOSH comment and it should defiantly be part of the discussion.
It make a lot of sense to held a f2f discussion at Cloud foundry summit. Chip, thoughts ?

I believe I reply to all the comments.

Cheers,
Idit


On Apr 13, 2017, at 4:29 AM, Michael Maximilien <mmaximilien(a)gmail.com> wrote:

Could we use time at the CF summit for f2f discussions?

There are also some comments in the docs. I added one more today w.r.t. to the BOSH CPI and interface in Unik.

Anyhow, we could of course have more than one discussion.

Best,

Max

On Thu, Apr 13, 2017 at 6:11 AM, Idit Levine <idit.levine(a)gmail.com> wrote:
Hi Daniel,

Sorry for the late reply and thank you for the comments. Please see my answers inline.
Cloud Foundry aside, how does Unik go from receiving some app code to know exactly which kernel features are needed for the application at run time? This is the part that seems most magical to me.
First, Unik is not the one who make that decision, this decision is being done by the tool that build the unikernel and it is different tool for each unikernel type.

For example in IncludeOS, the mechanism used for extracting only what is needed from the operating system, is the one provided by default by modern linkers. Each part of the OS is compiled into an object-file, such as ip4.o, udp.o, pci_device.o etc., which are then combined using ar to form a static library os.a. When a program links with this library, only what’s necessary will automatically be extracted by the linker and end up in the final binary. To facilitate this build process a custom toolchain has been created.

Hope that make sense.
Where is the Unik daemon running - presumably somewhere of the operators choice, mostly likely as a BOSH-deployed job?
Yes, it is the Operators jobs and it will be smart to create BOSH release to UniK - right now the install process is done by running 'make'.
What happens if the Unik daemon dies - is it stateful? Will unikernels continue to run and be manageable?
The unikernels will continue running but will not be managed by UniK until the daemon will be restarted - just like docker.
Does the unikernel image get created at app staging time?
If the image is not pre built it will be built it, otherwise the existence image will be used.
Has there been any discussion about how to strategically implement this alongside Diego, instead of the tactical solution of 'going through' Diego?
I am going to set time with Chip help and we can all discuss it together. I think that clean integration should be done in Garden. But it should be discussed and decided by the community.

I hope that clarify some of your concerns and please feel free to ask me any question.

Thanks,
Idit


On Mar 23, 2017, at 4:32 PM, Daniel Jones <daniel.jones(a)engineerbetter.com> wrote:

Hi Idit,

Thanks for sending out the proposal, and thanks for giving your talk in Santa Clara last year, which I attended and was excited by.

I've read the proposal and heard the talk, but I'll be frank - I still don't get it. That may well be because I'm a simpleton who doesn't know much about kernels, let alone unikernels, but if I don't ask some silly questions I'll probably never know.

Cloud Foundry aside, how does Unik go from receiving some app code to know exactly which kernel features are needed for the application at run time? This is the part that seems most magical to me.
Where is the Unik daemon running - presumably somewhere of the operators choice, mostly likely as a BOSH-deployed job?
What happens if the Unik daemon dies - is it stateful? Will unikernels continue to run and be manageable?
Does the unikernel image get created at app staging time?
Has there been any discussion about how to strategically implement this alongside Diego, instead of the tactical solution of 'going through' Diego?

I'm excited by the promise of unikernels - being able to cut out so much bloat and indirection would be a massive win for efficiency if they usurped containers as a unit of currency. I wonder how much CO2 emission we could avoid by stripping abstraction away, instead of piling it on!

Regards,
Daniel Jones - CTO
+44 (0)79 8000 9153
@DanielJonesEB
EngineerBetter Ltd - UK Cloud Foundry Specialists

On 16 March 2017 at 21:09, Michael Maximilien <mmaximilien(a)gmail.com> wrote:
Thanks for this submission Idit and Dell/EMC.

I look forward to comments from community as we work this though the CF-extensions process.

Best.

On Thu, Mar 16, 2017 at 1:25 PM Idit Levine <idit.levine(a)gmail.com> wrote:
One clarification, we propose it to the CF incubation program.


On Mar 16, 2017, at 3:59 PM, Idit Levine <idit.levine(a)gmail.com> wrote:

Hi all,

We at Dell EMC would like to propose to contribute project unik (https://github.com/emc-advanced-dev/unik) and its integration with Cloud Foundry (https://github.com/emc-advanced-dev/cf-unik-buildpack) to Cloud Foundry community.

You can find the full official proposal at: https://docs.google.com/document/d/1Q9GakKpm6DMniJpWB-fqhE13SSPaj4-3sOWZ-I5nVyA/edit?usp=sharing
We of course welcome input and feedback on the proposal via inline commentary on the proposal document or directly to me.

Thanks,
Idit
--
dr.max Sent from my iPhone http://maximilien.org


--
max
http://maximilien.org
http://blog.maximilien.com
--
Chip Childers
CTO, Cloud Foundry Foundation
1.267.250.0815


How to capture CF runtime container and application memory usage

Zhiyong Li
 

We have used four different commands to capture the memory usage for the Springboot apps running in CF runtime including: jcmp, ps, cf app and pmap. However, everyone seems report different values. Do you know which one is accurate? Here are example outputs:

Running “cf apps” the memory usage is reported as 930.4M:

C:\Program Files\Cloud Foundry>cf app mtp-identities
Showing health and status for app mtp-identities in org stocf / space dev as admin...

name: mtp-identities
requested state: started
instances: 1/1
usage: 2G x 1 instances
routes: vdmml.17w21.sas.com/identities
last uploaded: Thu 27 Apr 19:01:05 EDT 2017
stack: cflinuxfs2
buildpack: java_buildpack

state since cpu memory disk details
#0 running 2017-04-27T23:02:13Z 0.2% 930.4M of 2G 186M of 2G


If I ssh into the container and run ‘ps’, the RSS size is reported as 939M:

vcap(a)dpgdg9915eg:~$ ps axo user,pid,cpu,rss,size,vsize,nlwp,cmd --sort -rss
USER PID CPU RSS SIZE VSZ NLWP CMD
vcap 13 - 962008 5909184 6023768 58 /home/vcap/app/.java-buildpack/open_jdk_jre/bin/java -Djava.io.tmpdir=/home/vcap/tmp -XX:OnOutOfMemoryError=/home/vcap/app/.java-buildpack/open_jdk_jre/bin/killjava.sh -Xmx384M -XX:MaxMetaspaceSize=128M -Xss256K -Xms384M -XX:MetaspaceSize=128M -javaagent:/home/vcap/app/lib/aspectjweaver-1.8.9.jar -Dserver.port=8080 -Djava.security.egd=file:/dev/./urandom -Dconsul.token=tobeusedfordemosonlyclnt -XX:NativeMemoryTracking=detail -verbose:class -XX:MaxDirectMemorySize=10M -XX:ReservedCodeCacheSize=240M -cp /home/vcap/app/. org.springframework.boot.loader.JarLauncher

While still in the container, running pmap for the process, the RSS is reported as 944M, and the “dirty” column reports 921M:
Address Kbytes RSS Dirty Mode Mapping

total kB 6030924 966196 942984

None of this looks close to what is reported via the NMT (jcmp) totals though (644M of committed):

2017-04-28T12:28:03.81-0400 [APP/0] OUT Total: reserved=1889246KB +84548KB, committed=659678KB +137612KB


Re: Brokered route services only receiving traffic for routes mapped to started apps

Krannich, Bernd <bernd.krannich@...>
 

Hi Shannon, all,

Any updates on the topic?

Thanks,
Bernd

From: "Price, Jon" <jon.price(a)intel.com>
Reply-To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev(a)lists.cloudfoundry.org>
Date: Monday, 27. March 2017 at 20:50
To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev(a)lists.cloudfoundry.org>
Subject: [cf-dev] Re: Re: Re: Re: Re: Re: Re: Re: Re: Brokered route services only receiving traffic for routes mapped to started apps

+1 for the 503 response

Jon Price
Intel Corporation

From: Krannich, Bernd [mailto:bernd.krannich(a)sap.com]
Sent: Sunday, March 26, 2017 11:49 PM
To: Discussions about Cloud Foundry projects and the system overall. <cf-dev(a)lists.cloudfoundry.org>
Subject: [cf-dev] Re: Re: Re: Re: Re: Re: Re: Re: Brokered route services only receiving traffic for routes mapped to started apps

Hi Shannon, all,

We have customers that prefer that unhealthy apps return a 503 response code instead of the 404 that currently is returned. We found the mail thread below which sounds like there’s an upcoming change that might enable our customers to get the response code they want.

Enabling this feature using a wildcard route as suggested earlier is unfortunately not an option because we would like to keep the wildcard route for ourselves as operators of the platform.

Therefore, I was wondering:
· Would the described change below allow people to change the response code to a 503?
· If so, are there any updates in terms of timelines? I’m also asking because for #3 below ("removed the need for NATS in CF”) it sounds like this is something that’s currently in the works.

Thanks in advance,
Bernd

Bernd Krannich
SAP Cloud Platform
SAP SE
Dietmar-Hopp-Allee 16, 69190 Walldorf, Germany

Pflichtangaben/Mandatory Disclosure Statement: www.sap.com/impressum<http://www.sap.com/company/legal/impressum.epx/>

Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtümlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

This e-mail may contain trade secrets or privileged, undisclosed, or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying, or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation.


From: Shannon Coen <scoen(a)pivotal.io<mailto:scoen(a)pivotal.io>>
Reply-To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev(a)lists.cloudfoundry.org<mailto:cf-dev(a)lists.cloudfoundry.org>>
Date: Tuesday, 17 May 2016 at 01:27
To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev(a)lists.cloudfoundry.org<mailto:cf-dev(a)lists.cloudfoundry.org>>
Subject: [cf-dev] Re: Re: Re: Re: Re: Re: Re: Brokered route services only receiving traffic for routes mapped to started apps

Inline

Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.

On Mon, May 16, 2016 at 3:29 PM, Guillaume Berche <bercheg(a)gmail.com<mailto:bercheg(a)gmail.com>> wrote:
Thanks a lot Shannon for your detailed response and sharing the routing architecture plans. I realize the priorization of such effort remains a challenge.
Out of curiosity, in step #6, how would CC be notified of LRP current state change, as to perform route unregister/updates ? Would CC be registering to BBS external event client [1] through server-side-events, or rather diego notifying CC through HTTP callbacks ?

CC would not be involved with route registration. CC would send the route and the app process ID (received from Diego) to the Routing API. The route-emitter would adds/remove backends for this process ID based on Diego server-sent-events, and periodic batch fetches (as it does now).

I wonder whether this architecture could also enable fully-brokered route services implementations to fetch the routing table from the routing-api, and perform direct routing to apps (an alternative discussed in [2]), enabling more advanced features (such as custom load balancing). I understand this currently would require granting routing.routes.read scope to route services. Granting them a routing.route.<bound_route_guid>.read oauth scope at SB route binding time would remove such a need for "admin creds".

Yes, we imagine the Routing API providing the route/backend mapping as a service, which will enable bring-your-own router, as well as direct routing from Route Services. This assumes your route service has access to the same private network as the interfaces for the Cell VMs. We may to consider how to partition this data so that your route service or router only receives routing data it should know about.

Thanks again,

[1] https://godoc.org/github.com/cloudfoundry-incubator/bbs#ExternalEventClient
[2] https://docs.google.com/document/d/1bGOQxiKkmaw6uaRWGd-sXpxL0Y28d3QihcluI15FiIA/edit#<https://docs.google.com/document/d/1bGOQxiKkmaw6uaRWGd-sXpxL0Y28d3QihcluI15FiIA/edit> section "Other proposals that we considered"
Guillaume.

On Fri, May 13, 2016 at 9:10 PM, Shannon Coen <scoen(a)pivotal.io<mailto:scoen(a)pivotal.io>> wrote:
Hi Guillaume,

It would be great to have a chance to talk more about this at summit.

In summary, I believe supporting your use case is a large effort, and yours in the only evidence I've heard in support of it. This makes prioritization a challenge. However, I believe our current plan for architectural changes to routing will eventually satisfy your requirements as well.

Currently, CC sends route registration to Diego when an app is started. Routes do not land in the router's routing table until the app is started, as Diego doesn't know anything about stopped apps (LRPs are deleted when a user requests stop). Since Diego will have no information about the LRP, the router-emitter has no way of discovering that a route should be registered.

Our plan is to move routing info out Diego. I believe it will fulfill your use case, and the Diego team very much wants this also.

The plan looks like this:
1. Update Routing API endpoints for HTTP route registration to be consistent with the TCP endpoints we've been focused on
2. Update the Route-Registrar job used by system components, service brokers, etc. to register HTTP, to point at the Routing API, instead of NATS.
3. Update the route-emitter to register HTTP routes for apps on Diego with the Routing API. At this point, we believe we will have removed the need for NATS in CF
4. Update the Routing API to support route reservation, independent of whether there are backends or not. At this point, an independent client could conceivably register a route with a route_service_url, and without backends
5. We may need to update Route-Registrar job to support reservation of a route without backends, and association of backends with the route
6. Update CC to register app routes with Routing API, instead of sending this data to Diego with createLRP, and update the route-emitter to significantly change its behavior: instead of calculating the routing table and sending it to Routing API, it will ask Diego for backends associated with routes in the Routing API (linked by the process ID, most likely). At this point, a developer could conceivably use CLI to create a route, bind it to a Route Service, and without mapping the route to an app, the router would forward requests for the route to the Route Service.

Best,

Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.

On Fri, May 13, 2016 at 1:31 AM, Guillaume Berche <bercheg(a)gmail.com<mailto:bercheg(a)gmail.com>> wrote:
Shannon,
What are your current thoughts on "maintaining routes with no backends in the routing table" ? I quickly scanned the routing backlog few days ago without yet finding trace of it.

I wish we could have used the opportunity of the cf summit "project office hours" routing session [1] to have interactive exchanges around these use cases. Unfortunately, my autosleep session [2] is scheduled at the exact same timeslot.
If the cf foundation organizers were able to swap sessions that would be great. I'll send a separate email to events(a)cloudfoundry.org<mailto:events(a)cloudfoundry.org>, is there are other community members suffering from the same conflict.
Thanks in advance,
Guillaume.

[1] http://sched.co/71aq
[2] http://sched.co/6aNp


Guillaume.

On Sun, May 1, 2016 at 12:03 AM, Stefan Mayr <stefan(a)mayr-stefan.de<mailto:stefan(a)mayr-stefan.de>> wrote:
Hi

Am 28.04.2016 um 23:08 schrieb Mike Youngstrom:
Here is another minor use case. My users are often confused that a
stopped app returns a 404 instead of a 503. So, we implement that
functionality for the user using an app mapped to wildcard routes that
constantly asks the CC for valid routes. This works for wildcard
domains but not one off domains.

It might be better if the router returned a 503. At least for routes
bound to apps. Not sure if this should extend to routes not bound to apps.

+1 for that proposal. A 404 also causes issues when crawler remove pages from their index. A 503 has less side effects. I would also prefer a 503 service unavailable when a route is not bound - because there is no service for this route. IMHO the meaning is much closer to what has happended.

Stefan
Mike

On Thu, Apr 28, 2016 at 1:32 PM, Shannon Coen <scoen(a)pivotal.io<mailto:scoen(a)pivotal.io>
<mailto:scoen(a)pivotal.io<mailto:scoen(a)pivotal.io>>> wrote:

Hello Guillaume,

Thank you for sharing your thoughts on these use cases. I can see
how having
a route service field requests for an app, whether the app is up on not,
could be useful.

However, enabling this would significantly change how routes are
registered
for apps on Cloud Foundry, and how the router handles the route lookup.
Routes are not currently enabled in the routing tier unless they are
mapped
to an app, and only when the app is determined healthy.

You are proposing the router maintains routes which have no
backends, and
instead of a failed lookup determining whether a 404 is returned,
the router
should figure out whether a route has any backends or a route service.

I'll chew on your use case and keep my ear out for additional use
cases for
maintaining routes with no backends in the routing table.

Best,
Shannon



--
View this message in context:
http://cf-dev.70369.x6.nabble.com/cf-dev-Brokered-route-services-only-receiving-traffic-for-routes-mapped-to-started-apps-tp4699p4742.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: proposal: unik & cloud foundry

Idit Levine
 

It does. Looking forward!

Sent from my iPhone

On Apr 28, 2017, at 4:43 PM, Chip Childers <cchilders(a)cloudfoundry.org> wrote:

Idit,

I spoke with Dr. Max about this proposal, and he and I both think that walking through the project and proposal live with the community would be best done during the monthly CAB call. The next call is on 5/17 at 11 AM Eastern US Time.

Hopefully that works for you!

-chip

On Thu, Apr 13, 2017 at 11:39 AM Idit Levine <idit.levine(a)gmail.com> wrote:
Sounds good. I reply to the BOSH comment and it should defiantly be part of the discussion.
It make a lot of sense to held a f2f discussion at Cloud foundry summit. Chip, thoughts ?

I believe I reply to all the comments.

Cheers,
Idit


On Apr 13, 2017, at 4:29 AM, Michael Maximilien <mmaximilien(a)gmail.com> wrote:

Could we use time at the CF summit for f2f discussions?

There are also some comments in the docs. I added one more today w.r.t. to the BOSH CPI and interface in Unik.

Anyhow, we could of course have more than one discussion.

Best,

Max

On Thu, Apr 13, 2017 at 6:11 AM, Idit Levine <idit.levine(a)gmail.com> wrote:
Hi Daniel,

Sorry for the late reply and thank you for the comments. Please see my answers inline.
Cloud Foundry aside, how does Unik go from receiving some app code to know exactly which kernel features are needed for the application at run time? This is the part that seems most magical to me.
First, Unik is not the one who make that decision, this decision is being done by the tool that build the unikernel and it is different tool for each unikernel type.

For example in IncludeOS, the mechanism used for extracting only what is needed from the operating system, is the one provided by default by modern linkers. Each part of the OS is compiled into an object-file, such as ip4.o, udp.o, pci_device.o etc., which are then combined using ar to form a static library os.a. When a program links with this library, only what’s necessary will automatically be extracted by the linker and end up in the final binary. To facilitate this build process a custom toolchain has been created.

Hope that make sense.
Where is the Unik daemon running - presumably somewhere of the operators choice, mostly likely as a BOSH-deployed job?
Yes, it is the Operators jobs and it will be smart to create BOSH release to UniK - right now the install process is done by running 'make'.
What happens if the Unik daemon dies - is it stateful? Will unikernels continue to run and be manageable?
The unikernels will continue running but will not be managed by UniK until the daemon will be restarted - just like docker.
Does the unikernel image get created at app staging time?
If the image is not pre built it will be built it, otherwise the existence image will be used.
Has there been any discussion about how to strategically implement this alongside Diego, instead of the tactical solution of 'going through' Diego?
I am going to set time with Chip help and we can all discuss it together. I think that clean integration should be done in Garden. But it should be discussed and decided by the community.

I hope that clarify some of your concerns and please feel free to ask me any question.

Thanks,
Idit


On Mar 23, 2017, at 4:32 PM, Daniel Jones <daniel.jones(a)engineerbetter.com> wrote:

Hi Idit,

Thanks for sending out the proposal, and thanks for giving your talk in Santa Clara last year, which I attended and was excited by.

I've read the proposal and heard the talk, but I'll be frank - I still don't get it. That may well be because I'm a simpleton who doesn't know much about kernels, let alone unikernels, but if I don't ask some silly questions I'll probably never know.

Cloud Foundry aside, how does Unik go from receiving some app code to know exactly which kernel features are needed for the application at run time? This is the part that seems most magical to me.
Where is the Unik daemon running - presumably somewhere of the operators choice, mostly likely as a BOSH-deployed job?
What happens if the Unik daemon dies - is it stateful? Will unikernels continue to run and be manageable?
Does the unikernel image get created at app staging time?
Has there been any discussion about how to strategically implement this alongside Diego, instead of the tactical solution of 'going through' Diego?

I'm excited by the promise of unikernels - being able to cut out so much bloat and indirection would be a massive win for efficiency if they usurped containers as a unit of currency. I wonder how much CO2 emission we could avoid by stripping abstraction away, instead of piling it on!

Regards,
Daniel Jones - CTO
+44 (0)79 8000 9153
@DanielJonesEB
EngineerBetter Ltd - UK Cloud Foundry Specialists

On 16 March 2017 at 21:09, Michael Maximilien <mmaximilien(a)gmail.com> wrote:
Thanks for this submission Idit and Dell/EMC.

I look forward to comments from community as we work this though the CF-extensions process.

Best.

On Thu, Mar 16, 2017 at 1:25 PM Idit Levine <idit.levine(a)gmail.com> wrote:
One clarification, we propose it to the CF incubation program.


On Mar 16, 2017, at 3:59 PM, Idit Levine <idit.levine(a)gmail.com> wrote:

Hi all,

We at Dell EMC would like to propose to contribute project unik (https://github.com/emc-advanced-dev/unik) and its integration with Cloud Foundry (https://github.com/emc-advanced-dev/cf-unik-buildpack) to Cloud Foundry community.

You can find the full official proposal at: https://docs.google.com/document/d/1Q9GakKpm6DMniJpWB-fqhE13SSPaj4-3sOWZ-I5nVyA/edit?usp=sharing
We of course welcome input and feedback on the proposal via inline commentary on the proposal document or directly to me.

Thanks,
Idit
--
dr.max Sent from my iPhone http://maximilien.org


--
max
http://maximilien.org
http://blog.maximilien.com
--
Chip Childers
CTO, Cloud Foundry Foundation
1.267.250.0815


[Known Issue] Intermittent 502 errors from Cloud Controller when using router.max_idle_connections

Zach Robinson
 

This is a followup to the concerns raised in an earlier thread titled "Issue with Routing Release v149-151". That thread notes that there is no issue with those routing releases.

The underlying 502 errors appear to be an issue with how the nginx server on Cloud Controller handles keep alive connections. The root cause is unknown and investigation is underway.

What does this mean for you?

If you set the value of router.max_idle_connections to zero then you will see no issues. However, setting that property provides performance enhancements to traffic going to hosted apps. If the performance gains are important enough to outweigh intermittent 502 errors from CC API requests, then you can consider enabling router.max_idle_connections.

-Zach
CAPI Project Lead


Re: How to tune ETCD performance for bbs job

Jason Huang
 

Do you know what was used as the Diego store in the test reported below? Is
it etcd or a relational database? If the later, what was it?
https://content.pivotal.io/blog/250k-containers-in-production-a-real-test-for-the-real-world
It is reported that 250k containers were running in one environment and
performed well.

Thanks,

Jason

On Fri, Apr 28, 2017 at 7:38 AM, Eric Malm <emalm(a)pivotal.io> wrote:

Hi, Maggie,

You may be hitting some performance issues with regard to the load on the
Diego etcd versus the capabilities of the disks it's writing to. Adding
more etcd nodes doesn't improve performance because etcd is a consistent
system, so all of the nodes are active and writing changes to disk. Instead
of tuning the etcd resource configuration, I would recommend you migrate
your Diego deployment to a relational store (MySQL or Postgres): you're
already on Diego v1.0.0, which supports it, and Diego v1.2.0 and later
require you to migrate.

Best,
Eric, CF Diego PM

On Thu, Apr 27, 2017 at 7:39 PM, Meng, Xiangyi <Xiangyi.Meng(a)dell.com>
wrote:

Hi, Juan



Yes, I mean from DEA to Diego. Sorry for the typo.



Thanks,

Maggie



*From:* Juan Pablo Genovese [mailto:juanpgenovese(a)gmail.com]
*Sent:* Thursday, April 27, 2017 11:59 PM
*To:* Discussions about Cloud Foundry projects and the system overall. <
cf-dev(a)lists.cloudfoundry.org>
*Subject:* [cf-dev] Re: How to tune ETCD performance for bbs job



Maggie,



I'm not sure, but, do you mean migrating *from* DEA to Diego?



Thank you!!



2017-04-27 4:03 GMT-03:00 Meng, Xiangyi <Xiangyi.Meng(a)dell.com>:

Hi,



We are migrating our application from Diego backend to Dea backend. But
recently we experienced some intermittent failures when pushing application
or fetching application status. We found below errors from bbs.stdout.log



*{"timestamp":"1493197381.010582685","source":"bbs","message":"bbs.request.tasks.tasks.etcd-error.unknown-error","log_level":2,"data":{"error":"501:
All the given peers are not reachable (failed to propose on members
[https://etcd.service.cf.internal:4001
<https://etcd.service.cf.internal:4001>] twice [last error: Unexpected HTTP
status code])
[0]","method":"POST","request":"/v1/tasks/list.r1","revision":0,"session":"1280311.1.1.1"}}*



And quite a lot of errors from etcd.stderr.log.



*etcdhttp: got unexpected response error (etcdserver: request timed out)*



So I added two more database jobs to compose a etcd cluster. But still I
can find same error messages from bbs log and etcd log on the first
database job.



I suppose all etcd nodes should accept read/write action. But only one
bbs node accepts access. Am I right? Why the errors are only found from the
first database job?



And my question is what is the suggested configuration for etcd cluster
and bbs node? Do we have to use relational database such as mysql instead
of ETCD?



Our env is CF 249 + Diego 1.0.0 + Etcd 86.



Any help would be appreciated.



Thanks,

Maggie





--

Mis mejores deseos,
Best wishes,
Meilleurs vœux,

Juan Pablo
------------------------------------------------------
http://www.jpgenovese.com