Date   

Re: V3: Multiple droplets for one app?

Nicholas Calugar
 

Hi,

Droplets are the result of [1] staging a package, they are only assigned to
an application when staging a package belonging to the same application.
You can assign a single droplet from the apps list of droplets as current.
The current droplet will be used for all processes and will be the default
for tasks, although tasks can specify a droplet from the list of the app’s
droplet instead of using the current droplet.


Hope that helps, let us know if we can clarify further,

Nick


[1]
http://v3-apidocs.cloudfoundry.org/version/3.3.0/index.html#create-a-droplet

--
Nicholas Calugar
Product Manager - Cloud Foundry API
Pivotal Software, Inc.

On December 23, 2016 at 7:38:18 AM, Kopecz, Klaus (klaus.kopecz(a)sap.com)
wrote:

Hi

I'm looking at the V3 controller API specification. It seems that an app
can have multiple droplets assigned. In the app creation sample response,
there is a "droplets" entry in the "links" section. While I see that it is
possible to set a "current" droplet, how to assign multiple droplets to an
app and what are the constraints and use cases for this (blue/green?).

The same multiplicity shines through when it comes to creating tasks.
There, a 'droplet_guid' can be assigned (defaults to the "current"
droplet). Can I assign any droplet here, or is the choice somehow limited
to droplets assigned to the app for which I'm creating the task?

Thanks

*Dr. Klaus Kopecz*
*SAP SE*


Re: Cleaning up Cloud Controller fog dependency

Mike Youngstrom
 

Thanks Nick, we also use the "Local" provider with our enterprise NFS.

Mike

On Fri, Dec 23, 2016 at 8:47 AM, Nicholas Calugar <ncalugar(a)pivotal.io>
wrote:

Hi CF,

The Cloud Controller requires the [1] fog gem, which pulls in the entire
universe of fog providers. One of these providers has an [2] odd version
constraint because fog declares support for a lower version of ruby than
fog-google. Since Cloud Foundry ships with a specific version of ruby, we
know we can pull in a later version of fog-google, leading to an awful
situation where we have to manually add that dependency.

Our plan is to follow fog’s recommendation to require specific fog
providers instead of fog and the entire universe of providers. We’ll still
only officially support the [3] documented fog providers, but we are happy
to add any providers being used by the Cloud Foundry community.

When you have a moment, please respond with the fog provider you are using
in your deployment. We plan on making this change sometime in January once
we have a reasonable list of providers.


Thanks,

Nick


[1] https://rubygems.org/gems/fog
[2] https://github.com/fog/fog/blob/master/fog.gemspec#L63
[3] https://docs.cloudfoundry.org/deploying/common/cc-
blobstore-config.html



--
Nicholas Calugar
Product Manager - Cloud Foundry API
Pivotal Software, Inc.


Cleaning up Cloud Controller fog dependency

Nicholas Calugar
 

Hi CF,

The Cloud Controller requires the [1] fog gem, which pulls in the entire
universe of fog providers. One of these providers has an [2] odd version
constraint because fog declares support for a lower version of ruby than
fog-google. Since Cloud Foundry ships with a specific version of ruby, we
know we can pull in a later version of fog-google, leading to an awful
situation where we have to manually add that dependency.

Our plan is to follow fog’s recommendation to require specific fog
providers instead of fog and the entire universe of providers. We’ll still
only officially support the [3] documented fog providers, but we are happy
to add any providers being used by the Cloud Foundry community.

When you have a moment, please respond with the fog provider you are using
in your deployment. We plan on making this change sometime in January once
we have a reasonable list of providers.


Thanks,

Nick


[1] https://rubygems.org/gems/fog
[2] https://github.com/fog/fog/blob/master/fog.gemspec#L63
[3] https://docs.cloudfoundry.org/deploying/common/cc-blobstore-config.html



--
Nicholas Calugar
Product Manager - Cloud Foundry API
Pivotal Software, Inc.


Re: container restart on logout

Graham Bleach
 

Hi Stefan,

On 23 December 2016 at 13:52, Stefan Mayr <stefan(a)mayr-stefan.de> wrote:
Am 23.12.2016 um 10:36 schrieb Graham Bleach:
On 23 December 2016 at 09:21, Daniel Jones
<daniel.jones(a)engineerbetter.com> wrote:
Hmm, here's an idea that I haven't through and so is probably rubbish...

How about an immutability enforcer? Recursively checksum the expanded
contents of a droplet, and kill-with-fire anything that doesn't match it.
It'd need to be optional for folks storing ephemeral data on their ephemeral
disk, and a non-invasive (ie no changes to CF components) implementation
would depend on `cf ssh` or a chained buildpack, but maybe that's a nice
compromise that could be quicker to develop than waiting for mainline code
changes to CF?
An idea we've been kicking around is to ensure that app instance
containers never live longer than a certain time (eg. 3, 6, 12 or 24
hours).

This would ensure that we'd catch cases where apps weren't able to
cope with being rescheduled to different cells. It'd also strongly
discourage manual tweaks via ssh. It'd probably be useful for people
deploying apps to be able to initiate an aggressive version of this
behaviour to run in their testing pipelines, prior to production
deployment, to catch regressions in keeping state in app instances.

There's a naive implementation in my head that would work fine on
smaller installations by looping through app instances returned by the
API and restarting them.

Cheers,
Graham
How to cope with the following issues?

Temporary data: some software still uses sessions, file uploads or
caches which are buffered or written to disk (Java/Tomcat, PHP, ...).
While it is okay to loose this data when a container is restarted (after
you had some time to work with this data) it becomes a problem when
every write can cause the recreation of this container. How should an
upload form work if every upload can kill the container? I'm only
refering the processing of the upload - not permanently storing it.
I think this was in response to Dan's immutability enforcement
proposal, so I'll let him respond :)

Single instances: recreating app containers when there are more than two
should not cause to many issues. But if there is only one instance you
have two choices:
- kill the running container and start a new one -> short downtime
- start a second instance and kill the first one afterwards -> problem
if the application is only allowed to run with one instance (singleton).
App instances go away when the cells get replaced (eg. stemcell
update) or fail, so apps need to be able to cope with it. If you're
not comfortable with downtime then the app probably shouldn't be
single instance.

For my naive "loop through all the app instances" script I'd be
inclined to check that the restarted instance was healthy again before
moving onto the next one.

One-shot tasks: a slight variation of the single instance problem and
the question if you are allowed to restart a oneshot task
Tasks feel less safe to interrupt than app instances. I'm unclear what
happens to a running task when the cell gets destroyed and therefore
if there's some reasonable upper bound on how long a task should take
to complete.

--
Technical Architect
Government Digital Service


Re: container restart on logout

Stefan Mayr
 

Am 23.12.2016 um 10:36 schrieb Graham Bleach:
On 23 December 2016 at 09:21, Daniel Jones
<daniel.jones(a)engineerbetter.com> wrote:
Hmm, here's an idea that I haven't through and so is probably rubbish...

How about an immutability enforcer? Recursively checksum the expanded
contents of a droplet, and kill-with-fire anything that doesn't match it.
It'd need to be optional for folks storing ephemeral data on their ephemeral
disk, and a non-invasive (ie no changes to CF components) implementation
would depend on `cf ssh` or a chained buildpack, but maybe that's a nice
compromise that could be quicker to develop than waiting for mainline code
changes to CF?
An idea we've been kicking around is to ensure that app instance
containers never live longer than a certain time (eg. 3, 6, 12 or 24
hours).

This would ensure that we'd catch cases where apps weren't able to
cope with being rescheduled to different cells. It'd also strongly
discourage manual tweaks via ssh. It'd probably be useful for people
deploying apps to be able to initiate an aggressive version of this
behaviour to run in their testing pipelines, prior to production
deployment, to catch regressions in keeping state in app instances.

There's a naive implementation in my head that would work fine on
smaller installations by looping through app instances returned by the
API and restarting them.

Cheers,
Graham
How to cope with the following issues?

Temporary data: some software still uses sessions, file uploads or
caches which are buffered or written to disk (Java/Tomcat, PHP, ...).
While it is okay to loose this data when a container is restarted (after
you had some time to work with this data) it becomes a problem when
every write can cause the recreation of this container. How should an
upload form work if every upload can kill the container? I'm only
refering the processing of the upload - not permanently storing it.

Single instances: recreating app containers when there are more than two
should not cause to many issues. But if there is only one instance you
have two choices:
- kill the running container and start a new one -> short downtime
- start a second instance and kill the first one afterwards -> problem
if the application is only allowed to run with one instance (singleton).

One-shot tasks: a slight variation of the single instance problem and
the question if you are allowed to restart a oneshot task

Happy holidays,

Stefan


V3: Multiple droplets for one app?

Kopecz, Klaus <klaus.kopecz@...>
 

Hi
I'm looking at the V3 controller API specification. It seems that an app can have multiple droplets assigned. In the app creation sample response, there is a "droplets" entry in the "links" section. While I see that it is possible to set a "current" droplet, how to assign multiple droplets to an app and what are the constraints and use cases for this (blue/green?).
The same multiplicity shines through when it comes to creating tasks. There, a 'droplet_guid' can be assigned (defaults to the "current" droplet). Can I assign any droplet here, or is the choice somehow limited to droplets assigned to the app for which I'm creating the task?
Thanks
Dr. Klaus Kopecz
SAP SE


Re: container restart on logout

Graham Bleach
 

On 23 December 2016 at 09:21, Daniel Jones
<daniel.jones(a)engineerbetter.com> wrote:
Hmm, here's an idea that I haven't through and so is probably rubbish...

How about an immutability enforcer? Recursively checksum the expanded
contents of a droplet, and kill-with-fire anything that doesn't match it.
It'd need to be optional for folks storing ephemeral data on their ephemeral
disk, and a non-invasive (ie no changes to CF components) implementation
would depend on `cf ssh` or a chained buildpack, but maybe that's a nice
compromise that could be quicker to develop than waiting for mainline code
changes to CF?
An idea we've been kicking around is to ensure that app instance
containers never live longer than a certain time (eg. 3, 6, 12 or 24
hours).

This would ensure that we'd catch cases where apps weren't able to
cope with being rescheduled to different cells. It'd also strongly
discourage manual tweaks via ssh. It'd probably be useful for people
deploying apps to be able to initiate an aggressive version of this
behaviour to run in their testing pipelines, prior to production
deployment, to catch regressions in keeping state in app instances.

There's a naive implementation in my head that would work fine on
smaller installations by looping through app instances returned by the
API and restarting them.

Cheers,
Graham


Re: container restart on logout

Daniel Jones
 

Hmm, here's an idea that I haven't through and so is probably rubbish...

How about an immutability enforcer? Recursively checksum the expanded
contents of a droplet, and kill-with-fire anything that doesn't match it.
It'd need to be optional for folks storing ephemeral data on their
ephemeral disk, and a non-invasive (ie no changes to CF components)
implementation would *depend* on `cf ssh` or a chained buildpack, but maybe
that's a nice compromise that could be quicker to develop than waiting for
mainline code changes to CF?

Regards,
Daniel Jones - CTO
+44 (0)79 8000 9153
@DanielJonesEB <https://twitter.com/DanielJonesEB>
*EngineerBetter* Ltd <http://www.engineerbetter.com> - UK Cloud Foundry
Specialists

On Thu, Dec 22, 2016 at 10:01 AM, David Illsley <davidillsley(a)gmail.com>
wrote:

I have no idea why the idea hasn't be implemented, but pondering it, it
seems like it's hard to do because of the cases you mention. Some people
need a policy that 'app teams won’t abuse it by creating app snowflakes',
and in some (most?) cases you need the flexibility to do debugging as you
mentioned.

I think it's possible to combine the SSH authorized events, and the
instance uptime details from the API to build audit capability - identify
instances which have been SSH'd to and not recycled within some time period
(eg 1 hour). You could have either some escalations process to get a human
to do something about it (in case there's a reason an hour wasn't enough),
or more brutally, give the audit code the ability to do a restart instance.



On Tue, Dec 20, 2016 at 12:48 PM, Daniel Jones <
daniel.jones(a)engineerbetter.com> wrote:

Plus one!

An implementation whereby the recycling behaviour can be feature-flagged
by space or globally would be nice, so you could turn it off whilst
debugging in a space, and then re-enable it when you've finished debugging
via a series of short-lived SSH sessions.

Regards,
Daniel Jones - CTO
+44 (0)79 8000 9153 <07980%20009153>
@DanielJonesEB <https://twitter.com/DanielJonesEB>
*EngineerBetter* Ltd <http://www.engineerbetter.com> - UK Cloud Foundry
Specialists

On Tue, Dec 20, 2016 at 8:06 AM, DHR <lists(a)dhrapson.com> wrote:

Thanks Jon. The financial services clients I have worked with would also
like the ability to turn on ‘cf ssh’ support in production, safe in the
knowledge that app teams won’t abuse it by creating app snowflakes.

I see that the audit trail mentioned in the thread you posted have been
implemented in ‘cf events’. Like this:

time event actor
description
2016-12-19T16:20:36.00+0000 audit.app.ssh-authorized user index: 0
2016-12-19T15:30:33.00+0000 audit.app.ssh-authorized user index: 0
2016-12-19T12:00:53.00+0000 audit.app.ssh-authorized user index: 0


That said: I still think the container recycle functionality, available
as say a feature flag, would be really appreciated by the large enterprise
community.

On 19 Dec 2016, at 18:25, Jon Price <jon.price(a)intel.com> wrote:

This is something that has been on our wishlist as well but I haven't
seen any discussion about it in quite some time. Here is one of the
original discussions about it: https://lists.cloudfoundry.org
/archives/list/cf-dev(a)lists.cloudfoundry.org/thread/GCFOOYRU
T5ARBMUHDGINID46KFNORNYM/

It would go a long way with our security team if we could have some
sort of recycling policy for containers in some of our more secure
environments.

Jon Price
Intel Corporation


NOTICE: Removal of Node.js v0.12 and v5 version lines from Node.js buildpack

Stephen Levine
 

Hi All,

The first release of the Node.js buildpack after January 22, 2017 will not
include any versions of Node.js with 0.12.x or 5.x version numbers. These
Node.js versions are no longer supported upstream[1]. Please migrate your
Node.js apps to supported versions of Node.js before that time.

[1] https://github.com/nodejs/LTS

Thanks,
Stephen Levine
CF Bulidpacks PM


NOTICE: Breaking change in the .NET Core buildpack msbuild alpha tooling

Stephen Levine
 

Hi All,

The v1.0.7 release of the .NET Core buildpack added support for msbuild
tooling as an alternative to project.json tooling. This tooling is not GA,
and we do not yet recommend using it in production. The first release of
the buildpack after January 22, 2017 will build all apps that use the
msbuild tooling using `dotnet publish` during staging. This will
dramatically reduce the droplet size for msbuild apps, but it may require
making configuration changes to these apps to prevent them from breaking.
The behavior for apps that used the project.json tooling will remain
unchanged for the foreseeable future to avoid breaking production apps.

To migrate your msbuild apps to the new staging process, you may need to
add configuration to your csproj files in order to copy files that are
required for your project. For example:

<ItemGroup>
...
<Content Include="appsettings.json">
<CopyToPublishDirectory>PreserveNewest</CopyToPublishDirectory>
</Content>
</ItemGroup>

Thanks,
Stephen Levine
CF Buildpacks PM


Re: container restart on logout

David Illsley <davidillsley@...>
 

I have no idea why the idea hasn't be implemented, but pondering it, it
seems like it's hard to do because of the cases you mention. Some people
need a policy that 'app teams won’t abuse it by creating app snowflakes',
and in some (most?) cases you need the flexibility to do debugging as you
mentioned.

I think it's possible to combine the SSH authorized events, and the
instance uptime details from the API to build audit capability - identify
instances which have been SSH'd to and not recycled within some time period
(eg 1 hour). You could have either some escalations process to get a human
to do something about it (in case there's a reason an hour wasn't enough),
or more brutally, give the audit code the ability to do a restart instance.



On Tue, Dec 20, 2016 at 12:48 PM, Daniel Jones <
daniel.jones(a)engineerbetter.com> wrote:

Plus one!

An implementation whereby the recycling behaviour can be feature-flagged
by space or globally would be nice, so you could turn it off whilst
debugging in a space, and then re-enable it when you've finished debugging
via a series of short-lived SSH sessions.

Regards,
Daniel Jones - CTO
+44 (0)79 8000 9153 <07980%20009153>
@DanielJonesEB <https://twitter.com/DanielJonesEB>
*EngineerBetter* Ltd <http://www.engineerbetter.com> - UK Cloud Foundry
Specialists

On Tue, Dec 20, 2016 at 8:06 AM, DHR <lists(a)dhrapson.com> wrote:

Thanks Jon. The financial services clients I have worked with would also
like the ability to turn on ‘cf ssh’ support in production, safe in the
knowledge that app teams won’t abuse it by creating app snowflakes.

I see that the audit trail mentioned in the thread you posted have been
implemented in ‘cf events’. Like this:

time event actor
description
2016-12-19T16:20:36.00+0000 audit.app.ssh-authorized user index: 0
2016-12-19T15:30:33.00+0000 audit.app.ssh-authorized user index: 0
2016-12-19T12:00:53.00+0000 audit.app.ssh-authorized user index: 0


That said: I still think the container recycle functionality, available
as say a feature flag, would be really appreciated by the large enterprise
community.

On 19 Dec 2016, at 18:25, Jon Price <jon.price(a)intel.com> wrote:

This is something that has been on our wishlist as well but I haven't
seen any discussion about it in quite some time. Here is one of the
original discussions about it: https://lists.cloudfoundry.org
/archives/list/cf-dev(a)lists.cloudfoundry.org/thread/GCFOOY
RUT5ARBMUHDGINID46KFNORNYM/

It would go a long way with our security team if we could have some
sort of recycling policy for containers in some of our more secure
environments.

Jon Price
Intel Corporation


CF CLI v6.23.0 Released Today

Koper, Dies <diesk@...>
 

The CF CLI team just cut 6.23.0. Binaries and link to release notes are available at:

https://github.com/cloudfoundry/cli#downloads


This is the last release to bundle the deprecated loggregator consumer library, which is used to talk to the loggregator endpoint on CF releases before v212.

In the next cf CLI release this library is scheduled to be removed; regardless of the CF release version targeted, the noaa library will be used to talk to the doppler endpoint.
This endpoint was deemed stable around CF v203: if targeting an earlier CF release and experiencing issues with commands that interact with the loggregator (e.g. logs, push), please stay on this release of the cf CLI until your target CF is upgraded.

One-off tasks

This release introduces commands to run, terminate and list tasks, available from CF release v247 (CC API v3.0.0) onwards.

A task is an application or script whose code is included as part of a deployed application, but runs independently in its own container.
It can be used to perform one-off jobs, such as:

* Migrating a database
* Sending an email
* Running a batch job
* Running a data processing script

Refer to the Running Tasks documentation<https://docs.cloudfoundry.org/devguide/using-tasks.html> for details.

Creating users with external identity providers

The create-user command has been enhanced to allow the creation of users mapped to users in an external identity provider, such as LDAP. (#822<https://github.com/cloudfoundry/cli/pull/822>)



$ cf create-user j.smith(a)example.com --origin ldap # LDAP user

CLI client id and secret no longer hard-coded

The client id and secret used by the cf CLI for certain UAA requests are now stored in the local config.json file, making it possible to configure custom ids and secrets, for example to use long-lived tokens for scripts in CI environments.
We're working on proper documentation. For now, refer to #919 (comment)<https://github.com/cloudfoundry/cli/issues/919#issuecomment-268699211>.

Built with Golang 1.7.4

Golang 1.7.4 was released this month, addressing a vulnerability that could affect cf CLI users on Darwin with trust preferences for root certificates.
See this announcement<https://groups.google.com/forum/#!topic/golang-nuts/9eqIHqaWvck> for details.

Refactored commands

We are in the process of creating a more consistent user experience; our goal is to standardize UI output. For example, warnings and errors will consistently be outputted to stderr instead of stdout. As we iterate through the list of commands, we are also focusing on improving performance and stability.

List of improved commands in this release:

* api
* create-user
* delete-org
* delete-orphaned-routes
* unbind-service
* version

Fixed regressions

* 32 bit binaries of cf CLI 6.22.2 panic on 32 bit systems for commands that interact with loggregator (such as push) due to 64 bit-only code in the doppler library. (#991<https://github.com/cloudfoundry/cli/issues/991>)
* Commands that interact with loggregator (such as push) could panic if the connection with the loggegrator was interrupted (e.g. in the case of a loggregator restart) due to an issue in retry logic. (#1019<https://github.com/cloudfoundry/cli/issues/1019>)

Updated commands

* create-security-group and update-security-group now include the new "description" field in the JSON example in their help pages. This field is accepted from CF release v238 (CC API v2.57.0) onwards.
* push now treats values of environment variables specified in the app manifest as strings so big integers do not unintentionally get converted into (harder to read) scientific numbers. (#996<https://github.com/cloudfoundry/cli/issues/996>)
* push no longer panics when loggregator restarts while collecting app logs. (#1019<https://github.com/cloudfoundry/cli/issues/1019>)
* delete-space now takes an optional org parameter to allow deletion of a space without targeting it. (#957<https://github.com/cloudfoundry/cli/pull/957>)
* unbind-service no longer fails with an error saying the app is not bound when unbinding a service that is bound to more than 50 apps. (#948<https://github.com/cloudfoundry/cli/issues/948>)
* delete-orphaned-routes now deletes all orphaned routes instead of stopping after deleting 50, and no longer exits with return code 0 when an error occurs. (https://www.pivotaltracker.com/story/show/131127157, #978<https://github.com/cloudfoundry/cli/issues/978>)

New & Updated Community Plugins

* top v0.7.5: https://github.com/ECSTeam/cloudfoundry-top-plugin
* Usage Report v1.4.1: https://github.com/krujos/usagereport-plugin
* Blue-green-deploy v1.1.0: https://github.com/bluemixgaragelondon/cf-blue-green-deploy
* docker-usage v1.0.3: https://github.com/ECSTeam/docker-usage
* cf-download v1.2.0: https://github.com/ibmjstart/cf-download
* buildpack-usage v1.0.0: https://github.com/ECSTeam/buildpack-usage

Enjoy!

Regards,
Dies Koper
Cloud Foundry Product Manager - CLI


Re: Incubation Proposal: CredHub (credential manager)

Dan Jahner
 

Hi Wayne -

When you say that your customers would prefer to use Vault as their "at-rest" store, do you know the motivation driving that preference? I assume the concern is primarily focused on the encryption, not the data storage itself (which Vault delegates to standard data stores, e.g. Consul, MySQL, etc.).

CredHub is being developed to provide a pluggable encryption provider interface, which allows users to select the appropriate provider based on their needs. By example, most customers in high security environments would select a hardware security module to perform these cryptography operations.

An integration with Vault doesn’t make sense in this context, because interfacing with the entire Vault codebase to leverage only its encryption features would create an inferior experience to implementing these algorithms natively with well-respected Java libraries. If your customers’ concern is specifically the algorithm, you’ll be happy to know that both the internal software and HSM client providers in CredHub will support AES256-GCM, which is the same algorithm used by Vault.

Please let me know if you have any additional thoughts or concerns.


Re: Incubation Proposal: CredHub (credential manager)

Dan Jahner
 

Hi David -

You are correct - our initial use case if focused on operator concerns. An additional proposal that focuses on application secrets is forthcoming and will involve the larger group of teams needed to make that successful.


Proposing move of Gibson to CF Attic

Shannon Coen
 

Gibson [1] is a deprecated library for registering HTTP routes with
Gorouter via NATS. The currently supported way to do this is with the BOSH
job route-registrar [2]. Many months ago we removed Gibson as a dependency
of route-registrar in favor of nats.io. The CF Routing team would like to
move Gibson to the CF attic and promote route-registrar to an active CFF
project. We'll formally propose these changes on the Runtime PMC call on
Jan 11, 2017.

If you are still using Gibson, consider switching to route-registrar soon.
As route-registrar is a standalone job meant to be colocated with the
process that requires a route, it should require no code changes to your
project except to remove support for Gibson.

[1] https://github.com/cloudfoundry/gibson
[2] https://github.com/cloudfoundry-incubator/route-registrar


Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.


New Loggregator Certificates

Adam Hevenor
 

Hi All -

In order to secure the transport of log messages going forward Loggregator
will require Metron cert & key as well as the Loggregator CA cert. You
won't be able to deploy the latest versions of Loggregator (> v70) if you
don't have these configured. See our README
<https://github.com/cloudfoundry/loggregator#generating-tls-certificates> with
specifics for generating and setting up your certs.

Feel free to reach out with any questions you have in our slack channel
<https://cloudfoundry.slack.com/archives/loggregator>.

Thanks

--
*Adam Hevenor*

*----------------------------*
PM: Loggregator


Re: Incubation Proposal: CredHub (credential manager)

Travis McPeak
 

I see how getting UAA to interface correctly to somethung like Vault may be
difficult. Aside from that and the language stack, what are primary
diffetences with other mature key management systems?

Crypto (and surrounding implementations) are notoriously difficult to get
right. Are we planning to have any kind of third party audit?


On Tue, Dec 20, 2016, 6:05 AM Wayne E. Seguin <wayneeseguin(a)gmail.com>
wrote:

Most of our customers are using Vault already for secrets management and
would prefer to keep doing so as their final "at-rest" store.

For our customers it would therefore work best if CredHub can use Vault.

With this idea, can we design CredHub using interchangeable backends
(Perhaps a plugin API for backends) so that our customers can continue to
use Vault as the final "at-rest" backend key/credentials store?

On Mon, Dec 19, 2016 at 8:45 PM, Justin Smith <jusmith(a)pivotal.io> wrote:

It makes sense to build CredHub for many reasons. A few that come to mind
quickly are below.

1) The service must start and restart without human intervention. This
immediately means the key encrypting key resides in a Hardware Security
Module (HSM) of some kind. We think this is a first-class feature and
should be part of CredHub open source.
2) CloudFoundry runs in production in some of the most restricted
environments in the world. These environments tend towards favoring crypto
stacks they have seen and reviewed before. The Java ecosystem contains
several. The Go ecosystem will eventually get there, but it isn't right
now. We're fully aware there are tons of great Go apps (including large
chunks of CloudFoundry) that perform crypto today. We received credible
feedback that showed a Java preference for this type of application.
3) Authorization is an important part of credential management in CF.
Authorization can be an incredibly difficult problem, and an authorization
implementation can be very hard to change after it is built. We didn't see
anything available that had the right feature set. We could have bolted
something in front of another product, but we decided it made more sense to
build something new.

MVP certainly requires hitting a threshold of capabilities, but it also
requires leaving runway for what's ahead.

App secrets are a mainline scenario. Let's take it as a given that apps
must have access to plaintext secrets. From there, the discussion centers
on two questions: 1) what's the chain of custody of the secret and what has
access to it, and 2) how is the secret made available to the app. (1) is
all about CredHub and (2) is mostly about Diego. We'll engage in a
discussion on these in the weeks ahead.




--
~Wayne

Wayne E. Seguin
wayneeseguin(a)gmail.com
wayneeseguin on irc.freenode.net
http://twitter.com/wayneeseguin/
https://github.com/wayneeseguin/


Re: Incubation Proposal: CredHub (credential manager)

Wayne E. Seguin
 

Most of our customers are using Vault already for secrets management and
would prefer to keep doing so as their final "at-rest" store.

For our customers it would therefore work best if CredHub can use Vault.

With this idea, can we design CredHub using interchangeable backends
(Perhaps a plugin API for backends) so that our customers can continue to
use Vault as the final "at-rest" backend key/credentials store?

On Mon, Dec 19, 2016 at 8:45 PM, Justin Smith <jusmith(a)pivotal.io> wrote:

It makes sense to build CredHub for many reasons. A few that come to mind
quickly are below.

1) The service must start and restart without human intervention. This
immediately means the key encrypting key resides in a Hardware Security
Module (HSM) of some kind. We think this is a first-class feature and
should be part of CredHub open source.
2) CloudFoundry runs in production in some of the most restricted
environments in the world. These environments tend towards favoring crypto
stacks they have seen and reviewed before. The Java ecosystem contains
several. The Go ecosystem will eventually get there, but it isn't right
now. We're fully aware there are tons of great Go apps (including large
chunks of CloudFoundry) that perform crypto today. We received credible
feedback that showed a Java preference for this type of application.
3) Authorization is an important part of credential management in CF.
Authorization can be an incredibly difficult problem, and an authorization
implementation can be very hard to change after it is built. We didn't see
anything available that had the right feature set. We could have bolted
something in front of another product, but we decided it made more sense to
build something new.

MVP certainly requires hitting a threshold of capabilities, but it also
requires leaving runway for what's ahead.

App secrets are a mainline scenario. Let's take it as a given that apps
must have access to plaintext secrets. From there, the discussion centers
on two questions: 1) what's the chain of custody of the secret and what has
access to it, and 2) how is the secret made available to the app. (1) is
all about CredHub and (2) is mostly about Diego. We'll engage in a
discussion on these in the weeks ahead.
--
~Wayne

Wayne E. Seguin
wayneeseguin(a)gmail.com
wayneeseguin on irc.freenode.net
http://twitter.com/wayneeseguin/
https://github.com/wayneeseguin/


Re: consul_z1/0 is failing after update

Sylvain Gibier
 

Hi,

Ok - after doing the recovery scenario - the cluster was back, and finally
ping point the root cause.

The reason - on 2 DIEGO cells - the consul_agent (client) is running out of
disk space to write up the keys. Per my understanding is that cluster
server node will swap if any errors occurred from nodes during the
registration.

How can I have /var/vcap/store map on the ephemeral disk and not root
partition when not using persisted disk in diego deployment?

Sylvain

On Tue, Dec 20, 2016 at 8:39 AM, Etourneau Gwenn <gwenn.etourneau(a)gmail.com>
wrote:

Hi,

You can check recovery scenario here https://github.com/
cloudfoundry-incubator/consul-release#failure-recovery

Thanks.
Gwenn

2016-12-20 16:12 GMT+09:00 Sylvain Gibier <sylvain(a)munichconsulting.de>:

Hi,

Any hint on how to fix it ? From a network topology - nothing changed,
and I can't find anything usefull in consul documentation for reforming my
cluster. Currently the 2 second consul node server is experiencing the
issue, so running on one consul node (server and leader)...

From CF perspective - how can I reinitialize the consul cluster, and
impact on the other components - as I'm starting to see failing routing
requests at this stage.

Sylvain



On Tue, Dec 20, 2016 at 2:40 AM, Yitao Jiang <jiangyt.cn(a)gmail.com>
wrote:

we once had the same issue which causing by network issue, the consul
server follower couldn't connect to the leader, but what difference is that
we are running on openstack.

On Tue, Dec 20, 2016 at 12:32 AM, Sylvain Gibier <
sylvain(a)munichconsulting.de> wrote:

Hi,

Diego has been default in my CF installation (H/A over 3 AZ) - and
today, while trying a simple BOSH CF update of a stemcell - the consul_z1/0
keeps on "failing after update".

If I look in the log file - I can see the following:

"
++ logger -p user.info -t vcap.consul-agent
++ tee -a /var/vcap/sys/log/consul_agent/consul_agent.stdout.log error
during start: 2/30 nodes reported failure
2016/12/19 14:49:50 [ERR] agent.client: Failed to decode response
header: EOF
2016/12/19 14:49:50 [ERR] agent.client: Failed to decode response
header: EOF
"
Also it seems that I have a bunch of errors:

"
2016/12/19 13:54:32 [INFO] consul: adding server consul-z3-0 (Addr:
10.10.30.37:8300) (DC: dc1)
2016/12/19 13:54:32 [INFO] consul: adding server consul-z2-0 (Addr:
10.10.20.37:8300) (DC: dc1)
2016/12/19 13:54:32 [ERR] agent: failed to sync remote state: No
cluster leader
2016/12/19 13:54:32 [INFO] agent: Joining cluster...
2016/12/19 13:54:32 [INFO] agent: (LAN) joining: [10.10.10.37
10.10.20.37 10.10.30.37]
2016/12/19 13:54:32 [INFO] agent: (LAN) joined: 3 Err: <nil>
2016/12/19 13:54:32 [INFO] agent: Join completed. Synced with 3
initial agents
2016/12/19 13:54:32 [WARN] raft: Failed to get previous log: 503710
log not found (last: 503708)
2016/12/19 13:54:32 [INFO] raft: Removed ourself, transitioning to
follower
"
I can definitely confirm in my case - that consul_z3 is the Leader (via
consul info) in my current setup.

Any help/point on how to fix that ?


Releases: CF: v234, Diego: 0.1467.0
IaaS: AWS



--

Regards,

Yitao


Re: container restart on logout

Daniel Jones
 

Plus one!

An implementation whereby the recycling behaviour can be feature-flagged by
space or globally would be nice, so you could turn it off whilst debugging
in a space, and then re-enable it when you've finished debugging via a
series of short-lived SSH sessions.

Regards,
Daniel Jones - CTO
+44 (0)79 8000 9153
@DanielJonesEB <https://twitter.com/DanielJonesEB>
*EngineerBetter* Ltd <http://www.engineerbetter.com> - UK Cloud Foundry
Specialists

On Tue, Dec 20, 2016 at 8:06 AM, DHR <lists(a)dhrapson.com> wrote:

Thanks Jon. The financial services clients I have worked with would also
like the ability to turn on ‘cf ssh’ support in production, safe in the
knowledge that app teams won’t abuse it by creating app snowflakes.

I see that the audit trail mentioned in the thread you posted have been
implemented in ‘cf events’. Like this:

time event actor
description
2016-12-19T16:20:36.00+0000 audit.app.ssh-authorized user index: 0
2016-12-19T15:30:33.00+0000 audit.app.ssh-authorized user index: 0
2016-12-19T12:00:53.00+0000 audit.app.ssh-authorized user index: 0


That said: I still think the container recycle functionality, available as
say a feature flag, would be really appreciated by the large enterprise
community.

On 19 Dec 2016, at 18:25, Jon Price <jon.price(a)intel.com> wrote:

This is something that has been on our wishlist as well but I haven't
seen any discussion about it in quite some time. Here is one of the
original discussions about it: https://lists.cloudfoundry.
org/archives/list/cf-dev(a)lists.cloudfoundry.org/thread/
GCFOOYRUT5ARBMUHDGINID46KFNORNYM/

It would go a long way with our security team if we could have some sort
of recycling policy for containers in some of our more secure environments.

Jon Price
Intel Corporation

3201 - 3220 of 9425