Date   

Re: When will dea be replaced by diego?

Guillaume Berche
 

Thanks a lot Matthew and Amit, the document can now be publicly
accessed/commented.

Guillaume.

On Tue, Sep 8, 2015 at 4:09 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Done, anyone with the link should be able to comment now.

Best,
Amit


On Tuesday, September 8, 2015, Matthew Sykes <matthew.sykes(a)gmail.com>
wrote:

Hi Guillaume. The proposal document was created by Amit and I had
assumed it was public. I'll try to make sure he sees this chain today so he
can address it. Sorry to send a unusable link.

On Tue, Sep 8, 2015 at 3:02 AM, Guillaume Berche <bercheg(a)gmail.com>
wrote:

Thanks Matthew for the additional details and pointers. It seems that
the deployment strategy proposal mentionned in [2] is lacking read/comment
permissions. Any chance to fix that ?

Guillaume.

On Tue, Sep 8, 2015 at 2:07 AM, Matthew Sykes <matthew.sykes(a)gmail.com>
wrote:

The notes you're pointing to were a straw man proposal; many of the
dates no longer seem relevant.

With that, I'm not in product management but, in my opinion, the
definition of "done" and "ready" are relative.

The current bar that the development team is focusing on is data and
API versioning. We feel it's necessary to maintain continuous operation
across deployments. In particular, we want to be sure that operators can
perform forward migration with minimal down time before it becomes the
default backend in production. We're currently referring to that target as
v 0.9.

That said, the current path towards that goal has us going to a single
API server Diego[1]. With this change in architecture, the scaling and
performance characteristics will probably change. While it's likely these
changes won't have measurable impact to smaller environments, it remains to
be seen what will happen with the larger deployments operated by public
providers. This is where the whole notion of "replacement" starts to get a
bit murky.

As for "merging into cf-release," again, I'm not product management
(James and Amit are in a better position to comment) but the current
direction appears to be to break down Cloud Foundry into a number of
smaller releases. We already have a cf-release, garden-release, and
diego-release as part of a diego deployment but there are others like an
etcd-release that the MEGA team is managing and a uaa-release that the
identity team have done. These are all pieces of a new deployment strategy
that was proposed[2] a few months ago.

Given that path, I don't know that diego-release will ever be merged
into cf-release; it's more likely that it will be stitched into the
"cf-deployment" described in that proposal.

So, to your question, the 0.9 release may be cut in September. That's
the first release that operators will be able to roll forward from without
downtime. If you want Diego to be the default backend without having to
mess with plugins and configuration, you can already do that today via
configuration[3].

[1]: https://github.com/onsi/migration-proposal
[2]:
https://docs.google.com/document/d/1Viga_TzUB2nLxN_ILqksmUiILM1hGhq7MBXxgLaUOkY/edit#heading=h.qam414rpl0xe
[3]:
https://github.com/cloudfoundry/cloud_controller_ng/blob/aea2a53b123dc5104c11eb53b81a09a4c4eaba55/bosh-templates/cloud_controller_api.yml.erb#L287

On Mon, Sep 7, 2015 at 2:08 PM, Layne Peng <layne.peng(a)emc.com> wrote:

I think what he ask is, when the Diego-release will merge to
cf-release. And also no need to install cf cli diego plugin, no need to
enabe-diego to your app, then start. For the
https://github.com/cloudfoundry-incubator/diego-design-notes/blob/master/migrating-to-diego.md#a-detailed-transition-timeline
. it is said to be mid-september, is it right?


--
Matthew Sykes
matthew.sykes(a)gmail.com

--
Matthew Sykes
matthew.sykes(a)gmail.com


Re: CAB September Call on 9/9/2015 @ 8a PDT

Amit Kumar Gupta
 

Hi all,

I will not be able to attend the CAB meeting tomorrow, but I have added my
notes to the agenda doc. MEGA has been/will be working on a bunch of
exciting things, and I welcome questions/comments via email, either through
the cf-dev mailing list or directly.

Best,
Amit, CF Release Integration team (MEGA) PM

On Tue, Sep 8, 2015 at 8:19 AM, Michael Maximilien <maxim(a)us.ibm.com> wrote:

Final reminder for the CAB call tomorrow. See you at Pivotal SF and talk
to you all then.

Best,

dr.max
ibm cloud labs
silicon valley, ca

Sent from my iPhone

On Sep 2, 2015, at 6:04 PM, Michael Maximilien <maxim(a)us.ibm.com> wrote:

Hi, all,

Quick reminder that the CAB call for September is next week Wednesday
September 9th @ 8a PDT.

Please add any project updates to Agenda here:
https://docs.google.com/document/d/1SCOlAquyUmNM-AQnekCOXiwhLs6gveTxAcduvDcW_xI/edit#heading=h.o44xhgvum2we

If you have something else to share, please also add an entry at the end.

Best,

Chip, James, and Max

PS: Dr.Nic this is one week in advance, so no excuses ;) phone info listed
on agenda.
PPS: Have a great labor day weekend---if you are in the US.


Re: So many hard-coded dropsonde destinations to metrons

Noburou TANIGUCHI
 

We're happy to see it.
Thanks a lot, Warren.


Warren Fernandes wrote
The LAMB team added a chore to discuss how we can better manage a
dropsonde_incoming_port on the metron_agent over here
https://www.pivotaltracker.com/story/show/102935222

We'll update this thread once we decide how to proceed.




-----
I'm not a ...
noburou taniguchi
--
View this message in context: http://cf-dev.70369.x6.nabble.com/So-many-hard-coded-dropsonde-destinations-to-metrons-tp1474p1565.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: Application only starts when a bogus service is attached

Amit Gupta
 

Hi all,

Ramon, Fabien, are you both working on the same problem or is Fabien's response to Daniel separate from Ramon's original question?

I couldn't understand Fabien's response. Did Daniel's help provide a satisfactory solution to the problem, or are you still using a workaround (whether it's binding a bogus service, or some other workaround)?

Thanks,
Amit, OSS Release Integration PM


Re: feedback request: extracting a common route-registrar job

Dieu Cao <dcao@...>
 

Agree with Zach, that for the CAPI team we would prefer to just wait for
the routing-api to be ready before prioritizing work to change the way CC
registers its routes.

Dieu

On Tuesday, September 8, 2015, Amit Gupta <agupta(a)pivotal.io> wrote:

Hey all,

The router already has some health check behaviour -- you can naively
register routes with it and it's smart about knowing whether to actually
route traffic or not. I'd like to understand what the specific use cases
are that would require a route-registration job to support additional
custom health check logic provided by the various colocated jobs.

Here's the set of things that are being done/need to be done for this
track of work:

1. extract route-registration process from UAA job in cf-release into its
own job in cf-release (note that the new uaa-release doesn't have this
process) [story done, needs acceptance]
2. use this route-registration job colocated with UAA in the cf manifests.
[story done, needs acceptance]
3. use this route-registration job colocated with CC, HM9k, and doppler in
the cf manifests. [stories in flight]
4. discover additional requirements for a generic route-registration job
to be useful [this email thread]
5. implement features in route-registration job to satisfy requirements
6. wait for consul to be declared stable, and enable the routing-api as a
default part of cf deployments [next couple final versions of cf-release
will validate consul]
7. all things doing route-registration should use routing-api
8. extract route-registration job out of cf-release (and put it where?) in
order to make it usable by other things like Gemfire, MySQL, Riak, etc.
9. discover more additional requirements for a generic route-registration
job to be useful
10. implement more features in route-registration job to satisfy
requirements
11. implement more features in routing-api to satisfy requirements

This isn't necessarily an ordered list, however we want to extract and
start using the route-registration stuff early to discover requirements,
and whether it's even a feasible idea, rather than waiting on consul and
routing-api to validate it. A nice additional consequence of doing this
extraction early is that once we want to switch over to routing-api, it
only needs to be done in one place.

Jens,

Here is a story in the Routing backlog to enable the routing-api to
support sticky-session route registrations:
https://www.pivotaltracker.com/story/show/100327146

Best,
Amit

On Tue, Sep 8, 2015 at 3:43 PM, Lyle Franklin <lfranklin(a)pivotal.io
<javascript:_e(%7B%7D,'cvml','lfranklin(a)pivotal.io');>> wrote:

The mysql and riak services register routes with this guy:
https://github.com/cloudfoundry-incubator/route-registrar. At a glance,
the option to run a healthcheck script to determine whether the service was
healthy would be our only real requirement in addition to the routing API
interaction. We'd love to be guinea pigs for any extracted library.

- Lyle

On Tue, Sep 8, 2015 at 1:38 PM, Zach Robinson <zrobinson(a)pivotal.io
<javascript:_e(%7B%7D,'cvml','zrobinson(a)pivotal.io');>> wrote:

i think it's a great idea, but I would wait until the routing api goes
live (waiting on consul stability) and use that tooling to create something
re-usable. I think the job for that could have some sort of health checky
thing where it would run a script with an agreed upon name to decide when
to register/unregister.

-Zach

On Tue, Sep 8, 2015 at 12:53 PM, Amit Gupta <agupta(a)pivotal.io
<javascript:_e(%7B%7D,'cvml','agupta(a)pivotal.io');>> wrote:

Hi all,

Several components within cf-release, as well as many jobs in different
releases, register a route with the gorouter:

- *doppler* registers the "doppler" and "loggregator" routes
- the *hm9000* API server registers the "hm9000" route
- *UAA* registers "uaa", "*.uaa", "login", and "*.login" routes
- *CC* registers the "api" route
- Many *service brokers* also register a route.

All these components register their routes in different ways. They
also all use the existing NATS flow, and will all need to switch their
implementations to use the routing API once that goes live and we start to
phase out NATS.

We have been working on extracting a route-registrar job which can be
colocated with other jobs and register routes on their behalf. Currently
it naively just always advertises the configured routes, and relies on the
gorouter's behaviour around knowing not to route requests to addresses that
aren't currently up.

One might require more sophisticated logic than this, however. For
example, a server may be "up" and theoretically capable of handling
requests, but not actually ready yet. Perhaps the router-registrar should
have some contract with its colocated jobs where those jobs can define a
health check script, and the route-registrar will only update the route
registration if the check succeeds.

Another requirement may exist around shutdown behaviour. Jobs may only
want to stop having its routes registered at a certain point in its drain
lifecycle.

*We would like feedback* from anyone maintaining a job or release that
does some sort of route registration to gather requirements that would be
desired of a generic route-registration component.

Thanks,
Amit, CF OSS Release Integration team


Relation between Network property in Resource pool block and Network property in Actual Job block

Ronak Banka
 

Hello All,

I have a resource pool , let say small_z1

- name: small_z1
network: cf1
stemcell: stemcell-xyz
cloud_properties:
instance_type: m1.small
availability_zone: zone1

and a Job , router having two networks assigned to it

- name: router
instances: 1
networks:
- name: router_internal
default: [dns, gateway]
static_ips:
- xy.xy.xy.xy
- name: router_external
static_ips:
- yz.yz.yz.yz
gateway: yy.yy.yy.yy
networks:
apps: router_internal
management: router_internal
resource_pool: small_z1

With these properties there are no issues anywhere.

what is the network property in resource pool responsible for, if the
created job networks and not linked to the one in pool??

Regards,
Ronak



--
View this message in context: http://cf-dev.70369.x6.nabble.com/Relation-between-Network-property-in-Resource-pool-block-and-Network-property-in-Actual-Job-block-tp1562.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: Any word on a large install version of CF on OpenStack?

Amit Kumar Gupta
 

Hi Michael,

What do you consider "large"? In principle, nothing is stopping you from
deploying a large cf deployment to your own OpenStack environment. When
you say "seeing a large install version of CF on OpenStack," whom would you
like to see produce this large install? There are several large
CF-on-OpenStack installs out there in the wild, but are you specifically
waiting on one from Stark & Wayne?

Best,
Amit

On Tue, Sep 8, 2015 at 2:55 PM, Michael Minges <mminges(a)ecsteam.com> wrote:

Curious whether there is any word on a possible time frame of seeing a
large install version of CF on OpenStack. I saw in the comments section of
a blog post (
https://blog.starkandwayne.com/2015/05/06/deploying-cloud-foundry-on-openstack-using-terraform/)
by Chris Weibel of Stark&Wayne that a cf-openstack-large.yml was in the
works but there was no set date for when that may be released to the
public. Currently we have a tiny deploy of CF on OpenStack for Dev but we
are looking to migrate our environment to a large install if this is in the
works and due out soon. Is this mailing list even the right place to start
this thread?

Thanks ahead for your time,

Michael Minges


Re: feedback request: extracting a common route-registrar job

Amit Kumar Gupta
 

Hey all,

The router already has some health check behaviour -- you can naively
register routes with it and it's smart about knowing whether to actually
route traffic or not. I'd like to understand what the specific use cases
are that would require a route-registration job to support additional
custom health check logic provided by the various colocated jobs.

Here's the set of things that are being done/need to be done for this track
of work:

1. extract route-registration process from UAA job in cf-release into its
own job in cf-release (note that the new uaa-release doesn't have this
process) [story done, needs acceptance]
2. use this route-registration job colocated with UAA in the cf manifests.
[story done, needs acceptance]
3. use this route-registration job colocated with CC, HM9k, and doppler in
the cf manifests. [stories in flight]
4. discover additional requirements for a generic route-registration job to
be useful [this email thread]
5. implement features in route-registration job to satisfy requirements
6. wait for consul to be declared stable, and enable the routing-api as a
default part of cf deployments [next couple final versions of cf-release
will validate consul]
7. all things doing route-registration should use routing-api
8. extract route-registration job out of cf-release (and put it where?) in
order to make it usable by other things like Gemfire, MySQL, Riak, etc.
9. discover more additional requirements for a generic route-registration
job to be useful
10. implement more features in route-registration job to satisfy
requirements
11. implement more features in routing-api to satisfy requirements

This isn't necessarily an ordered list, however we want to extract and
start using the route-registration stuff early to discover requirements,
and whether it's even a feasible idea, rather than waiting on consul and
routing-api to validate it. A nice additional consequence of doing this
extraction early is that once we want to switch over to routing-api, it
only needs to be done in one place.

Jens,

Here is a story in the Routing backlog to enable the routing-api to support
sticky-session route registrations:
https://www.pivotaltracker.com/story/show/100327146

Best,
Amit

On Tue, Sep 8, 2015 at 3:43 PM, Lyle Franklin <lfranklin(a)pivotal.io> wrote:

The mysql and riak services register routes with this guy:
https://github.com/cloudfoundry-incubator/route-registrar. At a glance,
the option to run a healthcheck script to determine whether the service was
healthy would be our only real requirement in addition to the routing API
interaction. We'd love to be guinea pigs for any extracted library.

- Lyle

On Tue, Sep 8, 2015 at 1:38 PM, Zach Robinson <zrobinson(a)pivotal.io>
wrote:

i think it's a great idea, but I would wait until the routing api goes
live (waiting on consul stability) and use that tooling to create something
re-usable. I think the job for that could have some sort of health checky
thing where it would run a script with an agreed upon name to decide when
to register/unregister.

-Zach

On Tue, Sep 8, 2015 at 12:53 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hi all,

Several components within cf-release, as well as many jobs in different
releases, register a route with the gorouter:

- *doppler* registers the "doppler" and "loggregator" routes
- the *hm9000* API server registers the "hm9000" route
- *UAA* registers "uaa", "*.uaa", "login", and "*.login" routes
- *CC* registers the "api" route
- Many *service brokers* also register a route.

All these components register their routes in different ways. They also
all use the existing NATS flow, and will all need to switch their
implementations to use the routing API once that goes live and we start to
phase out NATS.

We have been working on extracting a route-registrar job which can be
colocated with other jobs and register routes on their behalf. Currently
it naively just always advertises the configured routes, and relies on the
gorouter's behaviour around knowing not to route requests to addresses that
aren't currently up.

One might require more sophisticated logic than this, however. For
example, a server may be "up" and theoretically capable of handling
requests, but not actually ready yet. Perhaps the router-registrar should
have some contract with its colocated jobs where those jobs can define a
health check script, and the route-registrar will only update the route
registration if the check succeeds.

Another requirement may exist around shutdown behaviour. Jobs may only
want to stop having its routes registered at a certain point in its drain
lifecycle.

*We would like feedback* from anyone maintaining a job or release that
does some sort of route registration to gather requirements that would be
desired of a generic route-registration component.

Thanks,
Amit, CF OSS Release Integration team


Re: feedback request: extracting a common route-registrar job

Lyle Franklin
 

The mysql and riak services register routes with this guy:
https://github.com/cloudfoundry-incubator/route-registrar. At a glance, the
option to run a healthcheck script to determine whether the service was
healthy would be our only real requirement in addition to the routing API
interaction. We'd love to be guinea pigs for any extracted library.

- Lyle

On Tue, Sep 8, 2015 at 1:38 PM, Zach Robinson <zrobinson(a)pivotal.io> wrote:

i think it's a great idea, but I would wait until the routing api goes
live (waiting on consul stability) and use that tooling to create something
re-usable. I think the job for that could have some sort of health checky
thing where it would run a script with an agreed upon name to decide when
to register/unregister.

-Zach

On Tue, Sep 8, 2015 at 12:53 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hi all,

Several components within cf-release, as well as many jobs in different
releases, register a route with the gorouter:

- *doppler* registers the "doppler" and "loggregator" routes
- the *hm9000* API server registers the "hm9000" route
- *UAA* registers "uaa", "*.uaa", "login", and "*.login" routes
- *CC* registers the "api" route
- Many *service brokers* also register a route.

All these components register their routes in different ways. They also
all use the existing NATS flow, and will all need to switch their
implementations to use the routing API once that goes live and we start to
phase out NATS.

We have been working on extracting a route-registrar job which can be
colocated with other jobs and register routes on their behalf. Currently
it naively just always advertises the configured routes, and relies on the
gorouter's behaviour around knowing not to route requests to addresses that
aren't currently up.

One might require more sophisticated logic than this, however. For
example, a server may be "up" and theoretically capable of handling
requests, but not actually ready yet. Perhaps the router-registrar should
have some contract with its colocated jobs where those jobs can define a
health check script, and the route-registrar will only update the route
registration if the check succeeds.

Another requirement may exist around shutdown behaviour. Jobs may only
want to stop having its routes registered at a certain point in its drain
lifecycle.

*We would like feedback* from anyone maintaining a job or release that
does some sort of route registration to gather requirements that would be
desired of a generic route-registration component.

Thanks,
Amit, CF OSS Release Integration team


Any word on a large install version of CF on OpenStack?

Michael Minges
 

Curious whether there is any word on a possible time frame of seeing a large install version of CF on OpenStack. I saw in the comments section of a blog post (https://blog.starkandwayne.com/2015/05/06/deploying-cloud-foundry-on-openstack-using-terraform/) by Chris Weibel of Stark&Wayne that a cf-openstack-large.yml was in the works but there was no set date for when that may be released to the public. Currently we have a tiny deploy of CF on OpenStack for Dev but we are looking to migrate our environment to a large install if this is in the works and due out soon. Is this mailing list even the right place to start this thread?

Thanks ahead for your time,

Michael Minges


PHP and HHVM support questions

Mike Dalessio
 

Hello cf-dev,

*TL;DR*: The Cloud Foundry Buildpacks team is discussing whether, and how,
to continue support for HHVM support in the php-buildpack.

*Actions*: If you're a PHP developer, please fill out a short four-question
survey to help us determine what level of support the community needs for
HHVM.

*Please click through to the anonymous survey here:*

https://docs.google.com/forms/d/1WBupympWFRMQnoGZAgQLKmUZugreVldj3xDhyn9kpWM/viewform?usp=send_form

-----

*Context*

The PHP buildpack, in v4.0.0 and later, supports PHP 5.4, 5.5, and 5.6 as
well as HHVM 3.5 and 3.6.

HHVM currently presents a challenge in that it depends on many packages
that are not present in the rootfs. The tooling we're using now downloads a
handful of .deb packages as part of the HHVM compilation process and
packages them in the buildpack with the compiled binary.

This, of course, opens HHVM users up to potentially needing to update a
buildpack to address security vulnerabilities and bugs that could normally
be easily addressed with a rootfs update. And maybe that's OK, but it's a
notable deviation from how we generally the binaries we vendor into the CF
buildpacks.

One possible solution is to add all the packages necessary to run HHVM to
the rootfs, which would include libboost as well as the other libraries
enumerated here:

https://www.pivotaltracker.com/story/show/99169476

In order to really understand the tradeoffs, it's necessary to understand
whether, and how, HHVM is being used by the CF community.


This is related to a broader conversation around customization and
modification of rootfses, but for now I'd like to focus on the specific
question of whether HHVM support is valuable enough to continue.

Thanks for reading, and *once again, the survey link is here*:

https://docs.google.com/forms/d/1WBupympWFRMQnoGZAgQLKmUZugreVldj3xDhyn9kpWM/viewform?usp=send_form

Cheers,

-mike


Re: feedback request: extracting a common route-registrar job

Zach Robinson
 

i think it's a great idea, but I would wait until the routing api goes live
(waiting on consul stability) and use that tooling to create something
re-usable. I think the job for that could have some sort of health checky
thing where it would run a script with an agreed upon name to decide when
to register/unregister.

-Zach

On Tue, Sep 8, 2015 at 12:53 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hi all,

Several components within cf-release, as well as many jobs in different
releases, register a route with the gorouter:

- *doppler* registers the "doppler" and "loggregator" routes
- the *hm9000* API server registers the "hm9000" route
- *UAA* registers "uaa", "*.uaa", "login", and "*.login" routes
- *CC* registers the "api" route
- Many *service brokers* also register a route.

All these components register their routes in different ways. They also
all use the existing NATS flow, and will all need to switch their
implementations to use the routing API once that goes live and we start to
phase out NATS.

We have been working on extracting a route-registrar job which can be
colocated with other jobs and register routes on their behalf. Currently
it naively just always advertises the configured routes, and relies on the
gorouter's behaviour around knowing not to route requests to addresses that
aren't currently up.

One might require more sophisticated logic than this, however. For
example, a server may be "up" and theoretically capable of handling
requests, but not actually ready yet. Perhaps the router-registrar should
have some contract with its colocated jobs where those jobs can define a
health check script, and the route-registrar will only update the route
registration if the check succeeds.

Another requirement may exist around shutdown behaviour. Jobs may only
want to stop having its routes registered at a certain point in its drain
lifecycle.

*We would like feedback* from anyone maintaining a job or release that
does some sort of route registration to gather requirements that would be
desired of a generic route-registration component.

Thanks,
Amit, CF OSS Release Integration team


Re: [p1-data-services] feedback request: extracting a common route-registrar job

Jens Deppe <jdeppe@...>
 

The GemFire service registers HA routes to our dashboard(s). For this to
work correctly and have the gorouter honor session stickiness I submitted
this pull request to natbeat:
https://github.com/cloudfoundry-incubator/natbeat/pull/5. The essence of
the fix is:

For a HA backend service (such as a dashboard) I need to have requests be
sticky. To enable this I need to set the private_instance_id in the
RegistryMessage so that the gorouter does the right thing by setting a
__VCAP_ID__ cookie. This is enabled by a private_instance_id in the
registration message.


Thanks
--Jens

On Tue, Sep 8, 2015 at 12:53 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hi all,

Several components within cf-release, as well as many jobs in different
releases, register a route with the gorouter:

- *doppler* registers the "doppler" and "loggregator" routes
- the *hm9000* API server registers the "hm9000" route
- *UAA* registers "uaa", "*.uaa", "login", and "*.login" routes
- *CC* registers the "api" route
- Many *service brokers* also register a route.

All these components register their routes in different ways. They also
all use the existing NATS flow, and will all need to switch their
implementations to use the routing API once that goes live and we start to
phase out NATS.

We have been working on extracting a route-registrar job which can be
colocated with other jobs and register routes on their behalf. Currently
it naively just always advertises the configured routes, and relies on the
gorouter's behaviour around knowing not to route requests to addresses that
aren't currently up.

One might require more sophisticated logic than this, however. For
example, a server may be "up" and theoretically capable of handling
requests, but not actually ready yet. Perhaps the router-registrar should
have some contract with its colocated jobs where those jobs can define a
health check script, and the route-registrar will only update the route
registration if the check succeeds.

Another requirement may exist around shutdown behaviour. Jobs may only
want to stop having its routes registered at a certain point in its drain
lifecycle.

*We would like feedback* from anyone maintaining a job or release that
does some sort of route registration to gather requirements that would be
desired of a generic route-registration component.

Thanks,
Amit, CF OSS Release Integration team


feedback request: extracting a common route-registrar job

Amit Kumar Gupta
 

Hi all,

Several components within cf-release, as well as many jobs in different
releases, register a route with the gorouter:

- *doppler* registers the "doppler" and "loggregator" routes
- the *hm9000* API server registers the "hm9000" route
- *UAA* registers "uaa", "*.uaa", "login", and "*.login" routes
- *CC* registers the "api" route
- Many *service brokers* also register a route.

All these components register their routes in different ways. They also
all use the existing NATS flow, and will all need to switch their
implementations to use the routing API once that goes live and we start to
phase out NATS.

We have been working on extracting a route-registrar job which can be
colocated with other jobs and register routes on their behalf. Currently
it naively just always advertises the configured routes, and relies on the
gorouter's behaviour around knowing not to route requests to addresses that
aren't currently up.

One might require more sophisticated logic than this, however. For
example, a server may be "up" and theoretically capable of handling
requests, but not actually ready yet. Perhaps the router-registrar should
have some contract with its colocated jobs where those jobs can define a
health check script, and the route-registrar will only update the route
registration if the check succeeds.

Another requirement may exist around shutdown behaviour. Jobs may only
want to stop having its routes registered at a certain point in its drain
lifecycle.

*We would like feedback* from anyone maintaining a job or release that does
some sort of route registration to gather requirements that would be
desired of a generic route-registration component.

Thanks,
Amit, CF OSS Release Integration team


Proposal: Decomposing cf-release and Extracting Deployment Strategies

Amit Kumar Gupta
 

Hi all,

The CF OSS Release Integration team (casually referred to as the "MEGA
team") is trying to solve a lot of tightly interrelated problems, and make
many of said problems less interrelated. It is difficult to address just
one issue without touching the others, so the following proposal addresses
several issues, but the most important ones are:

* decompose cf-release into many independently manageable, independently
testable, independently usable releases
* separate manifest generation strategies from the release source, paving
the way for Diego to be part of the standard deployment

This proposal will outline a picture of how manifest generation will work
in a unified manner in development, test, and integration environments. It
will also outline a picture of what each release’s test pipelines will look
like, how they will feed into a common integration environment, and how
feedback from the integration environment will feed back into the test
environments. Finally, it will propose a picture for what the integration
environment will look like, and how we get from the current integration
environment to where we want to be.

For further details, please feel free to view and comment here:

https://docs.google.com/document/d/1Viga_TzUB2nLxN_ILqksmUiILM1hGhq7MBXxgLaUOkY

Thanks,
Amit, CF OSS Release Integration team


Re: How to deploy a Web application using HTTPs

James Bayer
 

juan i don't understand what you are trying to do.

your node app should listen to the $PORT environment variable with a plain
http connection.

the load balancer you use for cloud foundry (HAProxy or a LB you provide
like F5 or ELB) should terminate SSL and add the appropriate
x-forwarded-proto header to indicate whether the originating request was
SSL.

gorouter also supports received https traffic from the load balancer, but
does not re-encrypt the traffic to the backend container.

app client ---HTTPS---> LB ---HTTPS---> GoRouter ---HTTP--->
DEA/DiegoCell

what are you trying to do?

On Tue, Sep 8, 2015 at 11:34 AM, Juan Antonio Breña Moral <
bren(a)juanantonio.info> wrote:

Hi James,

I have just tested and I received this message:

"502 Bad Gateway: Registered endpoint failed to handle the request."

Source:
https://github.com/jabrena/CloudFoundryLab/tree/master/Node_HelloWorld_ssl

I think that it is a very important feature. In the example, I use a local
certificate to offer a https connection with an API, but CF doesn't have
any support.

My question is: How to deploy in Pivotal a secure application if the
platform doesn't that support?

Juan Antonio


--
Thank you,

James Bayer


Re: How to deploy a Web application using HTTPs

Juan Antonio Breña Moral <bren at juanantonio.info...>
 

Hi James,

I have just tested and I received this message:

"502 Bad Gateway: Registered endpoint failed to handle the request."

Source:
https://github.com/jabrena/CloudFoundryLab/tree/master/Node_HelloWorld_ssl

I think that it is a very important feature. In the example, I use a local certificate to offer a https connection with an API, but CF doesn't have any support.

My question is: How to deploy in Pivotal a secure application if the platform doesn't that support?

Juan Antonio


Re: So many hard-coded dropsonde destinations to metrons

Warren Fernandes
 

The LAMB team added a chore to discuss how we can better manage a dropsonde_incoming_port on the metron_agent over here https://www.pivotaltracker.com/story/show/102935222

We'll update this thread once we decide how to proceed.


Re: CAB September Call on 9/9/2015 @ 8a PDT

Michael Maximilien
 

Final reminder for the CAB call tomorrow. See you at Pivotal SF and talk to you all then.

Best,

dr.max
ibm cloud labs
silicon valley, ca

Sent from my iPhone

On Sep 2, 2015, at 6:04 PM, Michael Maximilien <maxim(a)us.ibm.com> wrote:

Hi, all,

Quick reminder that the CAB call for September is next week Wednesday September 9th @ 8a PDT.

Please add any project updates to Agenda here: https://docs.google.com/document/d/1SCOlAquyUmNM-AQnekCOXiwhLs6gveTxAcduvDcW_xI/edit#heading=h.o44xhgvum2we

If you have something else to share, please also add an entry at the end.

Best,

Chip, James, and Max

PS: Dr.Nic this is one week in advance, so no excuses ;) phone info listed on agenda.
PPS: Have a great labor day weekend---if you are in the US.


Re: Generic data points for dropsonde

Johannes Tuchscherer
 

Ben,

I guess I am working under the assumption that the current upstream schema
is not going to see a terrible amount of change. The StatsD protocol has
been very stable for over four years, so I don't understand why we would
add more and more metric types. (I already struggle with the decision to
have container metrics as their own data type. I am not quite sure why that
was done vs just expressing them as ValueMetrics).

I am also not following your argument with the multiple implementations of
a redis export? Why would you have multiple implementations of a redis info
export? Also, why does the downstream consumer have to know about the
schema? Neither the datadog nozzle nor the graphite nozzle cares about any
type of schema right now.

But to answer your question, I think as a downstream developer I am not as
interested in whether you are sending me a uint32 or uint64, but the
meaning (e.g. counter vs value) is much more important to me. So, if you
were to do nested metrics, I think I would rather like to see having nested
counters or values in there plus maybe one type that we are missing which
is a generic event with just a string.

Generally, I would try to avoid falling into the trap of creating a overly
generic system at the cost of making consumers unnecessarily complicated.
Maybe it would help if you outlined a few use cases that might benefit from
a system like this and how specifically you would implement a downstream
consumer (e.g. is there a common place where I can fetch the schema for the
generic data point?).

On Sat, Sep 5, 2015 at 6:57 AM, James Bayer <jbayer(a)pivotal.io> wrote:

after understanding ben's proposal of what i would call an extensible
generic point versus the status quo of metrics that are actually hard-coded
in software on by the metric producer and the metric consumer, i
immediately gravitated toward the approach by ben.

cloud foundry has really benefited from extensibility in these examples:

* diego lifecycles
* app buildpacks
* app docker images
* app as windows build artifact
* service brokers
* cf cli plugins
* collector plugins
* firehose nozzles
* diego route emitters
* garden backends
* bosh cli plugins
* bosh releases
* external bosh CPIs
* bosh health monitor plugins

let me know if there are other points of extension i'm missing.

in most cases, the initial implementations required cloud foundry system
components to change software to support additional extensibility, and some
of the examples above still require that and it's an issue in frustration
as someone with an idea to explore needs to persuade the cf maintaining
team to process a pull request or complete work on an area. i see ben's
proposal as making an extremely valuable additional point of extension for
creating application and system metrics that benefits the entire cloud
foundry ecosystem.

i am sympathetic to the question raised by dwayne around how large the
messages will be. it would seem that we could consider an upper bound on
the number of attributes supported by looking at the types of metrics that
would be expressed. the redis info point is already 84 attributes for
example.

all of the following seem related to scaling considerations off the top of
my head:
* how large an individual metric may be
* at what rate the platform should support producers sending metrics
* what platform quality of service to provide (lossiness or not, back
pressure, rate limiting, etc)
* what type of clients to the metrics are supported and any limitations
related to that.
* whether there is tenant variability in some of the dimensions above. for
example a system metric might have a higher SLA than an app metric

should we consider putting a boundary on the "how large an individual
metric may be" by limiting the initial implementation to a number of
attributes (that we could change later or make configurable?).

i'm personally really excited about this new set of extensibility being
proposed and the creative things people will do with it. having loggregator
as a built-in system component versus a bolt-on is already such a great
capability compared with other platforms and i see investments to make it
more extensible and apply to more scenarios as making cloud foundry more
valuable and more fun to use.

On Fri, Sep 4, 2015 at 10:52 AM, Benjamin Black <bblack(a)pivotal.io> wrote:

johannes,

the problem of upstream schema changes causing downstream change or
breakage is the current situation: every addition of a metric type implies
a change to the dropsonde-protocol requiring everything downstream to be
updated.

the schema concerns are similar. currently there is no schema whatsoever
beyond the very fine grained "this is a name and this is a value". this
means every implementation of redis info export, for example, can, and
almost certainly will, be different. this results in every downstream
consumer having to know every possible variant or to only support specific
variants, both exactly the problem you are looking to avoid.

i share the core concern regarding ensuring points are "sufficiently"
self describing. however, there is no clear line delineating what is
sufficient. the current proposal pushes almost everything out to schema. we
could imagine a change to the attributes that includes what an attribute is
(gauge, counter, etc), what the units are for the attribute, and so on.

it is critical that we balance the complexity of the points against
complexity of the consumers as there is no free lunch here. which specific
functionality would you want to see in the generic points to achieve the
balance you prefer?


b



On Wed, Sep 2, 2015 at 2:07 PM, Johannes Tuchscherer <
jtuchscherer(a)pivotal.io> wrote:

The current way of sending metrics as either Values or Counters through
the pipeline makes the development of a downstream consumer (=nozzle)
pretty easy. If you look at the datadog nozzle[0], it just takes all
ValueMetrics and Counters and sends them off to datadog. The nozzle does
not have to know anything about these metrics (e.g. their origin, name, or
layout).

Adding a new way to send metrics as a nested object would make the
downstream implementation certainly more complicated. In that case, the
nozzle developer has to know what metrics are included inside the generic
point (basically the schema of the metric) and treat each point
accordingly. For example, if I were to write a nozzle that emits metrics to
Graphite with a StatsD client (like it is done here[1]), I need to know if
my int64 value is a Gauge or a Counter. Also, my consumer is under constant
risk of breaking when the upstream schema changes.

We are already facing this problem with the container metrics. But at
least the container metrics are in a defined format that is well documented
and not likely to change.

I agree with you, though, the the dropsonde protocol could use a
mechanism for easier extension. Having a GenericPoint and/or GenericEvent
seems like a good idea that I whole-heartedly support. I would just like to
stay away from nested metrics. I think the cost of adding more logic into
the downstream consumer (and making it harder to maintain) is not worth the
benefit of a more concise metric transport.


[0] https://github.com/cloudfoundry-incubator/datadog-firehose-nozzle
[1] https://github.com/CloudCredo/graphite-nozzle

On Tue, Sep 1, 2015 at 5:52 PM, Benjamin Black <bblack(a)pivotal.io>
wrote:

great questions, dwayne.

1) the partition key is intended to be used in a similar manner to
partitioners in distributed systems like cassandra or kafka. the specific
behavior i would like to make part of the contract is two-fold: that all
data with the same key is routed to the same partition and that all data in
a partition is FIFO (meaning no ordering guarantees beyond arrival time).

this could help with the multi-line log problem by ensuring a single
consumer will receive all lines for a given log entry in order, allowing
simple reassembly. however, the lines might be interleaved with other lines
with the same key or even other keys that happen to map to the same
partition, so the consumer does require a bit of intelligence. this was
actually one of the driving scenarios for adding the key.

2) i expect typical points to be in the hundreds of bytes to a few KB.
if we find ourselves regularly needing much larger points, especially near
that 64KB limit, i'd look to the JSON representation as the hierarchical
structure is more efficiently managed there.


b




On Tue, Sep 1, 2015 at 4:42 PM, <dschultz(a)pivotal.io> wrote:

Hi Ben,

I was wondering if you could give a concrete use case for the
partition key functionality.

In particular I am interested in how we solve multi line log entries.
I think it would be better to solve it by keeping all the data (the
multiple lines) together throughout the logging/metrics pipeline, but could
see how something like a partition key might help keep the data together as
well.

Second question: how large do you see these point messages getting
(average and max)? There are still several stages of the logging/metrics
pipeline that use UDP with a standard 64K size limit.

Thanks,
Dwayne

On Aug 28, 2015, at 4:54 PM, Benjamin Black <bblack(a)pivotal.io> wrote:

All,

The existing dropsonde protocol uses a different message type for each
event type. HttpStart, HttpStop, ContainerMetrics, and so on are all
distinct types in the protocol definition. This requires protocol changes
to introduce any new event type, making such changes very expensive. We've
been working for the past few weeks on an addition to the dropsonde
protocol to support easier future extension to new types of events and to
make it easier for users to define their own events.

The document linked below [1] describes a generic data point message
capable of carrying multi-dimensional, multi-metric points as sets of
name/value pairs. This new message is expected to be added as an additional
entry in the existing dropsonde protocol metric type enum. Things are now
at a point where we'd like to get feedback from the community before moving
forward with implementation.

Please contribute your thoughts on the document in whichever way you
are most comfortable: comments on the document, email here, or email
directly to me. If you comment on the document, please make sure you are
logged in so we can keep track of who is asking for what. Your views are
not just appreciated, but critical to the continued health and success of
the Cloud Foundry community. Thank you!


b

[1]
https://docs.google.com/document/d/1SzvT1BjrBPqUw6zfSYYFfaW9vX_dTZZjn5sl2nxB6Bc/edit?usp=sharing





--
Thank you,

James Bayer