Date   

Re: cf-deployment 3.0

Krannich, Bernd
 

I was about to mention that I indeed enjoyed the existing CF model of releases which roughly translated to “you better run fast” for consumers.

 

The thing I found needed some tweaking in the existing model was the approach to including fixes for prio very high CVEs. Often times, in our quest to run fast and keep systems secure as fast as possible, we ended up pulling in a bunch of features which required additional validation and essentially slowed us down in our effort of rolling things out to production.

 

I felt that the better approach to support people that can keep the speed would have been to always provide fixes for prio very high CVEs as cherry-picks based on the latest released version (and then of course also include those fixes into the next “regular” release, too).

 

Based on the comments so far, it sounds like for consumers “you better run fast” will actually be harder with the newly proposed approach. But maybe I’m not fully understanding the concepts, so it would be great to get some more details on the plans.

 

Regards,

Bernd

 

From: <cf-dev@...> on behalf of Chip Childers <cchilders@...>
Reply-To: "cf-dev@..." <cf-dev@...>
Date: Wednesday, 18. July 2018 at 19:38
To: "cf-dev@..." <cf-dev@...>
Subject: Re: [cf-dev] cf-deployment 3.0

 

Food for thought: One of the challenges here is that maintaining patches for past coordinated releases is expensive (both in time and CI costs). In the CF ecosystem, this has traditionally been the responsibility of the downstream commercial distributions.

 

This isn't to say that there isn't a solution that can help all downstream users (including non-commercial users AND the distros), yet not burden the Rel Int team too much. I'm not sure what that solution is though...

 

On Mon, Jul 16, 2018 at 9:47 AM Franks, Geoff <geoff.franks@...> wrote:

I’m going to agree with Marco’s concerns here. Making life harder and less stable for the end users of CF has a real potential to alienate and push away the CF userbase altogether, even if it’s just in appearance (seeing monthly major releases of a product may cause new organizations to hesitate to migrate, until the release process appears more stable.

 

 

From: <cf-dev@...> on behalf of Marco Voelz <marco.voelz@...>
Reply-To: "cf-dev@..." <cf-dev@...>
Date: Monday, July 16, 2018 at 1:34 AM
To: "cf-dev@..." <cf-dev@...>
Subject: [External] Re: [cf-dev] cf-deployment 3.0

 

Dear Josh,

 

Thanks for the context, I wasn't aware of what happened before the release of networking 2.0. To stick with your example, though: From what you are saying I have understood that you would rather have done it this way – please correct me here if I'm wrong:

  • integrate networking release 2.0 into cf-deployment, 
  • integrate other PRs with breaking changes
  • bumping cf-deployment to a new major version, given above changes
  • merging the CVE fixes only into the new major version of cf-deployment

 

With this process, you would have achieved the following:

  • the development teams are happy, because they shipped as soon as they were ready to
  • operators are grumpy, because they have to bump networking to a new major version and adopt to other breaking changes in order to fix CVEs

 

I'm not saying you have to turn this tradeoff the other way around, but in my opinion this doesn't seem very consumer friendly. 

 

In your team's mission, you have clearly stated that your goal is to enable development teams to maintain a high velocity. I'd like to stress that we shouldn't leave the operators and users out of the picture here. In the end, you're developing for them, not for yourself. 

 

I'm not sure if the consumer/operator persona is a thing for RelInt, but if that's the case, here's something I'd like to hold true for whatever change RelInt makes to their process:

"As an operator of CF, I'd like to consume CVE fixes with as little changes to my existing installation as possible, such that I close known vulnerabilities as soon as possible"

 

Does that sound reasonable?

 

Warm regards

Marco


From: cf-dev@... <cf-dev@...> on behalf of Josh Collins <jcollins@...>
Sent: Friday, July 13, 2018 11:39:30 PM
To: cf-dev@...
Subject: Re: [cf-dev] cf-deployment 3.0

 

Hi Marco,

I'm happy to provide more context on the container networking 2.0 reference.
The container networking team submitted a PR to cf-deployment with changes required for them to ship v2.0. 
RelInt deferred the container networking team's PR for a few weeks due to competing priorities including multiple CVE's fixes.
During the deferral time, a few other PRs were submitted which included breaking changes.
These additional changes took much more time to integrate and validate than anticipated and in the end, the container networking team's 2.0 release was published in cf-d about 5 weeks after it was ready to go.
The introduction of a regular cadence aims to mitigate this type of delay in the future. Had we had one at the time, the networking team would have timed it's PR to align and we would have been poised to accept and publish it quickly.
We believe this will help teams confidently plan for, communicate about, and negotiate integrating their releases into cf-deployment.
And hopefully enable the RelInt team to integrate and ship major releases more seamlessly.

This is an evolving process so we'll see how things roll in the coming months and make adjustments where it makes sense to do so. 
I appreciate and welcome any and all feedback along the way.

Thanks very much,

Josh

--

Chip Childers
CTO, Cloud Foundry Foundation
1.267.250.0815


Re: cf-deployment 3.0

Jesse T. Alford
 

I don't agree with the claim that we didn't introduce major breaking changes in the past - we did. Routinely.

`cf-release` was sem-ver only insofar as every version was a major version. Changes just as dramatic as this were made on some but not all arbitrary major releases.

The major thing cf-d brings here is real semver, so it's _clear_ that some versions are major changes.

The credo remains the same - forward, always.

Chip's point about long-term support/backported fixes is exactly on-point. It's a major support burden, and is one of the principle pieces of work done by commercial distributors.

Jesse Alford
_Formerly of_ CF Release Integration


On Wed, Jul 18, 2018 at 11:38 AM Chip Childers <cchilders@...> wrote:
Food for thought: One of the challenges here is that maintaining patches for past coordinated releases is expensive (both in time and CI costs). In the CF ecosystem, this has traditionally been the responsibility of the downstream commercial distributions.

This isn't to say that there isn't a solution that can help all downstream users (including non-commercial users AND the distros), yet not burden the Rel Int team too much. I'm not sure what that solution is though...

On Mon, Jul 16, 2018 at 9:47 AM Franks, Geoff <geoff.franks@...> wrote:

I’m going to agree with Marco’s concerns here. Making life harder and less stable for the end users of CF has a real potential to alienate and push away the CF userbase altogether, even if it’s just in appearance (seeing monthly major releases of a product may cause new organizations to hesitate to migrate, until the release process appears more stable.

 

 

From: <cf-dev@...> on behalf of Marco Voelz <marco.voelz@...>
Reply-To: "cf-dev@..." <cf-dev@...>
Date: Monday, July 16, 2018 at 1:34 AM
To: "cf-dev@..." <cf-dev@...>
Subject: [External] Re: [cf-dev] cf-deployment 3.0

 

Dear Josh,

 

Thanks for the context, I wasn't aware of what happened before the release of networking 2.0. To stick with your example, though: From what you are saying I have understood that you would rather have done it this way – please correct me here if I'm wrong:

  • integrate networking release 2.0 into cf-deployment, 
  • integrate other PRs with breaking changes
  • bumping cf-deployment to a new major version, given above changes
  • merging the CVE fixes only into the new major version of cf-deployment

 

With this process, you would have achieved the following:

  • the development teams are happy, because they shipped as soon as they were ready to
  • operators are grumpy, because they have to bump networking to a new major version and adopt to other breaking changes in order to fix CVEs

 

I'm not saying you have to turn this tradeoff the other way around, but in my opinion this doesn't seem very consumer friendly. 

 

In your team's mission, you have clearly stated that your goal is to enable development teams to maintain a high velocity. I'd like to stress that we shouldn't leave the operators and users out of the picture here. In the end, you're developing for them, not for yourself. 

 

I'm not sure if the consumer/operator persona is a thing for RelInt, but if that's the case, here's something I'd like to hold true for whatever change RelInt makes to their process:

"As an operator of CF, I'd like to consume CVE fixes with as little changes to my existing installation as possible, such that I close known vulnerabilities as soon as possible"

 

Does that sound reasonable?

 

Warm regards

Marco


From: cf-dev@... <cf-dev@...> on behalf of Josh Collins <jcollins@...>
Sent: Friday, July 13, 2018 11:39:30 PM
To: cf-dev@...
Subject: Re: [cf-dev] cf-deployment 3.0

 

Hi Marco,

I'm happy to provide more context on the container networking 2.0 reference.
The container networking team submitted a PR to cf-deployment with changes required for them to ship v2.0. 
RelInt deferred the container networking team's PR for a few weeks due to competing priorities including multiple CVE's fixes.
During the deferral time, a few other PRs were submitted which included breaking changes.
These additional changes took much more time to integrate and validate than anticipated and in the end, the container networking team's 2.0 release was published in cf-d about 5 weeks after it was ready to go.
The introduction of a regular cadence aims to mitigate this type of delay in the future. Had we had one at the time, the networking team would have timed it's PR to align and we would have been poised to accept and publish it quickly.
We believe this will help teams confidently plan for, communicate about, and negotiate integrating their releases into cf-deployment.
And hopefully enable the RelInt team to integrate and ship major releases more seamlessly.

This is an evolving process so we'll see how things roll in the coming months and make adjustments where it makes sense to do so. 
I appreciate and welcome any and all feedback along the way.

Thanks very much,

Josh

--
Chip Childers
CTO, Cloud Foundry Foundation
1.267.250.0815


Re: cf-deployment 3.0

Chip Childers <cchilders@...>
 

Food for thought: One of the challenges here is that maintaining patches for past coordinated releases is expensive (both in time and CI costs). In the CF ecosystem, this has traditionally been the responsibility of the downstream commercial distributions.

This isn't to say that there isn't a solution that can help all downstream users (including non-commercial users AND the distros), yet not burden the Rel Int team too much. I'm not sure what that solution is though...

On Mon, Jul 16, 2018 at 9:47 AM Franks, Geoff <geoff.franks@...> wrote:

I’m going to agree with Marco’s concerns here. Making life harder and less stable for the end users of CF has a real potential to alienate and push away the CF userbase altogether, even if it’s just in appearance (seeing monthly major releases of a product may cause new organizations to hesitate to migrate, until the release process appears more stable.

 

 

From: <cf-dev@...> on behalf of Marco Voelz <marco.voelz@...>
Reply-To: "cf-dev@..." <cf-dev@...>
Date: Monday, July 16, 2018 at 1:34 AM
To: "cf-dev@..." <cf-dev@...>
Subject: [External] Re: [cf-dev] cf-deployment 3.0

 

Dear Josh,

 

Thanks for the context, I wasn't aware of what happened before the release of networking 2.0. To stick with your example, though: From what you are saying I have understood that you would rather have done it this way – please correct me here if I'm wrong:

  • integrate networking release 2.0 into cf-deployment, 
  • integrate other PRs with breaking changes
  • bumping cf-deployment to a new major version, given above changes
  • merging the CVE fixes only into the new major version of cf-deployment

 

With this process, you would have achieved the following:

  • the development teams are happy, because they shipped as soon as they were ready to
  • operators are grumpy, because they have to bump networking to a new major version and adopt to other breaking changes in order to fix CVEs

 

I'm not saying you have to turn this tradeoff the other way around, but in my opinion this doesn't seem very consumer friendly. 

 

In your team's mission, you have clearly stated that your goal is to enable development teams to maintain a high velocity. I'd like to stress that we shouldn't leave the operators and users out of the picture here. In the end, you're developing for them, not for yourself. 

 

I'm not sure if the consumer/operator persona is a thing for RelInt, but if that's the case, here's something I'd like to hold true for whatever change RelInt makes to their process:

"As an operator of CF, I'd like to consume CVE fixes with as little changes to my existing installation as possible, such that I close known vulnerabilities as soon as possible"

 

Does that sound reasonable?

 

Warm regards

Marco


From: cf-dev@... <cf-dev@...> on behalf of Josh Collins <jcollins@...>
Sent: Friday, July 13, 2018 11:39:30 PM
To: cf-dev@...
Subject: Re: [cf-dev] cf-deployment 3.0

 

Hi Marco,

I'm happy to provide more context on the container networking 2.0 reference.
The container networking team submitted a PR to cf-deployment with changes required for them to ship v2.0. 
RelInt deferred the container networking team's PR for a few weeks due to competing priorities including multiple CVE's fixes.
During the deferral time, a few other PRs were submitted which included breaking changes.
These additional changes took much more time to integrate and validate than anticipated and in the end, the container networking team's 2.0 release was published in cf-d about 5 weeks after it was ready to go.
The introduction of a regular cadence aims to mitigate this type of delay in the future. Had we had one at the time, the networking team would have timed it's PR to align and we would have been poised to accept and publish it quickly.
We believe this will help teams confidently plan for, communicate about, and negotiate integrating their releases into cf-deployment.
And hopefully enable the RelInt team to integrate and ship major releases more seamlessly.

This is an evolving process so we'll see how things roll in the coming months and make adjustments where it makes sense to do so. 
I appreciate and welcome any and all feedback along the way.

Thanks very much,

Josh

--
Chip Childers
CTO, Cloud Foundry Foundation
1.267.250.0815


CF/K8S SIG Calls

Chip Childers <cchilders@...>
 

All,

We held our last CF/K8S SIG call today, which is great news. They served the purpose of getting a bunch of the interesting work that's happening out into the open, and now most of the efforts are either inside a CFF PMC or on their way there. The attendees agreed that the time had come to discontinue the calls (although Julz says that I should use a bat signal if / when needed in the future).

So for those interested, dive into the various projects directly within the Runtime, Extensions and BOSH PMCs. :)

-chip
--
Chip Childers
CTO, Cloud Foundry Foundation
1.267.250.0815


FINAL REMINDER: CAB call for July is Wednesday (tomorrow) 07/18 @ 8a PST or 11a EST

Michael Maximilien
 

FYI...

Zoom soon. Best,

dr.max
ibm ☁ 
silicon valley, ca



dr.max
ibm ☁ 
silicon valley, ca


On Jul 12, 2018, at 11:16 AM, Michael Maximilien <maxim@...> wrote:

FYI...


Please remember to join the Zoom call [0] Wednesday July 18th at 8a Pacific for QAs, highlights, and two presentations:


1. Project Shield v8 Updates by James Hunt of Stark & Wayne [1] 


2. CF-Extensions Project Service Fabrik Updates by Ashish Jain of SAP  [2] and [3]


Zoom soon. Best,




Re: CF Application Runtime PMC - CF Bits-Service Project Lead Call for Nominations

Simon D Moser
 

Hello all,

IBM is nominating Peter Goetz for the CF Bits Service Project Lead in the Application Runtime PMC.

Peter is a Software Engineer at IBM working both as a core contributor to the Cloud Foundry Bits-Service and on IBM's Cloud Foundry production system.

Prior to joining IBM, Peter worked at Amazon as a technical lead, developing systems to expand Amazon's international business; he holds a Diploma degree in Physics from the University of Stuttgart.

Mit freundlichen Grüßen / Kind regards

Simon Moser

Senior Technical Staff Member / IBM Master Inventor
Bluemix Application Platform Lead Architect
Dept. C727, IBM Research & Development Boeblingen
 
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland
Schoenaicher Str. 220
71032 Boeblingen
Phone: +49-7031-16-4304
Fax: +49-7031-16-4890
E-Mail: smoser@...
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland Research & Development GmbH / Vorsitzender des
Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht
Stuttgart, HRB 243294
 
**
Great minds discuss ideas; average minds discuss events; small minds discuss people.
Eleanor Roosevelt



From:        "Dieu Cao" <dcao@...>
To:        cf-dev <cf-dev@...>
Date:        16/07/2018 22:19
Subject:        [cf-dev] CF Application Runtime PMC - CF Bits-Service Project Lead Call for Nominations
Sent by:        cf-dev@...




Hello All,

Simon Moser, the Project Lead for the Bits-Service team within the Application Runtime PMC, is rotating into a different role within IBM. We thank him for his time serving as the Bits-Service Project Lead. 

The Bits-Service team, located in Germany, now has an opening for its project lead. Project leads must be nominated by a Cloud Foundry Foundation member.

Please send nominations to me/in reply to this posting by end of day July 23rd, 2018.

If you have any questions about the role/process, please let me know.
These are described in the CFF governance documents. [1]

-Dieu Cao
CF Application Runtime PMC Lead

[1] https://www.cloudfoundry.org/wp-content/uploads/2015/09/CFF_Development_Operations_Policy.pdf




CF Application Runtime PMC - CF Bits-Service Project Lead Call for Nominations

Dieu Cao <dcao@...>
 

Hello All,

Simon Moser, the Project Lead for the Bits-Service team within the Application Runtime PMC, is rotating into a different role within IBM. We thank him for his time serving as the Bits-Service Project Lead. 

The Bits-Service team, located in Germany, now has an opening for its project lead. Project leads must be nominated by a Cloud Foundry Foundation member.

Please send nominations to me/in reply to this posting by end of day July 23rd, 2018.

If you have any questions about the role/process, please let me know.
These are described in the CFF governance documents. [1]

-Dieu Cao
CF Application Runtime PMC Lead


Re: Proposal for weighted routing user experience in Cloud Foundry

Filip Hanik
 

Use case: I want v1-stable to receive 5 times more traffic than each individual upgrade version I deploy


Phase 1: Deploy alpha 1

Proposed (sum MUST add up to a 100):
 v1-stable: 83
 v2-alpha1: 17

Suggested (simpler base1-lb)
v1-stable: 5
v2-alpha1: 1

Phase 2: Deploying Alpha 1 and 2

Proposed (sum MUST add up to a 100):
 v1-stable: 72
 v2-alpha1: 14
 v2-alpha2: 14

Suggested (simpler base1-lb)
v1-stable: 5
v2-alpha1: 1
v2-alpha2: 1

Why the simpler is better:
When adding v2-alpha2 I don't need to change the load balancing algorithm on all my settings. The relationship between v1 and v2-alpha1 remains exactly the same.
I also don't need to be doing any math to understand the relationship between the two.

The proposed base1-lb simply removes the need for percentages and calculations. 






On Fri, Jul 13, 2018 at 8:11 PM Filip Hanik <fhanik@...> wrote:
aaarrgh, there is a bug in my psuedo code

clusterWeight = [v1,v1,v1,v1,v1,v1,v2,v2,v2,v3,v3] should be
clusterWeight = [v1,v1,v1,v1,v1,v1,v2,v2,v2,v3,v4]

Full solution:
Implementation: "Randomized Round Robin" is also super simple [pseudo code follows]

clusterWeight = [v1,v1,v1,v1,v1,v1,v2,v2,v2,v3,v4] //very easy to create based on base1 solution
randomCluster = random(clusterWeight) 
int atomicPointer = 0;
for each request:
  next = atomicPointer.getAndIncrease();
  application = randomCluster[atomicPointer];


On Fri, Jul 13, 2018 at 8:09 PM Filip Hanik <fhanik@...> wrote:
I put a long comment in the doc, maybe comments are good for short notes. here is the spiel

"The sum of weights must add to 100"
I would say this is where being user friendly ends. If I add reviews-v4 I have to go in and rebalance the whole thing just to figure out how to get to 100.

an alternate solution can be much simpler:

What if you just used a single integer that is relative to the whole cluster. let's call it "base1-lb"

reviews-v1: 6
reviews-v2: 3
reviews-v3: 1

there are two ways to think of this

relative to each other:
In this scenario, v1 gets twice as many requests as v2, and six times as many requests as v3

or in consideration of X requests: (and this is most likely how the code implements it so that it doesn't have to do a lot of math)
This is saying is that for every (total) 10 requests, this is how they distributed.

to add v4
reviews-v1: 6
reviews-v2: 3
reviews-v3: 1
reviews-v4: 1

this is still super simple to look at. v1 gets 6x more than v3/v4, still gets 2x more than v2. I don't have to figure out how to "add up to a 100"

and it's not complicated to calculate either. for every 11 requests:
v1 gets 6
v2 gets 3
v3 gets 1
v4 gets 1

Implementation: "Randomized Round Robin" is also super simple [pseudo code follows]

clusterWeight = [v1,v1,v1,v1,v1,v1,v2,v2,v2,v3,v3] //very easy to create based on base1 solution
randomCluster = random(clusterWeight) 
int atomicPointer = 0;
for each request:
  next = atomicPointer.getAndIncrease();
  application = randomCluster[atomicPointer];

and that's it. the router doesn't have to figure out where the next request goes. This is a simple, elegant and easy to understand solution.

Filip








On Fri, Jul 13, 2018 at 3:09 PM Shubha Anjur Tupil <sanjurtupil@...> wrote:

The CF Routing team has received feedback from many users that support for weighted routing would make it easier to accomplish their goals. We have a proposal on the preferred user experience for weighted routing and the considerations we have taken into account.


If you have thoughts on this or have experience working with traffic splitting on other platforms, please share your feedback with us. Feel free to comment on the doc or reply here.


Regards,

CF Routing Team




Re: cf-deployment 3.0

Franks, Geoff
 

I’m going to agree with Marco’s concerns here. Making life harder and less stable for the end users of CF has a real potential to alienate and push away the CF userbase altogether, even if it’s just in appearance (seeing monthly major releases of a product may cause new organizations to hesitate to migrate, until the release process appears more stable.

 

 

From: <cf-dev@...> on behalf of Marco Voelz <marco.voelz@...>
Reply-To: "cf-dev@..." <cf-dev@...>
Date: Monday, July 16, 2018 at 1:34 AM
To: "cf-dev@..." <cf-dev@...>
Subject: [External] Re: [cf-dev] cf-deployment 3.0

 

Dear Josh,

 

Thanks for the context, I wasn't aware of what happened before the release of networking 2.0. To stick with your example, though: From what you are saying I have understood that you would rather have done it this way – please correct me here if I'm wrong:

  • integrate networking release 2.0 into cf-deployment, 
  • integrate other PRs with breaking changes
  • bumping cf-deployment to a new major version, given above changes
  • merging the CVE fixes only into the new major version of cf-deployment

 

With this process, you would have achieved the following:

  • the development teams are happy, because they shipped as soon as they were ready to
  • operators are grumpy, because they have to bump networking to a new major version and adopt to other breaking changes in order to fix CVEs

 

I'm not saying you have to turn this tradeoff the other way around, but in my opinion this doesn't seem very consumer friendly. 

 

In your team's mission, you have clearly stated that your goal is to enable development teams to maintain a high velocity. I'd like to stress that we shouldn't leave the operators and users out of the picture here. In the end, you're developing for them, not for yourself. 

 

I'm not sure if the consumer/operator persona is a thing for RelInt, but if that's the case, here's something I'd like to hold true for whatever change RelInt makes to their process:

"As an operator of CF, I'd like to consume CVE fixes with as little changes to my existing installation as possible, such that I close known vulnerabilities as soon as possible"

 

Does that sound reasonable?

 

Warm regards

Marco


From: cf-dev@... <cf-dev@...> on behalf of Josh Collins <jcollins@...>
Sent: Friday, July 13, 2018 11:39:30 PM
To: cf-dev@...
Subject: Re: [cf-dev] cf-deployment 3.0

 

Hi Marco,

I'm happy to provide more context on the container networking 2.0 reference.
The container networking team submitted a PR to cf-deployment with changes required for them to ship v2.0. 
RelInt deferred the container networking team's PR for a few weeks due to competing priorities including multiple CVE's fixes.
During the deferral time, a few other PRs were submitted which included breaking changes.
These additional changes took much more time to integrate and validate than anticipated and in the end, the container networking team's 2.0 release was published in cf-d about 5 weeks after it was ready to go.
The introduction of a regular cadence aims to mitigate this type of delay in the future. Had we had one at the time, the networking team would have timed it's PR to align and we would have been poised to accept and publish it quickly.
We believe this will help teams confidently plan for, communicate about, and negotiate integrating their releases into cf-deployment.
And hopefully enable the RelInt team to integrate and ship major releases more seamlessly.

This is an evolving process so we'll see how things roll in the coming months and make adjustments where it makes sense to do so. 
I appreciate and welcome any and all feedback along the way.

Thanks very much,

Josh


Re: Deprecate route-sync from CFCR to CFAR

Oleksandr Slynko
 

Hi Arghya,

You have mentioned in Github that you were able to overcome this issue.

For everyone else, here is the context and a bit more information.

History
In very early CFCR days, we did not support cloud provider and basically could not give access to the applications and API outside of the cluster. We had HA Proxies to give access to workloads and API.  At that point, several early adopters told us that they would like to try exposing routes in more dynamic way a-la CFAR and possibly reuse existing routing layer. The main point was that we would like to provision multiple clusters with ease and without changed to Cloud Config.
As result we created a route-sync. 

What is does
It solves two problems:
- have stable and known URL for the API, so we can use to sign the certificate
- have a way to expose applications

How we solve it now
For API, we suggest people to wire their load balancers directly and then add the URL to the manifest. For example, check how BBL does it https://github.com/cloudfoundry/bosh-bootloader/tree/master/plan-patches/cfcr-gcp

Are we diverging further from CFAR?
Yes, CFCR team is moving further to the "vanilla" Kubernetes. But we expect other team to provide solutions for both worlds. We don't have enough deep knowledge of CFAR components and getting this knowledge will slow us down in improving Kubernetes experience. 

We are ready to help anyone to understand Kubernetes more and provide better experience with both runtimes.

Sincerely,
Oleksandr


Re: cf-deployment 3.0

Marco Voelz
 

Dear Josh,


Thanks for the context, I wasn't aware of what happened before the release of networking 2.0. To stick with your example, though: From what you are saying I have understood that you would rather have done it this way – please correct me here if I'm wrong:

  • integrate networking release 2.0 into cf-deployment, 
  • integrate other PRs with breaking changes
  • bumping cf-deployment to a new major version, given above changes
  • merging the CVE fixes only into the new major version of cf-deployment

With this process, you would have achieved the following:
  • the development teams are happy, because they shipped as soon as they were ready to
  • operators are grumpy, because they have to bump networking to a new major version and adopt to other breaking changes in order to fix CVEs

I'm not saying you have to turn this tradeoff the other way around, but in my opinion this doesn't seem very consumer friendly. 

In your team's mission, you have clearly stated that your goal is to enable development teams to maintain a high velocity. I'd like to stress that we shouldn't leave the operators and users out of the picture here. In the end, you're developing for them, not for yourself. 

I'm not sure if the consumer/operator persona is a thing for RelInt, but if that's the case, here's something I'd like to hold true for whatever change RelInt makes to their process:
"As an operator of CF, I'd like to consume CVE fixes with as little changes to my existing installation as possible, such that I close known vulnerabilities as soon as possible"

Does that sound reasonable?

Warm regards
Marco


From: cf-dev@... <cf-dev@...> on behalf of Josh Collins <jcollins@...>
Sent: Friday, July 13, 2018 11:39:30 PM
To: cf-dev@...
Subject: Re: [cf-dev] cf-deployment 3.0
 
Hi Marco,

I'm happy to provide more context on the container networking 2.0 reference.
The container networking team submitted a PR to cf-deployment with changes required for them to ship v2.0. 
RelInt deferred the container networking team's PR for a few weeks due to competing priorities including multiple CVE's fixes.
During the deferral time, a few other PRs were submitted which included breaking changes.
These additional changes took much more time to integrate and validate than anticipated and in the end, the container networking team's 2.0 release was published in cf-d about 5 weeks after it was ready to go.
The introduction of a regular cadence aims to mitigate this type of delay in the future. Had we had one at the time, the networking team would have timed it's PR to align and we would have been poised to accept and publish it quickly.
We believe this will help teams confidently plan for, communicate about, and negotiate integrating their releases into cf-deployment.
And hopefully enable the RelInt team to integrate and ship major releases more seamlessly.

This is an evolving process so we'll see how things roll in the coming months and make adjustments where it makes sense to do so. 
I appreciate and welcome any and all feedback along the way.

Thanks very much,

Josh


Re: Proposal for weighted routing user experience in Cloud Foundry

Filip Hanik
 

aaarrgh, there is a bug in my psuedo code

clusterWeight = [v1,v1,v1,v1,v1,v1,v2,v2,v2,v3,v3] should be
clusterWeight = [v1,v1,v1,v1,v1,v1,v2,v2,v2,v3,v4]

Full solution:
Implementation: "Randomized Round Robin" is also super simple [pseudo code follows]

clusterWeight = [v1,v1,v1,v1,v1,v1,v2,v2,v2,v3,v4] //very easy to create based on base1 solution
randomCluster = random(clusterWeight) 
int atomicPointer = 0;
for each request:
  next = atomicPointer.getAndIncrease();
  application = randomCluster[atomicPointer];


On Fri, Jul 13, 2018 at 8:09 PM Filip Hanik <fhanik@...> wrote:
I put a long comment in the doc, maybe comments are good for short notes. here is the spiel

"The sum of weights must add to 100"
I would say this is where being user friendly ends. If I add reviews-v4 I have to go in and rebalance the whole thing just to figure out how to get to 100.

an alternate solution can be much simpler:

What if you just used a single integer that is relative to the whole cluster. let's call it "base1-lb"

reviews-v1: 6
reviews-v2: 3
reviews-v3: 1

there are two ways to think of this

relative to each other:
In this scenario, v1 gets twice as many requests as v2, and six times as many requests as v3

or in consideration of X requests: (and this is most likely how the code implements it so that it doesn't have to do a lot of math)
This is saying is that for every (total) 10 requests, this is how they distributed.

to add v4
reviews-v1: 6
reviews-v2: 3
reviews-v3: 1
reviews-v4: 1

this is still super simple to look at. v1 gets 6x more than v3/v4, still gets 2x more than v2. I don't have to figure out how to "add up to a 100"

and it's not complicated to calculate either. for every 11 requests:
v1 gets 6
v2 gets 3
v3 gets 1
v4 gets 1

Implementation: "Randomized Round Robin" is also super simple [pseudo code follows]

clusterWeight = [v1,v1,v1,v1,v1,v1,v2,v2,v2,v3,v3] //very easy to create based on base1 solution
randomCluster = random(clusterWeight) 
int atomicPointer = 0;
for each request:
  next = atomicPointer.getAndIncrease();
  application = randomCluster[atomicPointer];

and that's it. the router doesn't have to figure out where the next request goes. This is a simple, elegant and easy to understand solution.

Filip








On Fri, Jul 13, 2018 at 3:09 PM Shubha Anjur Tupil <sanjurtupil@...> wrote:

The CF Routing team has received feedback from many users that support for weighted routing would make it easier to accomplish their goals. We have a proposal on the preferred user experience for weighted routing and the considerations we have taken into account.


If you have thoughts on this or have experience working with traffic splitting on other platforms, please share your feedback with us. Feel free to comment on the doc or reply here.


Regards,

CF Routing Team




Re: Proposal for weighted routing user experience in Cloud Foundry

Filip Hanik
 

I put a long comment in the doc, maybe comments are good for short notes. here is the spiel

"The sum of weights must add to 100"
I would say this is where being user friendly ends. If I add reviews-v4 I have to go in and rebalance the whole thing just to figure out how to get to 100.

an alternate solution can be much simpler:

What if you just used a single integer that is relative to the whole cluster. let's call it "base1-lb"

reviews-v1: 6
reviews-v2: 3
reviews-v3: 1

there are two ways to think of this

relative to each other:
In this scenario, v1 gets twice as many requests as v2, and six times as many requests as v3

or in consideration of X requests: (and this is most likely how the code implements it so that it doesn't have to do a lot of math)
This is saying is that for every (total) 10 requests, this is how they distributed.

to add v4
reviews-v1: 6
reviews-v2: 3
reviews-v3: 1
reviews-v4: 1

this is still super simple to look at. v1 gets 6x more than v3/v4, still gets 2x more than v2. I don't have to figure out how to "add up to a 100"

and it's not complicated to calculate either. for every 11 requests:
v1 gets 6
v2 gets 3
v3 gets 1
v4 gets 1

Implementation: "Randomized Round Robin" is also super simple [pseudo code follows]

clusterWeight = [v1,v1,v1,v1,v1,v1,v2,v2,v2,v3,v3] //very easy to create based on base1 solution
randomCluster = random(clusterWeight) 
int atomicPointer = 0;
for each request:
  next = atomicPointer.getAndIncrease();
  application = randomCluster[atomicPointer];

and that's it. the router doesn't have to figure out where the next request goes. This is a simple, elegant and easy to understand solution.

Filip








On Fri, Jul 13, 2018 at 3:09 PM Shubha Anjur Tupil <sanjurtupil@...> wrote:

The CF Routing team has received feedback from many users that support for weighted routing would make it easier to accomplish their goals. We have a proposal on the preferred user experience for weighted routing and the considerations we have taken into account.


If you have thoughts on this or have experience working with traffic splitting on other platforms, please share your feedback with us. Feel free to comment on the doc or reply here.


Regards,

CF Routing Team




Proposal for weighted routing user experience in Cloud Foundry

Shubha Anjur Tupil
 

The CF Routing team has received feedback from many users that support for weighted routing would make it easier to accomplish their goals. We have a proposal on the preferred user experience for weighted routing and the considerations we have taken into account.


If you have thoughts on this or have experience working with traffic splitting on other platforms, please share your feedback with us. Feel free to comment on the doc or reply here.


Regards,

CF Routing Team




Re: cf-deployment 3.0

Josh Collins
 

Hi Marco,

I'm happy to provide more context on the container networking 2.0 reference.
The container networking team submitted a PR to cf-deployment with changes required for them to ship v2.0. 
RelInt deferred the container networking team's PR for a few weeks due to competing priorities including multiple CVE's fixes.
During the deferral time, a few other PRs were submitted which included breaking changes.
These additional changes took much more time to integrate and validate than anticipated and in the end, the container networking team's 2.0 release was published in cf-d about 5 weeks after it was ready to go.
The introduction of a regular cadence aims to mitigate this type of delay in the future. Had we had one at the time, the networking team would have timed it's PR to align and we would have been poised to accept and publish it quickly.
We believe this will help teams confidently plan for, communicate about, and negotiate integrating their releases into cf-deployment.
And hopefully enable the RelInt team to integrate and ship major releases more seamlessly.

This is an evolving process so we'll see how things roll in the coming months and make adjustments where it makes sense to do so. 
I appreciate and welcome any and all feedback along the way.

Thanks very much,

Josh


Re: Deprecate route-sync from CFCR to CFAR

Shannon Coen
 

That issue could be addressed by having CFCR use a different router group, which is part of the solution we have proposed here: https://docs.google.com/document/d/1RXu-o44zxwrU5gKqsghT86hXKwgPrPpSk6-TWSTlrBs/edit

Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.


On Fri, Jul 13, 2018 at 11:47 AM Gabriel Rosenhouse <grosenhouse@...> wrote:
Also: I suspect that the CFCR route-sync feature has a dangerous interaction with CFAR Cloud Controller, if both CFCR and CFAR are sharing a TCP Routing API.  CFAR Cloud Controller creates and uses a TCP Router Group for itself, and expects to completely own that router group.  My reading of the CFCR code is that route-sync will happily discover and use that Router Group as-is.  The CFAR Routing API has no mechanism to prevent this collision, or to prevent the two clients from reserving the same TCP port for different backends.  I think that the result will be that ingress to that TCP Router Port will get load balanced to both the CFAR App and the CFCR Service.  This is likely not what the user intends.

On Fri, Jul 13, 2018 at 4:14 AM, arghya sadhu <arghya88@...> wrote:
Hi Oleksandr,

What alternative do we have if we want to use kubectl with tls

Thanks,
Arghya

On Fri, Jul 13, 2018, 3:01 PM <oslynko@...> wrote:
Hi, cf-dev

Almost one year ago CFCR has added the ability to expose applications using CFAR gorouter. This was an experiment.
We haven't added any changes to this feature for one year and plan to remove it in next release. It will greatly reduce the burden on the team.

If someone uses it, please contact us via email or Slack (#cfcr).

Thanks,
Oleksandr



Re: Deprecate route-sync from CFCR to CFAR

Gabriel Rosenhouse <grosenhouse@...>
 

Also: I suspect that the CFCR route-sync feature has a dangerous interaction with CFAR Cloud Controller, if both CFCR and CFAR are sharing a TCP Routing API.  CFAR Cloud Controller creates and uses a TCP Router Group for itself, and expects to completely own that router group.  My reading of the CFCR code is that route-sync will happily discover and use that Router Group as-is.  The CFAR Routing API has no mechanism to prevent this collision, or to prevent the two clients from reserving the same TCP port for different backends.  I think that the result will be that ingress to that TCP Router Port will get load balanced to both the CFAR App and the CFCR Service.  This is likely not what the user intends.

On Fri, Jul 13, 2018 at 4:14 AM, arghya sadhu <arghya88@...> wrote:
Hi Oleksandr,

What alternative do we have if we want to use kubectl with tls

Thanks,
Arghya

On Fri, Jul 13, 2018, 3:01 PM <oslynko@...> wrote:
Hi, cf-dev

Almost one year ago CFCR has added the ability to expose applications using CFAR gorouter. This was an experiment.
We haven't added any changes to this feature for one year and plan to remove it in next release. It will greatly reduce the burden on the team.

If someone uses it, please contact us via email or Slack (#cfcr).

Thanks,
Oleksandr



Re: Deprecate route-sync from CFCR to CFAR

arghya88@...
 

Hi Oleksandr,

What alternative do we have if we want to use kubectl with tls

Thanks,
Arghya

On Fri, Jul 13, 2018, 3:01 PM <oslynko@...> wrote:
Hi, cf-dev

Almost one year ago CFCR has added the ability to expose applications using CFAR gorouter. This was an experiment.
We haven't added any changes to this feature for one year and plan to remove it in next release. It will greatly reduce the burden on the team.

If someone uses it, please contact us via email or Slack (#cfcr).

Thanks,
Oleksandr


Deprecate route-sync from CFCR to CFAR

Oleksandr Slynko
 

Hi, cf-dev

Almost one year ago CFCR has added the ability to expose applications using CFAR gorouter. This was an experiment.
We haven't added any changes to this feature for one year and plan to remove it in next release. It will greatly reduce the burden on the team.

If someone uses it, please contact us via email or Slack (#cfcr).

Thanks,
Oleksandr


Feature Narrative / Proposal: Let's fix* CPU Sharing and Metrics in CF!

Julz Friedman
 

Hi cf-dev-

Here is a feature narrative. The feature narrative is called "Let's Fix CPU Sharing and Metrics in CF" (but actually it's just a proposal to make them quite a lot better). More information about the feature narrative is contained in the feature narrative. Please enjoy the feature narrative.

Comments, feedback, suggestions, and questions very welcome and appreciated!


Thanks,
Julz