Changes to docs have been committed, and will show up with the next push of the docs app.
Shannon Coen Product Manager, Cloud Foundry Pivotal, Inc.
toggle quoted messageShow quoted text
On Tue, Feb 7, 2017 at 11:30 AM, Shannon Coen <scoen(a)pivotal.io> wrote: Hello Thomas,
Comments inline.
On Thu, Feb 2, 2017 at 10:40 PM, Anderer, Thomas <thomas.anderer(a)sap.com> wrote:
Hello everyone,
we already had an internal discussion on the topic and since we could not quite come to a solution, I'd like to ask in this list.
We're using a self-made blue-green-deployment-script which pretty much does what's explained on cloudfoundry.org (https://docs.cloudfoundry.org /devguide/deploy-apps/blue-green.html), and additionally checks (HTTP-GET) the application after push, map-route and unmap-route. This script has been working fine on an instance A of CF (DEA-based and relatively small), but has issues on instance B (Diego-based, larger). The aforementioned HTTP-GET sometimes fails with "404 Not Found: Requested route does not exist" directly after push or map-route. We never experienced this issue on instance A. This could of course mean that the instance is small enough that all router operation are handled faster than our script is able to react. On instance B it sometimes takes up to a 1 second or even longer until the route mapping is finally completed.
I am reviewing the logic for how routes are registered for apps on DEAs, but I believe it is also asynchronous. It may be that registration of routes takes slightly longer with Diego, or you're seeing another difference between the environments.
In our internal discussion with our CF operators, we found a couple of parts of documentation which at least hint to different, maybe even inconsistent inner workings of the route mapping: 1) https://docs.cloudfoundry.org/devguide/deploy-apps/blue-green.html: Step 3: Map-route - The CF Router immediately begins to load balance traffic for demo-time.example.com between Blue and Green. - In my opinion this implies or at least strongly hints that the map-route call was supposed to be synchronous.
The routing table is indeed updated asynchronously, and we can update the docs to clarify this; "immediately" may be a bit misleading.
2) PUT to "/v2/apps/#{app_guid}/routes/#{route_guid}" returns 201 CREATED and not 202 ACCEPTED, which also implies that it is a synchronous operation. It also does not return an event or operation which could be pinged to wait for completion of route mapping. 3) Operations which show the state of the route, like for example 'cf
routes' already shows the route as being successfully mapped, although 404 is still returned.
Currently Cloud Controller has no way of knowing whether a route is ever registered with a router, whether the app is on DEAs or Diego. We could consider how to provide such a guarantee; e.g. CC could poll routes until an app returns a 200 (several issues with this). Also, if we were to add this feature we couldn't change the response in v2 from 201 to 202 as that would not be backwards compatible; we could consider this for the v3 CC API.
4) See https://docs.cloudfoundry.org/devguide/deploy-apps/routes-do mains.html#map-route: Applications running on the DEA architecture must be restarted after routes for an app are mapped or unmapped. Applications running on Diego do not need to be restarted. - This is contrary to what I experienced, since route mapping on our instance A always worked synchronously and without restart of the application. Furthermore, it contradicts what's explained in 1) at least for DEA.
This sentence in the docs is incorrect. We will update them. Restarting an app on DEA is not required after mapping or unmapping routes.
5) https://github.com/cloudfoundry/diego-design-notes#routing- translation-components: Routing Translation Components: Route-Emitter a) monitors DesiredLRP state and ActualLRP state via the BBS. When a change is detected, the Route-Emitter emits route registration and unregistration messages to the gorouter via the NATS message bus, b) periodically emits the entire routing table to the router, c) maintains a lock in consul to ensure only one route-emitter handles route registration at a time. - This hints that route mapping is an asynchronous task.
Some of the issues could be fixed by repeatedly pinging the application in order to wait until the route has been mapped. But what about blue-green-deployment, where I map the application to a route on which there is already a running application. Here, I cannot find out in a general way, if the route mapping is completed and when I can start unmapping the route from the old application. If route-mapping is indeed an asynchronous task, in my opinion the description on https://docs.cloudfoundry.org/devguide/deploy-apps/blue-green.html is misleading at best. And all blue-green-deployment-scripts which I've seen so far have this issue.
So, is the route mapping really supposed to be asynchronous? If so, is there any general way to find out when the route-mapping process has finished?
Route registration is an asynchronous operation. Other than polling the application, CF does not expose a guaranteed way to discover that the route has been registered with the routers themselves.
We recommend adding a short wait to your blue-green deploy script. The blue-green-deploy CLI plugin effectively does this, but running a few other commands (renaming apps) between mapping the route to one app and unmapping it from another.
Thank you for your feedback! I'll submit a PR to update the docs now. Shannon
Thank you for your help and clarification,
Best regards, -- Thomas Anderer Agile Software Engineer andrena objects ag currently working at SAP
|