[p1-data-services] feedback request: extracting a common route-registrar job

Jens Deppe <jdeppe@...>

The GemFire service registers HA routes to our dashboard(s). For this to
work correctly and have the gorouter honor session stickiness I submitted
this pull request to natbeat:
https://github.com/cloudfoundry-incubator/natbeat/pull/5. The essence of
the fix is:

For a HA backend service (such as a dashboard) I need to have requests be
sticky. To enable this I need to set the private_instance_id in the
RegistryMessage so that the gorouter does the right thing by setting a
__VCAP_ID__ cookie. This is enabled by a private_instance_id in the
registration message.


On Tue, Sep 8, 2015 at 12:53 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hi all,

Several components within cf-release, as well as many jobs in different
releases, register a route with the gorouter:

- *doppler* registers the "doppler" and "loggregator" routes
- the *hm9000* API server registers the "hm9000" route
- *UAA* registers "uaa", "*.uaa", "login", and "*.login" routes
- *CC* registers the "api" route
- Many *service brokers* also register a route.

All these components register their routes in different ways. They also
all use the existing NATS flow, and will all need to switch their
implementations to use the routing API once that goes live and we start to
phase out NATS.

We have been working on extracting a route-registrar job which can be
colocated with other jobs and register routes on their behalf. Currently
it naively just always advertises the configured routes, and relies on the
gorouter's behaviour around knowing not to route requests to addresses that
aren't currently up.

One might require more sophisticated logic than this, however. For
example, a server may be "up" and theoretically capable of handling
requests, but not actually ready yet. Perhaps the router-registrar should
have some contract with its colocated jobs where those jobs can define a
health check script, and the route-registrar will only update the route
registration if the check succeeds.

Another requirement may exist around shutdown behaviour. Jobs may only
want to stop having its routes registered at a certain point in its drain

*We would like feedback* from anyone maintaining a job or release that
does some sort of route registration to gather requirements that would be
desired of a generic route-registration component.

Amit, CF OSS Release Integration team