On the CAB call Dr. Nic asked about support in the routing tier for
connection draining. I asked him out-of-band to elaborate, then realized
this was a topic the community might be interested in. Nic explained that
he's looking for a TCP router to route requests from apps on CF to a
clustered service, and wants to allow graceful draining of requests before
a backend was moved.
When a backend for a route is removed from the routing table, the TCP
Router will prevent new requests for the route from being routed to that
backend, and will reject requests for the route when all associated
backends are removed. The routing table is updated via the Routing API; the
TCP router fetches its configuration by subscribing to the API via SSE, as
well as a periodic bulk fetch. When backends are removed for a route,
existing connections remain up until closed by either the client or
backend. We don't currently sever open connections after a timeout.
In CF, when Diego removes an app instance it sends a TERM to the process in
the container which has 10s to drain active connections before the
container is torn down and all the processes killed. In parallel the
backend will be removed from the route, preventing new connections.
Nic:
Does the existing behavior described above meet your needs, or would you
require a timeout and proactive connection severing by the router? I recall
we found this difficult using HAProxy last year, leading us to build the
Switchboard proxy for cf-mysql-release. Have you considered Switchboard?
In your use case could the IPs of your cluster nodes change at any time, or
only on a deploy? In either case, you could use the Routing API to
configure the router with the node addresses (similar to the way clients
must currently register routes via NATS).
Would you expect other clients to register routes with the same deployment
of the API, or would you isolated it to the deployment of your service? The
Routing API, like NATS, doesn't support multi-tenant isolation yet, so
multiple clients could potentially add unrelated backends for the same
route.
Finally, are you only interested in TCP routing; if so, I imagine you would
deploy the routing-release with only the API and TCP router jobs?
Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.