Re: cloud_controller_ng performance degrades slowly over time


Matt Cholick
 

Zack & Swetha,
Thanks for the suggestion, will gather netstat info there next time.

Amit,
1:20 delay is due to paging. The total call length for each page is closer
to 10s. Just included those two calls with paging by the cf command line
included numbers to demonstrate the dramatic difference after a restart.
Delays disappear after a restart. We're not running consul yet, so it
wouldn't be that.

-Matt

On Thu, Oct 8, 2015 at 10:03 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

We've seen issues on some environments where requests to cc that involve
cc making a request to uaa or hm9k have a 5s delay while the local consul
agent fails to resolves the DNS for uaa/hm9k, before moving on to a
different resolver.

The expected behavior observed in almost all environments is that the DNS
request to consul agent fails fast and moves on to the next resolver, we
haven't figured out why a couple envs exhibit different behavior. The
impact is a 5 or 10s delay (5 or 10, not 5 to 10). It doesn't explain your
1:20 delay though. Are you always seeing delays that long?

Amit


On Thursday, October 8, 2015, Zach Robinson <zrobinson(a)pivotal.io> wrote:

Hey Matt,

I'm trying to think of other things that would affect only the endpoints
that interact with UAA and would be fixed after a CC restart. I'm
wondering if it's possible there are a large number of connections being
kept-alive, or stuck in a wait state or something. Could you take a look
at the netstat information on the CC and UAA next time this happens?

-Zach and Swetha

Join cf-dev@lists.cloudfoundry.org to automatically receive all group messages.