Re: cloud_controller_ng performance degrades slowly over time

Mike Youngstrom <youngm@...>

We don't have the same performance numbers since we don't use New Relic.
However, for the last 6 months or so after 3-4 weeks our smoke tests begin
failing randomly with 500 errors doing basic things like creating routes.
If we restart the CC(s) then everything works great for another 3-4 weeks.


On Tue, Sep 29, 2015 at 1:08 PM, Matt Cholick <cholick(a)> wrote:

This is a pretty tricky one, as it takes a long time to manifest. After a
while without a restart, cloud_controller_ng take a long time listing org
users. For example, in an org with 350 users, before restart `cf org-users
ORG` took 1:20. After restart, the call took 0:07. We've only notice this
twice, both times after the cloud controller had been running for a couple
weeks without a restart.

Looking in New Relic, the breakdown of an individual call to
organization/guid/managers shows the majority of the call externally:

[image: Inline image 1]

Though uaa itself isn't the issue, as things are fine immediately after
restarting cloud_controller_ng (and restarting uaa has no affect).

Have other Cloud Foundry operators seen this degraded performance? Is
there other information we could provide to turn this into a workable bug

-Matt Cholick

Join { to automatically receive all group messages.