Date   

Re: Doppler zoning query

James Bayer
 

ok thanks for the extra detail.

to confirm, during the load test, the http traffic is being routed through
zones 4 and 5 app instances on DEAs in a balanced way. however the dopplers
associated with zone 4 / 5 are getting a very small amount of load sent
their way. is that right?

On Sun, May 24, 2015 at 3:45 PM, john mcteague <john.mcteague(a)gmail.com>
wrote:

I am seeing logs from zone 4 and 5 when tailing the logs (*cf logs
hello-world | grep App | awk '{ print $2 }'*), I see a relatively even
balance between all app instances, yet doppler on zones 1-3 consume far
greater cpu resources (15x in some cases) than zones 4 and 5. Generally
zones 4 and 5 barely get above 1% utilization.

Running *cf curl /v2/apps/guid/stats | grep host | sort *shows 30
instances, 6 in each zone, a perfect balance.

Each loggregator is running with 8GB RAM and 4vcpus.


John

On Sat, May 23, 2015 at 11:56 PM, James Bayer <jbayer(a)pivotal.io> wrote:

john,

can you say more about "receiving no load at all"? for example, if you
restart one of the app instances in zone 4 or zone 5 do you see logs with
"cf logs"? you can target a single app instance index to get restarted with
using a "cf curl" command for terminating an app index [1]. you can find
the details with json output from "cf stats" that should show you the
private IPs for the DEAs hosting your app, which should help you figure out
which zone each app index is in.
http://apidocs.cloudfoundry.org/209/apps/terminate_the_running_app_instance_at_the_given_index.html

if you are seeing logs from zone 4 and zone 5, then what might be
happening is that for some reason DEAs in zone 4 or zone 5 are not routable
somewhere along the path. reasons for that could be:
* DEAs in Zone 4 / Zone 5 not getting apps that are hosted there listed
in the routing table
* The routing table may be correct, but for some reason the routers
cannot reach DEAs in zone 4 or zone 5 with outbound traffic and routers
fails over to instances in DEAs 1-3 that it can reach
* some other mystery

On Fri, May 22, 2015 at 2:06 PM, john mcteague <john.mcteague(a)gmail.com>
wrote:

We map our dea's , dopplers and traffic controllers in 5 logical zones
using the various zone properties of doppler, metron_agent and
traffic_controller. This aligns to our physical failure domains in
openstack.

During a recent load test we discovered that zones 4 and 5 were
receiving no load at all, all traffic went to zones 1-3.

What would cause this unbalanced distribution? I have a single app
running 30 instances and have verified it is evenly balanced across all 5
zones (6 instances in each). I have additionally verified that each logical
zone in the bosh yml contains 1 dea, doppler server and traffic controller.

Thanks,
John

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

--
Thank you,

James Bayer


scheduler

Corentin Dupont <corentin.dupont@...>
 

Hi guys,
just to know, is there a project to add a job scheduler in Cloud Foundry?
I'm thinking of something like the Heroku scheduler (
https://devcenter.heroku.com/articles/scheduler).
That would be very neat to have regular tasks triggered...
Thanks,
Corentin


--

Corentin Dupont
Researcher @ Create-Netwww.corentindupont.info


Re: scheduler

Corentin Dupont <cdupont@...>
 

To complete my request, I'm thinking of something like this in the
manifest.yml:

applications:
- name: virusscan
memory: 512M
instances: 1




*schedule: - startFrom : a date endBefore : a date
walltime : a duration*
* precedence : other application name moldable : true/false*

What do you think?

On Mon, May 25, 2015 at 11:25 AM, Corentin Dupont <cdupont(a)create-net.org>
wrote:


---------- Forwarded message ----------
From: Corentin Dupont <corentin.dupont(a)create-net.org>
Date: Mon, May 25, 2015 at 11:21 AM
Subject: scheduler
To: cf-dev(a)lists.cloudfoundry.org


Hi guys,
just to know, is there a project to add a job scheduler in Cloud Foundry?
I'm thinking of something like the Heroku scheduler (
https://devcenter.heroku.com/articles/scheduler).
That would be very neat to have regular tasks triggered...
Thanks,
Corentin


--

Corentin Dupont
Researcher @ Create-Netwww.corentindupont.info



Re: scheduler

James Bayer
 

there is ongoing work to support process types using buildpacks, so that
the same application codebase could be used for multiple different types of
processes (web, worker, etc).

once process types and diego tasks are fully available, we expect to
implement a user-facing api for running batch jobs as application processes.

what people do today is run a long-running process application which uses
something like quartz scheduler [1] or ruby clock with a worker system like
resque [2]

[1] http://quartz-scheduler.org/
[2] https://github.com/resque/resque-scheduler

On Mon, May 25, 2015 at 6:19 AM, Corentin Dupont <cdupont(a)create-net.org>
wrote:

To complete my request, I'm thinking of something like this in the
manifest.yml:

applications:
- name: virusscan
memory: 512M
instances: 1




*schedule: - startFrom : a date endBefore : a date
walltime : a duration*
* precedence : other application name moldable :
true/false*

What do you think?

On Mon, May 25, 2015 at 11:25 AM, Corentin Dupont <cdupont(a)create-net.org>
wrote:


---------- Forwarded message ----------
From: Corentin Dupont <corentin.dupont(a)create-net.org>
Date: Mon, May 25, 2015 at 11:21 AM
Subject: scheduler
To: cf-dev(a)lists.cloudfoundry.org


Hi guys,
just to know, is there a project to add a job scheduler in Cloud Foundry?
I'm thinking of something like the Heroku scheduler (
https://devcenter.heroku.com/articles/scheduler).
That would be very neat to have regular tasks triggered...
Thanks,
Corentin


--

Corentin Dupont
Researcher @ Create-Netwww.corentindupont.info


_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer


Re: Doppler zoning query

john mcteague <john.mcteague@...>
 

Correct, thanks.

On Mon, May 25, 2015 at 12:01 AM, James Bayer <jbayer(a)pivotal.io> wrote:

ok thanks for the extra detail.

to confirm, during the load test, the http traffic is being routed through
zones 4 and 5 app instances on DEAs in a balanced way. however the dopplers
associated with zone 4 / 5 are getting a very small amount of load sent
their way. is that right?


On Sun, May 24, 2015 at 3:45 PM, john mcteague <john.mcteague(a)gmail.com>
wrote:

I am seeing logs from zone 4 and 5 when tailing the logs (*cf logs
hello-world | grep App | awk '{ print $2 }'*), I see a relatively even
balance between all app instances, yet doppler on zones 1-3 consume far
greater cpu resources (15x in some cases) than zones 4 and 5. Generally
zones 4 and 5 barely get above 1% utilization.

Running *cf curl /v2/apps/guid/stats | grep host | sort *shows 30
instances, 6 in each zone, a perfect balance.

Each loggregator is running with 8GB RAM and 4vcpus.


John

On Sat, May 23, 2015 at 11:56 PM, James Bayer <jbayer(a)pivotal.io> wrote:

john,

can you say more about "receiving no load at all"? for example, if you
restart one of the app instances in zone 4 or zone 5 do you see logs with
"cf logs"? you can target a single app instance index to get restarted with
using a "cf curl" command for terminating an app index [1]. you can find
the details with json output from "cf stats" that should show you the
private IPs for the DEAs hosting your app, which should help you figure out
which zone each app index is in.
http://apidocs.cloudfoundry.org/209/apps/terminate_the_running_app_instance_at_the_given_index.html

if you are seeing logs from zone 4 and zone 5, then what might be
happening is that for some reason DEAs in zone 4 or zone 5 are not routable
somewhere along the path. reasons for that could be:
* DEAs in Zone 4 / Zone 5 not getting apps that are hosted there listed
in the routing table
* The routing table may be correct, but for some reason the routers
cannot reach DEAs in zone 4 or zone 5 with outbound traffic and routers
fails over to instances in DEAs 1-3 that it can reach
* some other mystery

On Fri, May 22, 2015 at 2:06 PM, john mcteague <john.mcteague(a)gmail.com>
wrote:

We map our dea's , dopplers and traffic controllers in 5 logical zones
using the various zone properties of doppler, metron_agent and
traffic_controller. This aligns to our physical failure domains in
openstack.

During a recent load test we discovered that zones 4 and 5 were
receiving no load at all, all traffic went to zones 1-3.

What would cause this unbalanced distribution? I have a single app
running 30 instances and have verified it is evenly balanced across all 5
zones (6 instances in each). I have additionally verified that each logical
zone in the bosh yml contains 1 dea, doppler server and traffic controller.

Thanks,
John

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

--
Thank you,

James Bayer


Re: Doppler zoning query

Erik Jasiak <ejasiak@...>
 

Hi John,

I'll be working on this with engineering in the morning; thanks for the
details thus far.

This is puzzling: Metrons do not route traffic to dopplers outside
their zone today. If all your app instances are spread evenly, and all are
serving an equal amount of requests, then I would expect no
major variability in Doppler load either.

For completeness, what version of CF are you running? I assume your
configurations for all dopplers are roughly the same? All app instances per
AZ are serving an equal number of requests?

Thanks,
Erik Jasiak

On Monday, May 25, 2015, john mcteague <john.mcteague(a)gmail.com> wrote:

Correct, thanks.

On Mon, May 25, 2015 at 12:01 AM, James Bayer <jbayer(a)pivotal.io
<javascript:_e(%7B%7D,'cvml','jbayer(a)pivotal.io');>> wrote:

ok thanks for the extra detail.

to confirm, during the load test, the http traffic is being routed
through zones 4 and 5 app instances on DEAs in a balanced way. however the
dopplers associated with zone 4 / 5 are getting a very small amount of load
sent their way. is that right?


On Sun, May 24, 2015 at 3:45 PM, john mcteague <john.mcteague(a)gmail.com
<javascript:_e(%7B%7D,'cvml','john.mcteague(a)gmail.com');>> wrote:

I am seeing logs from zone 4 and 5 when tailing the logs (*cf logs
hello-world | grep App | awk '{ print $2 }'*), I see a relatively even
balance between all app instances, yet doppler on zones 1-3 consume far
greater cpu resources (15x in some cases) than zones 4 and 5. Generally
zones 4 and 5 barely get above 1% utilization.

Running *cf curl /v2/apps/guid/stats | grep host | sort *shows 30
instances, 6 in each zone, a perfect balance.

Each loggregator is running with 8GB RAM and 4vcpus.


John

On Sat, May 23, 2015 at 11:56 PM, James Bayer <jbayer(a)pivotal.io
<javascript:_e(%7B%7D,'cvml','jbayer(a)pivotal.io');>> wrote:

john,

can you say more about "receiving no load at all"? for example, if you
restart one of the app instances in zone 4 or zone 5 do you see logs with
"cf logs"? you can target a single app instance index to get restarted with
using a "cf curl" command for terminating an app index [1]. you can find
the details with json output from "cf stats" that should show you the
private IPs for the DEAs hosting your app, which should help you figure out
which zone each app index is in.
http://apidocs.cloudfoundry.org/209/apps/terminate_the_running_app_instance_at_the_given_index.html

if you are seeing logs from zone 4 and zone 5, then what might be
happening is that for some reason DEAs in zone 4 or zone 5 are not routable
somewhere along the path. reasons for that could be:
* DEAs in Zone 4 / Zone 5 not getting apps that are hosted there listed
in the routing table
* The routing table may be correct, but for some reason the routers
cannot reach DEAs in zone 4 or zone 5 with outbound traffic and routers
fails over to instances in DEAs 1-3 that it can reach
* some other mystery

On Fri, May 22, 2015 at 2:06 PM, john mcteague <john.mcteague(a)gmail.com
<javascript:_e(%7B%7D,'cvml','john.mcteague(a)gmail.com');>> wrote:

We map our dea's , dopplers and traffic controllers in 5 logical zones
using the various zone properties of doppler, metron_agent and
traffic_controller. This aligns to our physical failure domains in
openstack.

During a recent load test we discovered that zones 4 and 5 were
receiving no load at all, all traffic went to zones 1-3.

What would cause this unbalanced distribution? I have a single app
running 30 instances and have verified it is evenly balanced across all 5
zones (6 instances in each). I have additionally verified that each logical
zone in the bosh yml contains 1 dea, doppler server and traffic controller.

Thanks,
John

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
<javascript:_e(%7B%7D,'cvml','cf-dev(a)lists.cloudfoundry.org');>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

--
Thank you,

James Bayer


Re: CVE-2015-1834 CC Path Traversal vulnerability

Noburou TANIGUCHI
 

I understand the CFF strongly recommends to upgrade to v208 or after, but for
those (including us) who cannot immediately upgrade, I want to know if there
is a workaround against this vulnerability.

I've found that there is a commit which seems related this vulnerability:
https://github.com/cloudfoundry/cloud_controller_ng/commit/5257a8af6990e71cd1e34ae8978dfe4773b32826

Cherry-picking this commit may be a workaround? Or we need another commits
to cherry-pick?

Thanks in advance.





--
View this message in context: http://cf-dev.70369.x6.nabble.com/cf-dev-CVE-2015-1834-CC-Path-Traversal-vulnerability-tp163p173.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: CVE-2015-1834 CC Path Traversal vulnerability

Dieu Cao <dcao@...>
 

Yes, that's the correct commit to cherry pick for the cc path traversal
vulnerability.

-Dieu
CF Runtime PM

On Tue, May 26, 2015 at 12:30 AM, nota-ja <dev(a)nota.m001.jp> wrote:

I understand the CFF strongly recommends to upgrade to v208 or after, but
for
those (including us) who cannot immediately upgrade, I want to know if
there
is a workaround against this vulnerability.

I've found that there is a commit which seems related this vulnerability:

https://github.com/cloudfoundry/cloud_controller_ng/commit/5257a8af6990e71cd1e34ae8978dfe4773b32826

Cherry-picking this commit may be a workaround? Or we need another commits
to cherry-pick?

Thanks in advance.





--
View this message in context:
http://cf-dev.70369.x6.nabble.com/cf-dev-CVE-2015-1834-CC-Path-Traversal-vulnerability-tp163p173.html
Sent from the CF Dev mailing list archive at Nabble.com.
_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: scheduler

Corentin Dupont <cdupont@...>
 

Hi James, thanks for the answer!
We are interested to implement a job scheduler for CF. Do you think this
could be interesting to have?

We are working in a project called DC4Cities (http://www.dc4cities.eu) were
the objective is to make data centres use more renewable energy.
We want to use PaaS frameworks such as CloudFoundry to achieve this goal.
The idea is to schedule some PaaS tasks at the moment there is more
renewable energies (when the sun is shining).

That's why I had the idea to implement a job scheduler for batch jobs in
CF. For example one could state "I need to have this task to run for 2
hours per day" and the scheduler could choose when to run it.

Another possibility is to have application-oriented SLA implemented at CF
level. For example if some KPIs of the application are getting too low, CF
would spark a new container. If the SLA is defined with some flexibility,
it could also be used to schedule renewable energies. For example in our
trial scenarios we have an application that convert images. Its SLA says
that it needs to convert 1000 images per day, but you are free to produce
them when you want i.e. when renewable energies are available...

On Mon, May 25, 2015 at 7:29 PM, James Bayer <jbayer(a)pivotal.io> wrote:

there is ongoing work to support process types using buildpacks, so that
the same application codebase could be used for multiple different types of
processes (web, worker, etc).

once process types and diego tasks are fully available, we expect to
implement a user-facing api for running batch jobs as application processes.

what people do today is run a long-running process application which uses
something like quartz scheduler [1] or ruby clock with a worker system like
resque [2]

[1] http://quartz-scheduler.org/
[2] https://github.com/resque/resque-scheduler

On Mon, May 25, 2015 at 6:19 AM, Corentin Dupont <cdupont(a)create-net.org>
wrote:

To complete my request, I'm thinking of something like this in the
manifest.yml:

applications:
- name: virusscan
memory: 512M
instances: 1




*schedule: - startFrom : a date endBefore : a date
walltime : a duration*
* precedence : other application name moldable :
true/false*

What do you think?

On Mon, May 25, 2015 at 11:25 AM, Corentin Dupont <cdupont(a)create-net.org
wrote:

---------- Forwarded message ----------
From: Corentin Dupont <corentin.dupont(a)create-net.org>
Date: Mon, May 25, 2015 at 11:21 AM
Subject: scheduler
To: cf-dev(a)lists.cloudfoundry.org


Hi guys,
just to know, is there a project to add a job scheduler in Cloud
Foundry?
I'm thinking of something like the Heroku scheduler (
https://devcenter.heroku.com/articles/scheduler).
That would be very neat to have regular tasks triggered...
Thanks,
Corentin


--

Corentin Dupont
Researcher @ Create-Netwww.corentindupont.info


_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer


Re: CVE-2015-1834 CC Path Traversal vulnerability

Noburou TANIGUCHI
 

Thank you for the quick response, Dieu!



--
View this message in context: http://cf-dev.70369.x6.nabble.com/cf-dev-CVE-2015-1834-CC-Path-Traversal-vulnerability-tp163p176.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: List Reply-To behavior

Chip Childers <cchilders@...>
 

I've asked the admin team to make this adjustment. Thanks for pointing this
out!

Chip Childers | Technology Chief of Staff | Cloud Foundry Foundation

On Fri, May 22, 2015 at 10:06 AM, James Bayer <jbayer(a)pivotal.io> wrote:

yes, this has affected me

On Fri, May 22, 2015 at 4:33 AM, Daniel Mikusa <dmikusa(a)pivotal.io> wrote:



On Fri, May 22, 2015 at 6:22 AM, Matthew Sykes <matthew.sykes(a)gmail.com>
wrote:

The vcap-dev list used to use a Reply-To header pointing back to the
list such that replying to a post would automatically go back to the list.
The current mailman configuration for cf-dev does not set a Reply-To header
and the default behavior is to reply to the author.

While I understand the pros and cons of setting the Reply-To header,
this new behavior has bitten me several times and I've found myself
re-posting a response to the list instead of just the author.

I'm interested in knowing if anyone else has been bitten by this
behavior and would like a Reply-To header added back...
+1 and +1

Dan



Thanks.

--
Matthew Sykes
matthew.sykes(a)gmail.com

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: scheduler

James Bayer
 

those are exciting use cases, thank you for sharing the background!

On Tue, May 26, 2015 at 2:37 AM, Corentin Dupont <cdupont(a)create-net.org>
wrote:

Hi James, thanks for the answer!
We are interested to implement a job scheduler for CF. Do you think this
could be interesting to have?

We are working in a project called DC4Cities (http://www.dc4cities.eu)
were the objective is to make data centres use more renewable energy.
We want to use PaaS frameworks such as CloudFoundry to achieve this goal.
The idea is to schedule some PaaS tasks at the moment there is more
renewable energies (when the sun is shining).

That's why I had the idea to implement a job scheduler for batch jobs in
CF. For example one could state "I need to have this task to run for 2
hours per day" and the scheduler could choose when to run it.

Another possibility is to have application-oriented SLA implemented at CF
level. For example if some KPIs of the application are getting too low, CF
would spark a new container. If the SLA is defined with some flexibility,
it could also be used to schedule renewable energies. For example in our
trial scenarios we have an application that convert images. Its SLA says
that it needs to convert 1000 images per day, but you are free to produce
them when you want i.e. when renewable energies are available...


On Mon, May 25, 2015 at 7:29 PM, James Bayer <jbayer(a)pivotal.io> wrote:

there is ongoing work to support process types using buildpacks, so that
the same application codebase could be used for multiple different types of
processes (web, worker, etc).

once process types and diego tasks are fully available, we expect to
implement a user-facing api for running batch jobs as application processes.

what people do today is run a long-running process application which uses
something like quartz scheduler [1] or ruby clock with a worker system like
resque [2]

[1] http://quartz-scheduler.org/
[2] https://github.com/resque/resque-scheduler

On Mon, May 25, 2015 at 6:19 AM, Corentin Dupont <cdupont(a)create-net.org>
wrote:

To complete my request, I'm thinking of something like this in the
manifest.yml:

applications:
- name: virusscan
memory: 512M
instances: 1




*schedule: - startFrom : a date endBefore : a date
walltime : a duration*
* precedence : other application name moldable :
true/false*

What do you think?

On Mon, May 25, 2015 at 11:25 AM, Corentin Dupont <
cdupont(a)create-net.org> wrote:


---------- Forwarded message ----------
From: Corentin Dupont <corentin.dupont(a)create-net.org>
Date: Mon, May 25, 2015 at 11:21 AM
Subject: scheduler
To: cf-dev(a)lists.cloudfoundry.org


Hi guys,
just to know, is there a project to add a job scheduler in Cloud
Foundry?
I'm thinking of something like the Heroku scheduler (
https://devcenter.heroku.com/articles/scheduler).
That would be very neat to have regular tasks triggered...
Thanks,
Corentin


--

Corentin Dupont
Researcher @ Create-Netwww.corentindupont.info


_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

--
Thank you,

James Bayer


Re: scheduler

Corentin Dupont <corentin.dupont@...>
 

Another question, what is the status of Diego? Is there an expected date
for its release?
Is it useable already?
If I understand correctly, Diego doesn't supports cron-like jobs, but will
facilitate them?

On Tue, May 26, 2015 at 5:08 PM, James Bayer <jbayer(a)pivotal.io> wrote:

those are exciting use cases, thank you for sharing the background!


On Tue, May 26, 2015 at 2:37 AM, Corentin Dupont <cdupont(a)create-net.org>
wrote:

Hi James, thanks for the answer!
We are interested to implement a job scheduler for CF. Do you think this
could be interesting to have?

We are working in a project called DC4Cities (http://www.dc4cities.eu)
were the objective is to make data centres use more renewable energy.
We want to use PaaS frameworks such as CloudFoundry to achieve this goal.
The idea is to schedule some PaaS tasks at the moment there is more
renewable energies (when the sun is shining).

That's why I had the idea to implement a job scheduler for batch jobs in
CF. For example one could state "I need to have this task to run for 2
hours per day" and the scheduler could choose when to run it.

Another possibility is to have application-oriented SLA implemented at CF
level. For example if some KPIs of the application are getting too low, CF
would spark a new container. If the SLA is defined with some flexibility,
it could also be used to schedule renewable energies. For example in our
trial scenarios we have an application that convert images. Its SLA says
that it needs to convert 1000 images per day, but you are free to produce
them when you want i.e. when renewable energies are available...


On Mon, May 25, 2015 at 7:29 PM, James Bayer <jbayer(a)pivotal.io> wrote:

there is ongoing work to support process types using buildpacks, so that
the same application codebase could be used for multiple different types of
processes (web, worker, etc).

once process types and diego tasks are fully available, we expect to
implement a user-facing api for running batch jobs as application processes.

what people do today is run a long-running process application which
uses something like quartz scheduler [1] or ruby clock with a worker system
like resque [2]

[1] http://quartz-scheduler.org/
[2] https://github.com/resque/resque-scheduler

On Mon, May 25, 2015 at 6:19 AM, Corentin Dupont <cdupont(a)create-net.org
wrote:
To complete my request, I'm thinking of something like this in the
manifest.yml:

applications:
- name: virusscan
memory: 512M
instances: 1




*schedule: - startFrom : a date endBefore : a
date walltime : a duration*
* precedence : other application name moldable :
true/false*

What do you think?

On Mon, May 25, 2015 at 11:25 AM, Corentin Dupont <
cdupont(a)create-net.org> wrote:


---------- Forwarded message ----------
From: Corentin Dupont <corentin.dupont(a)create-net.org>
Date: Mon, May 25, 2015 at 11:21 AM
Subject: scheduler
To: cf-dev(a)lists.cloudfoundry.org


Hi guys,
just to know, is there a project to add a job scheduler in Cloud
Foundry?
I'm thinking of something like the Heroku scheduler (
https://devcenter.heroku.com/articles/scheduler).
That would be very neat to have regular tasks triggered...
Thanks,
Corentin


--

Corentin Dupont
Researcher @ Create-Netwww.corentindupont.info


_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

--
Thank you,

James Bayer
--

Corentin Dupont
Researcher @ Create-Netwww.corentindupont.info


Re: scheduler

Onsi Fakhouri <ofakhouri@...>
 

Diego is very much usable at this point and we're encouraging beta testers to start putting workloads on it. Check out github.com/cloudfoundry-incubator/diego for all things Diego.

Diego supports one off tasks. It's up to the consumer (e.g. Cloud Controller) to submit the tasks when they want them run. We'd like to bubble this functionality up to the CC but it's not a very high priority at the moment.

Onsi

Sent from my iPad

On May 26, 2015, at 8:21 AM, Corentin Dupont <corentin.dupont(a)create-net.org> wrote:

Another question, what is the status of Diego? Is there an expected date for its release?
Is it useable already?
If I understand correctly, Diego doesn't supports cron-like jobs, but will facilitate them?

On Tue, May 26, 2015 at 5:08 PM, James Bayer <jbayer(a)pivotal.io> wrote:
those are exciting use cases, thank you for sharing the background!


On Tue, May 26, 2015 at 2:37 AM, Corentin Dupont <cdupont(a)create-net.org> wrote:
Hi James, thanks for the answer!
We are interested to implement a job scheduler for CF. Do you think this could be interesting to have?

We are working in a project called DC4Cities (http://www.dc4cities.eu) were the objective is to make data centres use more renewable energy.
We want to use PaaS frameworks such as CloudFoundry to achieve this goal.
The idea is to schedule some PaaS tasks at the moment there is more renewable energies (when the sun is shining).

That's why I had the idea to implement a job scheduler for batch jobs in CF. For example one could state "I need to have this task to run for 2 hours per day" and the scheduler could choose when to run it.

Another possibility is to have application-oriented SLA implemented at CF level. For example if some KPIs of the application are getting too low, CF would spark a new container. If the SLA is defined with some flexibility, it could also be used to schedule renewable energies. For example in our trial scenarios we have an application that convert images. Its SLA says that it needs to convert 1000 images per day, but you are free to produce them when you want i.e. when renewable energies are available...


On Mon, May 25, 2015 at 7:29 PM, James Bayer <jbayer(a)pivotal.io> wrote:
there is ongoing work to support process types using buildpacks, so that the same application codebase could be used for multiple different types of processes (web, worker, etc).

once process types and diego tasks are fully available, we expect to implement a user-facing api for running batch jobs as application processes.

what people do today is run a long-running process application which uses something like quartz scheduler [1] or ruby clock with a worker system like resque [2]

[1] http://quartz-scheduler.org/
[2] https://github.com/resque/resque-scheduler

On Mon, May 25, 2015 at 6:19 AM, Corentin Dupont <cdupont(a)create-net.org> wrote:
To complete my request, I'm thinking of something like this in the manifest.yml:

applications:
- name: virusscan
memory: 512M
instances: 1
schedule:
- startFrom : a date
endBefore : a date
walltime : a duration
precedence : other application name
moldable : true/false

What do you think?

On Mon, May 25, 2015 at 11:25 AM, Corentin Dupont <cdupont(a)create-net.org> wrote:


---------- Forwarded message ----------
From: Corentin Dupont <corentin.dupont(a)create-net.org>
Date: Mon, May 25, 2015 at 11:21 AM
Subject: scheduler
To: cf-dev(a)lists.cloudfoundry.org


Hi guys,
just to know, is there a project to add a job scheduler in Cloud Foundry?
I'm thinking of something like the Heroku scheduler (https://devcenter.heroku.com/articles/scheduler).
That would be very neat to have regular tasks triggered...
Thanks,
Corentin


--
Corentin Dupont
Researcher @ Create-Net
www.corentindupont.info

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer


--
Thank you,

James Bayer


--
Corentin Dupont
Researcher @ Create-Net
www.corentindupont.info
_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: Doppler zoning query

john mcteague <john.mcteague@...>
 

We are using cf v204 and all loggregators are the same size and config
(other than zone).

The distribution of requests across app instances is fairly even as far as
I can see.

John.

On 26 May 2015 06:21, "Erik Jasiak" <ejasiak(a)pivotal.io> wrote:

Hi John,

I'll be working on this with engineering in the morning; thanks for
the details thus far.

This is puzzling: Metrons do not route traffic to dopplers outside
their zone today. If all your app instances are spread evenly, and all are
serving an equal amount of requests, then I would expect no
major variability in Doppler load either.

For completeness, what version of CF are you running? I assume your
configurations for all dopplers are roughly the same? All app instances per
AZ are serving an equal number of requests?

Thanks,
Erik Jasiak

On Monday, May 25, 2015, john mcteague <john.mcteague(a)gmail.com> wrote:

Correct, thanks.

On Mon, May 25, 2015 at 12:01 AM, James Bayer <jbayer(a)pivotal.io> wrote:

ok thanks for the extra detail.

to confirm, during the load test, the http traffic is being routed
through zones 4 and 5 app instances on DEAs in a balanced way. however the
dopplers associated with zone 4 / 5 are getting a very small amount of load
sent their way. is that right?


On Sun, May 24, 2015 at 3:45 PM, john mcteague <john.mcteague(a)gmail.com>
wrote:

I am seeing logs from zone 4 and 5 when tailing the logs (*cf logs
hello-world | grep App | awk '{ print $2 }'*), I see a relatively even
balance between all app instances, yet doppler on zones 1-3 consume far
greater cpu resources (15x in some cases) than zones 4 and 5. Generally
zones 4 and 5 barely get above 1% utilization.

Running *cf curl /v2/apps/guid/stats | grep host | sort *shows 30
instances, 6 in each zone, a perfect balance.

Each loggregator is running with 8GB RAM and 4vcpus.


John

On Sat, May 23, 2015 at 11:56 PM, James Bayer <jbayer(a)pivotal.io>
wrote:

john,

can you say more about "receiving no load at all"? for example, if
you restart one of the app instances in zone 4 or zone 5 do you see logs
with "cf logs"? you can target a single app instance index to get restarted
with using a "cf curl" command for terminating an app index [1]. you can
find the details with json output from "cf stats" that should show you the
private IPs for the DEAs hosting your app, which should help you figure out
which zone each app index is in.
http://apidocs.cloudfoundry.org/209/apps/terminate_the_running_app_instance_at_the_given_index.html

if you are seeing logs from zone 4 and zone 5, then what might be
happening is that for some reason DEAs in zone 4 or zone 5 are not routable
somewhere along the path. reasons for that could be:
* DEAs in Zone 4 / Zone 5 not getting apps that are hosted there
listed in the routing table
* The routing table may be correct, but for some reason the routers
cannot reach DEAs in zone 4 or zone 5 with outbound traffic and routers
fails over to instances in DEAs 1-3 that it can reach
* some other mystery

On Fri, May 22, 2015 at 2:06 PM, john mcteague <
john.mcteague(a)gmail.com> wrote:

We map our dea's , dopplers and traffic controllers in 5 logical
zones using the various zone properties of doppler, metron_agent and
traffic_controller. This aligns to our physical failure domains in
openstack.

During a recent load test we discovered that zones 4 and 5 were
receiving no load at all, all traffic went to zones 1-3.

What would cause this unbalanced distribution? I have a single app
running 30 instances and have verified it is evenly balanced across all 5
zones (6 instances in each). I have additionally verified that each logical
zone in the bosh yml contains 1 dea, doppler server and traffic controller.

Thanks,
John

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

--
Thank you,

James Bayer


Re: Doppler zoning query

John Tuley <jtuley@...>
 

John,

Can you verify (on, say one runner in each of your zones) that Metron's
local configuration has the correct zone? (Look in
/var/vcap/jobs/metron_agent/config/metron.json.)

Can you also verify the same for the Doppler servers
(/var/vcap/jobs/doppler/config/doppler.json)?

And then can you please verify that etcd is being updated correctly? (curl
*$ETCD_URL*/api/v2/keys/healthstatus/doppler/?recursive=true with the
correct ETCD_URL - the output should contain entries with the correct IP
address of each of your dopplers, under the correct zone.)

If all of those check out, then please send me the logs from the affected
Doppler servers and I'll take a look.

– John Tuley

On Tue, May 26, 2015 at 9:26 AM, <cf-dev-request(a)lists.cloudfoundry.org>
wrote:



Message: 2
Date: Tue, 26 May 2015 16:26:30 +0100
From: john mcteague <john.mcteague(a)gmail.com>
To: Erik Jasiak <ejasiak(a)pivotal.io>
Cc: cf-dev <cf-dev(a)lists.cloudfoundry.org>
Subject: Re: [cf-dev] Doppler zoning query
Message-ID:
<CAEduAK4WmMfrhdhxWDfpR=
Ot0eM+yspsswqx4hG36Mte0bS9kg(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

We are using cf v204 and all loggregators are the same size and config
(other than zone).

The distribution of requests across app instances is fairly even as far as
I can see.

John.
On 26 May 2015 06:21, "Erik Jasiak" <ejasiak(a)pivotal.io> wrote:

Hi John,

I'll be working on this with engineering in the morning; thanks for
the details thus far.

This is puzzling: Metrons do not route traffic to dopplers outside
their zone today. If all your app instances are spread evenly, and all
are
serving an equal amount of requests, then I would expect no
major variability in Doppler load either.

For completeness, what version of CF are you running? I assume your
configurations for all dopplers are roughly the same? All app instances
per
AZ are serving an equal number of requests?

Thanks,
Erik Jasiak

On Monday, May 25, 2015, john mcteague <john.mcteague(a)gmail.com> wrote:

Correct, thanks.

On Mon, May 25, 2015 at 12:01 AM, James Bayer <jbayer(a)pivotal.io>
wrote:

ok thanks for the extra detail.

to confirm, during the load test, the http traffic is being routed
through zones 4 and 5 app instances on DEAs in a balanced way. however
the
dopplers associated with zone 4 / 5 are getting a very small amount of
load
sent their way. is that right?


On Sun, May 24, 2015 at 3:45 PM, john mcteague <
john.mcteague(a)gmail.com>
wrote:

I am seeing logs from zone 4 and 5 when tailing the logs (*cf logs
hello-world | grep App | awk '{ print $2 }'*), I see a relatively even
balance between all app instances, yet doppler on zones 1-3 consume
far
greater cpu resources (15x in some cases) than zones 4 and 5.
Generally
zones 4 and 5 barely get above 1% utilization.

Running *cf curl /v2/apps/guid/stats | grep host | sort *shows 30
instances, 6 in each zone, a perfect balance.

Each loggregator is running with 8GB RAM and 4vcpus.


John

On Sat, May 23, 2015 at 11:56 PM, James Bayer <jbayer(a)pivotal.io>
wrote:

john,

can you say more about "receiving no load at all"? for example, if
you restart one of the app instances in zone 4 or zone 5 do you see
logs
with "cf logs"? you can target a single app instance index to get
restarted
with using a "cf curl" command for terminating an app index [1]. you
can
find the details with json output from "cf stats" that should show
you the
private IPs for the DEAs hosting your app, which should help you
figure out
which zone each app index is in.
http://apidocs.cloudfoundry.org/209/apps/terminate_the_running_app_instance_at_the_given_index.html

if you are seeing logs from zone 4 and zone 5, then what might be
happening is that for some reason DEAs in zone 4 or zone 5 are not
routable
somewhere along the path. reasons for that could be:
* DEAs in Zone 4 / Zone 5 not getting apps that are hosted there
listed in the routing table
* The routing table may be correct, but for some reason the routers
cannot reach DEAs in zone 4 or zone 5 with outbound traffic and
routers
fails over to instances in DEAs 1-3 that it can reach
* some other mystery

On Fri, May 22, 2015 at 2:06 PM, john mcteague <
john.mcteague(a)gmail.com> wrote:

We map our dea's , dopplers and traffic controllers in 5 logical
zones using the various zone properties of doppler, metron_agent and
traffic_controller. This aligns to our physical failure domains in
openstack.

During a recent load test we discovered that zones 4 and 5 were
receiving no load at all, all traffic went to zones 1-3.

What would cause this unbalanced distribution? I have a single app
running 30 instances and have verified it is evenly balanced across
all 5
zones (6 instances in each). I have additionally verified that each
logical
zone in the bosh yml contains 1 dea, doppler server and traffic
controller.

Thanks,
John

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

--
Thank you,

James Bayer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://lists.cloudfoundry.org/pipermail/cf-dev/attachments/20150526/31789891/attachment.html
------------------------------

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


End of cf-dev Digest, Vol 2, Issue 73
*************************************


Re: Doppler zoning query

john mcteague <john.mcteague@...>
 

- From etcd I see 5 unique entries, all 5 doppler hosts are listed with
the correct zone
- All metron_agent.json files list the correct zone name
- All doppler.json files also contain the correct zone name


All 5 doppler servers contain the following two errors, in varying amounts.

{"timestamp":1432671780.232883453,"process_id":1422,"
source":"doppler","log_level":"error","message":"AppStoreWatcher: Got error
while waiting for ETCD events: store request timed
out","data":null,"file":"/var/vcap/data/compile/doppler/loggregator/src/
github.com/cloudfoundry/loggregatorlib/store/app_service_store_watcher.go
","line":78,"method":"
github.com/cloudfoundry/loggregatorlib/store.(*AppServiceStoreWatcher).Run"}

{"timestamp":1432649819.481923819,"process_id":1441,"
source":"doppler","log_level":"warn","message":"TB: Output channel too
full. Dropped 100 messages for app f744c900-d82d-4efc-bbe4-
004e94ffdfec.","data":null,"file":"/var/vcap/data/compile/
doppler/loggregator/src/doppler/truncatingbuffer/
truncating_buffer.go","line":65,"method":"doppler/truncatingbuffer.(*
TruncatingBuffer).Run"}

For the latter, given the high log rate of the test app, it suggests I need
to tune the buffer of doppler, but I dont expect this to be the cause of my
cpu imbalance.

On Tue, May 26, 2015 at 5:08 PM, John Tuley <jtuley(a)pivotal.io> wrote:

John,

Can you verify (on, say one runner in each of your zones) that Metron's
local configuration has the correct zone? (Look in
/var/vcap/jobs/metron_agent/config/metron.json.)

Can you also verify the same for the Doppler servers
(/var/vcap/jobs/doppler/config/doppler.json)?

And then can you please verify that etcd is being updated correctly? (curl
*$ETCD_URL*/api/v2/keys/healthstatus/doppler/?recursive=true with the
correct ETCD_URL - the output should contain entries with the correct IP
address of each of your dopplers, under the correct zone.)

If all of those check out, then please send me the logs from the affected
Doppler servers and I'll take a look.

– John Tuley

On Tue, May 26, 2015 at 9:26 AM, <cf-dev-request(a)lists.cloudfoundry.org>
wrote:



Message: 2
Date: Tue, 26 May 2015 16:26:30 +0100
From: john mcteague <john.mcteague(a)gmail.com>
To: Erik Jasiak <ejasiak(a)pivotal.io>
Cc: cf-dev <cf-dev(a)lists.cloudfoundry.org>
Subject: Re: [cf-dev] Doppler zoning query
Message-ID:
<CAEduAK4WmMfrhdhxWDfpR=
Ot0eM+yspsswqx4hG36Mte0bS9kg(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"


We are using cf v204 and all loggregators are the same size and config
(other than zone).

The distribution of requests across app instances is fairly even as far as
I can see.

John.
On 26 May 2015 06:21, "Erik Jasiak" <ejasiak(a)pivotal.io> wrote:

Hi John,

I'll be working on this with engineering in the morning; thanks for
the details thus far.

This is puzzling: Metrons do not route traffic to dopplers outside
their zone today. If all your app instances are spread evenly, and all
are
serving an equal amount of requests, then I would expect no
major variability in Doppler load either.

For completeness, what version of CF are you running? I assume your
configurations for all dopplers are roughly the same? All app instances
per
AZ are serving an equal number of requests?

Thanks,
Erik Jasiak

On Monday, May 25, 2015, john mcteague <john.mcteague(a)gmail.com> wrote:

Correct, thanks.

On Mon, May 25, 2015 at 12:01 AM, James Bayer <jbayer(a)pivotal.io>
wrote:

ok thanks for the extra detail.

to confirm, during the load test, the http traffic is being routed
through zones 4 and 5 app instances on DEAs in a balanced way.
however the
dopplers associated with zone 4 / 5 are getting a very small amount
of load
sent their way. is that right?


On Sun, May 24, 2015 at 3:45 PM, john mcteague <
john.mcteague(a)gmail.com>
wrote:

I am seeing logs from zone 4 and 5 when tailing the logs (*cf logs
hello-world | grep App | awk '{ print $2 }'*), I see a relatively
even
balance between all app instances, yet doppler on zones 1-3 consume
far
greater cpu resources (15x in some cases) than zones 4 and 5.
Generally
zones 4 and 5 barely get above 1% utilization.

Running *cf curl /v2/apps/guid/stats | grep host | sort *shows 30
instances, 6 in each zone, a perfect balance.

Each loggregator is running with 8GB RAM and 4vcpus.


John

On Sat, May 23, 2015 at 11:56 PM, James Bayer <jbayer(a)pivotal.io>
wrote:

john,

can you say more about "receiving no load at all"? for example, if
you restart one of the app instances in zone 4 or zone 5 do you see
logs
with "cf logs"? you can target a single app instance index to get
restarted
with using a "cf curl" command for terminating an app index [1].
you can
find the details with json output from "cf stats" that should show
you the
private IPs for the DEAs hosting your app, which should help you
figure out
which zone each app index is in.
http://apidocs.cloudfoundry.org/209/apps/terminate_the_running_app_instance_at_the_given_index.html

if you are seeing logs from zone 4 and zone 5, then what might be
happening is that for some reason DEAs in zone 4 or zone 5 are not
routable
somewhere along the path. reasons for that could be:
* DEAs in Zone 4 / Zone 5 not getting apps that are hosted there
listed in the routing table
* The routing table may be correct, but for some reason the routers
cannot reach DEAs in zone 4 or zone 5 with outbound traffic and
routers
fails over to instances in DEAs 1-3 that it can reach
* some other mystery

On Fri, May 22, 2015 at 2:06 PM, john mcteague <
john.mcteague(a)gmail.com> wrote:

We map our dea's , dopplers and traffic controllers in 5 logical
zones using the various zone properties of doppler, metron_agent
and
traffic_controller. This aligns to our physical failure domains in
openstack.

During a recent load test we discovered that zones 4 and 5 were
receiving no load at all, all traffic went to zones 1-3.

What would cause this unbalanced distribution? I have a single app
running 30 instances and have verified it is evenly balanced
across all 5
zones (6 instances in each). I have additionally verified that
each logical
zone in the bosh yml contains 1 dea, doppler server and traffic
controller.

Thanks,
John

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

--
Thank you,

James Bayer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://lists.cloudfoundry.org/pipermail/cf-dev/attachments/20150526/31789891/attachment.html
------------------------------

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


End of cf-dev Digest, Vol 2, Issue 73
*************************************

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: Doppler zoning query

John Tuley <jtuley@...>
 

I also don't expect that to be the source of your CPU imbalance.

Well, the bad news is that the easy stuff checks out, so I have no idea
what's actually wrong. I'll keep suggesting diagnostics, but I don't have a
silver bullet for you.

Do you have a collector wired up in your deployment? If so, I'd take a look
at the metrics `MetronAgent.dropondeAgentListener.receivedMessageCount`
across each of the runners, and
`DopplerServer.dropsondeListener.receivedMessageCount` across each of the
dopplers. That should give you a better idea of the number of log messages
that *should* be sent to each Doppler (the first metric) and that *are*
received and processed (the second metric).

If the metron numbers are high (as you expect), but the doppler numbers are
low, then there's probably something wrong with those doppler instances. If
the metron numbers are low, then there might be something wrong with metron
on the runners, or with the DEA logging agent. Or, maybe the app instances
in those zones just aren't logging much (which seems the least likely
explanation so far).

– John Tuley

On Tue, May 26, 2015 at 3:24 PM, john mcteague <john.mcteague(a)gmail.com>
wrote:


- From etcd I see 5 unique entries, all 5 doppler hosts are listed
with the correct zone
- All metron_agent.json files list the correct zone name
- All doppler.json files also contain the correct zone name


All 5 doppler servers contain the following two errors, in varying amounts.

{"timestamp":1432671780.232883453,"process_id":1422,"
source":"doppler","log_level":"error","message":"AppStoreWatcher: Got
error while waiting for ETCD events: store request timed
out","data":null,"file":"/var/vcap/data/compile/doppler/loggregator/src/
github.com/cloudfoundry/loggregatorlib/store/app_service_store_watcher.go
","line":78,"method":"
github.com/cloudfoundry/loggregatorlib/store.(*AppServiceStoreWatcher).Run
"}

{"timestamp":1432649819.481923819,"process_id":1441,"
source":"doppler","log_level":"warn","message":"TB: Output channel too
full. Dropped 100 messages for app f744c900-d82d-4efc-bbe4-
004e94ffdfec.","data":null,"file":"/var/vcap/data/compile/
doppler/loggregator/src/doppler/truncatingbuffer/
truncating_buffer.go","line":65,"method":"doppler/truncatingbuffer.(*
TruncatingBuffer).Run"}

For the latter, given the high log rate of the test app, it suggests I
need to tune the buffer of doppler, but I dont expect this to be the cause
of my cpu imbalance.

On Tue, May 26, 2015 at 5:08 PM, John Tuley <jtuley(a)pivotal.io> wrote:

John,

Can you verify (on, say one runner in each of your zones) that Metron's
local configuration has the correct zone? (Look in
/var/vcap/jobs/metron_agent/config/metron.json.)

Can you also verify the same for the Doppler servers
(/var/vcap/jobs/doppler/config/doppler.json)?

And then can you please verify that etcd is being updated correctly? (curl
*$ETCD_URL*/api/v2/keys/healthstatus/doppler/?recursive=true with the
correct ETCD_URL - the output should contain entries with the correct IP
address of each of your dopplers, under the correct zone.)

If all of those check out, then please send me the logs from the affected
Doppler servers and I'll take a look.

– John Tuley

On Tue, May 26, 2015 at 9:26 AM, <cf-dev-request(a)lists.cloudfoundry.org>
wrote:



Message: 2
Date: Tue, 26 May 2015 16:26:30 +0100
From: john mcteague <john.mcteague(a)gmail.com>
To: Erik Jasiak <ejasiak(a)pivotal.io>
Cc: cf-dev <cf-dev(a)lists.cloudfoundry.org>
Subject: Re: [cf-dev] Doppler zoning query
Message-ID:
<CAEduAK4WmMfrhdhxWDfpR=
Ot0eM+yspsswqx4hG36Mte0bS9kg(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"


We are using cf v204 and all loggregators are the same size and config
(other than zone).

The distribution of requests across app instances is fairly even as far
as
I can see.

John.
On 26 May 2015 06:21, "Erik Jasiak" <ejasiak(a)pivotal.io> wrote:

Hi John,

I'll be working on this with engineering in the morning; thanks for
the details thus far.

This is puzzling: Metrons do not route traffic to dopplers outside
their zone today. If all your app instances are spread evenly, and all
are
serving an equal amount of requests, then I would expect no
major variability in Doppler load either.

For completeness, what version of CF are you running? I assume
your
configurations for all dopplers are roughly the same? All app
instances per
AZ are serving an equal number of requests?

Thanks,
Erik Jasiak

On Monday, May 25, 2015, john mcteague <john.mcteague(a)gmail.com>
wrote:

Correct, thanks.

On Mon, May 25, 2015 at 12:01 AM, James Bayer <jbayer(a)pivotal.io>
wrote:

ok thanks for the extra detail.

to confirm, during the load test, the http traffic is being routed
through zones 4 and 5 app instances on DEAs in a balanced way.
however the
dopplers associated with zone 4 / 5 are getting a very small amount
of load
sent their way. is that right?


On Sun, May 24, 2015 at 3:45 PM, john mcteague <
john.mcteague(a)gmail.com>
wrote:

I am seeing logs from zone 4 and 5 when tailing the logs (*cf logs
hello-world | grep App | awk '{ print $2 }'*), I see a relatively
even
balance between all app instances, yet doppler on zones 1-3 consume
far
greater cpu resources (15x in some cases) than zones 4 and 5.
Generally
zones 4 and 5 barely get above 1% utilization.

Running *cf curl /v2/apps/guid/stats | grep host | sort *shows 30
instances, 6 in each zone, a perfect balance.

Each loggregator is running with 8GB RAM and 4vcpus.


John

On Sat, May 23, 2015 at 11:56 PM, James Bayer <jbayer(a)pivotal.io>
wrote:

john,

can you say more about "receiving no load at all"? for example, if
you restart one of the app instances in zone 4 or zone 5 do you
see logs
with "cf logs"? you can target a single app instance index to get
restarted
with using a "cf curl" command for terminating an app index [1].
you can
find the details with json output from "cf stats" that should show
you the
private IPs for the DEAs hosting your app, which should help you
figure out
which zone each app index is in.
http://apidocs.cloudfoundry.org/209/apps/terminate_the_running_app_instance_at_the_given_index.html

if you are seeing logs from zone 4 and zone 5, then what might be
happening is that for some reason DEAs in zone 4 or zone 5 are not
routable
somewhere along the path. reasons for that could be:
* DEAs in Zone 4 / Zone 5 not getting apps that are hosted there
listed in the routing table
* The routing table may be correct, but for some reason the routers
cannot reach DEAs in zone 4 or zone 5 with outbound traffic and
routers
fails over to instances in DEAs 1-3 that it can reach
* some other mystery

On Fri, May 22, 2015 at 2:06 PM, john mcteague <
john.mcteague(a)gmail.com> wrote:

We map our dea's , dopplers and traffic controllers in 5 logical
zones using the various zone properties of doppler, metron_agent
and
traffic_controller. This aligns to our physical failure domains in
openstack.

During a recent load test we discovered that zones 4 and 5 were
receiving no load at all, all traffic went to zones 1-3.

What would cause this unbalanced distribution? I have a single app
running 30 instances and have verified it is evenly balanced
across all 5
zones (6 instances in each). I have additionally verified that
each logical
zone in the bosh yml contains 1 dea, doppler server and traffic
controller.

Thanks,
John

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

--
Thank you,

James Bayer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://lists.cloudfoundry.org/pipermail/cf-dev/attachments/20150526/31789891/attachment.html
------------------------------

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


End of cf-dev Digest, Vol 2, Issue 73
*************************************

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: Doppler zoning query

john mcteague <john.mcteague@...>
 

Ive only had a brief look, my graphite server does not seem to have the
same set of stats for each dea and doppler node, but where i can draw a
comparison is that the dopplers in z2 and 3 are receiving 10x more logs
than zone 4.

The dea stat for z4 has a receivedMessageCount that is lower than dopplers
z4 receivedMessageCount. Im not convinced my stats are being sent
correctly, I will investigate and provide further info tomorrow.

Thanks for your help.

John

On Tue, May 26, 2015 at 10:35 PM, John Tuley <jtuley(a)pivotal.io> wrote:

I also don't expect that to be the source of your CPU imbalance.

Well, the bad news is that the easy stuff checks out, so I have no idea
what's actually wrong. I'll keep suggesting diagnostics, but I don't have a
silver bullet for you.

Do you have a collector wired up in your deployment? If so, I'd take a
look at the metrics
`MetronAgent.dropondeAgentListener.receivedMessageCount` across each of the
runners, and `DopplerServer.dropsondeListener.receivedMessageCount` across
each of the dopplers. That should give you a better idea of the number of
log messages that *should* be sent to each Doppler (the first metric) and
that *are* received and processed (the second metric).

If the metron numbers are high (as you expect), but the doppler numbers
are low, then there's probably something wrong with those doppler
instances. If the metron numbers are low, then there might be something
wrong with metron on the runners, or with the DEA logging agent. Or, maybe
the app instances in those zones just aren't logging much (which seems the
least likely explanation so far).

– John Tuley

On Tue, May 26, 2015 at 3:24 PM, john mcteague <john.mcteague(a)gmail.com>
wrote:


- From etcd I see 5 unique entries, all 5 doppler hosts are listed
with the correct zone
- All metron_agent.json files list the correct zone name
- All doppler.json files also contain the correct zone name


All 5 doppler servers contain the following two errors, in varying
amounts.

{"timestamp":1432671780.232883453,"process_id":1422,"
source":"doppler","log_level":"error","message":"AppStoreWatcher: Got
error while waiting for ETCD events: store request timed
out","data":null,"file":"/var/vcap/data/compile/doppler/loggregator/src/
github.com/cloudfoundry/loggregatorlib/store/app_service_store_watcher.go
","line":78,"method":"
github.com/cloudfoundry/loggregatorlib/store.(*AppServiceStoreWatcher).Run
"}

{"timestamp":1432649819.481923819,"process_id":1441,"
source":"doppler","log_level":"warn","message":"TB: Output channel too
full. Dropped 100 messages for app f744c900-d82d-4efc-bbe4-
004e94ffdfec.","data":null,"file":"/var/vcap/data/compile/
doppler/loggregator/src/doppler/truncatingbuffer/
truncating_buffer.go","line":65,"method":"doppler/truncatingbuffer.(*
TruncatingBuffer).Run"}

For the latter, given the high log rate of the test app, it suggests I
need to tune the buffer of doppler, but I dont expect this to be the cause
of my cpu imbalance.

On Tue, May 26, 2015 at 5:08 PM, John Tuley <jtuley(a)pivotal.io> wrote:

John,

Can you verify (on, say one runner in each of your zones) that Metron's
local configuration has the correct zone? (Look in
/var/vcap/jobs/metron_agent/config/metron.json.)

Can you also verify the same for the Doppler servers
(/var/vcap/jobs/doppler/config/doppler.json)?

And then can you please verify that etcd is being updated correctly? (curl
*$ETCD_URL*/api/v2/keys/healthstatus/doppler/?recursive=true with the
correct ETCD_URL - the output should contain entries with the correct
IP address of each of your dopplers, under the correct zone.)

If all of those check out, then please send me the logs from the
affected Doppler servers and I'll take a look.

– John Tuley

On Tue, May 26, 2015 at 9:26 AM, <cf-dev-request(a)lists.cloudfoundry.org>
wrote:



Message: 2
Date: Tue, 26 May 2015 16:26:30 +0100
From: john mcteague <john.mcteague(a)gmail.com>
To: Erik Jasiak <ejasiak(a)pivotal.io>
Cc: cf-dev <cf-dev(a)lists.cloudfoundry.org>
Subject: Re: [cf-dev] Doppler zoning query
Message-ID:
<CAEduAK4WmMfrhdhxWDfpR=
Ot0eM+yspsswqx4hG36Mte0bS9kg(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"


We are using cf v204 and all loggregators are the same size and config
(other than zone).

The distribution of requests across app instances is fairly even as far
as
I can see.

John.
On 26 May 2015 06:21, "Erik Jasiak" <ejasiak(a)pivotal.io> wrote:

Hi John,

I'll be working on this with engineering in the morning; thanks
for
the details thus far.

This is puzzling: Metrons do not route traffic to dopplers outside
their zone today. If all your app instances are spread evenly, and
all are
serving an equal amount of requests, then I would expect no
major variability in Doppler load either.

For completeness, what version of CF are you running? I assume
your
configurations for all dopplers are roughly the same? All app
instances per
AZ are serving an equal number of requests?

Thanks,
Erik Jasiak

On Monday, May 25, 2015, john mcteague <john.mcteague(a)gmail.com>
wrote:

Correct, thanks.

On Mon, May 25, 2015 at 12:01 AM, James Bayer <jbayer(a)pivotal.io>
wrote:

ok thanks for the extra detail.

to confirm, during the load test, the http traffic is being routed
through zones 4 and 5 app instances on DEAs in a balanced way.
however the
dopplers associated with zone 4 / 5 are getting a very small amount
of load
sent their way. is that right?


On Sun, May 24, 2015 at 3:45 PM, john mcteague <
john.mcteague(a)gmail.com>
wrote:

I am seeing logs from zone 4 and 5 when tailing the logs (*cf logs
hello-world | grep App | awk '{ print $2 }'*), I see a relatively
even
balance between all app instances, yet doppler on zones 1-3
consume far
greater cpu resources (15x in some cases) than zones 4 and 5.
Generally
zones 4 and 5 barely get above 1% utilization.

Running *cf curl /v2/apps/guid/stats | grep host | sort *shows 30
instances, 6 in each zone, a perfect balance.

Each loggregator is running with 8GB RAM and 4vcpus.


John

On Sat, May 23, 2015 at 11:56 PM, James Bayer <jbayer(a)pivotal.io>
wrote:

john,

can you say more about "receiving no load at all"? for example, if
you restart one of the app instances in zone 4 or zone 5 do you
see logs
with "cf logs"? you can target a single app instance index to get
restarted
with using a "cf curl" command for terminating an app index [1].
you can
find the details with json output from "cf stats" that should
show you the
private IPs for the DEAs hosting your app, which should help you
figure out
which zone each app index is in.
http://apidocs.cloudfoundry.org/209/apps/terminate_the_running_app_instance_at_the_given_index.html

if you are seeing logs from zone 4 and zone 5, then what might be
happening is that for some reason DEAs in zone 4 or zone 5 are
not routable
somewhere along the path. reasons for that could be:
* DEAs in Zone 4 / Zone 5 not getting apps that are hosted there
listed in the routing table
* The routing table may be correct, but for some reason the
routers
cannot reach DEAs in zone 4 or zone 5 with outbound traffic and
routers
fails over to instances in DEAs 1-3 that it can reach
* some other mystery

On Fri, May 22, 2015 at 2:06 PM, john mcteague <
john.mcteague(a)gmail.com> wrote:

We map our dea's , dopplers and traffic controllers in 5 logical
zones using the various zone properties of doppler, metron_agent
and
traffic_controller. This aligns to our physical failure domains
in
openstack.

During a recent load test we discovered that zones 4 and 5 were
receiving no load at all, all traffic went to zones 1-3.

What would cause this unbalanced distribution? I have a single
app
running 30 instances and have verified it is evenly balanced
across all 5
zones (6 instances in each). I have additionally verified that
each logical
zone in the bosh yml contains 1 dea, doppler server and traffic
controller.

Thanks,
John

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


--
Thank you,

James Bayer

--
Thank you,

James Bayer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://lists.cloudfoundry.org/pipermail/cf-dev/attachments/20150526/31789891/attachment.html
------------------------------

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


End of cf-dev Digest, Vol 2, Issue 73
*************************************

_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


[IMPORTANT] lucid64 stack removal planned with next final cf-release.

Dieu Cao <dcao@...>
 

Hello all,

As a follow on to the original notice [1], I wanted to clarify that we plan
to remove the lucid64 stack from cf-release with the next final cf-release
that we hope to make available in the next week or so.

Thanks,
Dieu
CF Runtime PM

[1]
https://groups.google.com/a/cloudfoundry.org/d/msg/vcap-dev/gU7rpD8MSC4/9VfzwhUx_CsJ

201 - 220 of 9398