Date   

Re: Can resources of a IDLE application be shared by others?

Benjamin Gandon
 

I suggest you manually “cf scale -m 2G“ after your app has booted.
Type “cf scale --help” for more info.

Le 9 mars 2016 à 04:09, Stanley Shen <meteorping(a)gmail.com> a écrit :

Hello, all

When pushing an application to CF, we need to define its disk/memory limitation.
The memory limitation is just the possible maximum value will be needed in this application, but in most time, we don't need so much memory.
For example, I have one application which needs at most 5G memory at startup some some specific operation, but in most time it just needs 2G.
So right now I need to specify 5G in deployment manifest, and 5G memory is allocated.

Take m3.large VM for example, it has 7.5G.
Right now we can only push one application on it, but ideally we should can push more applications, like 3 since only 2G is needed for each application.

Can the resources of a IDLE application be shared by other applications?
It seems right now all the resources are pre-allocated when pushing application, it will not be released even I stopped the application.


Re: CF deployment with Diego support only ?

Benjamin Gandon
 

That's right Amit, but it was just a typo by me. I meant setting instances counts to zero for “runner_z*” and “hm9000_z*”.

I saw in a-detailed-transition-timeline that those two properties are also of help:
- cc.default_to_diego_backend=true
- cc.users_can_select_backend=false

So all in all, is that really all what needs to be done ?

/Benjamin

Le 10 mars 2016 à 09:07, Amit Gupta <agupta(a)pivotal.io> a écrit :

You need the api jobs, those are the cloud controllers! Set the runner and hm9000 jobs to 0 instances, or even remove them from your deployment manifest altogether.

On Wed, Mar 9, 2016 at 11:39 PM, Benjamin Gandon <benjamin(a)gandon.org> wrote:
Hi cf-dev,

For a fresh new deployment of cf-release, I wonder how the default manifests stubs and templates should be modified to remove unnecessary support for DEA in favor of Diego ?

Indeed, I’m starting with a working deployment of cf+diego. And now I want to wipe out those ancient DEA and HM9000 I don’t need.

I tried to draw inspiration from the MicroPCF main deployment manifest. (Are there any other sources for Diego-only CF deployments BTW?)
At the moment, all I see in this example is that I need to set « instances: » counts to zero for both « api_z* » and « hm9000_z* » jobs.

Is this sufficient ? Should I perform some more adaptations ?
Thanks for your guidance.

/Benjamin


Re: Update Parallelization in Cloud Foundry

Omar Elazhary <omazhary@...>
 

Thanks everyone. What I understood from Amit's response is that I can parallelize certain components. What I also understood from both Amit's and Dieu's responses is that some components have hard dependencies, while others only have soft ones, and some components have no dependencies at all. My question is: how can I figure out these dependencies? Are they listed somewhere? The cloud foundry docs do a great job of describing each component separately, but they do not explain which should be up before which. That is what I need in order to work an execution plan in order to minimize update time, all the while keeping CF 100% available.

Thanks.

Regards,
Omar


Re: cf ssh APP_NAME doesn't work in AWS environment

Balamurugan.J@...
 

Hi,

In which file I have to add below properties

After adding below properties, it works now:
app_ssh:
host_key_fingerprint: a6:d1:08:0b:b0:cb:9b:5f:c4:ba:44:2a:97:26:19:8a
oauth_client_id: ssh-proxy
cc:
allow_app_ssh_access: true

[cid:image001.png(a)01D17AD5.B296E030]

Thanks,
Bala


This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful. Where permitted by applicable law, this e-mail and other e-mail communications sent to and from Cognizant e-mail addresses may be monitored.


Re: CF deployment with Diego support only ?

Amit Kumar Gupta
 

You need the api jobs, those are the cloud controllers! Set the runner and
hm9000 jobs to 0 instances, or even remove them from your deployment
manifest altogether.

On Wed, Mar 9, 2016 at 11:39 PM, Benjamin Gandon <benjamin(a)gandon.org>
wrote:

Hi cf-dev,

For a fresh new deployment of cf-release
<https://github.com/cloudfoundry/cf-release>, I wonder how the default
manifests stubs and templates should be modified to remove unnecessary
support for DEA in favor of Diego ?

Indeed, I’m starting with a working deployment of cf+diego. And now I want
to wipe out those ancient DEA and HM9000 I don’t need.

I tried to draw inspiration from the MicroPCF main deployment manifest
<https://github.com/pivotal-cf/micropcf/blob/master/images/manifest.yml>.
(Are there any other sources for Diego-only CF deployments BTW?)
At the moment, all I see in this example is that I need to set «
instances: » counts to zero for both « api_z* » and « hm9000_z* » jobs.

Is this sufficient ? Should I perform some more adaptations ?
Thanks for your guidance.

/Benjamin


Re: Adding previous_instances and previous_memory fields to cf_event

Hristo Iliev
 

Hi Dieu,

We are polling app-usage-events with Abacus, but because of purge the
events may be out of order right after billing epoch started. But that's
only part of the problem.

To consume app-usage-events every integrator needs to build additional
infrastructure like:
- simple filter, loadbalancer or API management product to disable purging
once billing epoch started
- DB replication software that pulls data and deals with wrongly ordered
events after purge (we use abacus-cf-bridge)
- the Data warehouse described in the doc you sent

Introducing the previous values in the usage events will help us get rid of
most of the infrastructure we need in order to be able to deal with usage
events, before they even reach a billing system. We won't need to care for
purge calls or additional db, but instead simply pull events. The previous
values help us to:
- use formulas that do not care for the order of events (solves the purge
problem)
- get the info about a billing relevant change (we don't have to cache,
access DB or scan a stream to know what changed)
- simplify the processing logic in Abacus (or other metering/aggregation
solution)

We now pull the usage events, but we would like to be notified to offload
the CC from the constant /v2/app_usage_events calls. This however will not
solve any of the problems we now have and in fact may mess the ordering of
the events.

Regards,
Hristo Iliev

2016-03-10 6:32 GMT+02:00 Dieu Cao <dcao(a)pivotal.io>:

We don't advise using /v2/events for metering/billing for precisely the
reason you mention, that order of events is not guaranteed.

You can find more information about app usage events and service usage
events which are guaranteed to be in order here:
http://docs.cloudfoundry.org/running/managing-cf/usage-events.html

-Dieu
CF Runtime PMC Lead

On Wed, Mar 9, 2016 at 10:27 AM, KRuelY <kevinyudhiswara(a)gmail.com> wrote:

Hi,

I am currently working on metering runtime usage, and one issue I'm facing
is that there is a possibility that usage submission comes in out of
order(due to network error / other possibilities). Before the issue, the
way
metering runtime usage works is quiet simple. There is an app that will
look
at cf_events and submit usages to
[cf-abacus](https://github.com/cloudfoundry-incubator/cf-abacus).


{
"metadata": {
"guid": "40afe01a-b15a-4b8d-8bd1-e36a0ba2f6f5",
"url":
"/v2/app_usage_events/40afe01a-b15a-4b8d-8bd1-e36a0ba2f6f5",
"created_at": "2016-03-02T09:48:09Z"
},
"entity": {
"state": "STARTED",
"memory_in_mb_per_instance": 512,
"instance_count": 1,
"app_guid": "a2ab1b5a-94c0-4344-9a71-a1d2b11f483a",
"app_name": "abacus-usage-collector",
"space_guid": "d34d770d-4cd0-4bdc-8c83-8fdfa5f0b3cb",
"space_name": "dev",
"org_guid": "238a3e78-3fc8-4542-928a-88ee99643732",
"buildpack_guid": "b77d0ef8-da1f-4c0a-99cc-193449324706",
"buildpack_name": "nodejs_buildpack",
"package_state": "STAGED",
"parent_app_guid": null,
"parent_app_name": null,
"process_type": "web"
}
}


The way this app works is by looking at the state.
If the state is STARTED, it will submit usage to abacus with the
instance_memory = memory_in_mb_per_instance, running_instances =
instance_count, and since = created_at.
If the state is STOPPED, it will submit usage to abacus with the
instance_memory = 0, running_instances = 0, and since = created_at.

In ideal situation, where there is no out of order submission this is
fine.
'Simple, but Exaggerated' Example:
Usage instance_memory = 1GB, running_instances = 1, since = 3/9 00:00
comes
in. (STARTED)
Usage instance_memory = 0GB, running_instances = 0, since = 3/10 00:00
comes
in. (STOPPED)
Then Abacus know that the app consumed 1GB * (3/10 - 3/9 = 24 hours) = 24
GB-hour.

But when the usage comes in out of order:
Usage instance_memory = 0GB, running_instances = 0, since = 3/10 00:00
comes
in. (STOPPED)
Usage instance_memory = 1GB, running_instances = 1, since = 3/9 00:00
comes
in. (STARTED)
The formula that Abacus currently have would not works.

Abacus has another formula that would take care of this out of order
submission, but it would only works if we have previous_instance_memory
and
previous_running_instances.

When looking for a way to have this fields, we concluded that the cleanest
way would be to add previous_memory_in_mb_per_instance and
previous_instance_count to the cf_event. It will make App reconfigure or
cf
scale makes more sense too because currently cf scale is a STOP and a
START.

To sum up, the cf_event state submitted would include information:

// Starting
{
"state": "STARTED",
"memory_in_mb_per_instance": 512,
"instance_count": 1,
"previous_memory_in_mb_per_instance": 0,
"previous_instance_count": 0
}

// Scaling up
{
"state": "SCALE"?,
"memory_in_mb_per_instance": 512,
"instance_count": 2,
"previous_memory_in_mb_per_instance": 512,
"previous_instance_count": 1
}

// Scale down
{
"state": "SCALE"?,
"memory_in_mb_per_instance": 512,
"instance_count": 1,
"previous_memory_in_mb_per_instance": 512,
"previous_instance_count": 2
}

// Stopping
{
"state": "STOPPED",
"memory_in_mb_per_instance": 0,
"instance_count": 0,
"previous_memory_in_mb_per_instance": 512,
"previous_instance_count": 1
}


Any thoughts/feedbacks/guidance?








--
View this message in context:
http://cf-dev.70369.x6.nabble.com/cf-dev-Adding-previous-instances-and-previous-memory-fields-to-cf-event-tp4100.html
Sent from the CF Dev mailing list archive at Nabble.com.


CF deployment with Diego support only ?

Benjamin Gandon
 

Hi cf-dev,

For a fresh new deployment of cf-release <https://github.com/cloudfoundry/cf-release>, I wonder how the default manifests stubs and templates should be modified to remove unnecessary support for DEA in favor of Diego ?

Indeed, I’m starting with a working deployment of cf+diego. And now I want to wipe out those ancient DEA and HM9000 I don’t need.

I tried to draw inspiration from the MicroPCF main deployment manifest <https://github.com/pivotal-cf/micropcf/blob/master/images/manifest.yml>. (Are there any other sources for Diego-only CF deployments BTW?)
At the moment, all I see in this example is that I need to set « instances: » counts to zero for both « api_z* » and « hm9000_z* » jobs.

Is this sufficient ? Should I perform some more adaptations ?
Thanks for your guidance.

/Benjamin


Re: Update Parallelization in Cloud Foundry

Dieu Cao <dcao@...>
 

It should also be considered that in some scenarios the order of deployment
as recommended serially will most often be the most tested in terms of
ensuring backwards compatibility of code changes during deployment.

For example, a new end point might be added to cloud controller to be used
by DEAs/CELLs and it is assumed that because of the serial deployment
order, that all cloud controller's will have completed updating and thus
the new end point available prior to DEAs/CELLs updating so then code
changes to DEAs/CELLs can simply switch over to using the new end points as
they update and there is no need to keep the code on DEAs/CELLs that used
the older end points.

-Dieu
CF Runtime PMC Lead

On Wed, Mar 9, 2016 at 2:34 AM, Voelz, Marco <marco.voelz(a)sap.com> wrote:

Thanks for clarifying this for me, Amit.

Warm regards
Marco

On 09/03/16 07:43, "Amit Gupta" <agupta(a)pivotal.io> wrote:

You can probably try to start everything in parallel, and either set very
long update timeouts, or allow the deployment to fail with the expectation
that it will eventually correct itself. Or you can start things in a
strict order, and have stronger constraints on the possible failure
scenarios, and be able to debug the root cause of a failure better.

Certain things do depend on NATS, and thus won't work until NATS is up.
The main thing I can currently think of is registering routes with
gorouter, which is done both for apps and for system components (e.g. the
route-registrar registers api.SYSTEM_DOMAIN on behalf of the CC).

Best,
Amit

On Tue, Mar 8, 2016 at 2:14 AM, Voelz, Marco <marco.voelz(a)sap.com> wrote:

Does NATS also need to come up before any of the other components?

On 07/03/16 21:16, "Amit Gupta" <agupta(a)pivotal.io> wrote:

Hey Omar,

You can set the "serial" property at the global level of a deployment
(you can think of it as setting a default for all jobs), and then override
it at the individual job levels. You will want the consul server jobs to
be deployed first, with serial: true, and max_in_flight: 1. The important
thing here is, if you have more than one server in your consul cluster,
they need to come up one at a time to ensure the cluster orchestration goes
smoothly. The same is true if your etcd cluster has more than one server
in it. If you're using the postgres job for CCDB and/or UAADB (instead of
some external database), then you will want the postgres job to come up
before CC and/or UAA. Similarly, if you're using the provided blobstore
job instead of an external blobstore, you'll want it up before CC comes up.

You might be able to get away with parallelizing some of the things
above. E.g. if you bring the CC and blobstore up at the same time, CC
might fail to start for a while until Blobstore comes up, and then CC might
successfully start up. Monit also generally keeps retrying even after BOSH
gives up. So your deploy might fail but later on, you might see everything
up and running.

Cheers,
Amit

On Mon, Mar 7, 2016 at 5:54 AM, Omar Elazhary <omazhary(a)gmail.com> wrote:

Hello everyone,

I know it is possible to update and redeploy components in parallel in
cloud foundry by setting the "serial" property in the deployment manifest
to "false". However, is such a thing recommended? Are there particular job
dependencies that I need to pay attention to?

Regards,
Omar




Re: Adding previous_instances and previous_memory fields to cf_event

Dieu Cao <dcao@...>
 

We don't advise using /v2/events for metering/billing for precisely the
reason you mention, that order of events is not guaranteed.

You can find more information about app usage events and service usage
events which are guaranteed to be in order here:
http://docs.cloudfoundry.org/running/managing-cf/usage-events.html

-Dieu
CF Runtime PMC Lead

On Wed, Mar 9, 2016 at 10:27 AM, KRuelY <kevinyudhiswara(a)gmail.com> wrote:

Hi,

I am currently working on metering runtime usage, and one issue I'm facing
is that there is a possibility that usage submission comes in out of
order(due to network error / other possibilities). Before the issue, the
way
metering runtime usage works is quiet simple. There is an app that will
look
at cf_events and submit usages to
[cf-abacus](https://github.com/cloudfoundry-incubator/cf-abacus).


{
"metadata": {
"guid": "40afe01a-b15a-4b8d-8bd1-e36a0ba2f6f5",
"url": "/v2/app_usage_events/40afe01a-b15a-4b8d-8bd1-e36a0ba2f6f5",
"created_at": "2016-03-02T09:48:09Z"
},
"entity": {
"state": "STARTED",
"memory_in_mb_per_instance": 512,
"instance_count": 1,
"app_guid": "a2ab1b5a-94c0-4344-9a71-a1d2b11f483a",
"app_name": "abacus-usage-collector",
"space_guid": "d34d770d-4cd0-4bdc-8c83-8fdfa5f0b3cb",
"space_name": "dev",
"org_guid": "238a3e78-3fc8-4542-928a-88ee99643732",
"buildpack_guid": "b77d0ef8-da1f-4c0a-99cc-193449324706",
"buildpack_name": "nodejs_buildpack",
"package_state": "STAGED",
"parent_app_guid": null,
"parent_app_name": null,
"process_type": "web"
}
}


The way this app works is by looking at the state.
If the state is STARTED, it will submit usage to abacus with the
instance_memory = memory_in_mb_per_instance, running_instances =
instance_count, and since = created_at.
If the state is STOPPED, it will submit usage to abacus with the
instance_memory = 0, running_instances = 0, and since = created_at.

In ideal situation, where there is no out of order submission this is fine.
'Simple, but Exaggerated' Example:
Usage instance_memory = 1GB, running_instances = 1, since = 3/9 00:00 comes
in. (STARTED)
Usage instance_memory = 0GB, running_instances = 0, since = 3/10 00:00
comes
in. (STOPPED)
Then Abacus know that the app consumed 1GB * (3/10 - 3/9 = 24 hours) = 24
GB-hour.

But when the usage comes in out of order:
Usage instance_memory = 0GB, running_instances = 0, since = 3/10 00:00
comes
in. (STOPPED)
Usage instance_memory = 1GB, running_instances = 1, since = 3/9 00:00 comes
in. (STARTED)
The formula that Abacus currently have would not works.

Abacus has another formula that would take care of this out of order
submission, but it would only works if we have previous_instance_memory and
previous_running_instances.

When looking for a way to have this fields, we concluded that the cleanest
way would be to add previous_memory_in_mb_per_instance and
previous_instance_count to the cf_event. It will make App reconfigure or cf
scale makes more sense too because currently cf scale is a STOP and a
START.

To sum up, the cf_event state submitted would include information:

// Starting
{
"state": "STARTED",
"memory_in_mb_per_instance": 512,
"instance_count": 1,
"previous_memory_in_mb_per_instance": 0,
"previous_instance_count": 0
}

// Scaling up
{
"state": "SCALE"?,
"memory_in_mb_per_instance": 512,
"instance_count": 2,
"previous_memory_in_mb_per_instance": 512,
"previous_instance_count": 1
}

// Scale down
{
"state": "SCALE"?,
"memory_in_mb_per_instance": 512,
"instance_count": 1,
"previous_memory_in_mb_per_instance": 512,
"previous_instance_count": 2
}

// Stopping
{
"state": "STOPPED",
"memory_in_mb_per_instance": 0,
"instance_count": 0,
"previous_memory_in_mb_per_instance": 512,
"previous_instance_count": 1
}


Any thoughts/feedbacks/guidance?








--
View this message in context:
http://cf-dev.70369.x6.nabble.com/cf-dev-Adding-previous-instances-and-previous-memory-fields-to-cf-event-tp4100.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: UAA pre-start script failure on Bosh-Lite

Filip Hanik
 

There is a temporary work around available for you, if you are running into
this issue today.

Just delete the file

cf-release/src/uaa-release/jobs/uaa/templates/pre-start

Filip

On Wed, Mar 9, 2016 at 3:20 PM, Madhura Bhave <mbhave(a)pivotal.io> wrote:

Hi All,

There have been some reports about the UAA pre-start script failing on
bosh-lite. We will add a temporary workaround for this until we have a fix.

Thanks,
Priyata & Madhura


UAA pre-start script failure on Bosh-Lite

Madhura Bhave
 

Hi All,

There have been some reports about the UAA pre-start script failing on
bosh-lite. We will add a temporary workaround for this until we have a fix.

Thanks,
Priyata & Madhura


Spring Cloud Netflix and Cloud Foundry capability overlaps???

Thomas Wieger
 

Hello!

Currently i am evaluating Spring Boot/Spring Cloud for development of microservices and we are considering Cloud Foundry for deployment of those services. The thing I'm wondering about for a while is, that i see some overlap between Spring Cloud Netflix capabilities and built in functionality in Cloud Foundry. Especially regarding the following capabilities: (Go)router and Zuul, Consul and Eureka, Load Balancer and Ribbon, ...
From my prospective, i would like to let cloud foundry take care for routing, load balancing and service registry/discovery and would try either not to use the similar Spring Cloud features (Zuul, Ribbon) or try to integrate the respective Cloud Foundry functionality into the Spring Cloud Framework (Consul for Service discovery).

I would like to hear your thoughts and recommendations on that.

best regards,

Thomas


Re: how to debug "BuildpackCompileFailed" issue?

Mike Dalessio
 

Hi Ning,

If you generate an application manifest based on the current state of the
app, you'll have more information with which to try to diagnose the issue.

The command is `cf create-app-manifest APP_NAME [-p
/path/to/<app-name>-manifest.yml ]`

Can you share the contents of that manifest file?

-m

On Tue, Mar 8, 2016 at 10:29 PM, Ning Fu <nfu(a)pivotal.io> wrote:

Yes, Gemfile and Gemfile.lock files are present at the top level.
I can get "cf logs" output for other apps I pushed. So I think the reason
that I don't see anything from "cf logs --recent" is that there are no logs
produced at all.
Anyway, whatever the root cause it may be, I think the key point is not
the root cause itself, it is the utilities that we provide to developers to
investigate when such an error occurs. I think it is really a bad user
experience for developers when such an error occurs but they can only guess
the root cause and try the solutions, rather than find clues from logs. PHP
build pack has a BP_DEBUG environment variable that helps debugging. But it
seems Ruby build pack doesn't support it.

Thanks,
Ning


On Wed, Mar 9, 2016 at 12:45 AM, Jesse Alford <jalford(a)pivotal.io> wrote:

Are your Gemfile and Gemfile.lock files literally named Gemfile and
Gemfile.lock, and both checked present at the top level of the directory
you're pushing from, on the machine you're pushing from?

The Ruby Buildpack looks for (and requires) the presence of these to
identify a Ruby app. If you've got some other setup, you can put empty
files with the appropriate names in the top level dir.

This might not be your problem, but I've seen it cause the error you
report many times.

On Tue, Mar 8, 2016, 7:12 AM JT Archie <jarchie(a)pivotal.io> wrote:

Ning,

What is the output of `cf buildpacks`?

JT

On Tue, Mar 8, 2016 at 1:27 AM, Ning Fu <nfu(a)pivotal.io> wrote:

Hi,

Does anyone know how to debug "BuildpackCompileFailed" issue?
When I push a ruby app:
========================
...
Done uploadingOK
Starting app happy in org funorg / space development as funcloud...
FAILEDBuildpackCompileFailed
TIP: use 'cf logs happy --recent' for more
informationPivotals-iMac:happy-root-route-app pivotal$ cf logs happy
--recent
Connected, dumping recent logs for app happy in org funorg / space
development as funcloud...
========================
But I got nothing from "cf logs happy --recent".
It is "ruby '2.2.2'" in Gemfile, and my ruby build back is
cached-1.6.11. I've also tried "bundle package --all" before I push.

Any suggestions? Doesn't ruby build pack provide any logs?

Thanks,
Ning


Re: Dial tcp: i/o timeout while pushing a sample app to Cloud Foundry BOSH-Lite deployment

Rob Dimsdale
 

We have noticed similar things recently. Our issue was a misconfigured DNS record in /etc/resolv.conf so we suggest you check your nameservers point to valid DNS servers, and that those servers have the required DNS entries for the hostnames.

It might also be related to the CLI switching from golang 1.5 to 1.6 in cf v6.16, though we haven't confirmed this. You could try downloading the previous version of the CLI and seeing if that resolves your issue.

It's also possible that DNS is taking longer than 5 seconds to return correctly - the CLI will not handle this due to a hard-coded timeout. See https://github.com/cloudfoundry/cli/issues/763 for more information.

Thanks,
Rob && Al
CF Release Integration
Pivotal


Re: Can resources of a IDLE application be shared by others?

Montenegro, Allan Mauricio (EESO CC PREMIER) <amontenegro@...>
 

Hello,



I'm sorry, I'm constantly receiving emails from this source, in which I have no part on, as I am a new Hewlett Packard Enterprise Employee.



Please remove my email.



Thanks,

Montenegro Murillo, Allan
Support Specialist
ITIO Customer Care


+1-877-785-2155 Office
Heredia, Costa Rica
allan.montenegro(a)hpe.com<mailto:allan.montenegro(a)hpe.com>


[cid:image001.jpg(a)01D11DDE.2289DAA0]

-----Original Message-----
From: Deepak Vij (A) [mailto:deepak.vij(a)huawei.com]
Sent: Wednesday, March 09, 2016 1:09 PM
To: Discussions about Cloud Foundry projects and the system overall.
Subject: [cf-dev] Re: Can resources of a IDLE application be shared by others?



Over allocation of computing resources is a very serious problem in the contemporary cloud computing environment and across enterprise data centers in general. Famous Quasar research paper published by Stanford researches talks about this issue in great detail. There are startling numbers such as industry-wide utilization between 6% and 12%. A recent study estimated server utilization on Amazon EC2 in the 3% to 17% range.



Techniques which Google ("Borg") internally employs is called "Resource Reclamation" whereby over allocated resources are taken advantage of for executing best-effort workloads such as background analytics, and other low priority jobs. Such best-effort workloads could be pre-empted in case of any interference to the original normal running workload or resources are needed back at a later time. In order to be able to utilize unused resources in such a manner ("Resource Reclamation") requires underlying support for capabilities such as "Preemption & Resizing", "Isolation Mechanisms", "Interference Detection" etc.



Another important aspect of all this is the fact that users never know how to precisely do resource reservations - this leads to underutilization of computing resources. Instead, what is being proposed is some kind of classification based predictive analytics techniques whereby resource management system itself determines right amount of resources to meet the user performance constraints. In this case, user only specifies the performance constraints (SLO) not the actual low level resource reservations itself. Combination of predictive analytics based proactive approach in conjunction with reactive "resource reclamation" approach is the optimal strategy. This accounts for any mis-predictions as well.



Mesos Resource Management also supports "Resource Reclamation". Although, both Google ("Borg") and Mesos do not employ predictive analytics based proactive approach in their environment.



I hope this makes sense and helps.



- Deepak Vij (Huawei, Software Lab., Santa Clara)



-----Original Message-----

From: Stanley Shen [mailto:meteorping(a)gmail.com]

Sent: Tuesday, March 08, 2016 7:10 PM

To: cf-dev(a)lists.cloudfoundry.org<mailto:cf-dev(a)lists.cloudfoundry.org>

Subject: [cf-dev] Can resources of a IDLE application be shared by others?



Hello, all



When pushing an application to CF, we need to define its disk/memory limitation.

The memory limitation is just the possible maximum value will be needed in this application, but in most time, we don't need so much memory.

For example, I have one application which needs at most 5G memory at startup some some specific operation, but in most time it just needs 2G.

So right now I need to specify 5G in deployment manifest, and 5G memory is allocated.



Take m3.large VM for example, it has 7.5G.

Right now we can only push one application on it, but ideally we should can push more applications, like 3 since only 2G is needed for each application.



Can the resources of a IDLE application be shared by other applications?

It seems right now all the resources are pre-allocated when pushing application, it will not be released even I stopped the application.


Re: Can resources of a IDLE application be shared by others?

Deepak Vij
 

Over allocation of computing resources is a very serious problem in the contemporary cloud computing environment and across enterprise data centers in general. Famous Quasar research paper published by Stanford researches talks about this issue in great detail. There are startling numbers such as industry-wide utilization between 6% and 12%. A recent study estimated server utilization on Amazon EC2 in the 3% to 17% range.

Techniques which Google ("Borg") internally employs is called "Resource Reclamation" whereby over allocated resources are taken advantage of for executing best-effort workloads such as background analytics, and other low priority jobs. Such best-effort workloads could be pre-empted in case of any interference to the original normal running workload or resources are needed back at a later time. In order to be able to utilize unused resources in such a manner ("Resource Reclamation") requires underlying support for capabilities such as "Preemption & Resizing", "Isolation Mechanisms", "Interference Detection" etc.

Another important aspect of all this is the fact that users never know how to precisely do resource reservations - this leads to underutilization of computing resources. Instead, what is being proposed is some kind of classification based predictive analytics techniques whereby resource management system itself determines right amount of resources to meet the user performance constraints. In this case, user only specifies the performance constraints (SLO) not the actual low level resource reservations itself. Combination of predictive analytics based proactive approach in conjunction with reactive "resource reclamation" approach is the optimal strategy. This accounts for any mis-predictions as well.

Mesos Resource Management also supports "Resource Reclamation". Although, both Google ("Borg") and Mesos do not employ predictive analytics based proactive approach in their environment.

I hope this makes sense and helps.

- Deepak Vij (Huawei, Software Lab., Santa Clara)

-----Original Message-----
From: Stanley Shen [mailto:meteorping(a)gmail.com]
Sent: Tuesday, March 08, 2016 7:10 PM
To: cf-dev(a)lists.cloudfoundry.org
Subject: [cf-dev] Can resources of a IDLE application be shared by others?

Hello, all

When pushing an application to CF, we need to define its disk/memory limitation.
The memory limitation is just the possible maximum value will be needed in this application, but in most time, we don't need so much memory.
For example, I have one application which needs at most 5G memory at startup some some specific operation, but in most time it just needs 2G.
So right now I need to specify 5G in deployment manifest, and 5G memory is allocated.

Take m3.large VM for example, it has 7.5G.
Right now we can only push one application on it, but ideally we should can push more applications, like 3 since only 2G is needed for each application.

Can the resources of a IDLE application be shared by other applications?
It seems right now all the resources are pre-allocated when pushing application, it will not be released even I stopped the application.


Adding previous_instances and previous_memory fields to cf_event

KRuelY <kevinyudhiswara@...>
 

Hi,

I am currently working on metering runtime usage, and one issue I'm facing
is that there is a possibility that usage submission comes in out of
order(due to network error / other possibilities). Before the issue, the way
metering runtime usage works is quiet simple. There is an app that will look
at cf_events and submit usages to
[cf-abacus](https://github.com/cloudfoundry-incubator/cf-abacus).


{
"metadata": {
"guid": "40afe01a-b15a-4b8d-8bd1-e36a0ba2f6f5",
"url": "/v2/app_usage_events/40afe01a-b15a-4b8d-8bd1-e36a0ba2f6f5",
"created_at": "2016-03-02T09:48:09Z"
},
"entity": {
"state": "STARTED",
"memory_in_mb_per_instance": 512,
"instance_count": 1,
"app_guid": "a2ab1b5a-94c0-4344-9a71-a1d2b11f483a",
"app_name": "abacus-usage-collector",
"space_guid": "d34d770d-4cd0-4bdc-8c83-8fdfa5f0b3cb",
"space_name": "dev",
"org_guid": "238a3e78-3fc8-4542-928a-88ee99643732",
"buildpack_guid": "b77d0ef8-da1f-4c0a-99cc-193449324706",
"buildpack_name": "nodejs_buildpack",
"package_state": "STAGED",
"parent_app_guid": null,
"parent_app_name": null,
"process_type": "web"
}
}


The way this app works is by looking at the state.
If the state is STARTED, it will submit usage to abacus with the
instance_memory = memory_in_mb_per_instance, running_instances =
instance_count, and since = created_at.
If the state is STOPPED, it will submit usage to abacus with the
instance_memory = 0, running_instances = 0, and since = created_at.

In ideal situation, where there is no out of order submission this is fine.
'Simple, but Exaggerated' Example:
Usage instance_memory = 1GB, running_instances = 1, since = 3/9 00:00 comes
in. (STARTED)
Usage instance_memory = 0GB, running_instances = 0, since = 3/10 00:00 comes
in. (STOPPED)
Then Abacus know that the app consumed 1GB * (3/10 - 3/9 = 24 hours) = 24
GB-hour.

But when the usage comes in out of order:
Usage instance_memory = 0GB, running_instances = 0, since = 3/10 00:00 comes
in. (STOPPED)
Usage instance_memory = 1GB, running_instances = 1, since = 3/9 00:00 comes
in. (STARTED)
The formula that Abacus currently have would not works.

Abacus has another formula that would take care of this out of order
submission, but it would only works if we have previous_instance_memory and
previous_running_instances.

When looking for a way to have this fields, we concluded that the cleanest
way would be to add previous_memory_in_mb_per_instance and
previous_instance_count to the cf_event. It will make App reconfigure or cf
scale makes more sense too because currently cf scale is a STOP and a START.

To sum up, the cf_event state submitted would include information:

// Starting
{
"state": "STARTED",
"memory_in_mb_per_instance": 512,
"instance_count": 1,
"previous_memory_in_mb_per_instance": 0,
"previous_instance_count": 0
}

// Scaling up
{
"state": "SCALE"?,
"memory_in_mb_per_instance": 512,
"instance_count": 2,
"previous_memory_in_mb_per_instance": 512,
"previous_instance_count": 1
}

// Scale down
{
"state": "SCALE"?,
"memory_in_mb_per_instance": 512,
"instance_count": 1,
"previous_memory_in_mb_per_instance": 512,
"previous_instance_count": 2
}

// Stopping
{
"state": "STOPPED",
"memory_in_mb_per_instance": 0,
"instance_count": 0,
"previous_memory_in_mb_per_instance": 512,
"previous_instance_count": 1
}


Any thoughts/feedbacks/guidance?








--
View this message in context: http://cf-dev.70369.x6.nabble.com/cf-dev-Adding-previous-instances-and-previous-memory-fields-to-cf-event-tp4100.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: Can resources of a IDLE application be shared by others?

Daniel Mikusa
 

On Tue, Mar 8, 2016 at 10:09 PM, Stanley Shen <meteorping(a)gmail.com> wrote:

Hello, all

When pushing an application to CF, we need to define its disk/memory
limitation.
The memory limitation is just the possible maximum value will be needed in
this application, but in most time, we don't need so much memory.
For example, I have one application which needs at most 5G memory at
startup some some specific operation, but in most time it just needs 2G.
So right now I need to specify 5G in deployment manifest, and 5G memory is
allocated.

Take m3.large VM for example, it has 7.5G.
Right now we can only push one application on it, but ideally we should
can push more applications, like 3 since only 2G is needed for each
application.

Can the resources of a IDLE application be shared by other applications?
It seems right now all the resources are pre-allocated when pushing
application, it will not be released even I stopped the application.


Re: Can resources of a IDLE application be shared by others?

Stanley Shen <meteorping@...>
 

It seems right now all the resources are pre-allocated when pushing application, it will not be released even I stopped the application.

This statement is wrong, and the resources will be release after the application is stopped.


Re: Error while using uaac with CF r230

Sylvain Goulmy <sygoulmy@...>
 

It seems that i can give the solution for that one, I had a F5 in front of
my CF platform with compression set to deflate.

UAAC sends the HTTP header : Accept-Encoding:
gzip;q=1.0,deflate;q=0.6,identity;q=0.3\r\n on its request but the UAA send
an uncompressed response. As the F5 know that the client UAAC accept
compression, it does the job and compress the UAA response.

Unfortunately there is a bug in Ruby with the deflate compressiuon :
https://bugs.ruby-lang.org/issues/11268

Everything was finally solved by switching the F5 compression mode from
deflate to gzip.

End of the story.

On Thu, Feb 25, 2016 at 6:31 PM, Sylvain Goulmy <sygoulmy(a)gmail.com> wrote:

Hi all,

I'm currently experiencing issues with the uaac :

uaac token client get admin -s mysecretftw
Zlib::DataError: incorrect header check
attempt to get token failed

I'm working with CF r230 and UAA client 3.1.7.

Thanks in advance for your support.

Sylvain