Date   

Re: Proposal: Decomposing cf-release and Extracting Deployment Strategies

Amit Kumar Gupta
 

Thanks for the feedback Mike!

Can you tell us more specifically what sort of extensions you need? It
would be great if cf-deployment provided an interface that could serve the
needs of essentially all operators of CF.

Thanks,
Amit

On Tue, Sep 15, 2015 at 4:02 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

This is great stuff! My organization currently maintains our own custom
ways to generate manifests, include secure properties, and manage release
versions.

We would love to base the next generation of our solution on
cf-deployment. Have you put any thought into how others might customize or
extend cf-deployment? Our needs are very similar to yours just sometimes a
little different.

Perhaps a private fork periodically merged with a known good release
combination (tag) might be appropriate? Or some way to include the same
tools into a wholly private repo?

Mike


On Tue, Sep 8, 2015 at 1:22 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hi all,

The CF OSS Release Integration team (casually referred to as the "MEGA
team") is trying to solve a lot of tightly interrelated problems, and make
many of said problems less interrelated. It is difficult to address just
one issue without touching the others, so the following proposal addresses
several issues, but the most important ones are:

* decompose cf-release into many independently manageable, independently
testable, independently usable releases
* separate manifest generation strategies from the release source, paving
the way for Diego to be part of the standard deployment

This proposal will outline a picture of how manifest generation will work
in a unified manner in development, test, and integration environments. It
will also outline a picture of what each release’s test pipelines will look
like, how they will feed into a common integration environment, and how
feedback from the integration environment will feed back into the test
environments. Finally, it will propose a picture for what the integration
environment will look like, and how we get from the current integration
environment to where we want to be.

For further details, please feel free to view and comment here:


https://docs.google.com/document/d/1Viga_TzUB2nLxN_ILqksmUiILM1hGhq7MBXxgLaUOkY

Thanks,
Amit, CF OSS Release Integration team


Re: Packaging CF app as bosh-release

Ronak Banka
 

Can possibly package anything but that way application will run inside a vm
rather than on platform.

On Wed, Sep 16, 2015 at 12:02 AM, Kayode Odeyemi <dreyemi(a)gmail.com> wrote:

Hi,

Does it make any sense to package a CF app as boshrelease?


Re: Packaging CF app as bosh-release

Amit Gupta
 

Can you say a bit more about what you're trying to do?


Re: User cannot do CF login when UAA is being updated

Yunata, Ricky <rickyy@...>
 

Hi Joseph,

Yes that is the case. I have sent my test result but it seems that my e-mail does not get through. How can I sent attachment in this mailing list?

Regards,
Ricky


From: CF Runtime [mailto:cfruntime(a)gmail.com]
Sent: Tuesday, 15 September 2015 8:10 PM
To: Discussions about Cloud Foundry projects and the system overall.
Subject: [cf-dev] Re: Re: Re: Re: User cannot do CF login when UAA is being updated

Couple of updates here for clarity. No databases are stored on NFS in any default installation. NFS is only used to store blobstore data. If you are using the postgres job from cf-release, since it is single node there will be downtime during a stemcell deploy.

I talked with Dies from Fujitsu earlier and confirmed they are NOT using the postgres job but an external non-cf deployed postgres instance. So during a deploy, the UAA db should be up and available the entire time.

The issue they are seeing is that even though the database is up, and I'm guessing there is at least a single node of UAA up during the deploy, there are still login failures.

Joseph
OSS Release Integration Team

On Mon, Sep 14, 2015 at 6:39 PM, Filip Hanik <fhanik(a)pivotal.io<mailto:fhanik(a)pivotal.io>> wrote:
Amit, see previous comment.

Postgresql database is stored on NFS that is restarted during nfs job update.
UAA, while being up, is non functional while the NFS job is updated because it can't get to the DB.



On Mon, Sep 14, 2015 at 5:09 PM, Amit Gupta <agupta(a)pivotal.io<mailto:agupta(a)pivotal.io>> wrote:
Hi Ricky,

My understanding is that you still need help, and the issues Jiang and Alexander raised are different. To avoid confusion, let's keep this thread focused on your issue.

Can you confirm that you have two UAA VMs in separate bosh jobs, separate AZs, etc. Can you confirm that when you roll the UAAs, only one goes down at a time? The simplest way to affect a roll is to change some trivial property in the manifest for your UAA jobs. If you're using v215, any of the properties referenced here will do:

https://github.com/cloudfoundry/cf-release/blob/v215/jobs/uaa/spec#L321-L335

You should confirm that only one UAA is down at a time, and comes back up before bosh moves on to updating the other UAA.

While this roll is happening, can you just do `CF_TRACE=true cf auth USERNAME PASSWORD` in a loop, and if you see one that fails, post the output, along with noting the state of the bosh deploy when the error happens.

Thanks,
Amit

On Mon, Sep 14, 2015 at 10:51 AM, Amit Gupta <agupta(a)pivotal.io<mailto:agupta(a)pivotal.io>> wrote:
Ricky, Jiang, Alexander, are the three of you working together? It's hard to tell since you've got Fujitsu, Gmail, and Altoros email addresses. Are you folks talking about the same issue with the same deployment, or three separate issues.

Ricky, if you still need assistance with your issue, please let us know.

On Mon, Sep 14, 2015 at 10:16 AM, Lomov Alexander <alexander.lomov(a)altoros.com<mailto:alexander.lomov(a)altoros.com>> wrote:
Yes, the problem is that postgresql database is stored on NFS that is restarted during nfs job update. I’m sure that you’ll be able to run updates without outage with several customizations.

It is hard to tell without knowing your environment, but in common case steps will be following:


1. Add additional instances to nfs job and customize it to make replications (for instance use this docs for release customization [1])
2. Make your NFS job to update sequently without our jobs updates in parallel (like it is done for postgresql [2])
3. Check your options in update section [3].

[1] https://help.ubuntu.com/community/HighlyAvailableNFS
[2] https://github.com/cloudfoundry/cf-release/blob/master/example_manifests/minimal-aws.yml#L115-L116
[3] https://github.com/cloudfoundry/cf-release/blob/master/example_manifests/minimal-aws.yml#L57-L62

On Sep 14, 2015, at 9:47 AM, Yitao Jiang <jiangyt.cn(a)gmail.com<mailto:jiangyt.cn(a)gmail.com>> wrote:

On upgrading the deployment, the uaa not working due the uaadb filesystem hangup.Under my environment , the nfs-wal-server's ip changed which causing uaadb,ccdb hang up. Hard reboot the uaadb, restart uaa service solve the issue.

Hopes can help you.

On Mon, Sep 14, 2015 at 2:13 PM, Yunata, Ricky <rickyy(a)fast.au.fujitsu.com<mailto:rickyy(a)fast.au.fujitsu.com>> wrote:
Hello,

I have a question regarding UAA in Cloud Foundry. I’m currently running Cloud Foundry on Openstack.
I have 2 availability zones and redundancy of the important VMs including UAA.
Whenever I do an upgrade of either stemcell or CF release, user will not be able to do CF login when when CF is updating UAA VM.
My question is, is this a normal behaviour? If I have redundant UAA VM, shouldn’t user still be able to still login to the apps even though it’s being updated?
I’ve done this test a few times, with different CF version and stemcells and all of them are giving me the same result. The latest test that I’ve done was to upgrade CF version from 212 to 215.
Has anyone experienced the same issue?

Regards,
Ricky
Disclaimer

The information in this e-mail is confidential and may contain content that is subject to copyright and/or is commercial-in-confidence and is intended only for the use of the above named addressee. If you are not the intended recipient, you are hereby notified that dissemination, copying or use of the information is strictly prohibited. If you have received this e-mail in error, please telephone Fujitsu Australia Software Technology Pty Ltd on + 61 2 9452 9000<tel:%2B%2061%202%209452%209000> or by reply e-mail to the sender and delete the document and all copies thereof.


Whereas Fujitsu Australia Software Technology Pty Ltd would not knowingly transmit a virus within an email communication, it is the receiver’s responsibility to scan all communication and any files attached for computer viruses and other defects. Fujitsu Australia Software Technology Pty Ltd does not accept liability for any loss or damage (whether direct, indirect, consequential or economic) however caused, and whether by negligence or otherwise, which may result directly or indirectly from this communication or any files attached.


If you do not wish to receive commercial and/or marketing email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe(a)fast.au.fujitsu.com<mailto:unsubscribe(a)fast.au.fujitsu.com>




--

Regards,

Yitao
jiangyt.github.io<http://jiangyt.github.io/>





Disclaimer

The information in this e-mail is confidential and may contain content that is subject to copyright and/or is commercial-in-confidence and is intended only for the use of the above named addressee. If you are not the intended recipient, you are hereby notified that dissemination, copying or use of the information is strictly prohibited. If you have received this e-mail in error, please telephone Fujitsu Australia Software Technology Pty Ltd on + 61 2 9452 9000 or by reply e-mail to the sender and delete the document and all copies thereof.


Whereas Fujitsu Australia Software Technology Pty Ltd would not knowingly transmit a virus within an email communication, it is the receiver’s responsibility to scan all communication and any files attached for computer viruses and other defects. Fujitsu Australia Software Technology Pty Ltd does not accept liability for any loss or damage (whether direct, indirect, consequential or economic) however caused, and whether by negligence or otherwise, which may result directly or indirectly from this communication or any files attached.


If you do not wish to receive commercial and/or marketing email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe(a)fast.au.fujitsu.com


Re: DEA/Warden staging error

CF Runtime
 

It's not something we've ever seen before.

In theory, the warden container needs the git binary, which I think it gets
from the cflinuxfs2 stack; and internet access to wherever the git repo
lives.

If the warden container has both of those things, I can't think of any
reason why it wouldn't work.

Joseph
OSS Release Integration Team

On Tue, Sep 15, 2015 at 2:06 PM, kyle havlovitz <kylehav(a)gmail.com> wrote:

I tried deploying via uploading a buildpack to the CC (had to set up nginx
first, I didnt have it running/configured before) and that worked! So
that's awesome, but I'm not sure what the problem with using a remote
buildpack is. Even with nginx, I still get the exact same error as before
when pushing using a remote buildpack from git.

On Tue, Sep 15, 2015 at 6:57 AM, CF Runtime <cfruntime(a)gmail.com> wrote:

Looking at the logs, we can see it finishing downloading the app package.
The next step should be to download and run the buildpack. Since you
mention there is no output after this, I'm guessing it doesn't get that far.

It might be having trouble downloading the buildpack from the remote git
url. Could you try uploading the buildpack to Cloud Controller and then
having it use that buildpack to see if that makes a difference?


http://apidocs.cloudfoundry.org/217/buildpacks/creates_an_admin_buildpack.html

http://apidocs.cloudfoundry.org/217/buildpacks/upload_the_bits_for_an_admin_buildpack.html

Joseph
OSS Release Integration Team

On Mon, Sep 14, 2015 at 5:37 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

Here's the full dea_ng and warden debug logs:
https://gist.github.com/MrEnzyme/6dcc74174482ac62c1cf

Are there any other places I should look for logs?

On Mon, Sep 14, 2015 at 8:14 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

That's not an error we normally get. It's not clear if the
staging_info.yml error is the source of the problem or an artifact of it.
Having more logs would allow us to speculate more.

Joseph & Dan
OSS Release Integration Team

On Mon, Sep 14, 2015 at 2:24 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I have the cloudfoundry components built, configured and running on
one VM (not in BOSH), and when I push an app I'm getting a generic 'FAILED
StagingError' message after '-----> Downloaded app package (460K)'.

There's nothing in the logs for the dea/warden that seems suspect
other than these 2 things:


{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{
"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}


And I think the second error is just during cleanup, only failing
because the staging process didn't get far enough in to create the
'staging_info.yml'. The one about iomux-link exiting with status 1 is
pretty mysterious though and I have no idea what caused it. Does anyone
know why this might be happening?


Re: Proposal: Decomposing cf-release and Extracting Deployment Strategies

Mike Youngstrom
 

This is great stuff! My organization currently maintains our own custom
ways to generate manifests, include secure properties, and manage release
versions.

We would love to base the next generation of our solution on
cf-deployment. Have you put any thought into how others might customize or
extend cf-deployment? Our needs are very similar to yours just sometimes a
little different.

Perhaps a private fork periodically merged with a known good release
combination (tag) might be appropriate? Or some way to include the same
tools into a wholly private repo?

Mike

On Tue, Sep 8, 2015 at 1:22 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hi all,

The CF OSS Release Integration team (casually referred to as the "MEGA
team") is trying to solve a lot of tightly interrelated problems, and make
many of said problems less interrelated. It is difficult to address just
one issue without touching the others, so the following proposal addresses
several issues, but the most important ones are:

* decompose cf-release into many independently manageable, independently
testable, independently usable releases
* separate manifest generation strategies from the release source, paving
the way for Diego to be part of the standard deployment

This proposal will outline a picture of how manifest generation will work
in a unified manner in development, test, and integration environments. It
will also outline a picture of what each release’s test pipelines will look
like, how they will feed into a common integration environment, and how
feedback from the integration environment will feed back into the test
environments. Finally, it will propose a picture for what the integration
environment will look like, and how we get from the current integration
environment to where we want to be.

For further details, please feel free to view and comment here:


https://docs.google.com/document/d/1Viga_TzUB2nLxN_ILqksmUiILM1hGhq7MBXxgLaUOkY

Thanks,
Amit, CF OSS Release Integration team


Re: valid org, space and service instance name?

Zach Robinson
 

The regexes you found are correct. From the database for org and space name the max length is 255.

Thanks,
Zach


Re: anomaly in dealing with SharedDomains

Zach Robinson
 

Hey Nima,

The expected behavior is that neither a Shared nor Private domain can take over an existing domain.

We tried using your example test and found that it passed for both shared and private domains. So we are unable to reproduce a problem here.

Are you actually seeing this happen on a live cloudfoundry?

Thanks,
Zach & Jonathan


[ANN] python-buildpack v1.5.1 released

Mike Dalessio
 

python-buildpack v1.5.1 has been released!

----

python-buildpack v1.5.1 -
https://github.com/cloudfoundry/python-buildpack/releases/tag/v1.5.1

* Adding support for Python 3.5.0
(https://www.pivotaltracker.com/story/show/103268420)

* Output buildpack information in detect script
(https://www.pivotaltracker.com/story/show/100757820)

Packaged binaries:

| name | version | cf_stacks |
|-------------|---------|------------|
| python | 2.7.10 | cflinuxfs2 |
| python | 2.7.9 | cflinuxfs2 |
| python | 3.3.5 | cflinuxfs2 |
| python | 3.3.6 | cflinuxfs2 |
| python | 3.4.2 | cflinuxfs2 |
| python | 3.4.3 | cflinuxfs2 |
| python | 3.5.0 | cflinuxfs2 |
| libffi | 3.1 | cflinuxfs2 |
| libmemcache | 1.0.18 | cflinuxfs2 |


Re: DEA/Warden staging error

kyle havlovitz <kylehav@...>
 

I tried deploying via uploading a buildpack to the CC (had to set up nginx
first, I didnt have it running/configured before) and that worked! So
that's awesome, but I'm not sure what the problem with using a remote
buildpack is. Even with nginx, I still get the exact same error as before
when pushing using a remote buildpack from git.

On Tue, Sep 15, 2015 at 6:57 AM, CF Runtime <cfruntime(a)gmail.com> wrote:

Looking at the logs, we can see it finishing downloading the app package.
The next step should be to download and run the buildpack. Since you
mention there is no output after this, I'm guessing it doesn't get that far.

It might be having trouble downloading the buildpack from the remote git
url. Could you try uploading the buildpack to Cloud Controller and then
having it use that buildpack to see if that makes a difference?


http://apidocs.cloudfoundry.org/217/buildpacks/creates_an_admin_buildpack.html

http://apidocs.cloudfoundry.org/217/buildpacks/upload_the_bits_for_an_admin_buildpack.html

Joseph
OSS Release Integration Team

On Mon, Sep 14, 2015 at 5:37 PM, kyle havlovitz <kylehav(a)gmail.com> wrote:

Here's the full dea_ng and warden debug logs:
https://gist.github.com/MrEnzyme/6dcc74174482ac62c1cf

Are there any other places I should look for logs?

On Mon, Sep 14, 2015 at 8:14 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

That's not an error we normally get. It's not clear if the
staging_info.yml error is the source of the problem or an artifact of it.
Having more logs would allow us to speculate more.

Joseph & Dan
OSS Release Integration Team

On Mon, Sep 14, 2015 at 2:24 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I have the cloudfoundry components built, configured and running on one
VM (not in BOSH), and when I push an app I'm getting a generic 'FAILED
StagingError' message after '-----> Downloaded app package (460K)'.

There's nothing in the logs for the dea/warden that seems suspect other
than these 2 things:


{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{
"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}


And I think the second error is just during cleanup, only failing
because the staging process didn't get far enough in to create the
'staging_info.yml'. The one about iomux-link exiting with status 1 is
pretty mysterious though and I have no idea what caused it. Does anyone
know why this might be happening?


Re: tcp-routing in Lattice

Jack Cai
 

One thing I'm wondering is how to provide enough public "ports" for users
to map to. It seems the cloud provider need to provide multiple public IP
to map the ports, otherwise they will soon run out of ports on the same IP.
Any thoughts here?

Jack


On Thu, Sep 10, 2015 at 4:56 PM, Atul Kshirsagar <atul.kshirsagar(a)ge.com>
wrote:

Great! Give us your feedback after you have played around with tcp routing.


Re: app auto-scaling in OSS CF contribution

john mcteague <john.mcteague@...>
 

One of the areas of autoscaling we are looking at is in application led
autoscaling, by which I do not mean the app's themselves calling an API to
scale up or down, but having an agreed contract/API between app and
autoscaler by which the application can flag the need to scale up or down
based on app specific metrics, not just metrics that are visible via the
standard firehose/CC API. The app would not call an API, it may just expose
an API endpoint that the autoscaler would poll.

An example may be, if the depth of a message queue exceeds a certain limit,
we need more instances of that app to consume all messages within a given
SLA. We should be able to automate this in an autoscaler.

John

On Tue, Sep 15, 2015 at 4:42 PM, Klevenz, Stephan <stephan.klevenz(a)sap.com>
wrote:

+1

A contribution to the incubator about that feature is very interesting and
will get my attention.

Regards,
Stephan

Von: Siva Balan
Antworten an: "Discussions about Cloud Foundry projects and the system
overall."
Datum: Dienstag, 15. September 2015 17:20
An: "Discussions about Cloud Foundry projects and the system overall."
Betreff: [cf-dev] Re: Re: Re: app auto-scaling in OSS CF contribution

+1 on this feature. We at GE would be very interested in this feature as
well. We would very much like to collaborate on this feature.

Thanks
Siva

On Tue, Sep 15, 2015 at 7:59 AM, Guillaume Berche <bercheg(a)gmail.com>
wrote:

Hi Dies,

Thanks for your future sharing of your work on an opensource autoscaler
component!

We would be interested at Orange to use the autoscaler, and potentially
try to plug into it the autosleep service [1] we're working on, so that a
minimum instance count the autoscaler could set could be equal to zero.

To me, one important missing aspect to make use of an autoscaling service
to handle changes in workloads, is to effectively handle newly instanciated
cold instances (i.e. whose lazy initialized caches are not yet warm, and
which would degrade user perceived experience if given immediately a fixed
% of the traffic):
- either to have support for autoscaling service to send warmup http
requests to newly created instances (similar to GAE warmup support)
- have the gorouter support a traffic ramping setting, so that a cold
instance is slowly getting traffic when entering rotation.

Regards,

Guillaume.

[1]
https://docs.google.com/document/d/1tMhIBX3tw7kPEOMCzKhUgmtmr26GVxyXwUTwMO71THI/edit#

On Tue, Sep 15, 2015 at 2:30 AM, ronak banka <ronakbanka.cse(a)gmail.com>
wrote:

Hi Dies,

App auto-scaling is much needed feature for CF OSS , lot of users want
to use this functionality .

Once on the incubator, roadmap can be discussed. Hope to to see it soon
on cf incubator .

Regards,
Ronak Banka
Rakuten, Inc.

On Tue, Sep 15, 2015 at 9:00 AM, Koper, Dies <diesk(a)fast.au.fujitsu.com>
wrote:

Hi,



At Fujitsu we’re developing app auto-scaling and are considering to
propose moving it to the cf incubator.

Before we start open-sourcing it, I wanted to ask if there is any
interest for this in the community, possibly even others working on or
considering to work on one who’d be interested to collaborate/align with us?



We’re looking at providing basic support for scaling up/down based on
metrics like CPU, request count, and a service broker to enable it for your
app.

We can share a detailed functional description for review in a few
weeks.

Depending on priorities, interest and resources available we may add
functionality like sending an email notification in addition to/instead of
scaling, or scale based on other metrics (including app generated custom
metrics).

Either way, we want to make these things pluggable to allow people to
integrate it with their own (closed-source) monitoring agents or custom
actions.




I feel every PaaS comes with free app auto-scaling functionality (PCF,
Bluemix, OpenShift, AWS, …) so OSS CF deserves one too.



I have discussed this plan with Pivotal and they have encouraged me to
send this email to the list.



Please let me know if you have any questions.



Regards,

Dies Koper

diesk(a)fast.au.fujitsu.com



--
http://www.twitter.com/sivabalans


Re: expected? doppler log "lost election for cluster leader"

Rohit Kumar
 

Hi Amit,

The default timeout for the election is 15 seconds, so I would expect those
log lines to show up at that interval.

The syslog_drain_binder election code was written before my time on
Loggregator, so I don't know exactly what the original reason behind doing
the leader election this way was. From my perspective it's easy to
understand and we haven't had any problems with it.

Rohit

On Tue, Sep 15, 2015 at 7:39 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hi Rohit,

To add to Guangcai's question, is it expected for those "lost election"
log lines to be so frequent? Does the component run for election every 15
seconds?
In other leader election protocols that I'm familiar with, the followers
heartbeat to the leader and only hold an election and run for it if they
determine that
there is no leader.

Amit

On Mon, Sep 14, 2015 at 8:22 PM, Rohit Kumar <rokumar(a)pivotal.io> wrote:

Hi Guangcai,

The log messages are coming from the syslog_drain_binder process which is
colocated with dopplers. The syslog_drain_binder is used to poll the
CloudController for active syslog drain URLs for apps. At any point we only
want one syslog_drain_binder to be active, so that the CloudController
doesn't get overloaded with requests. The election process is done to
ensure that.

To answer your question, yes these messages are expected. Secondly, the
syslog_drain_binders will run for election after a specified timeout has
expired. All of them try to create a key in etcd but only one succeeds and
becomes the leader. The exact logic can be found here
<https://github.com/cloudfoundry/loggregator/blob/develop/src/syslog_drain_binder/elector/elector.go#L38-L58>
.

Rohit

On Mon, Sep 14, 2015 at 1:10 AM, Guangcai Wang <guangcai.wang(a)gmail.com>
wrote:

Hi all,

I have 2 doppler instances. I found if one of doppler won election for
cluster leader, the other will frequently log "lost election for cluster
leader" as follows. Is it expected?


{"timestamp":1442214238.411536455,"process_id":7212,"source":"syslog_drain_binder","log_level":"info","message":"Elector:
'doppler_z1.0' lost election for cluster
leader.","data":null,"file":"/var/vcap/data/compile/syslog_drain_binder/loggregator/src/syslog_drain_binder/elector/elector.go","line":57,"method":"syslog_drain_binder/elector.(*Elector).RunForElection"}
{"timestamp":1442214253.724292278,"process_id":7212,"source":"syslog_drain_binder","log_level":"info","message":"Elector:
'doppler_z1.0' lost election for cluster
leader.","data":null,"file":"/var/vcap/data/compile/syslog_drain_binder/loggregator/src/syslog_drain_binder/elector/elector.go","line":57,"method":"syslog_drain_binder/elector.(*Elector).RunForElection"}
{"timestamp":1442214269.286961317,"process_id":7212,"source":"syslog_drain_binder","log_level":"info","message":"Elector:
'doppler_z1.0' lost election for cluster
leader.","data":null,"file":"/var/vcap/data/compile/syslog_drain_binder/loggregator/src/syslog_drain_binder/elector/elector.go","line":57,"method":"syslog_drain_binder/elector.(*Elector).RunForElection"}
{"timestamp":1442214284.720170259,"process_id":7212,"source":"syslog_drain_binder","log_level":"info","message":"Elector:
'doppler_z1.0' lost election for cluster
leader.","data":null,"file":"/var/vcap/data/compile/syslog_drain_binder/loggregator/src/syslog_drain_binder/elector/elector.go","line":57,"method":"syslog_drain_binder/elector.(*Elector).RunForElection"}
{"timestamp":1442214300.056922436,"process_id":7212,"source":"syslog_drain_binder","log_level":"info","message":"Elector:
'doppler_z1.0' lost election for cluster
leader.","data":null,"file":"/var/vcap/data/compile/syslog_drain_binder/loggregator/src/syslog_drain_binder/elector/elector.go","line":57,"method":"syslog_drain_binder/elector.(*Elector).RunForElection"}


I also want to know in which conditions/situations they will reelect for
cluster leader again.


Re: FYI: Survey: Cloud Foundry Service Broker Compliance

Mohamed Mohamed <mmohamed@...>
 

Hi, all,

We would like to thank everybody for taking the survey.
However, we received quite few responses.
We will appreciate if you spend few minutes to take the survey and give us
your opinion on having CF service brokers compliance.

Best regards,

Max, Heiko and Mohamed

Mohamed Mohamed
IBM Almaden Research Center
http://researcher.ibm.com/person/us-mmohamed



From: Michael Maximilien/Almaden/IBM
To: cf-dev(a)lists.cloudfoundry.org
Cc: Mohamed Mohamed/Almaden/IBM(a)IBMUS, Heiko Ludwig/Watson/IBM(a)IBMUS
Date: 09/10/2015 04:42 PM
Subject: FYI: Survey: Cloud Foundry Service Broker Compliance


Hi, all,

I've been working on some side projects with IBM Research to improve
various aspects of CF. Some are pretty early research work and some are
ready to graduate and presented to you.

One of these relates to compliance of CF Service Brokers. We want to share
this work and make it open. We are planning a series of meetings next week
to socialize, open, and propose incubation. If interested then ping me
directly.

--------
In the mean time, Mohamed and Heiko, my colleagues from IBM Research, and
I have put together a short (literally two minutes) survey to gage what
would be the value of having Cloud Foundry (CF) service brokers
compliance.

https://www.surveymonkey.com/r/N37SD85

We'd be grateful if you could find some time to take this short survey
before we start socializing the solution we have been working on.

--------
Feel free to forward the survey link to others who may not be on this
mailing list and that you think should also take the survey

After we gather results, we will share a summary with everyone by next
Thursday.

All the best,

Mohamed, Heiko, and Max

------
dr.max
ibm cloud labs
silicon valley, ca
maximilien.org


Re: cc metric: total_users

John Liptak
 

Cloud Controller reports the number of users register in CF console. UAAC
reports additional users who may have access to other applications. So
they are both correct, depending on what you need.

For example, if you call the REST API for a UAAC user that isn't in the CF
console, but still call the cloud controller REST API, you will get a 404.

On Tue, Sep 15, 2015 at 10:10 AM, Klevenz, Stephan <stephan.klevenz(a)sap.com>
wrote:

Hi all,

I have a question, hopefully a small one :-)

The CloudController.total_users metric
(/CF\.CloudController\.0\..*\.total_users/) differs from number of users
reported by uaac (uaac users command / admin ui). Can someone explain why
this differs or which number is the correct one?

Regards,
Stephan




cc metric: total_users

Klevenz, Stephan <stephan.klevenz@...>
 

Hi all,

I have a question, hopefully a small one :-)

The CloudController.total_users metric (/CF\.CloudController\.0\..*\.total_users/) differs from number of users reported by uaac (uaac users command / admin ui). Can someone explain why this differs or which number is the correct one?

Regards,
Stephan


Re: app auto-scaling in OSS CF contribution

Klevenz, Stephan <stephan.klevenz@...>
 

+1

A contribution to the incubator about that feature is very interesting and will get my attention.

Regards,
Stephan

Von: Siva Balan
Antworten an: "Discussions about Cloud Foundry projects and the system overall."
Datum: Dienstag, 15. September 2015 17:20
An: "Discussions about Cloud Foundry projects and the system overall."
Betreff: [cf-dev] Re: Re: Re: app auto-scaling in OSS CF contribution

+1 on this feature. We at GE would be very interested in this feature as well. We would very much like to collaborate on this feature.

Thanks
Siva

On Tue, Sep 15, 2015 at 7:59 AM, Guillaume Berche <bercheg(a)gmail.com<mailto:bercheg(a)gmail.com>> wrote:
Hi Dies,

Thanks for your future sharing of your work on an opensource autoscaler component!

We would be interested at Orange to use the autoscaler, and potentially try to plug into it the autosleep service [1] we're working on, so that a minimum instance count the autoscaler could set could be equal to zero.

To me, one important missing aspect to make use of an autoscaling service to handle changes in workloads, is to effectively handle newly instanciated cold instances (i.e. whose lazy initialized caches are not yet warm, and which would degrade user perceived experience if given immediately a fixed % of the traffic):
- either to have support for autoscaling service to send warmup http requests to newly created instances (similar to GAE warmup support)
- have the gorouter support a traffic ramping setting, so that a cold instance is slowly getting traffic when entering rotation.

Regards,

Guillaume.

[1] https://docs.google.com/document/d/1tMhIBX3tw7kPEOMCzKhUgmtmr26GVxyXwUTwMO71THI/edit#

On Tue, Sep 15, 2015 at 2:30 AM, ronak banka <ronakbanka.cse(a)gmail.com<mailto:ronakbanka.cse(a)gmail.com>> wrote:
Hi Dies,

App auto-scaling is much needed feature for CF OSS , lot of users want to use this functionality .

Once on the incubator, roadmap can be discussed. Hope to to see it soon on cf incubator .

Regards,
Ronak Banka
Rakuten, Inc.

On Tue, Sep 15, 2015 at 9:00 AM, Koper, Dies <diesk(a)fast.au.fujitsu.com<mailto:diesk(a)fast.au.fujitsu.com>> wrote:
Hi,

At Fujitsu we’re developing app auto-scaling and are considering to propose moving it to the cf incubator.
Before we start open-sourcing it, I wanted to ask if there is any interest for this in the community, possibly even others working on or considering to work on one who’d be interested to collaborate/align with us?

We’re looking at providing basic support for scaling up/down based on metrics like CPU, request count, and a service broker to enable it for your app.
We can share a detailed functional description for review in a few weeks.
Depending on priorities, interest and resources available we may add functionality like sending an email notification in addition to/instead of scaling, or scale based on other metrics (including app generated custom metrics).
Either way, we want to make these things pluggable to allow people to integrate it with their own (closed-source) monitoring agents or custom actions.

I feel every PaaS comes with free app auto-scaling functionality (PCF, Bluemix, OpenShift, AWS, …) so OSS CF deserves one too.

I have discussed this plan with Pivotal and they have encouraged me to send this email to the list.

Please let me know if you have any questions.

Regards,
Dies Koper
diesk(a)fast.au.fujitsu.com<mailto:diesk(a)fast.au.fujitsu.com>






--
http://www.twitter.com/sivabalans


Re: app auto-scaling in OSS CF contribution

Siva Balan <mailsiva@...>
 

+1 on this feature. We at GE would be very interested in this feature as
well. We would very much like to collaborate on this feature.

Thanks
Siva

On Tue, Sep 15, 2015 at 7:59 AM, Guillaume Berche <bercheg(a)gmail.com> wrote:

Hi Dies,

Thanks for your future sharing of your work on an opensource autoscaler
component!

We would be interested at Orange to use the autoscaler, and potentially
try to plug into it the autosleep service [1] we're working on, so that a
minimum instance count the autoscaler could set could be equal to zero.

To me, one important missing aspect to make use of an autoscaling service
to handle changes in workloads, is to effectively handle newly instanciated
cold instances (i.e. whose lazy initialized caches are not yet warm, and
which would degrade user perceived experience if given immediately a fixed
% of the traffic):
- either to have support for autoscaling service to send warmup http
requests to newly created instances (similar to GAE warmup support)
- have the gorouter support a traffic ramping setting, so that a cold
instance is slowly getting traffic when entering rotation.

Regards,

Guillaume.

[1]
https://docs.google.com/document/d/1tMhIBX3tw7kPEOMCzKhUgmtmr26GVxyXwUTwMO71THI/edit#

On Tue, Sep 15, 2015 at 2:30 AM, ronak banka <ronakbanka.cse(a)gmail.com>
wrote:

Hi Dies,

App auto-scaling is much needed feature for CF OSS , lot of users want to
use this functionality .

Once on the incubator, roadmap can be discussed. Hope to to see it soon
on cf incubator .

Regards,
Ronak Banka
Rakuten, Inc.

On Tue, Sep 15, 2015 at 9:00 AM, Koper, Dies <diesk(a)fast.au.fujitsu.com>
wrote:

Hi,



At Fujitsu we’re developing app auto-scaling and are considering to
propose moving it to the cf incubator.

Before we start open-sourcing it, I wanted to ask if there is any
interest for this in the community, possibly even others working on or
considering to work on one who’d be interested to collaborate/align with us?



We’re looking at providing basic support for scaling up/down based on
metrics like CPU, request count, and a service broker to enable it for your
app.

We can share a detailed functional description for review in a few weeks.

Depending on priorities, interest and resources available we may add
functionality like sending an email notification in addition to/instead of
scaling, or scale based on other metrics (including app generated custom
metrics).

Either way, we want to make these things pluggable to allow people to
integrate it with their own (closed-source) monitoring agents or custom
actions.




I feel every PaaS comes with free app auto-scaling functionality (PCF,
Bluemix, OpenShift, AWS, …) so OSS CF deserves one too.



I have discussed this plan with Pivotal and they have encouraged me to
send this email to the list.



Please let me know if you have any questions.



Regards,

Dies Koper

diesk(a)fast.au.fujitsu.com



Re: User cannot do CF login when UAA is being updated

Filip Hanik
 

Amit, you are right. If there is no DB stored on NFS, then both DB and one
UAA instance should be available to handle requests.


On Tue, Sep 15, 2015 at 8:05 AM, Lomov Alexander <
alexander.lomov(a)altoros.com> wrote:

Thank you for clarification. This makes sense.

On Sep 15, 2015, at 1:09 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

Couple of updates here for clarity. No databases are stored on NFS in any
default installation. NFS is only used to store blobstore data. If you are
using the postgres job from cf-release, since it is single node there will
be downtime during a stemcell deploy.

I talked with Dies from Fujitsu earlier and confirmed they are NOT using
the postgres job but an external non-cf deployed postgres instance. So
during a deploy, the UAA db should be up and available the entire time.

The issue they are seeing is that even though the database is up, and I'm
guessing there is at least a single node of UAA up during the deploy, there
are still login failures.

Joseph
OSS Release Integration Team

On Mon, Sep 14, 2015 at 6:39 PM, Filip Hanik <fhanik(a)pivotal.io> wrote:

Amit, see previous comment.

Postgresql database is stored on NFS that is restarted during nfs job
update.

UAA, while being up, is non functional while the NFS job is updated
because it can't get to the DB.



On Mon, Sep 14, 2015 at 5:09 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hi Ricky,

My understanding is that you still need help, and the issues Jiang and
Alexander raised are different. To avoid confusion, let's keep this thread
focused on your issue.

Can you confirm that you have two UAA VMs in separate bosh jobs,
separate AZs, etc. Can you confirm that when you roll the UAAs, only one
goes down at a time? The simplest way to affect a roll is to change some
trivial property in the manifest for your UAA jobs. If you're using v215,
any of the properties referenced here will do:


https://github.com/cloudfoundry/cf-release/blob/v215/jobs/uaa/spec#L321-L335

You should confirm that only one UAA is down at a time, and comes back
up before bosh moves on to updating the other UAA.

While this roll is happening, can you just do `CF_TRACE=true cf auth
USERNAME PASSWORD` in a loop, and if you see one that fails, post the
output, along with noting the state of the bosh deploy when the error
happens.

Thanks,
Amit

On Mon, Sep 14, 2015 at 10:51 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

Ricky, Jiang, Alexander, are the three of you working together? It's
hard to tell since you've got Fujitsu, Gmail, and Altoros email addresses.
Are you folks talking about the same issue with the same deployment, or
three separate issues.

Ricky, if you still need assistance with your issue, please let us
know.

On Mon, Sep 14, 2015 at 10:16 AM, Lomov Alexander <
alexander.lomov(a)altoros.com> wrote:

Yes, the problem is that postgresql database is stored on NFS that is
restarted during nfs job update. I’m sure that you’ll be able to run
updates without outage with several customizations.

It is hard to tell without knowing your environment, but in common
case steps will be following:


1. Add additional instances to nfs job and customize it to make
replications (for instance use this docs for release customization [1])
2. Make your NFS job to update sequently without our jobs updates
in parallel (like it is done for postgresql [2])
3. Check your options in update section [3].


[1] https://help.ubuntu.com/community/HighlyAvailableNFS
[2]
https://github.com/cloudfoundry/cf-release/blob/master/example_manifests/minimal-aws.yml#L115-L116
[3]
https://github.com/cloudfoundry/cf-release/blob/master/example_manifests/minimal-aws.yml#L57-L62

On Sep 14, 2015, at 9:47 AM, Yitao Jiang <jiangyt.cn(a)gmail.com> wrote:

On upgrading the deployment, the uaa not working due the uaadb
filesystem hangup.Under my environment , the nfs-wal-server's ip changed
which causing uaadb,ccdb hang up. Hard reboot the uaadb, restart uaa
service solve the issue.

Hopes can help you.

On Mon, Sep 14, 2015 at 2:13 PM, Yunata, Ricky <
rickyy(a)fast.au.fujitsu.com> wrote:

Hello,



I have a question regarding UAA in Cloud Foundry. I’m currently
running Cloud Foundry on Openstack.

I have 2 availability zones and redundancy of the important VMs
including UAA.

Whenever I do an upgrade of either stemcell or CF release, user will
not be able to do CF login when when CF is updating UAA VM.

My question is, is this a normal behaviour? If I have redundant UAA
VM, shouldn’t user still be able to still login to the apps even though
it’s being updated?

I’ve done this test a few times, with different CF version and
stemcells and all of them are giving me the same result. The latest test
that I’ve done was to upgrade CF version from 212 to 215.

Has anyone experienced the same issue?



Regards,

Ricky
Disclaimer

The information in this e-mail is confidential and may contain
content that is subject to copyright and/or is commercial-in-confidence and
is intended only for the use of the above named addressee. If you are not
the intended recipient, you are hereby notified that dissemination, copying
or use of the information is strictly prohibited. If you have received this
e-mail in error, please telephone Fujitsu Australia Software Technology Pty
Ltd on + 61 2 9452 9000 or by reply e-mail to the sender and delete
the document and all copies thereof.

Whereas Fujitsu Australia Software Technology Pty Ltd would not
knowingly transmit a virus within an email communication, it is the
receiver’s responsibility to scan all communication and any files attached
for computer viruses and other defects. Fujitsu Australia Software
Technology Pty Ltd does not accept liability for any loss or damage
(whether direct, indirect, consequential or economic) however caused, and
whether by negligence or otherwise, which may result directly or indirectly
from this communication or any files attached.

If you do not wish to receive commercial and/or marketing email
messages from Fujitsu Australia Software Technology Pty Ltd, please email
unsubscribe(a)fast.au.fujitsu.com


--

Regards,

Yitao
jiangyt.github.io



Re: Starting Spring Boot App after deploying it to CF

Qing Gong
 

Perfect! Thanks.

7621 - 7640 of 9398