Couple of updates here for clarity. No databases are stored on NFS in any default installation. NFS is only used to store blobstore data. If you are using the postgres job from cf-release, since it is single node there will be downtime during a stemcell deploy.
I talked with Dies from Fujitsu earlier and confirmed they are NOT using the postgres job but an external non-cf deployed postgres instance. So during a deploy, the UAA db should be up and available the entire time.
The issue they are seeing is that even though the database is up, and I'm guessing there is at least a single node of UAA up during the deploy, there are still login failures.
Joseph OSS Release Integration Team
toggle quoted message
Show quoted text
On Mon, Sep 14, 2015 at 6:39 PM, Filip Hanik <fhanik(a)pivotal.io> wrote: Amit, see previous comment.
Postgresql database is stored on NFS that is restarted during nfs job update.
UAA, while being up, is non functional while the NFS job is updated because it can't get to the DB.
On Mon, Sep 14, 2015 at 5:09 PM, Amit Gupta <agupta(a)pivotal.io> wrote:
Hi Ricky,
My understanding is that you still need help, and the issues Jiang and Alexander raised are different. To avoid confusion, let's keep this thread focused on your issue.
Can you confirm that you have two UAA VMs in separate bosh jobs, separate AZs, etc. Can you confirm that when you roll the UAAs, only one goes down at a time? The simplest way to affect a roll is to change some trivial property in the manifest for your UAA jobs. If you're using v215, any of the properties referenced here will do:
https://github.com/cloudfoundry/cf-release/blob/v215/jobs/uaa/spec#L321-L335
You should confirm that only one UAA is down at a time, and comes back up before bosh moves on to updating the other UAA.
While this roll is happening, can you just do `CF_TRACE=true cf auth USERNAME PASSWORD` in a loop, and if you see one that fails, post the output, along with noting the state of the bosh deploy when the error happens.
Thanks, Amit
On Mon, Sep 14, 2015 at 10:51 AM, Amit Gupta <agupta(a)pivotal.io> wrote:
Ricky, Jiang, Alexander, are the three of you working together? It's hard to tell since you've got Fujitsu, Gmail, and Altoros email addresses. Are you folks talking about the same issue with the same deployment, or three separate issues.
Ricky, if you still need assistance with your issue, please let us know.
On Mon, Sep 14, 2015 at 10:16 AM, Lomov Alexander < alexander.lomov(a)altoros.com> wrote:
Yes, the problem is that postgresql database is stored on NFS that is restarted during nfs job update. I’m sure that you’ll be able to run updates without outage with several customizations.
It is hard to tell without knowing your environment, but in common case steps will be following:
1. Add additional instances to nfs job and customize it to make replications (for instance use this docs for release customization [1]) 2. Make your NFS job to update sequently without our jobs updates in parallel (like it is done for postgresql [2]) 3. Check your options in update section [3].
[1] https://help.ubuntu.com/community/HighlyAvailableNFS [2] https://github.com/cloudfoundry/cf-release/blob/master/example_manifests/minimal-aws.yml#L115-L116 [3] https://github.com/cloudfoundry/cf-release/blob/master/example_manifests/minimal-aws.yml#L57-L62
On Sep 14, 2015, at 9:47 AM, Yitao Jiang <jiangyt.cn(a)gmail.com> wrote:
On upgrading the deployment, the uaa not working due the uaadb filesystem hangup.Under my environment , the nfs-wal-server's ip changed which causing uaadb,ccdb hang up. Hard reboot the uaadb, restart uaa service solve the issue.
Hopes can help you.
On Mon, Sep 14, 2015 at 2:13 PM, Yunata, Ricky < rickyy(a)fast.au.fujitsu.com> wrote:
Hello,
I have a question regarding UAA in Cloud Foundry. I’m currently running Cloud Foundry on Openstack.
I have 2 availability zones and redundancy of the important VMs including UAA.
Whenever I do an upgrade of either stemcell or CF release, user will not be able to do CF login when when CF is updating UAA VM.
My question is, is this a normal behaviour? If I have redundant UAA VM, shouldn’t user still be able to still login to the apps even though it’s being updated?
I’ve done this test a few times, with different CF version and stemcells and all of them are giving me the same result. The latest test that I’ve done was to upgrade CF version from 212 to 215.
Has anyone experienced the same issue?
Regards,
Ricky Disclaimer
The information in this e-mail is confidential and may contain content that is subject to copyright and/or is commercial-in-confidence and is intended only for the use of the above named addressee. If you are not the intended recipient, you are hereby notified that dissemination, copying or use of the information is strictly prohibited. If you have received this e-mail in error, please telephone Fujitsu Australia Software Technology Pty Ltd on + 61 2 9452 9000 or by reply e-mail to the sender and delete the document and all copies thereof.
Whereas Fujitsu Australia Software Technology Pty Ltd would not knowingly transmit a virus within an email communication, it is the receiver’s responsibility to scan all communication and any files attached for computer viruses and other defects. Fujitsu Australia Software Technology Pty Ltd does not accept liability for any loss or damage (whether direct, indirect, consequential or economic) however caused, and whether by negligence or otherwise, which may result directly or indirectly from this communication or any files attached.
If you do not wish to receive commercial and/or marketing email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe(a)fast.au.fujitsu.com
--
Regards,
Yitao jiangyt.github.io
|