How to prevent compilation VMs from being deleted?
R M
Hi there, Is there any way of preventing compilation VMs from being destroyed if components fail to compile? I would like to be able to look into compilation VMs for troubleshooting the failures. I tried following https://starkandwayne.com/blog/how-to-lock-vcap-password-for-bosh-vms/ but wasn't sure where to put following section wrt `cf-deploy`: bosh -e bosh-1 -d cf deploy cf-deployment.yml -v system_domain=abc.com --vars-store=/tmp/cloud-foundry/bosh/director-creds.yml Thanks for any pointers ...
|
|||||||||||||||||||||
|
|||||||||||||||||||||
cluster auto-scale with bosh
dhensel@...
Hello,
Does bosh have an auto-scale feature that can be used with VMWare in a K8S deployment ? I know K8S has auto-scaling but from my understanding it is not available for VMWare.
Thanks,
-Doug
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Removal of Power DNS from BOSH
Morgan Fine
Hi BOSH Friends, PowerDNS has existed in BOSH for a long time to support service discovery of BOSH deployments. More recently, BOSH DNS has been available as the recommended way of doing service discovery. The BOSH Director team would like to remove Power DNS from the BOSH Director in an effort to formally make BOSH DNS the supported and recommended way of doing service discovery. If you have any feedback or concerns with this approach, please let me know. Thanks, Morgan Fine PM of BOSH Director Team
|
|||||||||||||||||||||
|
|||||||||||||||||||||
BOSH PMC meeting for June cancelled
Marco Voelz
Dear friends of BOSH,
I just cancelled the BOSH PMC meeting for June, which was originally scheduled on June 20th, as I am on vacation. I have updated the CFF calendar accordingly and deleted the event. See you for the next BOSH PMC meeting in July!
Warm regards Marco
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Re: BOSH deployment times out pinging agent after 600 seconds (s390x platform)
R M
I see some of these messages in NATS .. not sure if this is a problem:
Interestingly, Director is throwing Auth errors but VM is able to post to NATS: /==================================/ [1] 2019/06/11 12:32:14.863072 [TRC] 192.168.20.10:35232 - cid:3 - ->> [CONNECT {"verbose":false,"pedantic":false,"lang":"ruby","version":"0.9.2","protocol":1,"ssl_required":true,"tls_required":true}]
[1] 2019/06/11 12:32:14.863095 [ERR] 192.168.20.10:35232 - cid:3 - Authorization Error <--- Director VM to NATS fails
[1] 2019/06/11 12:32:14.863101 [TRC] 192.168.20.10:35232 - cid:3 - <<- [-ERR Authorization Violation]
[1] 2019/06/11 12:32:14.863138 [DBG] 192.168.20.10:35232 - cid:3 - Client connection closed
[1] 2019/06/11 12:32:14.865154 [DBG] 192.168.20.10:35234 - cid:4 - Client connection created
.... [1] 2019/06/11 12:32:48.757875 [TRC] 192.168.20.3:42022 - cid:5 - ->> [CONNECT {"user":"nats","pass":"739rv2lksjyyfdhmqo49","verbose":true,"pedantic":true}]
[1] 2019/06/11 12:32:48.757925 [TRC] 192.168.20.3:42022 - cid:5 - <<- [OK] <--- Agent VM seems ok and is able to publish using PWD authentication (?)
[1] 2019/06/11 12:32:48.759034 [TRC] 192.168.20.3:42022 - cid:5 - ->> [PUB hm.agent.heartbeat.e01ca0da-d340-4f08-ad1a-8a29ebb8abc9 356]
[1] 2019/06/11 12:32:48.759056 [TRC] 192.168.20.3:42022 - cid:5 - ->> MSG_PAYLOAD: [{"deployment":"","job":null,"index":null,"job_state":"running","vitals":{"cpu":{"sys":"0.4","user":"0.3","wait":"0.0"},"disk":{"ephemeral":{"inode_percent":"0","percent":"0"},"system":{"inode_percent":"28","percent":"42"}},"load":["0.00","0.00","0.00"],"mem":{"kb":"152908","percent":"2"},"swap":{"kb":"0","percent":"0"},"uptime":{"secs":24}},"node_id":""}]
[1] 2019/06/11 12:32:48.759069 [TRC] 192.168.20.3:42022 - cid:5 - <<- [OK]
[1] 2019/06/11 12:32:48.759637 [TRC] 192.168.20.3:42022 - cid:5 - ->> [SUB agent.e01ca0da-d340-4f08-ad1a-8a29ebb8abc9 1]
/==================================/
Do I need to specify NATS certs as part of my deployment request? - BOSH_LOG_LEVEL=info bosh -e bosh-1 -d redis-deployment deploy manifest.yml --certs?
|
|||||||||||||||||||||
|
|||||||||||||||||||||
BOSH deployment times out pinging agent after 600 seconds (s390x platform)
R M
Hi there,
- Using OpenStack Rocky on s390x I built (Xenial) stemcell and ported BOSH over to s390x platform. For the most part it seems to work. However, deployment times out during "Compiling packages" stage. I am unable to figure out why this could be a problem. Director VM and compilation VM seem to be able to ping each other. NATS messages are also being posted by compilation VM. Please let me know where else I could look for clues: Here are my steps: /====================================/ BOSH_LOG_LEVEL=info bosh -e bosh-1 -d redis-deployment deploy manifest.yml .... Task 61 | 19:19:07 | Preparing deployment: Preparing deployment (00:00:00)
Task 61 | 19:19:07 | Preparing package compilation: Finding packages to compile (00:00:00)
Task 61 | 19:19:07 | Compiling packages: redis/b8455f0a7551849b841b759fc44d2c1eff79331b (00:10:27)
L Error: Timed out pinging to c3080c5f-d79b-48f8-a117-8629cf4b6c3c after 600 seconds
Task 61 | 19:29:34 | Error: Timed out pinging to c3080c5f-d79b-48f8-a117-8629cf4b6c3c after 600 seconds
Task 61 Started Fri Jun 7 19:19:07 UTC 2019
Task 61 Finished Fri Jun 7 19:29:34 UTC 2019
Task 61 Duration 00:10:27
Task 61 error
[CLI] 2019/06/07 15:29:34 ERROR - Updating deployment: Expected task '61' to succeed but state is 'error'
/====================================/My compilation VM Agent logs from /var/vcap/bosh/log/current doesn't seem to indicate any issues: /====================================/ ... 2019-06-07_18:02:48.15948 [File System] 2019/06/07 18:02:48 DEBUG - Checking if file exists /var/vcap/bosh/spec.json
2019-06-07_18:02:48.15948 [File System] 2019/06/07 18:02:48 DEBUG - Stat '/var/vcap/bosh/spec.json'
2019-06-07_18:02:48.15949 [File System] 2019/06/07 18:02:48 DEBUG - Writing /var/vcap/instance/health.json
2019-06-07_18:02:48.15949 [File System] 2019/06/07 18:02:48 DEBUG - Making dir /var/vcap/instance with perm 0777
2019-06-07_18:02:48.15949 [File System] 2019/06/07 18:02:48 DEBUG - Write content
2019-06-07_18:02:48.15949 ********************
2019-06-07_18:02:48.15950 {"state":"running"}
2019-06-07_18:02:48.15950 ********************
2019-06-07_18:02:48.15950 [NATS Handler] 2019/06/07 18:02:48 INFO - Sending hm message 'heartbeat'
2019-06-07_18:02:48.15950 [NATS Handler] 2019/06/07 18:02:48 DEBUG - Message Payload
2019-06-07_18:02:48.15951 ********************
2019-06-07_18:02:48.15951 {"deployment":"","job":null,"index":null,"job_state":"running","vitals":{"cpu":{"sys":"0.0","user":"0.0","wait":"0.0"},"disk":{"ephemeral":{"inode_percent":"0","percent":"0"},"system":{"inode_percent":"28","percent":"42"}},"load":["0.00","0.00","0.00"],"mem":{"kb":"156596","percent":"2"},"swap":{"kb":"0","percent":"0"},"uptime":{"secs":289}},"node_id":""}
2019-06-07_18:02:48.15952 ********************
2019-06-07_18:02:48.15952 [Cmd Runner] 2019/06/07 18:02:48 DEBUG - Running command 'route -n'
2019-06-07_18:02:48.16047 [Cmd Runner] 2019/06/07 18:02:48 DEBUG - Successful: true (0)
I have also removed "ephemeral" option from my OpenStack flavor as per https://github.com/cloudfoundry/bosh/issues/2044 Any tips to debug this further greatly appreciated. Thanks.
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Bosh VM deployment priority
Thor
Dear cf-bosh group,
I am using Bosh to deploy VMs in a VSphere environment. The cloud manifest contains drs_rules specifying separate_vms. When I request 5 VMs in a 5 host environment, those 5 VMs are deployed onto the 5 different hosts. I can see - in VCenter - that a DRS rule has been created which contains the 5 VMs. Furthermore, if I migrate one of the VMs to a host which already has a VM, the VMs are shortly after rebalanced to meet the anti-affinity rules. I also see in VCenter that all 5 VMs have custom attributes with drs_rule set to anti-affinity. Good. However, when I attempt to deploy 6 VMs in a 5 host environment, Bosh deploys the first 5 VMs onto the 5 available hosts. Then the deployment fails for the 6th VM with the following error: ******** Task 15 | 22:02:15 | Updating instance worker: worker/bdcceaf4-cbc0-4238-87e9-9f6234273b80 (3) (00:01:41)
L Error: Unknown CPI error 'Unknown' with message 'Could not power on VM '<[Vim.VirtualMachine] vm-27888>': DRS cannot find a host to power on or migrate the virtual machine.' in 'create_vm' CPI method (CPI request ID: 'cpi-867860') ******** That error makes sense. The deployment fails with: *******
Updating deployment: Expected task '15' to succeed but state is 'error'
Exit code 1
******* At this point I would have expected (perhaps incorrectly) that the 6th VM would remained powered off, but this is not the case. After a few minutes the VM is powered on and scheduled onto a host which already has another VM running on it - violating the anti-affinity rule specified in the cloud manifest. When I look at the 6th VM in VCenter, I see that the VM does NOT have the custom attribute with drs_rule set to anti-affinity. I believe this is what allows VCenter to schedule the VM onto a running host, because that VM is not in the anti-affinity group. Questions: 1) Does Bosh (design) prioritize starting the number of requested VMs (in my case 6) over the requested anti-affinity rules (which in my mind would prevent the 6th VM from being powered up)? 2) If "yes" to question 1), is there an option to prevent the 6th VM from being started? 3) If "no" to question 1), is this a bug? Sincerely, Thor
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Cloud Foundry Summit CFP Deadline is Friday
Just wanted to remind you all about the CFP deadline that is creeping up. Please go ahead and get your submissions in before this Friday. -- Swarna Podila (she/her) Senior Director, Community | Cloud Foundry FoundationYou can read more about pronouns here, or please ask if you'd like to find out more. ---------- Forwarded message --------- From: Deborah Giles <dgiles@...> Date: Wed, May 29, 2019 at 8:34 AM Subject: Cloud Foundry Summit CFP Deadline is Friday To: <spodila@...>
|
|||||||||||||||||||||
|
|||||||||||||||||||||
[Proposal] CF-RFC 020: BOSH: setting global kernel parameters by jobs
Stanislav German-Evtushenko
Hi, everyone,
Somebody who has already tried to ensure kernel parameters to be in a certain state for a particular bosh job must have noticed that there is no simple and reliable way to do this (see https://github.com/cloudfoundry/routing-release/issues/138 as an example). I would like to share a proposal with the aim to solve it.
Link to the proposal: https://docs.google.com/document/d/1BEi2A5T47K8f26B-QSotuYr-16tUCNRbX6BumNKXb7c Pull request: https://github.com/cloudfoundry/os-conf-release/pull/47 Slack channel:
https://cloudfoundry.slack.com/messages/CJZLX6NDT
Please let us know your opinions here, on the document or on the slack channel.
Thanks, Stanislav
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Re: Removing Support for v1 Style Manifests
Morgan Fine
Hey Jean, Apologies for the delay. Thank you for raising your concerns around the removal of v1 manifest support. Looking at the Github issue you raised, it seems like while not ideal from your perspective, the workflow that Maya proposed would work. It's also worth mentioning that we're thinking about what it might look like to have BOSH be able to manage more and more of the IaaS resources that your issue references so that could be alternative way to support your use case. That being said, we are still planning to proceed with removing v1 manifests. It would be great if you could give some of the proposals in that Github issue a try and give us feedback there, that way we can work to ensure your use case is still covered in a v2 world. Thank you again for the feedback. Best, Morgan
On Mon, Apr 1, 2019 at 8:38 AM Jeanyhwh Desulme via Lists.Cloudfoundry.Org <jeanyhwh.desulme=libertymutual.com@...> wrote: Hey Morgan -
|
|||||||||||||||||||||
|
|||||||||||||||||||||
[cf-dev] [CFEU2019 Registration] Contributor Code
I meant to share this with the bosh folks as well; apologies for the miss on my part. Please make note of the Contributor Summit at Cloud Foundry Summit EU (Sep 10) while making travel plans to The Hague. Please also note the Contributor code for Cloud Foundry Summit EU registration if you're a current or a past Cloud Foundry Contributor. -- Swarna Podila (she/her) Senior Director, Community | Cloud Foundry FoundationYou can read more about pronouns here, or please ask if you'd like to find out more. ---------- Forwarded message --------- From: Swarna Podila via Lists.Cloudfoundry.Org <spodila=cloudfoundry.org@...> Date: Tue, May 14, 2019 at 3:45 PM Subject: Re: [cf-dev] [CFEU2019 Registration] Contributor Code To: CF Developers Mailing List <cf-dev@...> Dear Cloud Foundry Contributors, I just wanted to surface up the Contributor Code email in case you need it to register for Cloud Foundry Summit EU. As you are making travel plans, please also note that Contributor Summit will be hosted again as a Day Zero event on Sep 10th. We are still finalizing the specifics (start/end times, schedule etc.) but please do plan to get The Hague in time for the Contributor Summit. Thank you. -- Swarna Podila (she/her) Senior Director, Community | Cloud Foundry FoundationYou can read more about pronouns here, or please ask if you'd like to find out more.
On Wed, Apr 17, 2019 at 11:18 AM Swarna Podila <spodila@...> wrote:
|
|||||||||||||||||||||
|
|||||||||||||||||||||
[Operators SIG Meeting] Call for Content
Dear Cloud Foundry Community, Recently, the Cloud Foundry platform operators at SAP initiated an "operators SIG meeting" to bring the operators together and share experiences, good practices, etc. The recording of the first meeting is here. The recurring call is scheduled for every fourth Wednesday of the month at 8AM US Pacific; so the next call is on Wednesday, May 22. If you'd like discuss specific topics, you can either unicast me or add it to the meeting notes. Join the discussion on Cloud Foundry slack at #cf-operators. -- Swarna Podila (she/her) Senior Director, Community | Cloud Foundry FoundationYou can read more about pronouns here, or please ask if you'd like to find out more.
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Vote Today! CF Summit Track Co-Chairs
Please vote for the track co-chairs who will help shape the content at the next Cloud Foundry Summit Europe. Deadline: May 13, 11:59PM US Pacific -- Swarna Podila (she/her) Senior Director, Community | Cloud Foundry FoundationYou can read more about pronouns here, or please ask if you'd like to find out more.
|
|||||||||||||||||||||
|
|||||||||||||||||||||
[Important Notice] End of support for trusty stemcells
Mukesh Gadiya
Hi BOSH OSS Community, We want to let y'all know that Canonical has stopped providing security updates to Trusty ( Ubuntu 14.04) on April 30th 2019. In BOSH OSS world, this means we can not continue to provide security updates for trusty stemcell line 3586 but will continue to support and ship security patches to Xenial stemcell lines. Thanks, BOSH Systems team
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Re: Migrating Bosh-Acceptance-Tests to v2 manifests
Riegger, Felix
Hi all,
meanwhile this has been tested against OpenStack. There were only minor changes necessary. These have been integrated into the v2_manifest branch. Kind regards, Felix
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Re: Migrating Bosh-Acceptance-Tests to v2 manifests
Belinda Liu <bliu@...>
Hey Jason and Mike, Thanks for calling that out. We don't have any environments on hand to test Azure and Openstack, and have synced up with the Openstack CPI and Azure CPI team to test these changes. We don't intend to merge before we get feedback from them. Thanks, Belinda
On Thu, Apr 25, 2019 at 8:06 AM Jason Stevens <Jason.Stevens@...> wrote:
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Re: Migrating Bosh-Acceptance-Tests to v2 manifests
Jason Stevens
Yes, I’m curious about that, too. I would think we’d want to test on Azure before a merge.
-jps
Create clarity. Generate energy. Find solutions.
From: cf-bosh@... <cf-bosh@...> on behalf of Mike Lloyd via Lists.Cloudfoundry.Org <mike=reboot3times.org@...>
Sent: Thursday, April 25, 2019 7:25 AM To: cf-bosh@... Cc: Belinda Liu; Morgan Fine Subject: Re: [cf-bosh] Migrating Bosh-Acceptance-Tests to v2 manifests Charles and Belinda,
Is there a reason Azure and Openstack are still untested?
Thanks,
Mike.
From: cf-bosh@... <cf-bosh@...>On Behalf Of
via Lists.Cloudfoundry.Org
Hi Bosh community,
tldr; BATs is now on v2 manifests. If you run BATs in your pipeline, please try running against the v2_manifest branchbefore it is merged into master and let us know if you have any feedback. Hopefully it should just work.
We are migrating BATs (https://github.com/cloudfoundry/bosh-acceptance-tests) to use v2 manifests + cloud config. Using v2 manifest to deploy on bosh has been the standard for a while, but BATs has never been updated. We hope to deprecate v1 manifests very soon and need to upgrade BATs. Much of the v1 manifest that relates to vms, networks, disks etc is now in cloud config in the v2 world.
Specifically, the BATs previously had deployment manifest templates for every IaaS. We have migrated the IaaS details into IaaS specific cloud configs instead, and all BATs will deploy a generic deployment manifest. These can all be found in https://github.com/cloudfoundry/bosh-acceptance-tests/tree/v2_manifest/templates with the `cloud-config-` prefix.
We tested vSphere, AWS, GCP, and Warden in our own pipelines, but Azure and Openstack remain untested. BATs also has an Oracle configuration which we did not touch. Additionally, BATs can be configured with a variety of different network topologies, and we only tested a few. Ideally these changes should not affect anyone, but there may be some unforeseen issues. If you are a CPI author and have feedback on these changes, please reach out to the BOSH Director team. Otherwise, these changes will be merged into master next week.
Thanks in advance for your feedback,
Charles and Belinda, BOSH Director team (SF)
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Re: Migrating Bosh-Acceptance-Tests to v2 manifests
Mike Lloyd
Charles and Belinda,
Is there a reason Azure and Openstack are still untested?
Thanks,
Mike.
From: cf-bosh@... <cf-bosh@...>
On Behalf Of via Lists.Cloudfoundry.Org
Sent: Wednesday, April 24, 2019 5:46 PM To: cf-bosh@... Cc: Belinda Liu <bliu@...>; Morgan Fine <mfine@...> Subject: [cf-bosh] Migrating Bosh-Acceptance-Tests to v2 manifests
Hi Bosh community,
tldr; BATs is now on v2 manifests. If you run BATs in your pipeline, please try running against the v2_manifest branch before it is merged into master and let us know if you have any feedback. Hopefully it should just work.
We are migrating BATs (https://github.com/cloudfoundry/bosh-acceptance-tests) to use v2 manifests + cloud config. Using v2 manifest to deploy on bosh has been the standard for a while, but BATs has never been updated. We hope to deprecate v1 manifests very soon and need to upgrade BATs. Much of the v1 manifest that relates to vms, networks, disks etc is now in cloud config in the v2 world.
Specifically, the BATs previously had deployment manifest templates for every IaaS. We have migrated the IaaS details into IaaS specific cloud configs instead, and all BATs will deploy a generic deployment manifest. These can all be found in https://github.com/cloudfoundry/bosh-acceptance-tests/tree/v2_manifest/templates with the `cloud-config-` prefix.
We tested vSphere, AWS, GCP, and Warden in our own pipelines, but Azure and Openstack remain untested. BATs also has an Oracle configuration which we did not touch. Additionally, BATs can be configured with a variety of different network topologies, and we only tested a few. Ideally these changes should not affect anyone, but there may be some unforeseen issues. If you are a CPI author and have feedback on these changes, please reach out to the BOSH Director team. Otherwise, these changes will be merged into master next week.
Thanks in advance for your feedback,
Charles and Belinda, BOSH Director team (SF)
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Migrating Bosh-Acceptance-Tests to v2 manifests
Charles Hansen <chansen@...>
Hi Bosh community, tldr; BATs is now on v2 manifests. If you run BATs in your pipeline, please try running against the v2_manifest branch before it is merged into master and let us know if you have any feedback. Hopefully it should just work. We are migrating BATs (https://github.com/cloudfoundry/bosh-acceptance-tests) to use v2 manifests + cloud config. Using v2 manifest to deploy on bosh has been the standard for a while, but BATs has never been updated. We hope to deprecate v1 manifests very soon and need to upgrade BATs. Much of the v1 manifest that relates to vms, networks, disks etc is now in cloud config in the v2 world. Specifically, the BATs previously had deployment manifest templates for every IaaS. We have migrated the IaaS details into IaaS specific cloud configs instead, and all BATs will deploy a generic deployment manifest. These can all be found in https://github.com/cloudfoundry/bosh-acceptance-tests/tree/v2_manifest/templates with the `cloud-config-` prefix. We tested vSphere, AWS, GCP, and Warden in our own pipelines, but Azure and Openstack remain untested. BATs also has an Oracle configuration which we did not touch. Additionally, BATs can be configured with a variety of different network topologies, and we only tested a few. Ideally these changes should not affect anyone, but there may be some unforeseen issues. If you are a CPI author and have feedback on these changes, please reach out to the BOSH Director team. Otherwise, these changes will be merged into master next week. Thanks in advance for your feedback, Charles and Belinda, BOSH Director team (SF)
|
|||||||||||||||||||||
|
|||||||||||||||||||||
Re: Cloud Foundry Operators SIG
Kicking off in a few hours is the first cf-operators sig meeting. Don’t forget to join: at 5AM US Eastern | 11AM CET. Topic: Cloud Foundry Operators SIG Meeting Time: Apr 24, 2019 2:00 AM Pacific Time (US and Canada) Join Zoom Meeting One tap mobile +16699006833,,4084184280# US (San Jose) +19292056099,,4084184280# US (New York) Dial by your location +1 669 900 6833 US (San Jose) +1 929 205 6099 US (New York) Meeting ID: 408 418 4280 Find your local number: https://zoom.us/u/ali1NKUd8
On Thu, Apr 18, 2019 at 19:08 Swarna Podila <spodila@...> wrote:
--
|
|||||||||||||||||||||
|