Date   

Bosh Deployment on Alibaba Cloud has not been merged yet

"何贵民(箫竹)
 

Dear Friends of BOSH,
 
Same as message #2640, in order to help developers to deploy bosh director, we provided the bosh-deployment CPI manifest by the PR: https://github.com/cloudfoundry/bosh-deployment/pull/175 and it has been opened more than one year, and still has not been merged.

If there are any issues, please let us know by github or slack, and we will fix them asap.
 
We appreciate your comments, thoughts, feedback and suggestions! You can reply on the cf-bosh mailing list or reach out in the ways described in the 'contact' section in the document.
 
Thanks and warm regards
Guimin, He
 
 


Bosh AlibabaCloud CPI Certification has not been merged yet

"何贵民(箫竹)
 

Dear Friends of BOSH,
 
Same as message #2640, merging bosh Alibaba Cloud cpi certification still needs your help.

In order to validate the Bosh CPI, according to community requirements, I implemented the bosh-cpi-certification for AlibabaCloud and the codes has been submitted to PR: https://github.com/cloudfoundry-incubator/bosh-cpi-certification/pull/15 and https://github.com/cloudfoundry/bosh-acceptance-tests/pull/40. But, so far, them have not been merged yet.


The following is bosh-cpi-certification result:



The failed job came from two issues:  https://github.com/cloudfoundry/bosh-linux-stemcell-builder/issues/97 and https://github.com/cloudfoundry/bosh-linux-stemcell-builder/issues/98.
Currently, I have no idea on the issues, and can you give me some help?

Thanks and warm regards

 

Guimin, He


Bosh AlibabaCloud light stemcell has not been imported to provide community version yet

"何贵民(箫竹)
 

Dear Friends of BOSH,

Same as message #2640, importing bosh-alicloud-light-stemcell still needs your help.

In order to provide AlibabaCloud light stemcell community version, according to community requirements, I provided the bosh-alicloud-light-stemcell-builder: https://github.com/cloudfoundry-incubator/bosh-alicloud-light-stemcell-builder to build the latest light stemcell when there is a new full stemcell published. After that, the light stemcell meta info will be uploaded into https://github.com/cloudfoundry-incubator/stemcells-alicloud-index. Now, the builder has published three light stemcells, but them have not been published in the bosh.io.


The following is bosh-alicloud-light-stemcell-builder's result:




If there are any issues, please let us know by github or slack, and we will fix them asap.

 

We appreciate your comments, thoughts, feedback and suggestions! You can reply on the cf-bosh mailing list or reach out in the ways described in the 'contact' section in the document.



Thanks and warm regards

Guimin, He

 


Bosh AlibabaCloud CPI release has not had community version yet

"何贵民(箫竹)
 

Dear Friends of BOSH,
 
As the major Cloud Provider, we, Alibaba Cloud, align more closely with the Bosh and CF community. In the past two year, we have got great results, like providing Bosh AlibabaCloud CPI,  bosh-linux-stemcell-builder supports building AlibabaCloud xenial stemcell, cf-deployment supports AlibabaCloud cloud-config and so on. Besides, there also have more results needs us to achieve, like providing light stemcell, official CPI releases, CPI certification CI and others.
 
With the integration progress, there also needs CF's efforts to help us to accelerate and finish integration Alibaba Cloud with CloudFoundry.
 
Currently, we have provided the latest CPI codes in the https://github.com/cloudfoundry-incubator/bosh-alicloud-cpi-release and we has published several custom version to help our customer to deploy Bosh and CF. But, for our customers and us, we more need its community version. like AlibabaCloud full stemcell community version. 
 
In order to validate the CPI availability, I built my concourse pipeline to run the CPI CI, and the results as the following:

 
If there are any issues, please let us know by github or slack, and we will fix them asap.
 
We appreciate your comments, thoughts, feedback and suggestions! You can reply on the cf-bosh mailing list or reach out in the ways described in the 'contact' section in the document.
 
Thanks and warm regards
Guimin, He


CFF SIG Lifecycle

Marco Voelz
 

[cross-post for visibility on cf-bosh and cf-dev. Sorry for the spam if you're reading both lists. Future posts about this will be sent to cf-bosh only.]

 

Dear Friends of BOSH,

 

As the kubernetes community keeps growing and getting more traction and attention, we, the CF community, align more closely with the tools and community in the k8s ecosystem. Examples in the Runtime PMC are project Eirini and the ongoing efforts to leverage istio features for routing in CF, etc. Effectively, it seems that we're rebasing CF on top of a new abstraction, which is cool, because it allows us to do new cool things!

 

Within the BOSH PMC, there are also efforts towards installing CF on top of k8s, entirely without BOSH, called project Quarks. Especially when installing a complex and large distributed system like CF, issues with the current state of lifecycle management in the k8s ecosystem become apparent. Even more so if you're used to BOSH, which solves many issues for its users better than most other tools out there. Therefore, we're trying to bring some of the lessons we've learned by working on and with BOSH to the k8s ecosystem.

 

We have established a Lifecycle SIG and wrote down our ideas and opinions in the past few weeks [1]. While this document isn't done, we are at a point, we would like to open the discussion for everyone. If you're interested in participating in the conversation, please read the document and let us know what your think!

 

We appreciate your comments, thoughts, feedback and suggestions! You can reply on the cf-bosh mailing list or reach out in the ways described in the 'contact' section in the document.

 

Thanks and warm regards

Marco for the CFF SIG Lifecycle

 

[1] https://docs.google.com/document/d/1T1ZrwSV9aXWmF1tmUoMyo9M9EvWd07f-Uv5IO7nSmtY/edit#


Re: cluster auto-scale with bosh

Maya Rosecrance
 

Bosh does not have an autoscale feature at this time. We have heard of people using metrics to trigger bosh events to scale and we have an open issue that I'd encourage you to chime in on: https://github.com/cloudfoundry/bosh/issues/1652


Re: How to prevent compilation VMs from being deleted?

R M
 

Thanks Conor - I already have `reuse_compilation_vms` set but I believe it is intended to be re-used for compilation.  BOSH seems to delete it anyway once the compilation job finishes/fails.


Re: How to prevent compilation VMs from being deleted?

Conor Nosal
 

`keep_unreachable_vms` is unrelated to compilation. Unreachable means that the bosh agent on the VM never connected to the director.

To keep compilation VMs, you want to set `reuse_compilation_vms` to true in the cloud config. https://bosh.io/docs/cloud-config/#compilation

On Tue, Jul 30, 2019 at 11:04 AM R M <rishi.investigate@...> wrote:
Hi there,

Is there any way of preventing compilation VMs from being destroyed if components fail to compile? I would like to be able to look into compilation VMs for troubleshooting the failures.

I tried following https://starkandwayne.com/blog/how-to-lock-vcap-password-for-bosh-vms/ but wasn't sure where to put following section wrt `cf-deploy`:

bosh -e bosh-1 -d cf deploy cf-deployment.yml -v system_domain=abc.com  --vars-store=/tmp/cloud-foundry/bosh/director-creds.yml

Thanks for any pointers ...

instance_groups:
- name: bosh
  properties:
    director:
      debug: 
        keep_unreachable_vms: true


How to prevent compilation VMs from being deleted?

R M
 

Hi there,

Is there any way of preventing compilation VMs from being destroyed if components fail to compile? I would like to be able to look into compilation VMs for troubleshooting the failures.

I tried following https://starkandwayne.com/blog/how-to-lock-vcap-password-for-bosh-vms/ but wasn't sure where to put following section wrt `cf-deploy`:

bosh -e bosh-1 -d cf deploy cf-deployment.yml -v system_domain=abc.com  --vars-store=/tmp/cloud-foundry/bosh/director-creds.yml

Thanks for any pointers ...

instance_groups:
- name: bosh
  properties:
    director:
      debug: 
        keep_unreachable_vms: true


cluster auto-scale with bosh

dhensel@...
 

Hello,

 

Does bosh have an auto-scale feature that can be used with VMWare in a K8S deployment ? I know K8S has auto-scaling but from my understanding it is not available for VMWare.

 

Thanks,

 

-Doug

 


Removal of Power DNS from BOSH

Morgan Fine
 

Hi BOSH Friends,

PowerDNS has existed in BOSH for a long time to support service discovery of BOSH deployments. More recently, BOSH DNS has been available as the recommended way of doing service discovery. 

The BOSH Director team would like to remove Power DNS from the BOSH Director in an effort to formally make BOSH DNS the supported and recommended way of doing service discovery. 

If you have any feedback or concerns with this approach, please let me know.

Thanks,
Morgan Fine
PM of BOSH Director Team


BOSH PMC meeting for June cancelled

Marco Voelz
 

Dear friends of BOSH,

 

I just cancelled the BOSH PMC meeting for June, which was originally scheduled on June 20th, as I am on vacation. I have updated the CFF calendar accordingly and deleted the event.

See you for the next BOSH PMC meeting in July!

 

Warm regards

Marco

 


Re: BOSH deployment times out pinging agent after 600 seconds (s390x platform)

R M
 

I see some of these messages in NATS .. not sure if this is a problem:

Interestingly, Director is throwing Auth errors but VM is able to post to NATS:
/==================================/
[1] 2019/06/11 12:32:14.863072 [TRC] 192.168.20.10:35232 - cid:3 - ->> [CONNECT {"verbose":false,"pedantic":false,"lang":"ruby","version":"0.9.2","protocol":1,"ssl_required":true,"tls_required":true}]
[1] 2019/06/11 12:32:14.863095 [ERR] 192.168.20.10:35232 - cid:3 - Authorization Error <--- Director VM to NATS fails
[1] 2019/06/11 12:32:14.863101 [TRC] 192.168.20.10:35232 - cid:3 - <<- [-ERR Authorization Violation]
[1] 2019/06/11 12:32:14.863138 [DBG] 192.168.20.10:35232 - cid:3 - Client connection closed
[1] 2019/06/11 12:32:14.865154 [DBG] 192.168.20.10:35234 - cid:4 - Client connection created
....
[1] 2019/06/11 12:32:48.757875 [TRC] 192.168.20.3:42022 - cid:5 - ->> [CONNECT {"user":"nats","pass":"739rv2lksjyyfdhmqo49","verbose":true,"pedantic":true}]
[1] 2019/06/11 12:32:48.757925 [TRC] 192.168.20.3:42022 - cid:5 - <<- [OK] <--- Agent VM seems ok and is able to publish using PWD authentication (?)
[1] 2019/06/11 12:32:48.759034 [TRC] 192.168.20.3:42022 - cid:5 - ->> [PUB hm.agent.heartbeat.e01ca0da-d340-4f08-ad1a-8a29ebb8abc9 356]
[1] 2019/06/11 12:32:48.759056 [TRC] 192.168.20.3:42022 - cid:5 - ->> MSG_PAYLOAD: [{"deployment":"","job":null,"index":null,"job_state":"running","vitals":{"cpu":{"sys":"0.4","user":"0.3","wait":"0.0"},"disk":{"ephemeral":{"inode_percent":"0","percent":"0"},"system":{"inode_percent":"28","percent":"42"}},"load":["0.00","0.00","0.00"],"mem":{"kb":"152908","percent":"2"},"swap":{"kb":"0","percent":"0"},"uptime":{"secs":24}},"node_id":""}]
[1] 2019/06/11 12:32:48.759069 [TRC] 192.168.20.3:42022 - cid:5 - <<- [OK]
[1] 2019/06/11 12:32:48.759637 [TRC] 192.168.20.3:42022 - cid:5 - ->> [SUB agent.e01ca0da-d340-4f08-ad1a-8a29ebb8abc9 1]
/==================================/

Do I need to specify NATS certs as part of my deployment request? -  BOSH_LOG_LEVEL=info bosh -e bosh-1 -d redis-deployment deploy manifest.yml --certs?


BOSH deployment times out pinging agent after 600 seconds (s390x platform)

R M
 

Hi there,

- Using OpenStack Rocky on s390x

I built (Xenial) stemcell and ported BOSH over to s390x platform.  For the most part it seems to work.  However, deployment times out during "Compiling packages" stage.  I am unable to figure out why this could be a problem.  Director VM and compilation VM seem to be able to ping each other.  NATS messages are also being posted by compilation VM.  Please let me know where else I could look for clues:

Here are my steps:

/====================================/
BOSH_LOG_LEVEL=info bosh -e bosh-1 -d redis-deployment deploy manifest.yml
....

Task 61 | 19:19:07 | Preparing deployment: Preparing deployment (00:00:00)
Task 61 | 19:19:07 | Preparing package compilation: Finding packages to compile (00:00:00)
Task 61 | 19:19:07 | Compiling packages: redis/b8455f0a7551849b841b759fc44d2c1eff79331b (00:10:27)
                   L Error: Timed out pinging to c3080c5f-d79b-48f8-a117-8629cf4b6c3c after 600 seconds
Task 61 | 19:29:34 | Error: Timed out pinging to c3080c5f-d79b-48f8-a117-8629cf4b6c3c after 600 seconds
 
Task 61 Started  Fri Jun  7 19:19:07 UTC 2019
Task 61 Finished Fri Jun  7 19:29:34 UTC 2019
Task 61 Duration 00:10:27
Task 61 error
[CLI] 2019/06/07 15:29:34 ERROR - Updating deployment: Expected task '61' to succeed but state is 'error'
/====================================/

My compilation VM Agent logs from /var/vcap/bosh/log/current doesn't seem to indicate any issues:

/====================================/
...
2019-06-07_18:02:48.15948 [File System] 2019/06/07 18:02:48 DEBUG - Checking if file exists /var/vcap/bosh/spec.json
2019-06-07_18:02:48.15948 [File System] 2019/06/07 18:02:48 DEBUG - Stat '/var/vcap/bosh/spec.json'
2019-06-07_18:02:48.15949 [File System] 2019/06/07 18:02:48 DEBUG - Writing /var/vcap/instance/health.json
2019-06-07_18:02:48.15949 [File System] 2019/06/07 18:02:48 DEBUG - Making dir /var/vcap/instance with perm 0777
2019-06-07_18:02:48.15949 [File System] 2019/06/07 18:02:48 DEBUG - Write content
2019-06-07_18:02:48.15949 ********************
2019-06-07_18:02:48.15950 {"state":"running"}
2019-06-07_18:02:48.15950 ********************
2019-06-07_18:02:48.15950 [NATS Handler] 2019/06/07 18:02:48 INFO - Sending hm message 'heartbeat'
2019-06-07_18:02:48.15950 [NATS Handler] 2019/06/07 18:02:48 DEBUG - Message Payload
2019-06-07_18:02:48.15951 ********************
2019-06-07_18:02:48.15951 {"deployment":"","job":null,"index":null,"job_state":"running","vitals":{"cpu":{"sys":"0.0","user":"0.0","wait":"0.0"},"disk":{"ephemeral":{"inode_percent":"0","percent":"0"},"system":{"inode_percent":"28","percent":"42"}},"load":["0.00","0.00","0.00"],"mem":{"kb":"156596","percent":"2"},"swap":{"kb":"0","percent":"0"},"uptime":{"secs":289}},"node_id":""}
2019-06-07_18:02:48.15952 ********************
2019-06-07_18:02:48.15952 [Cmd Runner] 2019/06/07 18:02:48 DEBUG - Running command 'route -n'
2019-06-07_18:02:48.16047 [Cmd Runner] 2019/06/07 18:02:48 DEBUG - Successful: true (0)
 
/====================================/

I have also removed "ephemeral" option from my OpenStack flavor as per 
https://github.com/cloudfoundry/bosh/issues/2044

Any tips to debug this further greatly appreciated.

Thanks.


Bosh VM deployment priority

Thor
 

Dear cf-bosh group,

I am using Bosh to deploy VMs in a VSphere environment.  The cloud manifest contains drs_rules specifying separate_vms.  

When I request 5 VMs in a 5 host environment, those 5 VMs are deployed onto the 5 different hosts.  I can see - in VCenter - that a DRS rule has been created which contains the 5 VMs.  Furthermore, if I migrate one of the VMs to a host which already has a VM, the VMs are shortly after rebalanced to meet the anti-affinity rules.  I also see in VCenter that all 5 VMs have custom attributes with drs_rule set to anti-affinity.  Good.

However, when I attempt to deploy 6 VMs in a 5 host environment, Bosh deploys the first 5 VMs onto the 5 available hosts.  Then the deployment fails for the 6th VM with the following error:
********
Task 15 | 22:02:15 | Updating instance worker: worker/bdcceaf4-cbc0-4238-87e9-9f6234273b80 (3) (00:01:41)
                  L Error: Unknown CPI error 'Unknown' with message 'Could not power on VM '<[Vim.VirtualMachine] vm-27888>': DRS cannot find a host to power on or migrate the virtual machine.' in 'create_vm' CPI method (CPI request ID: 'cpi-867860')
********
That error makes sense.  The deployment fails with:
*******
Updating deployment:
 Expected task '15' to succeed but state is 'error'
 
Exit code 1
*******
At this point I would have expected (perhaps incorrectly) that the 6th VM would remained powered off, but this is not the case.  After a few minutes the VM is powered on and scheduled onto a host which already has another VM running on it - violating the anti-affinity rule specified in the cloud manifest.  When I look at the 6th VM in VCenter, I see that the VM does NOT have the custom attribute with drs_rule set to anti-affinity.  I believe this is what allows VCenter to schedule the VM onto a running host, because that VM is not in the anti-affinity group.

Questions:
1)  Does Bosh (design) prioritize starting the number of requested VMs (in my case 6) over the requested anti-affinity rules (which in my mind would prevent the 6th VM from being powered up)?
2)  If "yes" to question 1), is there an option to prevent the 6th VM from being started?
3) If "no" to question 1), is this a bug?

Sincerely,
   Thor


Cloud Foundry Summit CFP Deadline is Friday

Swarna Podila <spodila@...>
 

Just wanted to remind you all about the CFP deadline that is creeping up.  Please go ahead and get your submissions in before this Friday.

-- Swarna Podila (she/her)
Senior
 Director
, Community
 | Cloud Foundry Foundation

You can read more about pronouns here, or please ask if you'd like to find out more.


---------- Forwarded message ---------
From: Deborah Giles <dgiles@...>
Date: Wed, May 29, 2019 at 8:34 AM
Subject: Cloud Foundry Summit CFP Deadline is Friday
To: <spodila@...>


 
 
The deadline to submit your CFP for Cloud Foundry Summit in The Hague, The Netherlands is quickly approaching. Join us September 11-12, 2019 to discuss how you’re building the cloud native future using Cloud Foundry Technologies. The CFP deadline is Friday, May 31, 2019. 

Whether you’re a contributor to the open source project, a platform end user, an operator or a business decision maker, we want to hear from you.

Summit topics include Cloud Foundry 101: Getting Started; Cloud Foundry for Business; Cloud Foundry for Developers; Cloud Foundry for Operators; Experiments & Extensions; Project Updates and User Stories. 

We also invite you to submit a talk for the Cloud Foundry Summit Diversity Luncheon. You can find the Diversity Luncheon application embedded in the standard CFP for the event. The Foundation is committed to five value-driven actions: Be open, be inclusive, be kind, be transparent and be curious. We aim to bring these sensibilities into the Diversity Luncheon.

The Cloud Foundry community has elected the co-chairs that will review speaking submissions. Learn more about the co-chairs

If you want your abstract reviewed by your peers or just want a second set of eyes before you submit it, you can always tag @cfp-help in #summit on Cloud Foundry slack.
Submit Your CFP »

1 Letterman Drive Building D, Suite D4700 San Francisco, CA 94129
You are receiving this email because you signed up to receive email communications from Cloud Foundry. Update your email preferences.

 


[Proposal] CF-RFC 020: BOSH: setting global kernel parameters by jobs

Stanislav German-Evtushenko
 

Hi, everyone,


Somebody who has already tried to ensure kernel parameters to be in a certain state for a particular bosh job must have noticed that there is no simple and reliable way to do this (see https://github.com/cloudfoundry/routing-release/issues/138 as an example). I would like to share a proposal with the aim to solve it.


Link to the proposal: https://docs.google.com/document/d/1BEi2A5T47K8f26B-QSotuYr-16tUCNRbX6BumNKXb7c

Pull request: https://github.com/cloudfoundry/os-conf-release/pull/47

Slack channel: https://cloudfoundry.slack.com/messages/CJZLX6NDT


Please let us know your opinions here, on the document or on the slack channel.


Thanks,

Stanislav


Re: Removing Support for v1 Style Manifests

Morgan Fine
 

Hey Jean,

Apologies for the delay. Thank you for raising your concerns around the removal of v1 manifest support. Looking at the Github issue you raised, it seems like while not ideal from your perspective, the workflow that Maya proposed would work. It's also worth mentioning that we're thinking about what it might look like to have BOSH be able to manage more and more of the IaaS resources that your issue references so that could be alternative way to support your use case.

That being said, we are still planning to proceed with removing v1 manifests. It would be great if you could give some of the proposals in that Github issue a try and give us feedback there, that way we can work to ensure your use case is still covered in a v2 world. 

Thank you again for the feedback.

Best,
Morgan

On Mon, Apr 1, 2019 at 8:38 AM Jeanyhwh Desulme via Lists.Cloudfoundry.Org <jeanyhwh.desulme=libertymutual.com@...> wrote:

Hey Morgan -

This is unfortunately a breaking change for our implementation of Bosh. We looked at fully implementing V2 some time ago but found that there was an issue in how we wanted to use it. For example our pipelines combine both a CFT and a deployment manifest. So that when we run a deployment of a given application we are passing down physical ids of native aws resources directly into the manifest itself. The following open Github issue has some more specifics in what were looking to do and what we currently have in place today.

Ideally we would love to move but only if we can override some of the deployment configuration at deploy time. 

https://github.com/cloudfoundry/bosh/issues/2094

Regards,
Jean 
Liberty Mutual


[cf-dev] [CFEU2019 Registration] Contributor Code

Swarna Podila <spodila@...>
 

I meant to share this with the bosh folks as well; apologies for the miss on my part.

Please make note of the Contributor Summit at Cloud Foundry Summit EU (Sep 10) while making travel plans to The Hague.

Please also note the Contributor code for Cloud Foundry Summit EU registration if you're a current or a past Cloud Foundry Contributor.

-- Swarna Podila (she/her)
Senior
 Director
, Community
 | Cloud Foundry Foundation

You can read more about pronouns here, or please ask if you'd like to find out more.


---------- Forwarded message ---------
From: Swarna Podila via Lists.Cloudfoundry.Org <spodila=cloudfoundry.org@...>
Date: Tue, May 14, 2019 at 3:45 PM
Subject: Re: [cf-dev] [CFEU2019 Registration] Contributor Code
To: CF Developers Mailing List <cf-dev@...>


Dear Cloud Foundry Contributors,
I just wanted to surface up the Contributor Code email in case you need it to register for Cloud Foundry Summit EU.  

As you are making travel plans, please also note that Contributor Summit will be hosted again as a Day Zero event on Sep 10th.  We are still finalizing the specifics (start/end times, schedule etc.) but please do plan to get The Hague in time for the Contributor Summit.

Thank you.

-- Swarna Podila (she/her)
Senior
 Director
, Community
 | Cloud Foundry Foundation

You can read more about pronouns here, or please ask if you'd like to find out more.


On Wed, Apr 17, 2019 at 11:18 AM Swarna Podila <spodila@...> wrote:
I love some of y'all's excitement when you asked for the Contributors' Code to register for Cloud Foundry Summit EU in The Hague (Sep 11-12).

Please use CFEU19CONT to register for the EU Summit, if you are (or were) a Cloud Foundry Contributor (code, docs, bugs -- all contributions count).  See you all there!

-- Swarna Podila (she/her)
Senior
 Director
, Community
 | Cloud Foundry Foundation

You can read more about pronouns here, or please ask if you'd like to find out more.


[Operators SIG Meeting] Call for Content

Swarna Podila <spodila@...>
 

Dear Cloud Foundry Community,
Recently, the Cloud Foundry platform operators at SAP initiated an "operators SIG meeting" to bring the operators together and share experiences, good practices, etc.

The recording of the first meeting is here.

The recurring call is scheduled for every fourth Wednesday of the month at 8AM US Pacific; so the next call is on Wednesday, May 22.  If you'd like discuss specific topics, you can either unicast me or add it to the meeting notes.

Join the discussion on Cloud Foundry slack at #cf-operators. 

-- Swarna Podila (she/her)
Senior
 Director
, Community
 | Cloud Foundry Foundation

You can read more about pronouns here, or please ask if you'd like to find out more.