Date   
Using dynamic networks on aws

Jatin Naik
 

hi,

I am trying to compose a manifest with network type as dynamic on aws. I
have a manifest which is similar to the one described in
https://bosh.io/docs/networks#dynamic.

I am getting an error

Started compiling packages > nothing/1f163bd552d500098d236e67284465da39bfdbef.
Failed: Missing properties: resource_pool.availability_zone (00:00:12)

Error 100: Missing properties: resource_pool.availability_zone

Has anyone faced a similar issue before for the AWS cpi?

The cloud config:


vm_types:
- name: small
cloud_properties:
instance_type: t2.micro
ephemeral_disk: {size: 3000, type: gp2}

disk_types:
- name: disks
disk_size: 200
cloud_properties: {type: gp2}


networks:
- name: dyn
type: dynamic
dns: [10.0.16.2]
cloud_properties:
subnet: subnet-bb1884df
availability_zone: eu-west-1a

compilation:
workers: 5
reuse_compilation_vms: true
vm_type: small
network: dyn

The deployment manifest:

---
name: red
director_uuid: f7671e76-95eb-481e-9dfa-edd4096ecd78

stemcells:
- alias: trusty
name: bosh-aws-xen-hvm-ubuntu-trusty-go_agent
version: latest

releases:
- name: empty-box-release
version: latest

jobs:
- name: nothing
instances: 1
templates:
- {name: do_nothing, release: empty-box-release}
vm_type: small
stemcell: trusty
networks:
- name: dyn

update:
canaries: 1
canary_watch_time: 30000-300000
update_watch_time: 30000-300000
max_in_flight: 1
max_errors: 2
serial: false

Re: Can't find dependency

Dmitriy Kalinin
 

Line "autoreconf -fvi" seems to fail to find autoreconf on path. You may
have to add "export PATH=$PATH:/var/vcap/packages/autoconf/bin" to your
dynomite package.

On Mon, Aug 29, 2016 at 3:51 PM, Alex Njaastad <anjaastad(a)gmail.com> wrote:

I have a packaging file (autoconf) that looks like this:

# abort script on any command that exits with a non zero value
set -e
set -x

PREFIX=${BOSH_INSTALL_TARGET}

export PATH=$PATH:${BOSH_INSTALL_TARGET}/bin:/var/vcap/
packages/dynomite1/bin

tar xvf autoconf/autoconf.tar.gz
(
set -e
cd autoconf-2.69
autoconf_dir=$PREFIX/share/autoconf ./configure --prefix=$PREFIX
make
make install prefix=$PREFIX
)

------------------------------------------------------------
------------------------------------
Then I have another packaging file (Dynomite) that looks like this:

# abort script on any command that exit with a non zero value
set -e
set -x

PREFIX=${BOSH_INSTALL_TARGET}

tar xvf dynomite1/dynomite.tar.gz
(
set -e
cd dynomite

autoreconf -fvi
CFLAGS="-ggdb3 -O0" ./configure --enable-debug=full --prefix=$PREFIX
make
make install prefix=$PREFIX
)
------------------------------------------------------------
--------------------------------------------------
Dynomite has a spec file that looks like this:

---
name: dynomite1
dependencies:
- autoconf
- libtool
files:
- dynomite1/dynomite*
------------------------------------------------------------
-------------------------------------------------------------
Dynomite depends on autoconf, as it uses the autoreconf command. When
building the packages, the autoreconf command fails, as I get the following
error:

"packaging: line 12: autoreconf: command not found"
------------------------------------------------------------
---------------------------------------------------------------

What might I be doing wrong?

Can't find dependency

Alex Njaastad
 

I have a packaging file (autoconf) that looks like this:

# abort script on any command that exits with a non zero value
set -e
set -x

PREFIX=${BOSH_INSTALL_TARGET}

export PATH=$PATH:${BOSH_INSTALL_TARGET}/bin:/var/vcap/packages/dynomite1/bin

tar xvf autoconf/autoconf.tar.gz
(
set -e
cd autoconf-2.69
autoconf_dir=$PREFIX/share/autoconf ./configure --prefix=$PREFIX
make
make install prefix=$PREFIX
)

------------------------------------------------------------------------------------------------
Then I have another packaging file (Dynomite) that looks like this:

# abort script on any command that exit with a non zero value
set -e
set -x

PREFIX=${BOSH_INSTALL_TARGET}

tar xvf dynomite1/dynomite.tar.gz
(
set -e
cd dynomite

autoreconf -fvi
CFLAGS="-ggdb3 -O0" ./configure --enable-debug=full --prefix=$PREFIX
make
make install prefix=$PREFIX
)
--------------------------------------------------------------------------------------------------------------
Dynomite has a spec file that looks like this:

---
name: dynomite1
dependencies:
- autoconf
- libtool
files:
- dynomite1/dynomite*
-------------------------------------------------------------------------------------------------------------------------
Dynomite depends on autoconf, as it uses the autoreconf command. When building the packages, the autoreconf command fails, as I get the following error:

"packaging: line 12: autoreconf: command not found"
---------------------------------------------------------------------------------------------------------------------------

What might I be doing wrong?

Bosh stemcell Ubuntu Ruby, in conflict with /var/vcap/bosh/lib/ruby (LoadError)

Lukas Lehner <weblehner@...>
 

Re: bosh-init deploy failes with 'No valid placement found for disks'

Neil Watson
 

Turns out the problem is this:

datastore_pattern:

The RH syntax is not documented. I guessed '/regex/' and /regex/ but it turns out the regex is neither quoted or // delimited. e.g.

datastore_pattern: datastore[1-8]
persistent_datastore_pattern: datastore[1-8]

Re: How to debug the VM recreation

Stanley Shen
 

Do we have any information in bosh or bosh director side?

I tried to reproduce it by running the test again but it's not reproduced right now.
And yes, we are running some performance testing on this VM and the CPU/RAM usage will be quite busy.

How bosh consider a VM is "unresponsive", and any configuration we can do on it?

Re: How to debug the VM recreation

Dmitriy Kalinin
 

you can try to disable resurrection (bosh vm resurrection off) and wait
until agent becomes unresponsive. as long as you have iaas level ssh
capability you can get on the box and debug from there. we also recommend
to configuring hm (and your iaas) to record metrics like cpu, ram, etc to
see if your vm becomes very busy for some reason.

On Wed, Aug 24, 2016 at 3:56 PM, Stanley Shen <meteorping(a)gmail.com> wrote:

Hello all

I am having a VM which is recreated often under one testing.
I was trying to check what causes the recreation but I got no clue yet.

If there are any hint how to check this?

How to debug the VM recreation

Stanley Shen
 

Hello all

I am having a VM which is recreated often under one testing.
I was trying to check what causes the recreation but I got no clue yet.

If there are any hint how to check this?

Bosh Lite ip routing

Janke, Thomas <thomas.janke@...>
 

Hi,

I am running Bosh Lite on AWS and I can't get ip routing running so that I can access container IPs from outside the Bosh Lite VM. My setup looks like this:

- My Bosh Lite VM with IP 172.31.27.144 is running a container with IP 10.244.10.184
- I can access the container via the internal IP from within the Bosh Lite VM
- I have a another VM I want to access the container from
- From that VM I can reach the Bosh Lite IP without any problem
- I added a route there (route add -host 10.244.10.184 gw 172.31.27.144 ). Still I cannot reach the container from that VM

I usually run a similar setup locally on VirtualBox without any problems. AWS security groups should not be a problem. I even tried with all traffic enabled for inbound and outbound connections.

Any ideas? I am desperately looking for a solution or at least a hint.

Thanks,
Thomas

Re: bosh-init deploy failes with 'No valid placement found for disks'

Neil Watson
 

I'm certain that the ESX host has access to those datastores. The directory was using one of them before when it was hard coded to use just the one. Then I changed the pattern so it would match all datastores and now this error.

Re: bosh-init deploy failes with 'No valid placement found for disks'

Neil Watson
 

I'm certain that the Vsphere cluster has access to all of those datastores.

Re: bosh-init deploy failes with 'No valid placement found for disks'

Geoff Franks <geoff@...>
 

I believe it means that of the datastores you've specified for bosh disks, none are accessible by the ESX host that you tried to build the bosh director on. You might need to specify the datacenter/cluster for the bosh director as one that can access those datastores.

On Aug 24, 2016, at 3:39 PM, Neil Watson <neil(a)watson-wilson.ca> wrote:

What does this error mean and how can I fix it?

Starting registry... Finished (00:00:00)
Uploading stemcell 'bosh-vsphere-esxi-ubuntu-trusty-go_agent/3262.5'... Failed (00:00:08)
Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)

Command 'deploy' failed:
creating stemcell (bosh-vsphere-esxi-ubuntu-trusty-go_agent 3262.5):
CPI 'create_stemcell' method responded with error: CmdError{"type":"Bosh::Clouds::CloudError","message":"No valid placement found for disks:\n- Size: 528, Target DS Pattern: /datastore[1-8]/, Current Location: N/A\n\nPossible placement options:\n- Cluster name: BOSH_CL\n Datastores:\n - Name: datastore1, free space: 1774121\n - Name: datastore8, free space: 1605822\n - Name: datastore7, free space: 1802097\n - Name: datastore5, free space: 1787646\n - Name: datastore4, free space: 635180\n - Name: datastore3, free space: 196208\n - Name: datastore2, free space: 611371\n - Name: datastore6, free space: 1170125\n\n","ok_to_retry":false}

bosh-init deploy failes with 'No valid placement found for disks'

Neil Watson
 

What does this error mean and how can I fix it?

Starting registry... Finished (00:00:00)
Uploading stemcell 'bosh-vsphere-esxi-ubuntu-trusty-go_agent/3262.5'... Failed (00:00:08)
Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)

Command 'deploy' failed:
creating stemcell (bosh-vsphere-esxi-ubuntu-trusty-go_agent 3262.5):
CPI 'create_stemcell' method responded with error: CmdError{"type":"Bosh::Clouds::CloudError","message":"No valid placement found for disks:\n- Size: 528, Target DS Pattern: /datastore[1-8]/, Current Location: N/A\n\nPossible placement options:\n- Cluster name: BOSH_CL\n Datastores:\n - Name: datastore1, free space: 1774121\n - Name: datastore8, free space: 1605822\n - Name: datastore7, free space: 1802097\n - Name: datastore5, free space: 1787646\n - Name: datastore4, free space: 635180\n - Name: datastore3, free space: 196208\n - Name: datastore2, free space: 611371\n - Name: datastore6, free space: 1170125\n\n","ok_to_retry":false}

Re: Bosh scale up process

Sundarajan Srinivasan
 

Thanks Ronak.

I didn't mean cf releases. I am deploying databases in bosh. So when i try to increase the instance number in a deployment yaml file and redeploy it, bosh restarts the instances and does canary update.

I am trying to understand if there will be set of procedure so that i will avoid this restarts during canary update.

Thanks
Sundar

Re: Bosh scale up process

Ronak Banka
 

Hi Sundar,

If you are referring to cf release jobs than they already have drain
scripts which take care of clean shutdown and restart . so if you have
sufficient amount of instances running , to keep components in HA then
there shouldn't be any issue.

Thanks
Ronak

On Thu, Aug 18, 2016 at 4:57 PM, Sundarajan Srinivasan <
sundarajan.s(a)gmail.com> wrote:

I find that bosh restarts all existing process while increasing the number
of instances in a job. Do we have any approach so to achieve 0% down time
while doing a scale up/down job ?

Thanks
Sundar

Bosh scale up process

Sundarajan Srinivasan
 

I find that bosh restarts all existing process while increasing the number of instances in a job. Do we have any approach so to achieve 0% down time while doing a scale up/down job ?

Thanks
Sundar

Re: error generating deployment manifest from stub file

Amit Kumar Gupta
 

Hi Sankeerth,

Can you open an issue on
https://github.com/cloudfoundry-incubator/consul-release/issues? It's
helpful if you can mention what version of cf-release you're using, and if
you can include a link to your manifest file, with any sensitive data
redacted. Chances are the certificates you've put in for Consul are
malformed, or the manifest YAML where you've actually declared the certs is
syntactically incorrect.

Best,
Amit

On Wed, Aug 17, 2016 at 6:58 PM, Sankeerth Sai <sankyrth4u(a)gmail.com> wrote:

Thanks Ronak. It worked. I have declared those properties and deployment
manifest got generated.
However when I tried to deploy most of the vms got deployed but consul
failed resulting in deployment error
Director task 302
Deprecation: Ignoring cloud config. Manifest contains 'networks' section.

Started preparing deployment > Preparing deployment. Done (00:00:01)

Started preparing package compilation > Finding packages to compile.
Done (00:00:00)

Started updating job consul_z1 > consul_z1/0 (xxxxxxxx) (canary).
Failed: 'consul_z1/0 (xxxxxx)' is not running after update. Review logs for
failed jobs: consul_agent (00:10:36)

Error 400007: 'consul_z1/0 (xxx)' is not running after update. Review logs
for failed jobs: consul_agent

The logs under /var/vcap/sys/log indicate "error parsing any
certificates"...I checked many times the consul certificates t but couldn't
figure out the issue.......I used certstarp to generate self signed certs
with an encrypted key...Do I need to declare any additional properties in
consul for my deployment manifest before doing bosh deploy? or is it purely
syntax related issue?

Re: error generating deployment manifest from stub file

Sankeerth Sai
 

Thanks Ronak. It worked. I have declared those properties and deployment manifest got generated.
However when I tried to deploy most of the vms got deployed but consul failed resulting in deployment error
Director task 302
Deprecation: Ignoring cloud config. Manifest contains 'networks' section.

Started preparing deployment > Preparing deployment. Done (00:00:01)

Started preparing package compilation > Finding packages to compile. Done (00:00:00)

Started updating job consul_z1 > consul_z1/0 (xxxxxxxx) (canary). Failed: 'consul_z1/0 (xxxxxx)' is not running after update. Review logs for failed jobs: consul_agent (00:10:36)

Error 400007: 'consul_z1/0 (xxx)' is not running after update. Review logs for failed jobs: consul_agent

The logs under /var/vcap/sys/log indicate "error parsing any certificates"...I checked many times the consul certificates t but couldn't figure out the issue.......I used certstarp to generate self signed certs with an encrypted key...Do I need to declare any additional properties in consul for my deployment manifest before doing bosh deploy? or is it purely syntax related issue?

Re: error generating deployment manifest from stub file

Ronak Banka
 

Hi Sankeerth,

check for the missing properties in global property block, check if below
properties exists in your intermediate manifest or not.

properties:
cc:
external_port: ?? ( you can add this port as 9022 if it is not there)

nats:
machines: ?? ( this property will be populated from nats job IP's)

Thanks
Ronak

On Wed, Aug 17, 2016 at 5:08 AM, Sankeerth Sai <sankyrth4u(a)gmail.com> wrote:

I was trying to deploy cloud foundry on vsphere. After creating stub file
and running script to generate deployment manifest I ran in to this error.

error generating manifest: unresolved nodes:
(( .properties.cc.external_port )) in /home/ssanke001c/cf-release/templates/cf.yml
jobs.[9].properties.route_registrar.routes.[0].port
(( .properties.nats.machines )) in /home/ssanke001c/cf-release/templates/cf.yml
properties.etcd_metrics_server.nats.machines
(( .properties.cc.external_port )) in
/home/ssanke001c/cf-release/templates/cf.yml meta.api_routes.[0].port

However I made a force deployment ignoring the above errors. As a result
deployment failed.
Also, I had seen few more errors related to unresolved nodes (static
ips). Because the cf.yml assumes two networks cf1,cf2. I have only cf1. I
configured the cf-infrastructure-vsphere.yml and cf.yml from templates by
removing jobs associated with cf2. I was able fix that issue. Can someone
help me to get an understanding of how to deal with external port related
issues and route registrar and how to configure them in my deployment
manifest? thanks in advance.

error generating deployment manifest from stub file

Sankeerth Sai
 

I was trying to deploy cloud foundry on vsphere. After creating stub file and running script to generate deployment manifest I ran in to this error.

error generating manifest: unresolved nodes:
(( .properties.cc.external_port )) in /home/ssanke001c/cf-release/templates/cf.yml jobs.[9].properties.route_registrar.routes.[0].port
(( .properties.nats.machines )) in /home/ssanke001c/cf-release/templates/cf.yml properties.etcd_metrics_server.nats.machines
(( .properties.cc.external_port )) in /home/ssanke001c/cf-release/templates/cf.yml meta.api_routes.[0].port

However I made a force deployment ignoring the above errors. As a result deployment failed.
Also, I had seen few more errors related to unresolved nodes (static ips). Because the cf.yml assumes two networks cf1,cf2. I have only cf1. I configured the cf-infrastructure-vsphere.yml and cf.yml from templates by removing jobs associated with cf2. I was able fix that issue. Can someone help me to get an understanding of how to deal with external port related issues and route registrar and how to configure them in my deployment manifest? thanks in advance.