Re: cf-stub.yml example with minimum or required info


CF Runtime
 

Hi Ahmed,

Sorry for the slow reply. Those errors you are getting seem like they stem
from the deployment not being able to access itself.

To do a `cf apps`, the CLI performs a request to the spaces summary
endpoint, and the summary endpoint contacts hm9000.10.195.166.108.xip.io to
get instance status.

When you try to view logs, the logging service contacts
api.10.195.166.108.xip.io to determine if you have access to the logs you
are trying to view.

Most likely your network configuration does not allow the cloud foundry
components to access each other via dns hostnames. Possibly a routing
problem.

Joseph & Dan
CF OSS Release Integration Team


On Thu, Jul 2, 2015 at 3:05 PM, Ahmed Ali (ahmeali) <ahmeali(a)cisco.com>
wrote:

Here is the output of `cf curl /v2/info`

M-600H:dora ahmeali$ cf curl /v2/info

{

"name": "vcap",

"build": "2222",

"support": "http://support.cloudfoundry.com",

"version": 2,

"description": "Cloud Foundry sponsored by Pivotal",

"authorization_endpoint": "https://login.10.195.166.108.xip.io",

"token_endpoint": "https://uaa.10.195.166.108.xip.io",

"min_cli_version": null,

"min_recommended_cli_version": null,

"api_version": "2.28.0",

"app_ssh_endpoint": "ssh.10.195.166.108.xip.io:2222",

"app_ssh_host_key_fingerprint": null,

"logging_endpoint": "wss://loggregator.10.195.166.108.xip.io:443",

"user": "464dc83e-2993-4e14-b777-5291867140df"

}



M-600H:dora ahmeali$ cf apps

Getting apps in org *pivotal* / space *development* as *admin*...

*FAILED*

Server error, status code: 500, error code: 10001, message: An unknown
error occurred.

M-600H:dora ahmeali$ cf push dora

Updating app *dora* in org *pivotal* / space *development* as *admin*...

*OK*


Uploading *dora*...

Uploading app files from:
/Users/ahmeali/deployment/apps/cf-acceptance-tests/assets/dora

Uploading 182.7K, 38 files

Done uploading

*OK*


Stopping app *dora* in org *pivotal* / space *development* as *admin*...

*OK*


*Warning: error tailing logs*

Unauthorized error: You are not authorized. Error: Invalid authorization

Starting app *dora* in org *pivotal* / space *development* as *admin*...


*FAILED*

StagingError


TIP: use '*cf logs dora --recent*' for more information




M-600H:dora ahmeali$ cf logs dora --recent

Connected, dumping recent logs for app *dora* in org *pivotal* / space
*development* as *admin*...


*FAILED*

Unauthorized error: You are not authorized. Error: Invalid authorization




Thanks

Ali



From: CF Runtime <cfruntime(a)gmail.com>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Thursday, July 2, 2015 at 2:46 PM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required info

Hi Ahmed,

Can you post the output of you `cf curl /v2/info` against your
environment?

It looks like you may be missing some configuration in the section
below, the output of the above command should tell us what is missing.

properties:
logger_endpoint:
port:
use_ssl:

Thanks,
Zach

On Wed, Jul 1, 2015 at 12:11 PM, Ahmed Ali (ahmeali) <ahmeali(a)cisco.com>
wrote:

Any idea where to go from here? Thanks!

From: AHMED ALI <ahmeali(a)cisco.com>
Date: Friday, June 26, 2015 at 2:24 PM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required info


I attached both cf-stub.yml and cf-deployment.yml in case you need to
see more, below is requested part from cf-deployment.yml/cf-stub.yml


cf-deployment.yml
————————————

properties:
doppler:
blacklisted_syslog_ranges: null
debug: false
maxRetainedLogMessages: 100
unmarshaller_count: 5
doppler_endpoint:
shared_secret: loggregator_endpoint_secret
dropsonde:
enabled: true
logger_endpoint: null
loggregator:
blacklisted_syslog_ranges: null
debug: false
maxRetainedLogMessages: 100
loggregator_endpoint:
shared_secret: loggregator_endpoint_secret

cf-stub.yml
————————

properties:
loggregator_endpoint:
shared_secret: loggregator_endpoint_secret


Thank you for looking into this!

Ahmed

From: CF Runtime <cfruntime(a)gmail.com>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Friday, June 26, 2015 at 9:50 AM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required info

Hi Ahmed,

What are the properties for your loggregator/doppler endpoints? This
error can happen when SSL is configured incorrectly (the CLI trying to
connect to an unencrypted port, for example)

Best,
Zak + Dwayne CF Runtime + LAMB teams.

On Fri, Jun 19, 2015 at 11:11 AM, Ahmed Ali (ahmeali) <ahmeali(a)cisco.com>
wrote:

Thanks Zak and Joseph!

After the changes the “no available stager” is not showing up, but the
“Unauthorized error” still there, did you get a chance to look at how my
SSL Cert/key are in cf-stub.yml? Any example from your side will be a great
help.


M-20JW:dora ali$ cf push dora

Creating app *dora* in org *me* / space *development* as *admin*...

*OK*


Using route *dora.10.195.166.18.xip.io
<http://dora.10.195.166.18.xip.io>*

Binding *dora.10.195.166.18.xip.io <http://dora.10.195.166.18.xip.io>*
to *dora*...

*OK*


Uploading *dora*...

Uploading app files from:
/Users/ahali/deployments/apps/cf-acceptance-tests/assets/dora

Uploading 182.7K, 38 files

Done uploading

*OK*


*Warning: error tailing logs*

Unauthorized error: You are not authorized. Error: Invalid authorization

Starting app *dora* in org *me* / space *development* as *admin*...

panic: runtime error: close of closed channel


goroutine 409 [running]:

runtime.panic(0x560540, 0xe4b4b5)

/usr/local/go/src/pkg/runtime/panic.c:266 +0xb6

github.com/cloudfoundry/noaa.(*Consumer).retryAction(0xc21099d5a0,
0xc2108a95d0, 0xc21099d7e0, 0xc21099d840)

/Users/pivotal/go-agent/pipelines/Mac-OSX-Unit-Tests/src/
github.com/cloudfoundry/cli/Godeps/_workspace/src/github.com/cloudfoundry/noaa/consumer.go:473
+0x194

github.com/cloudfoundry/noaa.(*Consumer).TailingLogs(0xc21099d5a0,
0xc2100a21e0, 0x24, 0xc210889800, 0x3b6, ...)

/Users/pivotal/go-agent/pipelines/Mac-OSX-Unit-Tests/src/
github.com/cloudfoundry/cli/Godeps/_workspace/src/github.com/cloudfoundry/noaa/consumer.go:59
+0x108


github.com/cloudfoundry/cli/cf/api.(*noaaConsumer).TailingLogs(0xc210000560,
0xc2100a21e0, 0x24, 0xc210889800, 0x3b6, ...)

/Users/pivotal/go-agent/pipelines/Mac-OSX-Unit-Tests/src/
github.com/cloudfoundry/cli/tmp/cli_gopath/src/github.com/cloudfoundry/cli/cf/api/noaaConsumer.go:35
+0x78

created by
github.com/cloudfoundry/cli/cf/api.(*logNoaaRepository).TailNoaaLogsFor

/Users/pivotal/go-agent/pipelines/Mac-OSX-Unit-Tests/src/
github.com/cloudfoundry/cli/tmp/cli_gopath/src/github.com/cloudfoundry/cli/cf/api/logs_noaa.go:131
+0x4cd



Thanks
Ahmed


From: CF Runtime <cfruntime(a)gmail.com>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Friday, June 19, 2015 at 10:39 AM

To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required info

"no available stagers" happens when the DEAs do not think they have
enough disk or memory to perform application staging.

In the properties section of your stub, make sure dea_next.disk_mb and
dea_next.memory_mb are set to match the resources available on the
instances.

Zak & Joseph
CF Runtime Team

On Thu, Jun 18, 2015 at 4:00 PM, Ahmed Ali (ahmeali) <ahmeali(a)cisco.com>
wrote:

After changing domain to use xip.io, and regenerated ssl cert to
match it, still getting same error “*no available stagers*”, with
CF_TRACE=“true” I see some authorization and http 400 errors.

Could it be the ssl cert and ssh keys placement in the manifest, *does
SSL Cert and ssh keypair look correct here?*

In cf-stub.yml I have ssl cert placed in two sections

SSL Certificate:
"jobs > ha_proxy_z1 > properties > ha_proxy > ssl_pem:”
And here
"properties > router > ssl_cert:”

ssh keypair are placed under “properties > jwt ”


Corresponding parts in the manifest:

jobs:

- name: ha_proxy_z1

instances: 1

properties:

ha_proxy:

ssl_pem: |

-----BEGIN CERTIFICATE-----


MIICszCCAhwCCQD9lGyWUwS67jANBgkqhkiG9w0BAQUFADCBnTELMAkGA1UEBhMC

..<commented to save space>..

MQjIEwrUWMMQ6pdul2PqI9rC+Xl44mU=

-----END CERTIFICATE-----

-----BEGIN RSA PRIVATE KEY-----


MIICXAIBAAKBgQC28AM9naDijbqu5lYvQTxYzUHL788v6e78PuTfqhCOOlxh0+iq

..<commented out>..

In5G2A4WdwiYHWDWtBcySLyMfSGovZ8Tsax/6c0hqXE=

-----END RSA PRIVATE KEY——





properties:

router:

enable_ssl: true

ssl_cert: |

-----BEGIN CERTIFICATE-----


MIICszCCAhwCCQD9lGyWUwS67jANBgkqhkiG9w0BAQUFADCBnTELMAkGA1UEBhMC

..<commented out>..

MQjIEwrUWMMQ6pdul2PqI9rC+Xl44mU=

-----END CERTIFICATE-----

ssl_key: |

-----BEGIN RSA PRIVATE KEY-----


MIICXAIBAAKBgQC28AM9naDijbqu5lYvQTxYzUHL788v6e78PuTfqhCOOlxh0+iq

..<commented out..

In5G2A4WdwiYHWDWtBcySLyMfSGovZ8Tsax/6c0hqXE=

-----END RSA PRIVATE KEY——




jwt:

signing_key: |

-----BEGIN RSA PRIVATE KEY-----

MIICXAIBAAKBgQDHFr+KICms+tuT1OXJwhCUmR2dKVy7psa8xzElSyzqx7oJyfJ1

..<commented out>..

4SlotYRHgPCEubokb2S1zfZDWIXW3HmggnGgM949TlY=

-----END RSA PRIVATE KEY-----

verification_key: |

-----BEGIN PUBLIC KEY-----

MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDHFr+KICms+tuT1OXJwhCUmR2d

..<commented out>..

spULZVNRxq7veq/fzwIDAQAB

-----END PUBLIC KEY-----




Another reason could be the proxy maybe causing connectivity problems
between CF nodes if FQDN is being used to communicate internally, since
internal reverse DNS is different than xip.io, how do I go about
configuring all nodes to use proxy if that is the case? or how can I verify
connectivity between nodes?

Thanks
Ahmed

From: Gwenn Etourneau <getourneau(a)pivotal.io>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Wednesday, June 17, 2015 at 6:43 PM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required
info

Should be there, but can be DNS problem just take a look of xip.io
and remove all your dns entry.

On Thu, Jun 18, 2015 at 10:38 AM, Ahmed Ali (ahmeali) <
ahmeali(a)cisco.com> wrote:

Not familiar with xip.io but will look into it or similar in this
case, I was trying to avoid completed setup on end user.

Regarding DEA, I thought it is installed by default like in bosh lite,
how do I go about deploying it?

Thanks Gwenn
On Jun 17, 2015 6:11 PM, Gwenn Etourneau <getourneau(a)pivotal.io>
wrote:

What about xip.io ? or dnsmasq to avoid such change into your dns
config.

Do you have DEA ?
Staging is done into the DEA (runner) VM.

On Thu, Jun 18, 2015 at 5:09 AM, Ahmed Ali (ahmeali) <
ahmeali(a)cisco.com> wrote:

Update;

ha_proxy_z1 updating job completes successfully after I changed
stemcell to bosh-vsphere-esxi-ubuntu-trusty-go_agent and using
version cf-210, deployment completes and was able to create org and
workspace.


Im testing with this demo app “
https://github.com/cloudfoundry-samples/fib-cpu”, when I do "#cf push
fib-cpu" I get this error

*FAILED*

Server error, status code: 400, error code: 170001, message: Staging
error: *no available stagers*


Im suspecting DNS related since Im using local hosts file
temporarily to map all CF component, *how can I find the
CF components DNS/hostname list*?


/etc/hosts

10.195.166.18 api.foundry-appx.company.com

10.195.166.18 login.foundry-appx.company.com

10.195.166.18 loggregator.foundry-appx.company.com

10.195.166.18 uaa.foundry-appx.company.com

10.195.166.18 hm9000.foundry-appx.company.com

10.195.166.18 console.foundry-appx.company.com

10.195.166.18 doppler.foundry-appx.company.com
10.195.166.18 fib-cpu.foundry-appx.company.com

Thank you
A





From: AHMED ALI <ahmeali(a)cisco.com>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Tuesday, June 16, 2015 at 10:48 PM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required
info

The stemcells are ubuntu, and /etc/resolvconf/resolv.conf.d/head
does not exist.

$bosh stemcells



+------------------------------------------+---------+-----------------------------------------+

| Name | Version | CID
|


+------------------------------------------+---------+-----------------------------------------+

| bosh-vsphere-esxi-ubuntu | 2427* |
sc-7efaaf8d-9028-45a0-93e2-91a5045b85f0 |

| bosh-vsphere-esxi-ubuntu-trusty-go_agent | 2977 |
sc-58e1de73-66a4-464a-a940-d8311fc405bf |


+------------------------------------------+---------+-----------------------------------------+


(*) Currently in-use

root(a)60c3204d-ad98-483f-85c6-a673717f108a:~# uname -a
Linux 60c3204d-ad98-483f-85c6-a673717f108a 3.0.0-32-virtual #51~lucid1
SMP Thu Mar 6 17:43:24 UTC 2014 x86_64 GNU/Linux


Thank you

From: Gwenn Etourneau <getourneau(a)pivotal.io>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Tuesday, June 16, 2015 at 10:28 PM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required
info

Are you using centos ?

/etc/resolvconf/resolv.conf.d/head

On Wed, Jun 17, 2015 at 2:21 PM, Ahmed Ali (ahmeali) <
ahmeali(a)cisco.com> wrote:

These are the errors I found under /var/vcap/sys/log:


/var/vcap/sys/log/consul_template_ctl.err.log:

[2015-06-17 05:10:03+0000] ------------ STARTING consul_template_ctl
at Wed Jun 17 05:10:03 UTC 2015 --------------
[2015-06-17 05:10:03+0000] 2015/06/17 05:10:03 [ERR] (runner) watcher
reported error: health services: error fetching: Get
http://127.0.0.1:8500/v1/health/service/ssh-proxy?wait=60000ms: dial
tcp 127.0.0.1:8500: connection refused
[2015-06-17 05:10:08+0000] 2015/06/17 05:10:08 [ERR] (runner) watcher
reported error: health services: error fetching: Get
http://127.0.0.1:8500/v1/health/service/ssh-proxy?wait=60000ms: dial
tcp 127.0.0.1:8500: connection refused
[2015-06-17 05:10:13+0000] 2015/06/17 05:10:13 [ERR] (runner) watcher
reported error: health services: error fetching: Get
http://127.0.0.1:8500/v1/health/service/ssh-proxy?wait=60000ms: dial
tcp 127.0.0.1:8500: connection refused
[2015-06-17 05:10:14+0000] Received interrupt, cleaning up...
[2015-06-17 05:10:14+0000]
/var/vcap/jobs/haproxy/bin/consul_template_ctl: line 46: exit: : numeric
argument required


/var/vcap/sys/log/monit/consul_agent.err.log:

/var/vcap/jobs/consul_agent/bin/agent_ctl: line 39:
/etc/resolvconf/resolv.conf.d/head: No such file or directory


/var/vcap/sys/log/metron_agent/metron_agent.stdout.log:

{"timestamp":1434518129.113139153,"process_id":2263,"source":"metron","log_level":"warn","message":"Failed
to create client: Could not connect to NATS: nats: No servers available for
connection","data":null,"file":"/var/vcap/data/compile/metron_agent/loggregator/src/
github.com/cloudfoundry/loggregatorlib/cfcomponent/registrars/collectorregistrar/collector_registrar.go
","line":51,"method":"
github.com/cloudfoundry/loggregatorlib/cfcomponent/registrars/collectorregistrar.(*CollectorRegistrar).Run
<http://github.com/cloudfoundry/loggregatorlib/cfcomponent/registrars/collectorregistrar.%28*CollectorRegistrar%29.Run>
”}


Note: There is a proxy server between all CF VMS and the internet,
is there any connections to the outside? If so where in cf-stub.yml proxy
can be placed.

Thank you!


From: Gwenn Etourneau <getourneau(a)pivotal.io>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Tuesday, June 16, 2015 at 6:56 PM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required
info

bosh ssh to the haproxy and check the logs /var/vcap/sys/log.

On Wed, Jun 17, 2015 at 5:14 AM, Ahmed Ali (ahmeali) <
ahmeali(a)cisco.com> wrote:

Thank you!


After adding environment to meta section, the deployment is moving
forward but timing out on “Started updating job ha_proxy_z1 > ha_proxy_z1/0
(canary)"


What changed:

meta:
environment: cf

properties:

domain: foundry-appx.company.com #(domain used in the ssl cert)



I think this could be related to SSL cert and keys which should be
included in cf-stub.yml, I created SSL certificate for “ssl_pem” by
following this link
<https://github.com/cloudfoundry/cf-release/tree/master/example_manifests> ,
also added the jwt signing_key which was created using “ssh-keygen -t
rsa”, I see another place (here
<http://docs.cloudfoundry.org/deploying/cf-stub-vsphere.html>) where
SSL cert/key are needed in cf-stub.yml under “router > ssl_cert:” but
not sure if it is the same as ssl_pem, any idea what Im missing here?


"bosh cck" is coming out clean and no problems, "bosh vms" show all
VMs in running state except “ha_proxy_z1” in failing state.


# bosh deploy


Processing deployment manifest

------------------------------

Getting deployment properties from director...

Compiling deployment manifest...

Please review all changes carefully


Deploying

---------

Deployment name: `cf-deployment.yml'

Director name: `bosh2'

Are you sure you want to deploy? (type 'yes' to continue): yes


Director task 227

Started unknown

Started unknown > Binding deployment. Done (00:00:00)


Started preparing deployment

Started preparing deployment > Binding releases. Done (00:00:01)

Started preparing deployment > Binding existing deployment. Done
(00:00:00)

Started preparing deployment > Binding resource pools. Done
(00:00:00)

Started preparing deployment > Binding stemcells. Done (00:00:00)

Started preparing deployment > Binding templates. Done (00:00:00)

Started preparing deployment > Binding properties. Done (00:00:00)

Started preparing deployment > Binding unallocated VMs. Done
(00:00:00)

Started preparing deployment > Binding instance networks. Done
(00:00:00)


Started preparing package compilation > Finding packages to compile.
Done (00:00:00)


Started preparing dns > Binding DNS. Done (00:00:00)


Started preparing configuration > Binding configuration. Done
(00:00:02)


Started updating job ha_proxy_z1 > ha_proxy_z1/0 (canary). Failed: `ha_proxy_z1/0'
is not running after update (00:10:18)


Error 400007: `ha_proxy_z1/0' is not running after update




bosh task 227 --debug


E, [2015-06-16 19:41:51 #13416] [canary_update(ha_proxy_z1/0)]
*ERROR* -- DirectorJobRunner: *Error* updating canary instance:
#<Bosh::Director::AgentJobNotRunning: `ha_proxy_z1/0' is not running after
update>

I, [2015-06-16 19:41:51 #13416] [task:227] INFO -- DirectorJobRunner:
sending update deployment *error* event

D, [2015-06-16 19:41:51 #13416] [task:227] DEBUG -- DirectorJobRunner:
SENT: hm.director.alert
{"id":"52682ea9-0fa2-4edb-9611-128490279ba5","severity":3,"title":"director
- *error* during update deployment","summary":"*Error* during update
deployment for 'cf' against Director
'b9a1bf7b-952f-48e1-a496-f6543d7a782c':
#<Bosh::Director::AgentJobNotRunning: `ha_proxy_z1/0' is not running after
update>","created_at":1434483711}

E, [2015-06-16 19:41:51 #13416] [task:227] *ERROR* --
DirectorJobRunner: `ha_proxy_z1/0' is not running after update

D, [2015-06-16 19:41:51 #13416] [task:227] DEBUG -- DirectorJobRunner:
(0.000495s) UPDATE "tasks" SET "state" = '*error*', "timestamp" =
'2015-06-16 19:41:51.761481+0000', "description" = 'create deployment',
"result" = '`ha_proxy_z1/0'' is not running after update', "output" =
'/var/vcap/store/director/tasks/227', "checkpoint_time" = '2015-06-16
19:41:29.218401+0000', "type" = 'update_deployment', "username" = 'admin'
WHERE ("id" = 227)

Task 227 *error*

Thanks!


From: CF Runtime <cfruntime(a)gmail.com>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Monday, June 15, 2015 at 2:10 PM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required
info

Hi Ahmed,

This property is coming from the "templates/cf-lamb.yml" within
cf-release. You are able to overwrite this property in your stub like so:

meta:
environment: [name-of-environment]

Hope this helps,
Dan && James, CF Runtime Team

On Sun, Jun 14, 2015 at 7:13 PM, Ahmed Ali (ahmeali) <
ahmeali(a)cisco.com> wrote:

What should be the value then, it is generated by spiff?

should I place this in cf-stub.yml under properties to overwrite
what spiff doing as following:

properties:
metron_agent:
deployment: <???>

I found this link talking about same issue:
https://github.com/cloudfoundry/bosh-lite/issues/265 but could not
find an answer

Thanks Gwenn



From: Gwenn Etourneau <getourneau(a)pivotal.io>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Sunday, June 14, 2015 at 6:58 PM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required
info

Seems this could not be null in deployment.yml

metron_agent:
deployment: null

On Mon, Jun 15, 2015 at 10:50 AM, Ahmed Ali (ahmeali) <
ahmeali(a)cisco.com> wrote:

Both cf-stub.yml and the spiff generated cf-deployment.yml are
attached.

I noticed in cf-deployment.yml a section called “-
default_networks:” inserted between jobs, does this look normal?

Note: I did not edit cf-deployment, it is what I get from spiff.

Environment info:
Ubuntu 14

BOSH 1.2977.0

cf version 6.11.3-cebadc9-2015-05-20T19:00:58+00:00

spiff version 1.0.6

ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux]
vSphere 5.5


#bosh status

Config

/root/.bosh_config


Director

Name bosh2

URL https://10.195.166.12:25555

Version 1.2976.0 (00000000)

User admin

UUID b9a1bf7b-952f-48e1-a496-f6543d7a782c

CPI vsphere

dns enabled (domain_name: bosh)

compiled_package_cache disabled

snapshots enabled


Deployment

Manifest /root/deployment/cf-deployment.yml

#bosh releases


+------+------------+-------------+

| Name | Versions | Commit Hash |

+------+------------+-------------+

| cf | 211+dev.1* | 2121dc64+ |

+------+------------+-------------+

(*) Currently deployed

(+) Uncommitted changes




Thank you!

From: Gwenn Etourneau <getourneau(a)pivotal.io>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Sunday, June 14, 2015 at 5:35 PM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required
info

Please show us you manifest, seems something is missing.



On Mon, Jun 15, 2015 at 4:08 AM, Ahmed Ali (ahmeali) <
ahmeali(a)cisco.com> wrote:

Hi Joseph,

Thank you! I changed to using two different networks and now "bosh
deploy” works and all VMs are deployed successfully, but looks like there
is a binding configuration error:

Binding configuration. Failed: Error filling in template
`metron_agent.json.erb' for `ha_proxy_z1/0' (line 5: Can't find property
`["metron_agent.deployment"]') (00:00:00)


Error 100: Error filling in template `metron_agent.json.erb' for
`ha_proxy_z1/0' (line 5: Can't find property `["metron_agent.deployment"]')


Also tried to connect using CF and could not connect:

root(a)cloudfoundry:~/deployment# cf api --skip-ssl-validation
foundry-appx.domain.com

Setting api endpoint to *foundry-app.domain.com
<http://foundry-app.domain.com>*...

*FAILED*
Error performing request: Get http://foundry-appx.domain.com/v2/info:
dial tcp x.x.166.18:80: connection refused


Full run console log:

root(a)cloudfoundry:~/deployment# bosh deploy


Processing deployment manifest

------------------------------

Getting deployment properties from director...

Unable to get properties list from director, trying without it...

Compiling deployment manifest...

Cannot get current deployment information from director, possibly a
new deployment

Please review all changes carefully


Deploying

---------

Deployment name: `cf-deployment.yml'

Director name: `bosh2'

Are you sure you want to deploy? (type 'yes' to continue): yes


Director task 172

Started unknown

Started unknown > Binding deployment. Done (00:00:00)


Started preparing deployment

Started preparing deployment > Binding releases. Done (00:00:00)

Started preparing deployment > Binding existing deployment. Done
(00:00:00)

Started preparing deployment > Binding resource pools. Done
(00:00:00)

Started preparing deployment > Binding stemcells. Done (00:00:00)

Started preparing deployment > Binding templates. Done (00:00:00)

Started preparing deployment > Binding properties. Done (00:00:00)

Started preparing deployment > Binding unallocated VMs. Done
(00:00:00)

Started preparing deployment > Binding instance networks. Done
(00:00:00)


Started preparing package compilation > Finding packages to compile.
Done (00:00:00)


Started preparing dns > Binding DNS. Done (00:00:00)


Started creating bound missing vms

Started creating bound missing vms > small_z1/0

Started creating bound missing vms > small_z1/1

Started creating bound missing vms > small_z1/2

Started creating bound missing vms > small_z2/0

Started creating bound missing vms > small_z2/1

Started creating bound missing vms > medium_z1/0

Started creating bound missing vms > medium_z1/1

Started creating bound missing vms > medium_z1/2

Started creating bound missing vms > medium_z1/3

Started creating bound missing vms > medium_z1/4

Started creating bound missing vms > medium_z1/5

Started creating bound missing vms > medium_z1/6

Started creating bound missing vms > medium_z1/7

Started creating bound missing vms > medium_z1/8

Started creating bound missing vms > medium_z2/0

Started creating bound missing vms > medium_z2/1

Started creating bound missing vms > medium_z2/2

Started creating bound missing vms > medium_z2/3

Started creating bound missing vms > medium_z2/4

Started creating bound missing vms > large_z1/0

Started creating bound missing vms > large_z2/0

Started creating bound missing vms > runner_z1/0

Started creating bound missing vms > runner_z2/0

Started creating bound missing vms > router_z1/0

Started creating bound missing vms > router_z1/1

Started creating bound missing vms > router_z2/0

Done creating bound missing vms > medium_z1/0 (00:00:30)

Done creating bound missing vms > medium_z1/5 (00:00:32)

Done creating bound missing vms > small_z2/1 (00:00:34)

Done creating bound missing vms > medium_z1/2 (00:00:34)

Done creating bound missing vms > medium_z2/4 (00:00:35)

Done creating bound missing vms > medium_z1/1 (00:00:45)

Done creating bound missing vms > medium_z1/7 (00:00:45)

Done creating bound missing vms > small_z1/0 (00:00:46)

Done creating bound missing vms > router_z1/1 (00:00:47)

Done creating bound missing vms > medium_z2/2 (00:00:49)

Done creating bound missing vms > medium_z2/3 (00:00:49)

Done creating bound missing vms > large_z2/0 (00:00:51)

Done creating bound missing vms > medium_z1/4 (00:00:52)

Done creating bound missing vms > router_z1/0 (00:00:52)

Done creating bound missing vms > small_z1/1 (00:00:55)

Done creating bound missing vms > router_z2/0 (00:00:55)

Done creating bound missing vms > small_z2/0 (00:00:59)

Done creating bound missing vms > large_z1/0 (00:00:59)

Done creating bound missing vms > medium_z2/1 (00:01:00)

Done creating bound missing vms > medium_z1/6 (00:01:00)

Done creating bound missing vms > medium_z1/3 (00:01:01)

Done creating bound missing vms > medium_z1/8 (00:01:01)

Done creating bound missing vms > medium_z2/0 (00:01:01)

Done creating bound missing vms > runner_z2/0 (00:01:02)

Done creating bound missing vms > runner_z1/0 (00:01:02)

Done creating bound missing vms > small_z1/2 (00:01:03)

Done creating bound missing vms (00:01:03)


Started binding instance vms

Started binding instance vms > ha_proxy_z1/0

Started binding instance vms > nats_z1/0

Started binding instance vms > nats_z2/0

Started binding instance vms > etcd_z1/0

Started binding instance vms > etcd_z1/1

Started binding instance vms > etcd_z2/0

Started binding instance vms > stats_z1/0

Started binding instance vms > nfs_z1/0

Started binding instance vms > postgres_z1/0

Started binding instance vms > uaa_z1/0

Started binding instance vms > uaa_z2/0

Started binding instance vms > api_z1/0

Started binding instance vms > api_z2/0

Started binding instance vms > clock_global/0

Started binding instance vms > api_worker_z1/0

Started binding instance vms > api_worker_z2/0

Started binding instance vms > hm9000_z1/0

Started binding instance vms > hm9000_z2/0

Started binding instance vms > runner_z1/0

Started binding instance vms > runner_z2/0

Started binding instance vms > loggregator_z1/0

Started binding instance vms > loggregator_z2/0

Started binding instance vms > loggregator_trafficcontroller_z1/0

Started binding instance vms > router_z1/0

Started binding instance vms > loggregator_trafficcontroller_z2/0

Started binding instance vms > router_z2/0

Done binding instance vms > etcd_z2/0 (00:00:00)

Done binding instance vms > ha_proxy_z1/0 (00:00:00)

Done binding instance vms > nats_z1/0 (00:00:00)

Done binding instance vms > nats_z2/0 (00:00:00)

Done binding instance vms > clock_global/0 (00:00:00)

Done binding instance vms > etcd_z1/0 (00:00:00)

Done binding instance vms > uaa_z1/0 (00:00:00)

Done binding instance vms > nfs_z1/0 (00:00:00)

Done binding instance vms > postgres_z1/0 (00:00:00)

Done binding instance vms > api_z2/0 (00:00:00)

Done binding instance vms > api_z1/0 (00:00:00)

Done binding instance vms > uaa_z2/0 (00:00:00)

Done binding instance vms > etcd_z1/1 (00:00:00)

Done binding instance vms > stats_z1/0 (00:00:00)

Done binding instance vms > hm9000_z2/0 (00:00:00)

Done binding instance vms > hm9000_z1/0 (00:00:00)

Done binding instance vms > runner_z1/0 (00:00:00)

Done binding instance vms > loggregator_z1/0 (00:00:00)

Done binding instance vms > loggregator_z2/0 (00:00:00)

Done binding instance vms > loggregator_trafficcontroller_z1/0
(00:00:00)

Done binding instance vms > loggregator_trafficcontroller_z2/0
(00:00:00)

Done binding instance vms > runner_z2/0 (00:00:00)

Done binding instance vms > router_z2/0 (00:00:00)

Done binding instance vms > router_z1/0 (00:00:00)

Done binding instance vms > api_worker_z1/0 (00:00:01)

Done binding instance vms > api_worker_z2/0 (00:00:01)

Done binding instance vms (00:00:01)


Started preparing configuration > Binding configuration. Failed: Error
filling in template `metron_agent.json.erb' for `ha_proxy_z1/0' (line 5:
Can't find property `["metron_agent.deployment"]') (00:00:00)


Error 100: Error filling in template `metron_agent.json.erb' for
`ha_proxy_z1/0' (line 5: Can't find property `["metron_agent.deployment"]')


Task 172 error


For a more detailed error report, run: bosh task 172 --debug

root(a)cloudfoundry:~/deployment# bosh cck

Performing cloud check...


Processing deployment manifest

------------------------------


Director task 173

Started scanning 26 vms

Started scanning 26 vms > Checking VM states. Done (00:00:00)

Started scanning 26 vms > 26 OK, 0 unresponsive, 0 missing, 0
unbound, 0 out of sync. Done (00:00:00)

Done scanning 26 vms (00:00:00)


Started scanning 0 persistent disks

Started scanning 0 persistent disks > Looking for inactive disks.
Done (00:00:00)

Started scanning 0 persistent disks > 0 OK, 0 missing, 0 inactive,
0 mount-info mismatch. Done (00:00:00)

Done scanning 0 persistent disks (00:00:00)


Task 173 done


Started2015-06-14 18:34:55 UTC

Finished2015-06-14 18:34:55 UTC

Duration00:00:00


Scan is complete, checking if any problems found...

No problems found

root(a)cloudfoundry:~/deployment# cf api --skip-ssl-validation
foundry-appx.domain.com

Setting api endpoint to *foundry-app.doamin.com
<http://foundry-app.doamin.com>*...

*FAILED*

Error performing request: Get http://foundry-appx.domain.com/v2/info:
dial tcp x.x.166.18:80: connection refused




Note: the cluster is done inside a lab environment with proxy, did
not configure CF with proxy and not sure if I need to do something specific.



Thanks!




From: CF Runtime <cfruntime(a)gmail.com>
Reply-To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Date: Friday, June 12, 2015 at 5:39 PM
To: "Discussions about the Cloud Foundry BOSH project." <
cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] cf-stub.yml example with minimum or required
info

Anything in the network section that is not static or reserved,
BOSH will assume it can use for any other instances in that zone. Because
your two subnets overlap, and you have not partitioned it off using the
reserved sections, BOSH is using that IP for something in the other zone.

Normally, if you only have a single network, it is easier to just
set the instance count for jobs in the second zone to zero, and scale up
any in the first zone to multiple if you want to have redundancy.

Joseph Palermo
CF Runtime Team

On Fri, Jun 12, 2015 at 1:29 PM, Ahmed Ali (ahmeali) <
ahmeali(a)cisco.com> wrote:

Sorry I did not see this reply from Gwenn Etourneau.

After “bosh cck”, it found 6 problems and then I tried option 2 to
reboot and also tried option 3 to recreate VM, none of them fixed the
issue, but I noticed the problematic VMs are using duplicate IPs, my
network section in the manifest has 2 networks (cf1 and cf2) and there is
no overlab, see network section below.

For example the VM router_z1/0
(vm-c36efd49-7ac7-4b90-9779-b5192408e4a6) got the ip 10.195.166.110,
and another VM from same deployment (
vm-2b532db0-36df-433d-8132-89c76a9c81c3) got the same ip
10.195.166.110

I tried also removing the section “jobs" and go with defaults which
is generated by spiff and run into same issue, do I have
to statically assign IP address to each job?


M-20JW:cf-release ahmeali$ bosh cck
Performing cloud check...

Processing deployment manifest
------------------------------

Director task 141
Started scanning 26 vms
Started scanning 26 vms > Checking VM states. Done (00:00:10)
Started scanning 26 vms > 20 OK, 6 unresponsive, 0 missing, 0
unbound, 0 out of sync. Done (00:00:00)
Done scanning 26 vms (00:00:10)

Started scanning 0 persistent disks
Started scanning 0 persistent disks > Looking for inactive disks.
Done (00:00:00)
Started scanning 0 persistent disks > 0 OK, 0 missing, 0 inactive, 0
mount-info mismatch. Done (00:00:00)
Done scanning 0 persistent disks (00:00:00)

Task 141 done

Started 2015-06-12 20:03:17 UTC
Finished 2015-06-12 20:03:27 UTC
Duration 00:00:10

Scan is complete, checking if any problems found...

Found 6 problems

Problem 1 of 6: Unknown VM (vm-e4da0933-52ba-473f-903d-a9ee09d1671f)
is not responding.
1. Ignore problem
2. Reboot VM
3. Recreate VM using last known apply spec
4. Delete VM reference (DANGEROUS!)
Please choose a resolution [1 - 4]: 2

Problem 2 of 6: hm9000_z1/0
(vm-5ea4c90d-247e-43ad-a189-ff4d2d781854) is not responding.
1. Ignore problem
2. Reboot VM
3. Recreate VM using last known apply spec
4. Delete VM reference (DANGEROUS!)
Please choose a resolution [1 - 4]: 2

Problem 3 of 6: router_z1/0
(vm-c36efd49-7ac7-4b90-9779-b5192408e4a6) is not responding.
1. Ignore problem
2. Reboot VM
3. Recreate VM using last known apply spec
4. Delete VM reference (DANGEROUS!)
Please choose a resolution [1 - 4]: 2

Problem 4 of 6: nfs_z1/0 (vm-868e2bb9-ac61-49f0-86fb-38d5c338201b)
is not responding.
1. Ignore problem
2. Reboot VM
3. Recreate VM using last known apply spec
4. Delete VM reference (DANGEROUS!)
Please choose a resolution [1 - 4]: 2

Problem 5 of 6: router_z2/0
(vm-773e83ee-c97f-4aa4-b163-d09a703a4678) is not responding.
1. Ignore problem
2. Reboot VM
3. Recreate VM using last known apply spec
4. Delete VM reference (DANGEROUS!)
Please choose a resolution [1 - 4]: 2

Problem 6 of 6: ha_proxy_z1/0
(vm-643f3c80-93cb-4ccf-9239-1f85a407a317) is not responding.
1. Ignore problem
2. Reboot VM
3. Recreate VM using last known apply spec
4. Delete VM reference (DANGEROUS!)
Please choose a resolution [1 - 4]: 2

Below is the list of resolutions you've provided
Please make sure everything is fine and confirm your changes

1. Unknown VM (vm-e4da0933-52ba-473f-903d-a9ee09d1671f) is not
responding
Reboot VM

2. hm9000_z1/0 (vm-5ea4c90d-247e-43ad-a189-ff4d2d781854) is not
responding
Reboot VM

3. router_z1/0 (vm-c36efd49-7ac7-4b90-9779-b5192408e4a6) is not
responding
Reboot VM

4. nfs_z1/0 (vm-868e2bb9-ac61-49f0-86fb-38d5c338201b) is not
responding
Reboot VM

5. router_z2/0 (vm-773e83ee-c97f-4aa4-b163-d09a703a4678) is not
responding
Reboot VM

6. ha_proxy_z1/0 (vm-643f3c80-93cb-4ccf-9239-1f85a407a317) is not
responding
Reboot VM

Apply resolutions? (type 'yes' to continue): yes
Applying resolutions...

Director task 142
Started applying problem resolutions
Started applying problem resolutions > unresponsive_agent 154:
Reboot VM. Failed: Agent is responding now, skipping resolution (00:00:00)
Started applying problem resolutions > unresponsive_agent 184:
Reboot VM. Failed: Agent is responding now, skipping resolution (00:00:00)
Started applying problem resolutions > unresponsive_agent 177:
Reboot VM. Done (00:00:30)
Started applying problem resolutions > unresponsive_agent 179:
Reboot VM. Failed: Agent is responding now, skipping resolution (00:00:00)
Started applying problem resolutions > unresponsive_agent 185:
Reboot VM. Done (00:00:11)
Started applying problem resolutions > unresponsive_agent 180: Reb

...
[Message clipped]
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

Join cf-bosh@lists.cloudfoundry.org to automatically receive all group messages.