cf-bosh Digest, Vol 4, Issue 66


Shaozhen Ding
 

I saw a clients' environment in cf 210/211 having issues with bosh deploy
on an existing environment (fresh install always work)
Happens to both cloud contoller and cloud controller worker
If I manually used monit restart them. It functions well


But it cause the deployment failure

Director task 594

Started unknown

Started unknown > Binding deployment. Done (00:00:00)

Started preparing deployment

Started preparing deployment > Binding releases. Done (00:00:00)

Started preparing deployment > Binding existing deployment. Done
(00:00:01)

Started preparing deployment > Binding resource pools. Done (00:00:00)

Started preparing deployment > Binding stemcells. Done (00:00:00)

Started preparing deployment > Binding templates. Done (00:00:00)

Started preparing deployment > Binding properties. Done (00:00:01)

Started preparing deployment > Binding unallocated VMs. Done (00:00:00)

Started preparing deployment > Binding instance networks. Done (00:00:00)


Started preparing package compilation > Finding packages to compile. Done
(00:00:00)


Started preparing dns > Binding DNS. Done (00:00:00)


Started preparing configuration > Binding configuration. Done (00:00:03)


Started updating job api_worker_z5 > api_worker_z5/0 (canary).
Failed: `api_worker_z5/0'
is not running after update (00:01:39)


Error 400007: `api_worker_z5/0' is not running after update


Task 594 error

[2015-07-29 18:31:34+0000] ------------ STARTING
cloud_controller_worker_ctl at Wed Jul 29 18:31:34 UTC 2015 --------------

[2015-07-29 18:31:36+0000] rake aborted!

[2015-07-29 18:31:36+0000] SignalException: SIGTERM

[2015-07-29 18:31:37+0000] Tasks: TOP => jobs:generic

[2015-07-29 18:31:37+0000] (See full trace by running task with --trace)

[2015-07-29 18:35:35+0000] ------------ STARTING
cloud_controller_worker_ctl at Wed Jul 29 18:35:35 UTC 2015 --------------

[2015-07-29 18:35:37+0000] rake aborted!

[2015-07-29 18:35:37+0000] SignalException: SIGTERM

[2015-07-29 18:35:37+0000] Tasks: TOP => jobs:generic

[2015-07-29 18:35:37+0000] (See full trace by running task with --trace)

[2015-07-29 18:39:49+0000] ------------ STARTING
cloud_controller_worker_ctl at Wed Jul 29 18:39:48 UTC 2015 --------------

On Wed, Jul 29, 2015 at 7:46 AM, <cf-bosh-request(a)lists.cloudfoundry.org>
wrote:

Send cf-bosh mailing list submissions to
cf-bosh(a)lists.cloudfoundry.org

To subscribe or unsubscribe via the World Wide Web, visit
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
or, via email, send a message with subject or body 'help' to
cf-bosh-request(a)lists.cloudfoundry.org

You can reach the person managing the list at
cf-bosh-owner(a)lists.cloudfoundry.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of cf-bosh digest..."


Today's Topics:

1. Re: Failed updating job api_z1 > api_z1/0 (canary):
`api_z1/0' is not running after update (Tom Sherrod)


----------------------------------------------------------------------

Message: 1
Date: Wed, 29 Jul 2015 08:46:39 -0400
From: Tom Sherrod <tom.sherrod(a)gmail.com>
To: "Discussions about the Cloud Foundry BOSH project."
<cf-bosh(a)lists.cloudfoundry.org>
Subject: Re: [cf-bosh] Failed updating job api_z1 > api_z1/0 (canary):
`api_z1/0' is not running after update
Message-ID:
<
CADJy60EDqyMhW4Mhc9eHm+FCbSQL5BsFyf85DuWgbDfQP8otSQ(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

I am attempting the v213 install on Openstack. Initially with
cf-boshworkspace, now as a manual install, api_z1/0 is not running after
update.
-- I did include
debian_nfs_server:
no_root_squash: true

SSH to the box and the only error out of all *.err.* logs:

[2015-07-29 12:19:38+0000] ------------ STARTING cloud_controller_ng_ctl at
Wed

Jul 29 12:19:38 UTC 2015 --------------

[2015-07-29 12:20:18+0000] ------------ STARTING cloud_controller_ng_ctl at
Wed

Jul 29 12:20:18 UTC 2015 --------------

[2015-07-29 12:20:18+0000] chown: changing ownership of
?/var/vcap/nfs/shared?:

Operation not permitted

[2015-07-29 12:20:18+0000] chown: changing ownership of
?/var/vcap/nfs/shared?:

Operation not permitted

[2015-07-29 12:20:18+0000] chown: changing ownership of
?/var/vcap/nfs/shared?:

Operation not permitted

From another thread, nfs_mounter is the last in the template lists also.
Pointers??

On Fri, Jul 24, 2015 at 1:45 PM, Johannes Hiemer <jvhiemer(a)gmail.com>
wrote:

Hi Mathias,
I faced the same issue on vSphere with NFS. At the following to the
property section of your properties in cf-stub.yml:

debian_nfs_server:
no_root_squash: true

And run bosh deploy again.

On Fri, Jul 24, 2015 at 7:15 PM, Matthias Ender <Matthias.Ender(a)sas.com>
wrote:

I noticed this in /var/vcap/sys/log/routing-api/routing-api.log




{"timestamp":"1437740486.292583704","source":"routing-api","message":"routing-api.database","log_level":1,"data":{"etcd-addresses":["
http://10.10.3.8:4001"]}}

{"timestamp":"1437740486.293358803","source":"routing-api","message":"routing-api.failed
to connect to etcd","log_level":2,"data":{"error":"sync cluster
failed"}}



*From:* cf-bosh-bounces(a)lists.cloudfoundry.org [mailto:
cf-bosh-bounces(a)lists.cloudfoundry.org] *On Behalf Of *Johannes Hiemer
*Sent:* Friday, July 24, 2015 1:12 PM

*To:* Discussions about the Cloud Foundry BOSH project.
*Subject:* Re: [cf-bosh] Failed updating job api_z1 > api_z1/0 (canary):
`api_z1/0' is not running after update



Hi Matthias,

did you get into api_z1 0 via ssh and look at the specific error? The
problem here is, that the failure might arise from different sources.



On Fri, Jul 24, 2015 at 6:57 PM, Matthias Ender <Matthias.Ender(a)sas.com
wrote:

Dulanjalie,

did you ever get this resolved?

I am running into the same issue, with a blank-slate 212 install:



Failed updating job api_z1 > api_z1/0 (canary): `api_z1/0' is not

running after update (00:12:18)



Error 400007: `api_z1/0' is not running after update



thanks,

Matthias



*From:* cf-bosh-bounces(a)lists.cloudfoundry.org [mailto:
cf-bosh-bounces(a)lists.cloudfoundry.org] *On Behalf Of *Dulanjalie
Dhanapala
*Sent:* Wednesday, June 24, 2015 1:14 PM
*To:* Discussions about the Cloud Foundry BOSH project.
*Subject:* Re: [cf-bosh] Failed updating job api_z1 > api_z1/0 (canary):
`api_z1/0' is not running after update



Hi all,

I deleted my "cf" deployment

git cloned the latest https://github.com/cloudfoundry/cf-release.git

I am using the latest stemcells 2999

+-----------------------------------------+---------+--------------------+

| Name | Version | CID
|

+-----------------------------------------+---------+--------------------+

| bosh-aws-xen-hvm-ubuntu-trusty-go_agent | 2989 | ami-8bc83be0
light |

| bosh-aws-xen-hvm-ubuntu-trusty-go_agent | 2999* | ami-4dcc3526
light |

+-----------------------------------------+---------+--------------------+

(*) Currently in-use

Finally, i am using 212 release

bosh releases

+------+----------+-------------+

| Name | Versions | Commit Hash |

+------+----------+-------------+

| cf | 193 | 54613ebb+ |

| | 212* | ae2ec7a5+ |

+------+----------+-------------+

(*) Currently deployed

(+) Uncommitted changes

Still fails with

Failed updating job api_z1 > api_z1/0 (canary): `api_z1/0' is not
running after update (00:12:18)

Error 400007: `api_z1/0' is not running after update



But error logs are different:

http://fpaste.org/236236/65314143/

http://fpaste.org/236241/16579314/



Still "Buildpacks installation failed"



Sincerely,

Dulanjalie



On Wed, Jun 24, 2015 at 7:04 AM, Dulanjalie Dhanapala <
dulanjalie(a)gmail.com> wrote:

Thanks a lot for the respond.

Pablo, Lev and Nic also gave me some pointers. I am redeploying with
cf-212.



thanks all.

Sincerely,

Dulanjalie



On Wed, Jun 24, 2015 at 2:03 AM, Gwenn Etourneau <getourneau(a)pivotal.io
wrote:

Best way delete deployment and redeploy to start from clean env.







On Wed, Jun 24, 2015 at 5:07 PM, James Bayer <jbayer(a)pivotal.io> wrote:

this looks like DB migrations may be messed up.



On Mon, Jun 22, 2015 at 10:32 PM, Dulanjalie Dhanapala <
dulanjalie(a)gmail.com> wrote:

Some more error messages

------------ STARTING cloud_controller_clock_ctl at Tue Jun 23 04:58:50
UTC 2015 --------------

c9/cloud_controller_ng/lib/cloud_controller/background_job_environment.rb:10:in
`setup_environment'

lib/tasks/clock.rake:6:in `block (2 levels) in <top (required)>'

PG::UndefinedTable: ERROR: relation "apps" does not exist

LINE 1: SELECT * FROM "apps" LIMIT 1



------------ STARTING cloud_controller_ng_ctl at Tue Jun 23 04:58:52 UTC
2015 --------------

Preparing local package directory

Preparing local resource_pool directory

Preparing local droplet directory

{"timestamp":1435035533.863138,"message":"PG::UndefinedTable: ERROR:
relation \"schema_migrations\" does not exist\nLINE 1: SELECT NULL AS
\"nil\" FROM \"schema_migrations\" LIMIT 1\n
^: SELECT NULL AS \"nil\" FROM \"schema_migrations\" LIMIT
1","log_level":"error","source":"cc.db.migrations","data":{},"thread_id":69988649218860,"fiber_id":69988678772900,"process_id":2704,"file":"/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.1.0/gems/sequel-4.15.0/lib/sequel/database/logging.rb","lineno":70,"method":"block
in log_each"}




=====================cloud_controller_clock_ctl.log===========================

{"timestamp":1435035531.961396,"message":"PG::UndefinedTable: ERROR:
relation \"billing_events\" does not exist\nLINE 1: SELECT * FROM
\"billing_events\" LIMIT 1\n ^: SELECT * FROM
\"billing_events\" LIMIT
1","log_level":"error","source":"cc.background","data":{},"thread_id":69935043310380,"fiber_id":69935071682700,"process_id":2579,"file":"/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.1.0/gems/sequel-4.15.0/lib/sequel/database/logging.rb","lineno":70,"method":"block
in log_each"}

{"timestamp":1435035531.9634626,"message":"PG::UndefinedTable: ERROR:
relation \"billing_events\" does not exist\nLINE 1: SELECT * FROM
\"billing_events\" LIMIT 1\n ^: SELECT * FROM
\"billing_events\" LIMIT
1","log_level":"error","source":"cc.background","data":{},"thread_id":69935043310380,"fiber_id":69935071682700,"process_id":2579,"file":"/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.1.0/gems/sequel-4.15.0/lib/sequel/database/logging.rb","lineno":70,"method":"block
in log_each"}

{"timestamp":1435035531.9643211,"message":"PG::UndefinedTable: ERROR:
relation \"billing_events\" does not exist\nLINE 1: SELECT * FROM
\"billing_events\" LIMIT 1\n ^: SELECT * FROM
\"billing_events\" LIMIT
1","log_level":"error","source":"cc.background","data":{},"thread_id":69935043310380,"fiber_id":69935071682700,"process_id":2579,"file":"/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.1.0/gems/sequel-4.15.0/lib/sequel/database/logging.rb","lineno":70,"method":"block
in log_each"}



On Mon, Jun 22, 2015 at 10:17 PM, Dulanjalie Dhanapala <
dulanjalie(a)gmail.com> wrote:

I am seeing this error. I am checking other logs


=======================cloud_controller_worker_ctl.err.log==============================

rake aborted!

SignalException: SIGTERM

/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.1.0/gems/httpclient-2.5.1/lib/httpclient/session.rb:20:in
`require'

/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.1.0/gems/httpclient-2.5.1/lib/httpclient/session.rb:20:in
`<top (required)>'

/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.1.0/gems/httpclient-2.5.1/lib/httpclient.rb:17:in
`require'

/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.1.0/gems/httpclient-2.5.1/lib/httpclient.rb:17:in
`<top (required)>'

/var/vcap/data/packages/cloud_controller_ng/27a4f1b135a8415d623996c6f3a7c668801d1ea3.1-2c0961678c940361c8069ddcac42449d59c222c9/cloud_controller_ng/app/controllers/runtime/files_controller.rb:1:in
`require'

/var/vcap/data/packages/cloud_controller_ng/27a4f1b135a8415d623996c6f3a7c668801d1ea3.1-2c0961678c940361c8069ddcac42449d59c222c9/cloud_controller_ng/app/controllers/runtime/files_controller.rb:1:in
`<top (required)>'

/var/vcap/data/packages/cloud_controller_ng/27a4f1b135a8415d623996c6f3a7c668801d1ea3.1-2c0961678c940361c8069ddcac42449d59c222c9/cloud_controller_ng/lib/cloud_controller/controllers.rb:4:in
`require'

/var/vcap/data/packages/cloud_controller_ng/27a4f1b135a8415d623996c6f3a7c668801d1ea3.1-2c0961678c940361c8069ddcac42449d59c222c9/cloud_controller_ng/lib/cloud_controller/controllers.rb:4:in
`block in <top (required)>'

/var/vcap/data/packages/cloud_controller_ng/27a4f1b135a8415d623996c6f3a7c668801d1ea3.1-2c0961678c940361c8069ddcac42449d59c222c9/cloud_controller_ng/lib/cloud_controller/controllers.rb:3:in
`each'

/var/vcap/data/packages/cloud_controller_ng/27a4f1b135a8415d623996c6f3a7c668801d1ea3.1-2c0961678c940361c8069ddcac42449d59c222c9/cloud_controller_ng/lib/cloud_controller/controllers.rb:3:in
`<top (required)>'

/var/vcap/data/packages/cloud_controller_ng/27a4f1b135a8415d623996c6f3a7c668801d1ea3.1-2c0961678c940361c8069ddcac42449d59c222c9/cloud_controller_ng/lib/cloud_controller.rb:46:in
`require'

/var/vcap/data/packages/cloud_controller_ng/27a4f1b135a8415d623996c6f3a7c668801d1ea3.1-2c0961678c940361c8069ddcac42449d59c222c9/cloud_controller_ng/lib/cloud_controller.rb:46:in
`<top (required)>'

/var/vcap/data/packages/cloud_controller_ng/27a4f1b135a8415d623996c6f3a7c668801d1ea3.1-2c0961678c940361c8069ddcac42449d59c222c9/cloud_controller_ng/Rakefile:10:in
`require'

/var/vcap/data/packages/cloud_controller_ng/27a4f1b135a8415d623996c6f3a7c668801d1ea3.1-2c0961678c940361c8069ddcac42449d59c222c9/cloud_controller_ng/Rakefile:10:in
`<top (required)>'

(See full trace by running task with --trace)



On Mon, Jun 22, 2015 at 9:39 PM, Gwenn Etourneau <getourneau(a)pivotal.io
wrote:

Don t think is an error, please check all files.



On Tue, Jun 23, 2015 at 1:15 PM, Dulanjalie Dhanapala <
dulanjalie(a)gmail.com> wrote:


i am hitting an error when i execute bosh deploy in minimal-aws.yml
based cf deployment.



Failed updating job api_z1 > api_z1/0 (canary): `api_z1/0' is not
running after update (00:12:11)

Error 400007: `api_z1/0' is not running after update

Any suggestions to resolve this? I tried in #bosh freenode but no one
responded.



I see this error related to api_z1 in
the ./cloud_controller_worker_ctl.log

./cloud_controller_worker_ctl.log:{"timestamp":1434996874.6546366,"message":"(0.003843s)
UPDATE \"delayed_jobs\" SET \"guid\" =
'a3af8feb-30b7-4b4d-91fd-112045501a0d', \"created_at\" = '2015-06-22
18:14:32.651086+0000', \"updated_at\" = CURRENT_TIMESTAMP, \"priority\"
=
0, \"attempts\" = 0, \"handler\" = '---
!ruby/struct:VCAP::CloudController::Jobs::ExceptionCatchingJob\nhandler:
!ruby/struct:VCAP::CloudController::Jobs::RequestJob\n job:
!ruby/struct:VCAP::CloudController::Jobs::TimeoutJob\n job:
!ruby/struct:VCAP::CloudController::Jobs::Runtime::BuildpackInstaller\n
name: java_buildpack\n file:
\"/var/vcap/packages/buildpack_java/java-buildpack-v2.5.zip\"\n
opts:
{}\n config: \n request_id: \n', \"last_error\" = NULL,
\"run_at\" =
'2015-06-22 18:14:32.660520+0000', \"locked_at\" = '2015-06-22
18:14:34.649651+0000', \"failed_at\" = NULL, \"locked_by\" =
'cc_api_worker.
*api_z1*.0.2', \"queue\" = 'cc-*api_z1*-0', \"cf_api_error\" = NULL
WHERE (\"id\" =
1)","log_level":"debug2","source":"cc.background","data":{},"thread_id":69964119192340,"fiber_id":69964148702360,"process_id":2727,"file":"/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.1.0/gems/sequel-4.15.0/lib/sequel/database/logging.rb","lineno":70,"method":"block
in log_each"}

I searched online but it seems like there are some others hitting the
same issue. I could not find a proper solution.



Sincerely,

Dulanjalie



_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh




_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh





--

Dulanjalie





--

Dulanjalie


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh





--

Thank you,



James Bayer


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh




_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh





--

Dulanjalie





--

Dulanjalie


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh





--

Mit freundlichen Gr??en

Johannes Hiemer

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


--
Mit freundlichen Gr??en

Johannes Hiemer

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://lists.cloudfoundry.org/pipermail/cf-bosh/attachments/20150729/64ad6d91/attachment.html
------------------------------

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


End of cf-bosh Digest, Vol 4, Issue 66
**************************************