Date
1 - 4 of 4
Diego docker app launch issue with Diego's v0.1443.0
Anuj Jain <anuj17280@...>
Hi,
I deployed the latest CF v226 with Diego v0.1443.0 - I was able to successfully upgrade both deployments and verified that CF is working as expected. currently seeing problem with Diego while trying to deploy any docker app - I am getting *'Server error, status code: 500, error code: 170016, message: Runner error: stop app failed: 503' *- below you can see the CF_TRACE output of last few lines. I also notice that while trying to upgrade diego v0.1443.0 - it gave me the error while trying to upgrade database job - the fix which I applied (changed debug2 to debug from diego manifest file - path: properties => consul => log_level: debug) RESPONSE: [2015-12-10T09:35:07-05:00] HTTP/1.1 500 Internal Server Error Content-Length: 110 Content-Type: application/json;charset=utf-8 Date: Thu, 10 Dec 2015 14:35:07 GMT Server: nginx X-Cf-Requestid: 8328f518-4847-41ec-5836-507d4bb054bb X-Content-Type-Options: nosniff X-Vcap-Request-Id: 324d0fc0-2146-48f0-6265-755efb556e23::5c869046-8803-4dac-a620-8ca701f5bd22 { "code": 170016, "description": "Runner error: stop app failed: 503", "error_code": "CF-RunnerError" } FAILED Server error, status code: 500, error code: 170016, message: Runner error: stop app failed: 503 FAILED Server error, status code: 500, error code: 170016, message: Runner error: stop app failed: 503 FAILED Error: Error executing cli core command Starting app testing89 in org PAAS / space dev as admin... FAILED Server error, status code: 500, error code: 170016, message: Runner error: stop app failed: 503 - Anuj |
|
Eric Malm <emalm@...>
Hi, Anuj,
toggle quoted message
Show quoted text
Thanks for the info, and sorry to hear you've run into some difficulties. It sounds like Cloud Controller is getting a 503 error from the nsync-listener service on the CC-Bridge. That most likely means it's encountering some sort of error in communicating with Diego's BBS API. You mentioned that you had some problems with the database jobs when upgrading as well. Does BOSH now report that all the VMs in the Diego deployment are running correctly? One next step to try would be to tail logs from the nsync-listener processes on the CC-Bridge VMs with `tail -f /var/vcap/sys/log/nsync/nsync_listener.stdout.log`, and from the BBS processes on the database VMs with `tail -f /var/vcap/sys/log/bbs/bbs.stdout.log`, then try restarting your app that targets Diego, and see if there are any errors in the logs. It may also help to filter the logs to contain only the ones with your app guid, which you can get from the CF CLI via `cf app APP_NAME --guid`. Also, are you able to run a buildpack-based app on the Diego backend, or do you get the same error as with this Docker-based app? Best, Eric On Thu, Dec 10, 2015 at 6:45 AM, Anuj Jain <anuj17280(a)gmail.com> wrote:
Hi, |
|
Anuj Jain <anuj17280@...>
Hi Eric – Thanks for trying to help me to resolve my issues – please check
comments inline: On Mon, Dec 14, 2015 at 12:02 AM, Eric Malm <emalm(a)pivotal.io> wrote: Hi, Anuj,=> Database issue was in manifest, which got resolved once I update/changed the log_level value from debug2 to debug One next step to try would be to tail logs from the nsync-listenerstager and/or Nsync job logs => I checked CC host and run 'netstat -anp | grep consul' couple of time and found that sometime it showing established connection with one consul server and sometime not - below is the sample output: # netstat -anp | grep consul tcp 0 0 10.5.139.156:8301 0.0.0.0:* LISTEN 4559/consul tcp 0 0 127.0.0.1:8400 0.0.0.0:* LISTEN 4559/consul tcp 0 0 127.0.0.1:8500 0.0.0.0:* LISTEN 4559/consul tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 4559/consul udp 0 0 127.0.0.1:53 0.0.0.0:* 4559/consul udp 0 0 10.5.139.156:8301 0.0.0.0:* 4559/consul root(a)07c40eae-6a8c-4fb1-996d-e638637b5caa:/var/vcap/bosh_ssh/bosh_eerpetbrz# netstat -anp | grep consul tcp 0 0 10.5.139.156:8301 0.0.0.0:* LISTEN 4559/consul tcp 0 0 127.0.0.1:8400 0.0.0.0:* LISTEN 4559/consul tcp 0 0 127.0.0.1:8500 0.0.0.0:* LISTEN 4559/consul tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 4559/consul tcp 0 0 10.5.139.156:60316 10.5.139.140:8300 ESTABLISHED 4559/consul udp 0 0 127.0.0.1:53 0.0.0.0:* 4559/consul udp 0 0 10.5.139.156:8301 0.0.0.0:* 4559/consul => After that I also checked/verify consul agent logs on CC which showing EventmemberFailed and EventmemberJoin messages ======================================================================================== logs: 2015/12/14 09:32:43 [INFO] serf: EventMemberFailed: api-z1-0 10.5.139.156 2015/12/14 09:32:44 [INFO] serf: EventMemberJoin: docker-cache-0 10.5.139.252 2015/12/14 09:32:44 [INFO] serf: EventMemberJoin: api-z1-0 10.5.139.156 2015/12/14 09:32:49 [INFO] serf: EventMemberJoin: ha-proxy-z1-0 10.5.103.103 2015/12/14 09:33:32 [INFO] memberlist: Suspect ha-proxy-z1-1 has failed, no acks received 2015/12/14 09:33:40 [INFO] serf: EventMemberFailed: ha-proxy-z1-1 10.5.103.104 2015/12/14 09:33:40 [INFO] serf: EventMemberFailed: database-z1-2 10.5.139.194 2015/12/14 09:33:40 [INFO] memberlist: Marking ha-proxy-z1-0 as failed, suspect timeout reached 2015/12/14 09:33:40 [INFO] serf: EventMemberFailed: ha-proxy-z1-0 10.5.103.103 2015/12/14 09:33:41 [INFO] serf: EventMemberJoin: database-z1-2 10.5.139.194 2015/12/14 09:33:42 [INFO] serf: EventMemberFailed: cell-z1-3 10.5.139.199 2015/12/14 09:33:43 [INFO] serf: EventMemberJoin: cell-z1-3 10.5.139.199 2015/12/14 09:33:43 [INFO] serf: EventMemberFailed: api-worker-z1-0 10.5.139.159 2015/12/14 09:33:44 [INFO] serf: EventMemberJoin: ha-proxy-z1-1 10.5.103.104 2015/12/14 09:33:44 [INFO] memberlist: Marking uaa-z1-1 as failed, suspect timeout reached 2015/12/14 09:33:44 [INFO] serf: EventMemberFailed: uaa-z1-1 10.5.139.155 2015/12/14 09:33:46 [INFO] serf: EventMemberFailed: cell-z1-1 10.5.139.197 2015/12/14 09:33:46 [INFO] serf: EventMemberJoin: uaa-z1-1 10.5.139.155 2015/12/14 09:33:47 [INFO] memberlist: Marking cc-bridge-z1-0 as failed, suspect timeout reached 2015/12/14 09:33:47 [INFO] serf: EventMemberFailed: cc-bridge-z1-0 10.5.139.200 2015/12/14 09:33:49 [INFO] serf: EventMemberFailed: database-z1-1 10.5.139.193 2015/12/14 09:33:58 [INFO] serf: EventMemberJoin: api-worker-z1-0 10.5.139.159 2015/12/14 09:33:58 [INFO] serf: EventMemberJoin: database-z1-1 10.5.139.193 2015/12/14 09:33:59 [INFO] serf: EventMemberJoin: cell-z1-1 10.5.139.197 2015/12/14 09:33:59 [INFO] serf: EventMemberJoin: cc-bridge-z1-0 10.5.139.200 2015/12/14 09:34:01 [INFO] serf: EventMemberFailed: database-z1-1 10.5.139.193 2015/12/14 09:34:07 [INFO] serf: EventMemberJoin: database-z1-1 10.5.139.193 2015/12/14 09:34:09 [INFO] memberlist: Marking cell-z1-1 as failed, suspect timeout reached 2015/12/14 09:34:09 [INFO] serf: EventMemberFailed: cell-z1-1 10.5.139.197 2015/12/14 09:34:20 [INFO] serf: EventMemberJoin: ha-proxy-z1-0 10.5.103.103 2015/12/14 09:34:28 [INFO] serf: EventMemberJoin: cell-z1-1 10.5.139.197 2015/12/14 09:34:38 [INFO] serf: EventMemberFailed: ha-proxy-z1-0 10.5.103.103 2015/12/14 09:34:42 [INFO] serf: EventMemberFailed: ha-proxy-z1-1 10.5.103.104 2015/12/14 09:34:44 [INFO] memberlist: Marking api-z1-0 as failed, suspect timeout reached 2015/12/14 09:34:44 [INFO] serf: EventMemberFailed: api-z1-0 10.5.139.156 2015/12/14 09:34:48 [INFO] serf: EventMemberJoin: ha-proxy-z1-0 10.5.103.103 2015/12/14 09:34:48 [INFO] memberlist: Marking api-worker-z1-0 as failed, suspect timeout reached 2015/12/14 09:34:48 [INFO] serf: EventMemberFailed: api-worker-z1-0 10.5.139.159 2015/12/14 09:34:49 [INFO] serf: EventMemberJoin: api-z1-0 10.5.139.156 2015/12/14 09:34:52 [INFO] serf: EventMemberJoin: ha-proxy-z1-1 10.5.103.104 2015/12/14 09:34:58 [INFO] serf: EventMemberJoin: api-worker-z1-0 10.5.139.159 ================================================================================ Also, are you able to run a buildpack-based app on the Diego backend, orverified that by enable-diego on one of the app and then tried starting it - got same 500 Error. Best, |
|
Anuj Jain <anuj17280@...>
Hi Eric - please ignore last message - I found the issue - I installed
toggle quoted message
Show quoted text
diego with collocated VM as well as other VMs - which make CC confuse and collocated VM was not having all the functionality/jobs - now I am not getting 500 message and can run buildpack apps post diego enabled. On Mon, Dec 14, 2015 at 5:23 PM, Anuj Jain <anuj17280(a)gmail.com> wrote:
Hi Eric – Thanks for trying to help me to resolve my issues – please check |
|