Re: CF on Bosh Lite- traffic controller- dialing error (read tcp i/o timeout)


Gupta, Abhik
 

Hi all,
Can anyone help me with this?

Regards,
Abhik

From: Gupta, Abhik [mailto:abhik.gupta(a)sap.com]
Sent: Monday, August 15, 2016 7:01 PM
To: Discussions about Cloud Foundry projects and the system overall. <cf-dev(a)lists.cloudfoundry.org>
Subject: [cf-dev] CF on Bosh Lite- traffic controller- dialing error (read tcp i/o timeout)

Hi,
After setting up Cloud Foundry on Bosh Lite, "cf push" and "cf logs" for a node.js application fails. My configurations are as shown below:

root(a)bosh-lite:/home/vagrant/cf-abacus# bosh target
Current target is https://192.168.50.4:25555 (Bosh Lite Director)

root(a)bosh-lite:/home/vagrant/cf-abacus# bosh vms
RSA 1024 bit CA certificates are loaded due to old openssl compatibility
Acting as user 'admin' on 'Bosh Lite Director'
Deployment 'cf-warden'

Director task 27

Task 27 done

+---------------------------------------------------------------------------+---------+-----+-----------+--------------+
| VM | State | AZ | VM Type | IPs |
+---------------------------------------------------------------------------+---------+-----+-----------+--------------+
| api_z1/0 (783dc4f2-b18f-40a9-b1b4-6f76306c7a26) | running | n/a | large_z1 | 10.244.0.138 |
| blobstore_z1/0 (3d691827-5e49-4117-a381-2187e35886c4) | running | n/a | medium_z1 | 10.244.0.130 |
| consul_z1/0 (7fc7b3a9-9fcc-4d9c-977a-decfe863419e) | running | n/a | small_z1 | 10.244.0.54 |
| doppler_z1/0 (14680e42-c726-44e9-80ca-6e27bebc5a23) | running | n/a | medium_z1 | 10.244.0.146 |
| etcd_z1/0 (15a26ca3-b14b-44a5-a977-5b3036574564) | running | n/a | medium_z1 | 10.244.0.42 |
| ha_proxy_z1/0 (5ee32052-6f3d-4191-8abe-4d6a428aaf39) | running | n/a | router_z1 | 10.244.0.34 |
| hm9000_z1/0 (ab545dff-41ca-4927-882d-e0e5bb73f487) | running | n/a | medium_z1 | 10.244.0.142 |
| loggregator_trafficcontroller_z1/0 (eca6a918-9e05-4ddc-ab92-519f0c26dd26) | running | n/a | small_z1 | 10.244.0.150 |
| nats_z1/0 (026d423b-30b6-411b-87df-17cc46861664) | running | n/a | medium_z1 | 10.244.0.6 |
| postgres_z1/0 (245cb81c-f2e6-43dc-9098-ff8e960b7b3a) | running | n/a | medium_z1 | 10.244.0.30 |
| router_z1/0 (12e87356-c4d1-4683-b0fe-0c4e38e4149d) | running | n/a | router_z1 | 10.244.0.22 |
| runner_z1/0 (15b4c464-1d35-4792-af0d-6aad11b2bca1) | running | n/a | runner_z1 | 10.244.0.26 |
| uaa_z1/0 (a8e76f1b-4dcd-4581-8011-d7e0349de27b) | running | n/a | medium_z1 | 10.244.0.134 |
+---------------------------------------------------------------------------+---------+-----+-----------+--------------+

VMs total: 13

root(a)bosh-lite:/home/vagrant/cf-abacus# cf api
API endpoint: https://api.bosh-lite.com (API version: 2.59.0)

The errors I get while pushing the app:

root(a)bosh-lite:/home/vagrant/cf-abacus# cf logs abacus-usage-aggregator -recent

2016-08-15T12:08:55.06+0000 [API/0] OUT Updated app with guid 0b3b9d74-3ded-40d3-b6f0-e19a8def0f73 ({"name"=>"abacus-usage-aggregator", "instances"=>1, "memory"=>512, "disk_quota"=>512, "environment_json"=>"PRIVATE DATA HIDDEN"})
2016-08-15T12:09:33.28+0000 [API/0] OUT Updated app with guid 0b3b9d74-3ded-40d3-b6f0-e19a8def0f73 ({"state"=>"STOPPED"})
2016-08-15T12:09:48.84+0000 [DEA/0] OUT Got staging request for app with id 0b3b9d74-3ded-40d3-b6f0-e19a8def0f73
2016-08-15T12:09:48.85+0000 [API/0] OUT Updated app with guid 0b3b9d74-3ded-40d3-b6f0-e19a8def0f73 ({"state"=>"STARTED"})
2016-08-15T12:09:50.52+0000 [STG/0] OUT -----> Downloaded app package (1.8M)
2016-08-15T12:24:50.58+0000 [STG/0] ERR
2016-08-15T12:24:50.58+0000 [STG/0] OUT
2016-08-15T12:24:50.74+0000 [API/0] ERR Encountered error: Stager error: failed to stage application:
2016-08-15T12:24:50.74+0000 [API/0] ERR Script exited with status 255

The error I see after turning on CF_TRACE :

Warning: error tailing logs
Error dialing traffic controller server: read tcp 10.244.0.33:38578->10.244.0.34:443: i/o timeout.
Please ask your Cloud Foundry Operator to check the platform configuration (traffic controller is wss://doppler.bosh-lite.com:443).

I am a little confused by this because the API endpoint target works absolutely fine. Yes, I am behind a proxy and I have all the required proxy variables sets. Though I was getting I/O Timeout issues with targeting the API Endpoint initially, I set the environment variable "CF_DIAL_TIMEOUT" to a sufficiently large timeout value and it worked fine (as per the release notes of the new CF CLI version).
Can someone help me solve this problem?

Regards,
Abhik

P.S.: I tried restarting the loggregator traffic controller VM to see if it helps but to no avail.

Join {cf-dev@lists.cloudfoundry.org to automatically receive all group messages.