Re: `api_z1/0' is not running after update to CF v231
Wayne Ha <wayne.h.ha@...>
Zach,
Thanks for the hints. You are right, I am not using latest stemcell: vagrant(a)agent-id-bosh-0:~$ bosh stemcells +---------------------------------------------+---------+--------------------------------------+ | Name | Version | CID | +---------------------------------------------+---------+--------------------------------------+ | bosh-warden-boshlite-ubuntu-trusty-go_agent | 389* | cb6ee28c-a703-4a7e-581b-b63be2302e3d | I will try the stemcell you recommended to see if it helps. Thanks,
|
|
Re: Increase CAB meeting to 1.5 or 2 hours?
Steven Benario
One of the things we've started going in the Runtime-specific meetings is
asking that individual team status updates are provided ahead of time, and then we use the time in the meeting for Q&A only, instead of reading those notes aloud. I'm a big fan and recommend this model. Cheers On Sun, Feb 14, 2016 at 11:19 AM, Dr Nic Williams <drnicwilliams(a)gmail.com> wrote: Since we started the CAB calls in late 2013 the size of Cloud Foundry
|
|
Re: `api_z1/0' is not running after update to CF v231
Zach Robinson
Wayne,
Can you verify that you are using the latest bosh-lite stemcell 3147? Older stemcells are known to have issues with consul which is what many of the CF components use for service discovery. Latest bosh-lite stemcells can be found at http://bosh.io Just search for lite. See this similar issue: https://github.com/cloudfoundry/cf-release/issues/919 -Zach
|
|
Re: `api_z1/0' is not running after update to CF v231
Amit Kumar Gupta
As of cf v231, CC has switched from using NFS to WebDav as the default
toggle quoted messageShow quoted text
blobstore. There are more details in the release notes: https://github.com/cloudfoundry/cf-release/releases/tag/v231. I don't know off-hand how to debug the issue you're seeing, but I will reach out to some folks with more knowledge of Cloud Controller. Best, Amit
On Mon, Mar 7, 2016 at 8:48 AM, Wayne Ha <wayne.h.ha(a)gmail.com> wrote:
Kayode,
|
|
Re: `api_z1/0' is not running after update to CF v231
Wayne Ha <wayne.h.ha@...>
Kayode,
I am using the default bosh-lite-v231.yml file and the instances for nfs server is set to 0: vagrant(a)agent-id-bosh-0:~$ egrep -i "name:.*nfs|instances" bosh-lite-v231.yml.1603041454 etc... - instances: 0 - instances: 0 - instances: 0 name: nfs_z1 - name: debian_nfs_server - instances: 1 - instances: 1 - instances: 1 etc... So it is not running. Thanks,
|
|
Update Parallelization in Cloud Foundry
Omar Elazhary <omazhary@...>
Hello everyone,
I know it is possible to update and redeploy components in parallel in cloud foundry by setting the "serial" property in the deployment manifest to "false". However, is such a thing recommended? Are there particular job dependencies that I need to pay attention to? Regards, Omar
|
|
Re: New CF Service Broker "chaos-galago" - a chaos-monkey for your Cloud Foundry
Sam Bryant
For anyone interested we have also now added a smoke tests project for chaos-galago that can be used to monitor the service-broker. This can be found: https://github.com/FidelityInternational/chaos-galago-smoke-tests
Details are also on the README for chaos-galago. Regards, Sam
|
|
Reg the minimal-openstack yml files
Nithiyasri Gnanasekaran -X (ngnanase - TECH MAHINDRA LIM@Cisco) <ngnanase at cisco.com...>
Hi
We are trying to upgrade our deployment with the latest cloud-foundry, from 205 to 230 release, as per your advice. We could see minimal-aws.yml available in the GIT repo. Can we have a similar one available for openstack environment, with which we can deploy the basic cloud foundry and do our custom changes on top of it Parallely we are updating our stub to match the template yml files guided by the errors given by the generate_deployment_manifest script. Kindly let us know if this is the correct way to generate the manifest. Regards Nithiyasri
|
|
Re: `api_z1/0' is not running after update to CF v231
Paul Bakare
Wayne, is the nfs_server-partition running?
toggle quoted messageShow quoted text
On Mon, Mar 7, 2016 at 1:43 AM, Wayne Ha <wayne.h.ha(a)gmail.com> wrote:
I checked the blobstore is running:
|
|
Re: `api_z1/0' is not running after update to CF v231
Wayne Ha <wayne.h.ha@...>
I checked the blobstore is running:
root(a)e83575d2-dfbf-4f7c-97ee-5112560fa137:/var/vcap/sys/log# monit summary The Monit daemon 5.2.4 uptime: 4h 14m Process 'consul_agent' running Process 'metron_agent' running Process 'blobstore_nginx' running Process 'route_registrar' running System 'system_e83575d2-dfbf-4f7c-97ee-5112560fa137' running But there are thousands of errors saying DopplerForwarder: can't forward message, loggregator client pool is empty: root(a)e83575d2-dfbf-4f7c-97ee-5112560fa137:/var/vcap/sys/log# find . -name "*.log" | xargs grep -i error | cut -c 73-500 | sort -u ,"process_id":246,"source":"metron","log_level": "error","message":"DopplerForwarder: can't forward message","data":{ "error":"loggregator client pool is empty"}, "file":"/var/vcap/data/compile/metron_agent/loggregator/src/metron/writers/dopplerforwarder/doppler_forwarder.go", "line":104, "method":"metron/writers/dopplerforwarder.(*DopplerForwarder).networkWrite"} Not sure what is wrong.
|
|
Re: `api_z1/0' is not running after update to CF v231
Wayne Ha <wayne.h.ha@...>
Amit,
toggle quoted messageShow quoted text
Thanks for letting me know I might have looked at the wrong log files. I saw the following in cloud_controller log files: root(a)7a1f2221-c31a-494b-b16c-d4a97c16c9ab:/var/vcap/sys/log# tail ./cloud_controller_ng_ctl.log [2016-03-06 22:40:28+0000] ------------ STARTING cloud_controller_ng_ctl at Sun Mar 6 22:40:28 UTC 2016 -------------- [2016-03-06 22:40:28+0000] Checking for blobstore availability [2016-03-06 22:41:03+0000] Blobstore is not available root(a)7a1f2221-c31a-494b-b16c-d4a97c16c9ab:/var/vcap/sys/log# tail ./cloud_controller_worker_ctl.log [2016-03-06 22:41:13+0000] Killing /var/vcap/sys/run/cloud_controller_ng/cloud_controller_worker_2.pid: 12145 [2016-03-06 22:41:13+0000] .Stopped [2016-03-06 22:41:36+0000] Blobstore is not available [2016-03-06 22:41:48+0000] ------------ STARTING cloud_controller_worker_ctl at Sun Mar 6 22:41:48 UTC 2016 -------------- [2016-03-06 22:41:48+0000] Checking for blobstore availability [2016-03-06 22:41:48+0000] Removing stale pidfile... So maybe the cause is Blobstore is not available? Thanks,
On Sun, Mar 6, 2016 at 1:15 PM, Amit Gupta <agupta(a)pivotal.io> wrote:
The log lines saying "/var/vcap/sys/run/cloud_controller_ng/cloud_controller.sock
|
|
Re: app auto-scaling in OSS CF contribution
Padmashree B
Hi,
Is the solution same as the one offered in IBM Bluemix? Where can I find more information on IBM's solution [open-Autoscaler], current/planned features, their roadmap, timeline etc. ? Kind Regards, Padma
|
|
Re: `api_z1/0' is not running after update to CF v231
Amit Kumar Gupta
The log lines saying
"/var/vcap/sys/run/cloud_controller_ng/cloud_controller.sock is not found" is probably just a symptom of the problem, not the root cause. You're probably seeing those in the nginx logs? Cloud Controller is failing to start, hence it is not establishing a connection on the socket. You need to dig deeper into failures in logs in /var/vcap/sys/log/cloud_controller_ng. On Sun, Mar 6, 2016 at 10:00 AM, sridhar vennela <sridhar.vennela(a)gmail.com> wrote: Hi Wayne,
|
|
Re: `api_z1/0' is not running after update to CF v231
sridhar vennela
Hi Wayne,
Looks like it, It is trying to connect to loggregator and failing I guess. https://github.com/cloudfoundry/cloud_controller_ng/blob/master/app/controllers/runtime/syslog_drain_urls_controller.rb Thank you, Sridhar
|
|
Re: monit definitions
Hi,
toggle quoted messageShow quoted text
I'm no expert but “monit” is a component of BOSH, not Cloud Foundry. Your question would get answered if asked on the “bosh-dev” mailing-list. Cheers
Le 18 févr. 2016 à 11:19, Nitta, Minoru <minoru.nitta(a)jp.fujitsu.com> a écrit :
|
|
Re: `api_z1/0' is not running after update to CF v231
Wayne Ha <wayne.h.ha@...>
Since it is complaining /var/vcap/sys/run/cloud_controller_ng/cloud_controller.sock is not found, I thought I would just touch that file. Now I get:
2016/03/06 17:14:11 [error] 18497#0: *5 connect() to unix:/var/vcap/sys/run/cloud_controller_ng/cloud_controller.sock failed (111: Connection refused) while connecting to upstream, client: <bosh director>, server: _, request: "GET /v2/syslog_drain_urls?batch_size=1000 HTTP/1.1", upstream: "http://unix:/var/vcap/sys/run/cloud_controller_ng/cloud_controller.sock:/v2/syslog_drain_urls?batch_size=1000", host: "api.bosh-lite.com" Maybe there is network configuration problem in my environment?
|
|
Re: `api_z1/0' is not running after update to CF v231
Wayne Ha <wayne.h.ha@...>
Sridhar,
Thanks for your response. I have tried your suggestion and it doesn't help. But I might have misled you with the consul error. That error only got logged once at the beginning. So like you said, maybe VM was not able to join consul server before it came up. But after that, the following error keeps logging every minute or so: 2016/03/06 17:04:41 [crit] 11480#0: *4 connect() to unix:/var/vcap/sys/run/cloud_controller_ng/cloud_controller.sock failed (2: No such file or directory) while connecting to upstream, server: _, request: "GET /v2/syslog_drain_urls?batch_size=1000 HTTP/1.1", upstream: "http://unix:/var/vcap/sys/run/cloud_controller_ng/cloud_controller.sock:/v2/syslog_drain_urls?batch_size=1000", host: "api.bosh-lite.com" So maybe the above is the cause of the problem? Thanks, On Sun, Mar 6, 2016 at 12:51 AM, sridhar vennela <sridhar.vennela(a)gmail.com> wrote: Hi Wayne,
|
|
Re: `api_z1/0' is not running after update to CF v231
sridhar vennela
Hi Wayne,
Somehow VM is not able to join consul server. You can try below steps. ps -ef | grep consul kill consul-serverpid monit restart <consul-job> Thank you, Sridhar
|
|
Re: `api_z1/0' is not running after update to CF v231
Wayne Ha <wayne.h.ha@...>
Sridhar,
Thanks for your response. I found the VM is listening to port 8500: root(a)c6822dcb-fb02-4858-ae5d-3ab45d593896:/var/vcap/sys/log# netstat -anp | grep LISTEN tcp 0 0 127.0.0.1:8400 0.0.0.0:* LISTEN 18162/consul tcp 0 0 127.0.0.1:8500 0.0.0.0:* LISTEN 18162/consul tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 18162/consul tcp 0 0 127.0.0.1:2822 0.0.0.0:* LISTEN 72/monit tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 31/sshd tcp 0 0 10.244.0.138:8301 0.0.0.0:* LISTEN 18162/consul If I run "monit stop all" then it only listens to the following: root(a)c6822dcb-fb02-4858-ae5d-3ab45d593896:/var/vcap/sys/log# netstat -anp | grep LISTEN tcp 0 0 127.0.0.1:2822 0.0.0.0:* LISTEN 72/monit tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 31/sshd Note that 10.244.0.138 is the IP of this VM. Thanks, On Sat, Mar 5, 2016 at 12:58 AM, sridhar vennela <sridhar.vennela(a)gmail.com> wrote: Hi Wayne,
|
|
Re: User defined variable "key" validation doesn't happen at cf set-env phase
Nicholas Calugar
Hi Ponraj,
toggle quoted messageShow quoted text
I don't think the CC can make any determination regarding the validity of environment variables as the CC doesn't (and shouldn't) know how each buildpack will use these environment variables. Thanks, Nick
On Thu, Mar 3, 2016 at 9:22 AM Ponraj E <ponraj.e(a)gmail.com> wrote:
Hi CF Colleagues,
|
|