Re: Running the app test suite within the CATs, and the admin_buildpack_lifecycle_test is failing
You can also check out the v208 tag of cf-release, then run the acceptance-tests from src/github.com/cloudfoundry/cf-acceptance-tests Joseph CF Release Integration Team On Thu, Sep 24, 2015 at 2:53 PM, Christopher Piraino <cpiraino(a)pivotal.io> wrote: Jordan,
The Cloud Foundry bosh release comes with an errand called "acceptance_tests" that contains the version of CATs which that version of CF was tested with. You can run these by doing "bosh run errand acceptance_tests".
There are also some manifest properties that you might need to set for the CATs to run correctly. The list of all possible properties for the acceptance_tests errand can be found here: https://github.com/cloudfoundry/cf-release/blob/develop/jobs/acceptance-tests/spec .
- Chris Piraino
On Thu, Sep 24, 2015 at 11:27 AM, Jordan Collier <jordanicollier(a)gmail.com
wrote: I was unclear on what I am asking, the real question is as follows:
What is the best way to run the apps test suite within the CATS on an older version of cloud foundry? (for example I am running these tests on version 208)
|
|
Re: Running the app test suite within the CATs, and the admin_buildpack_lifecycle_test is failing
Christopher Piraino <cpiraino@...>
Jordan, The Cloud Foundry bosh release comes with an errand called "acceptance_tests" that contains the version of CATs which that version of CF was tested with. You can run these by doing "bosh run errand acceptance_tests". There are also some manifest properties that you might need to set for the CATs to run correctly. The list of all possible properties for the acceptance_tests errand can be found here: https://github.com/cloudfoundry/cf-release/blob/develop/jobs/acceptance-tests/spec. - Chris Piraino On Thu, Sep 24, 2015 at 11:27 AM, Jordan Collier <jordanicollier(a)gmail.com> wrote: I was unclear on what I am asking, the real question is as follows:
What is the best way to run the apps test suite within the CATS on an older version of cloud foundry? (for example I am running these tests on version 208)
|
|
Re: Environment variables with special characters not handled correctly?
Hi Daniel and Dieu, Finally after much trial and error I finally got it working. I created a user-provided service and then called on it from my application. I've documented the steps for anyone else wanting to know how to work with these variables (clearer documentation with examples maybe?). Here's the documentation and example application: https://gist.github.com/jonasrosland/08b5758eaa9098a81cf8Thanks for all your help! Best regards, Jonas Rosland
|
|
Re: Environment variables with special characters not handled correctly?
Sorry, sounds like escaping is the only option here for an environment variable. If you don't want to escape, I think you could create a user provided service, provide the value through there and bind that to your app. That'll come into VCAP_SERVICES which, if I read the PT story right, shouldn't need any extra escaping. Dan On Thu, Sep 24, 2015 at 3:10 PM, Jonas Rosland <jonas.rosland(a)emc.com> wrote: Hi Dieu and Daniel,
I did set the environment variable like you suggest Daniel, I should've showed that in my example. I see now that the app wrongly removes the $ characters and the character after it, I didn't notice that before. `cf env appname` shows the correct environment variable, so I guess I will have to do some escaping of characters in my app?
Best regards, Jonas Rosland
|
|
Re: Environment variables with special characters not handled correctly?
Hi Dieu and Daniel,
I did set the environment variable like you suggest Daniel, I should've showed that in my example. I see now that the app wrongly removes the $ characters and the character after it, I didn't notice that before. `cf env appname` shows the correct environment variable, so I guess I will have to do some escaping of characters in my app?
Best regards, Jonas Rosland
|
|
Re: Running the app test suite within the CATs, and the admin_buildpack_lifecycle_test is failing
I was unclear on what I am asking, the real question is as follows:
What is the best way to run the apps test suite within the CATS on an older version of cloud foundry? (for example I am running these tests on version 208)
|
|
Re: Error 400007: `stats_z1/0' is not running after update
Okay, please let me know if you are able to fix your security group settings and whether the original problem gets resolved. Amit On Wed, Sep 23, 2015 at 7:03 PM, Guangcai Wang <guangcai.wang(a)gmail.com> wrote: That did help. It showed us the real error.
==> metron_agent/metron_agent.stdout.log <== {"timestamp":1443054247.927488327,"process_id":23472,"source":"metron","log_level":"warn","message":"Failed to create client: Could not connect to NATS: dial tcp 192.168.110.202:4222: i/o timeout","data":null,"file":"/var/vcap/data/compile/metron_agent/loggregator/src/ github.com/cloudfoundry/loggregatorlib/cfcomponent/registrars/collectorregistrar/collector_registrar.go ","line":51,"method":" github.com/cloudfoundry/loggregatorlib/cfcomponent/registrars/collectorregistrar.(*CollectorRegistrar).Run "}
I checked the security rule. It seems to have some problems.
On Thu, Sep 24, 2015 at 2:47 AM, Amit Gupta <agupta(a)pivotal.io> wrote:
I often take the following approach to debugging issues like this:
* Open two shell sessions to your failing VM using bosh ssh, and switch to superuser * In one session, `watch monit summary`. You might see collector going back and forth between initializing and not monitored, but please report anything else of interest you see here * In the other session, `cd /var/vcap/sys/log` and then `watch --differences=cumulative ls -altr **/*` to see which files are being written to while the startup processes are thrashing. Then `tail -f FILE_1 FILE_2 ...` listing all the files that were being written to, and seem relevant to the thrashing process(es) in monit
On Wed, Sep 23, 2015 at 12:21 AM, Guangcai Wang <guangcai.wang(a)gmail.com> wrote:
It frequently logs the message below. It seems not helpful.
{"timestamp":1442987404.9433253,"message":"collector.started","log_level":"info","source":"collector","data":{},"thread_id":70132569199380,"fiber_id":70132570371720,"process_id":19392,"file":"/var/vcap/packages/collector/lib/collector/config.rb","lineno":45,"method":"setup_logging"}
the only possible error message from the bosh debug log is "ntp":{"message":"bad ntp server"}
But I don't think, it is related to the failure of stats_z1 updating.
I, [2015-09-23 04:55:59 #2392] [canary_update(stats_z1/0)] INFO -- DirectorJobRunner: Checking if stats_z1/0 has been updated after 63.333333333333336 seconds D, [2015-09-23 04:55:59 #2392] [canary_update(stats_z1/0)] DEBUG -- DirectorJobRunner: SENT: agent.7d3452bd-679e-4a97-8514-63a373a54ffd {"method":"get_state","arguments":[],"reply_to":"director.c5b97fc1-b972-47ec-9412-a83ad240823b.473fda64-6ac3-4a53-9ebc-321fc7eabd7a"} D, [2015-09-23 04:55:59 #2392] [] DEBUG -- DirectorJobRunner: RECEIVED: director.c5b97fc1-b972-47ec-9412-a83ad240823b.473fda64-6ac3-4a53-9ebc-321fc7eabd7a {"value":{"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"stats_z1","release":"","template":"fluentd","version":"4c71c87bbf0144428afacd470e2a5e32b91932fc","sha1":"b141c6037d429d732bf3d67f7b79f8d7d80aac5d","blobstore_id":"d8451d63-2e4f-4664-93a8-a77e5419621d","templates":[{"name":"fluentd","version":"4c71c87bbf0144428afacd470e2a5e32b91932fc","sha1":"b141c6037d429d732bf3d67f7b79f8d7d80aac5d","blobstore_id":"d8451d63-2e4f-4664-93a8-a77e5419621d"},{"name":"collector","version":"889b187e2f6adc453c61fd8f706525b60e4b85ed","sha1":"f5ae15a8fa2417bf984513e5c4269f8407a274dc","blobstore_id":"3eeb0166-a75c-49fb-9f28-c29788dbf64d"},{"name":"metron_agent","version":"e6df4c316b71af68dfc4ca476c8d1a4885e82f5b","sha1":"42b6d84ad9368eba0508015d780922a43a86047d","blobstore_id":"e578bfb0-9726-4754-87ae-b54c8940e41a"},{"name":"apaas_collector","version":"8808f0ae627a54706896a784dba47570c92e0c8b","sha1":"b9a63da925b40910445d592c70abcf4d23ffe84d","blobstore_id":"3e6fa71a-07f7-446a-96f4-3caceea02f2f"}]},"packages":{"apaas_collector":{"name":"apaas_collector","version":"f294704d51d4517e4df3d8417a3d7c71699bc04d.1","sha1":"5af77ceb01b7995926dbd4ad7481dcb7c3d94faf","blobstore_id":"fa0e96b9-71a6-4828-416e-dde3427a73a9"},"collector":{"name":"collector","version":"ba47450ce83b8f2249b75c79b38397db249df48b.1","sha1":"0bf8ee0d69b3f21cf1878a43a9616cb7e14f6f25","blobstore_id":"722a5455-f7f7-427d-7e8d-e562552857bc"},"common":{"name":"common","version":"99c756b71550530632e393f5189220f170a69647.1","sha1":"90159de912c9bfc71740324f431ddce1a5fede00","blobstore_id":"37be6f28-c340-4899-7fd3-3517606491bb"},"fluentd-0.12.13":{"name":"fluentd-0.12.13","version":"71d8decbba6c863bff6c325f1f8df621a91eb45f.1","sha1":"2bd32b3d3de59e5dbdd77021417359bb5754b1cf","blobstore_id":"7bc81ac6-7c24-4a94-74d1-bb9930b07751"},"metron_agent":{"name":"metron_agent","version":"997d87534f57cad148d56c5b8362b72e726424e4.1","sha1":"a21404c50562de75000d285a02cd43bf098bfdb9","blobstore_id":"6c7cf72c-9ace-40a1-4632-c27946bf631e"},"ruby-2.1.6":{"name":"ruby-2.1.6","version":"41d0100ffa4b21267bceef055bc84dc37527fa35.1","sha1":"8a9867197682cabf2bc784f71c4d904bc479c898","blobstore_id":"536bc527-3225-43f6-7aad-71f36addec80"}},"configuration_hash":"a73c7d06b0257746e95aaa2ca994c11629cbd324","networks":{"private_cf_subnet":{"cloud_properties":{"name":"random","net_id":"1e1c9aca-0b5a-4a8f-836a-54c18c21c9b9","security_groups":["az1_cf_management_secgroup_bosh_cf_ssh_cf2","az1_cf_management_secgroup_cf_private_cf2","az1_cf_management_secgroup_cf_public_cf2"]},"default":["dns","gateway"],"dns":["192.168.110.8","133.162.193.10","133.162.193.9","192.168.110.10"],"dns_record_name":"0.stats-z1.private-cf-subnet.cf-apaas.microbosh","gateway":"192.168.110.11","ip":"192.168.110.204","netmask":"255.255.255.0"}},"resource_pool":{"cloud_properties":{"instance_type":"S-1"},"name":"small_z1","stemcell":{"name":"bosh-openstack-kvm-ubuntu-trusty-go_agent","version":"2989"}},"deployment":"cf-apaas","index":0,"persistent_disk":0,"persistent_disk_pool":null,"rendered_templates_archive":{"sha1":"0ffd89fa41e02888c9f9b09c6af52ea58265a8ec","blobstore_id":"4bd01ae7-a69a-4fe5-932b-d98137585a3b"},"agent_id":"7d3452bd-679e-4a97-8514-63a373a54ffd","bosh_protocol":"1","job_state":"failing","vm":{"name":"vm-12d45510-096d-4b8b-9547-73ea5fda00c2"},"ntp":{"message":"bad ntp server"}}}
On Wed, Sep 23, 2015 at 5:13 PM, Amit Gupta <agupta(a)pivotal.io> wrote:
Please check the file collector/collector.log, it's in a subdirectory of the unpacked log tarball.
On Wed, Sep 23, 2015 at 12:01 AM, Guangcai Wang < guangcai.wang(a)gmail.com> wrote:
Actually, I checked the two files in status_z1 job VM. I did not find any clues. Attached for reference.
On Wed, Sep 23, 2015 at 4:54 PM, Amit Gupta <agupta(a)pivotal.io> wrote:
If you do "bosh logs stats_z1 0 --job" you will get a tarball of all the logs for the relevant processes running on the stats_z1/0 VM. You will likely find some error messages in the collectors stdout or stderr logs.
On Tue, Sep 22, 2015 at 11:30 PM, Guangcai Wang < guangcai.wang(a)gmail.com> wrote:
It does not help.
I always see the "collector" process bouncing between "running" and "does not exit" when I use "monit summary" in a while loop.
Who knows how to get the real error when the "collector" process is not failed? Thanks.
On Wed, Sep 23, 2015 at 4:11 PM, Tony <Tonyl(a)fast.au.fujitsu.com> wrote:
My approach is to login on the stats vm and sudo, then run "monit status" and restart the failed processes or simply restart all processes by running "monit restart all"
wait for a while(5~10 minutes at most) If there is still some failed process, e.g. collector then run ps -ef | grep collector and kill the processes in the list(may be you need to run kill -9 sometimes)
then "monit restart all"
Normally, it will fix the issue "Failed: `XXX' is not running after update"
-- View this message in context: http://cf-dev.70369.x6.nabble.com/cf-dev-Error-400007-stats-z1-0-is-not-running-after-update-tp1901p1902.html Sent from the CF Dev mailing list archive at Nabble.com.
|
|
Re: Environment variables with special characters not handled correctly?
toggle quoted message
Show quoted text
On Thu, Sep 24, 2015 at 11:14 AM, Daniel Mikusa <dmikusa(a)pivotal.io> wrote: It's possible that your shell is escaping the characters, like the '$'.
Try `cf set-env appname WORDPRESS_BEARER '1NNKhb5&Nfw$F(wqbqW&9nSeoonwAYz7#j2M1KKY!QU(Wbs(a)8xwjr6Q$hg(IPqcd'`. Note the single quotes around the value of the environment variable. Or set the environment variable in a manifest.yml file.
Also, run `cf env <app-name>` to confirm the value is being set correctly.
Thanks,
Dan
On Thu, Sep 24, 2015 at 1:57 PM, Jonas Rosland <jonas.rosland(a)emc.com> wrote:
Hi all,
I am having an issue with an environment variable containing special characters that doesn't seem to picked up correctly by CF.
I run `cf set-env appname WORDPRESS_BEARER 1NNKhb5&Nfw$F(wqbqW&9nSeoonwAYz7#j2M1KKY!QU(Wbs(a)8xwjr6Q$hg(IPqcd` (obviously not my currently correct key) and then use it in this Ruby app: https://gist.github.com/jonasrosland/08b5758eaa9098a81cf8
When I check the output the app complains about the API key being incorrect, when it is, in fact, correct. If I set it manually in the application it works, but that is of course not a good practice. I've also verified that the environment variable does get picked up by the application by adding some logging output to show the API key, but it still won't work. I'm wondering if this is because of the special characters in the environment variable?
Thanks in advance, Jonas Rosland
|
|
Re: Environment variables with special characters not handled correctly?
It's possible that your shell is escaping the characters, like the '$'. Try `cf set-env appname WORDPRESS_BEARER '1NNKhb5&Nfw$F(wqbqW&9nSeoonwAYz7#j2M1KKY!QU(Wbs(a)8xwjr6Q$hg(IPqcd'`. Note the single quotes around the value of the environment variable. Or set the environment variable in a manifest.yml file. Also, run `cf env <app-name>` to confirm the value is being set correctly. Thanks, Dan On Thu, Sep 24, 2015 at 1:57 PM, Jonas Rosland <jonas.rosland(a)emc.com> wrote: Hi all,
I am having an issue with an environment variable containing special characters that doesn't seem to picked up correctly by CF.
I run `cf set-env appname WORDPRESS_BEARER 1NNKhb5&Nfw$F(wqbqW&9nSeoonwAYz7#j2M1KKY!QU(Wbs(a)8xwjr6Q$hg(IPqcd` (obviously not my currently correct key) and then use it in this Ruby app: https://gist.github.com/jonasrosland/08b5758eaa9098a81cf8
When I check the output the app complains about the API key being incorrect, when it is, in fact, correct. If I set it manually in the application it works, but that is of course not a good practice. I've also verified that the environment variable does get picked up by the application by adding some logging output to show the API key, but it still won't work. I'm wondering if this is because of the special characters in the environment variable?
Thanks in advance, Jonas Rosland
|
|
Environment variables with special characters not handled correctly?
Hi all, I am having an issue with an environment variable containing special characters that doesn't seem to picked up correctly by CF. I run `cf set-env appname WORDPRESS_BEARER 1NNKhb5&Nfw$F(wqbqW&9nSeoonwAYz7#j2M1KKY!QU(Wbs(a)8xwjr6Q$hg(IPqcd` (obviously not my currently correct key) and then use it in this Ruby app: https://gist.github.com/jonasrosland/08b5758eaa9098a81cf8When I check the output the app complains about the API key being incorrect, when it is, in fact, correct. If I set it manually in the application it works, but that is of course not a good practice. I've also verified that the environment variable does get picked up by the application by adding some logging output to show the API key, but it still won't work. I'm wondering if this is because of the special characters in the environment variable? Thanks in advance, Jonas Rosland
|
|
Jordan Collier email for mailing list
jordanicollier(a)gmail.com
|
|
Running the app test suite within the CATs, and the admin_buildpack_lifecycle_test is failing
`[2015-09-24 15:06:25.24 (UTC)]> cf logout Logging out... OK • Failure [22.809 seconds] Admin Buildpacks /Users/localadmin/github.com/cloudfoundry/src/github.com/cloudfoundry/cf-acceptance-tests/apps/admin_buildpack_lifecycle_test.go:172 when the buildpack fails to detect /Users/localadmin/github.com/cloudfoundry/src/github.com/cloudfoundry/cf-acceptance-tests/apps/admin_buildpack_lifecycle_test.go:129 fails to stage [It] /Users/localadmin/github.com/cloudfoundry/src/github.com/cloudfoundry/cf-acceptance-tests/apps/admin_buildpack_lifecycle_test.go:128
Got stuck at: Creating app CATS-APP-5c7775a6-1753-4e2d-4415-7f6abe01a974 in org CATS-ORG-1-2015_09_24-08h01m29.448s / space CATS-SPACE-1-2015_09_24-08h01m29.448s as CATS-USER-1-2015_09_24-08h01m29.448s... OK
Creating route cats-app-5c7775a6-1753-4e2d-4415-7f6abe01a974.switchollie.allstate.com... OK
Binding cats-app-5c7775a6-1753-4e2d-4415-7f6abe01a974.switchollie.allstate.com to CATS-APP-5c7775a6-1753-4e2d-4415-7f6abe01a974... OK
Uploading CATS-APP-5c7775a6-1753-4e2d-4415-7f6abe01a974... Uploading app files from: /var/folders/ph/tg82ppzd6kngwm_g2tbzpccc0000gn/T/matching-app824262495 Uploading 132, 1 files Done uploading OK
Starting app CATS-APP-5c7775a6-1753-4e2d-4415-7f6abe01a974 in org CATS-ORG-1-2015_09_24-08h01m29.448s / space CATS-SPACE-1-2015_09_24-08h01m29.448s as CATS-USER-1-2015_09_24-08h01m29.448s... -----> Downloaded app package (4.0K) Staging failed: An application could not be detected by any available buildpack
FAILED Server error, status code: 400, error code: 170003, message: An app was not successfully detected by any available buildpack
TIP: use 'cf logs CATS-APP-5c7775a6-1753-4e2d-4415-7f6abe01a974 --recent' for more information
Waiting for: NoAppDetectedError
/Users/localadmin/github.com/cloudfoundry/src/github.com/cloudfoundry/cf-acceptance-tests/apps/admin_buildpack_lifecycle_test.go:127`
It looks as if it is failing for the correct reason, is there something I am missing?
|
|
Re: Security group rules to allow HTTP communication between 2 apps deployed on CF
Containers have a default iptables rule for REJECT all traffic. If there is not a security group configured to allow the traffic to the destination, you'll get a connection refused. Security groups can only be created and configured by admin users. Your only option is probably to have one app connect to the other using the public route bound to that app. Joseph CF Release Integration Team On Wed, Sep 23, 2015 at 3:54 AM, Denilson Nastacio <dnastacio(a)gmail.com> wrote: The message indicates this problem is unrelated to security groups. You would get something like "host not found" instead of "connection refused".
Which version of CF are you using? Can you curl a url from app2 at all?
On Wed, Sep 23, 2015, 3:27 AM Naveen Asapu <asapu.naveen(a)gmail.com> wrote:
Hi Matthew Sykes,
Actually I'm trying to monitor usage of app in bluemix. for that i'm using cf-abacus in the example steps this command also there.
can u suggest how to monitor app usage using curl and cloudfoundary
-- Thanks Naveen Asapu
|
|
Re: DEA/Warden staging error
kyle havlovitz <kylehav@...>
Ok, after more investigating the problem was that network manager was running on the machine and was trying to take control of new network interfaces after they came up, so it would cause problems with the interface that Warden created for the container. With network manager disabled I can push the app and everything is fine.
Thanks for your help everyone.
toggle quoted message
Show quoted text
On Wed, Sep 23, 2015 at 10:45 AM, kyle havlovitz <kylehav(a)gmail.com> wrote: Here's the output from those commands: https://gist.github.com/MrEnzyme/36592831b1c46d44f007 Soon after running those I noticed that the container loses its IPv4 address shortly after coming up and ifconfig looks like this:
root(a)cf-build:/home/cloud-user/test# ifconfig -a
docker0 Link encap:Ethernet HWaddr 56:84:7a:fe:97:99 inet addr:172.17.42.1 Bcast:0.0.0.0 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) eth0 Link encap:Ethernet HWaddr fa:16:3e:cd:f3:0a inet addr:172.25.1.52 Bcast:172.25.1.127 Mask:255.255.255.128 inet6 addr: fe80::f816:3eff:fecd:f30a/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:515749 errors:0 dropped:0 overruns:0 frame:0 TX packets:295471 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1162366659 (1.1 GB) TX bytes:59056756 (59.0 MB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:45057315 errors:0 dropped:0 overruns:0 frame:0 TX packets:45057315 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:18042315375 (18.0 GB) TX bytes:18042315375 (18.0 GB) w-190db6c54la-0 Link encap:Ethernet HWaddr 12:dc:ba:da:38:5b inet6 addr: fe80::10dc:baff:feda:385b/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1454 Metric:1 RX packets:12 errors:0 dropped:0 overruns:0 frame:0 TX packets:227 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:872 (872.0 B) TX bytes:35618 (35.6 KB) Any idea what would be causing that?
On Tue, Sep 22, 2015 at 10:31 PM, Matthew Sykes <matthew.sykes(a)gmail.com> wrote:
Based on your description, it doesn't sound like warden networking or the warden iptables chains are your problem. Are you able to share all of your routes and chains via a gist?
route -n ifconfig -a iptables -L -n -v -t filter iptables -L -n -v -t nat iptables -L -n -v -t mangle
Any kernel messages that look relevant in the message buffer (dmesg)?
Have you tried doing a network capture to verify the packets are look the way you expect? Are you sure your host routing rules are good? Do the warden subnets overlap with any network accessible to the host?
Based on previous notes, it doesn't sound like this is a standard deployment so it's hard to say what could be impacting you.
On Tue, Sep 22, 2015 at 1:08 PM, Kyle Havlovitz (kyhavlov) < kyhavlov(a)cisco.com> wrote:
I didn’t; I’m still having this problem. Even adding this lenient security group didn’t let me get any traffic out of the VM:
[{"name":"allow_all","rules":[{"protocol":"all","destination":"0.0.0.0/0 "},{"protocol":"tcp","destination":"0.0.0.0/0 ","ports":"1-65535"},{"protocol":"udp","destination":"0.0.0.0/0 ","ports":"1-65535"}]}]
The only way I was able to get traffic out was by manually removing the reject/drop iptables rules that warden set up, and even with that the container still lost all connectivity after 30 seconds.
From: CF Runtime <cfruntime(a)gmail.com> Reply-To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev(a)lists.cloudfoundry.org> Date: Tuesday, September 22, 2015 at 12:50 PM To: "Discussions about Cloud Foundry projects and the system overall." < cf-dev(a)lists.cloudfoundry.org> Subject: [cf-dev] Re: Re: Re: Re: Re: Re: Re: Re: DEA/Warden staging error
Hey Kyle,
Did you make any progress?
Zak & Mikhail CF Release Integration Team
On Thu, Sep 17, 2015 at 10:28 AM, CF Runtime <cfruntime(a)gmail.com> wrote:
It certainly could be. By default the contains reject all egress traffic. CC security groups configure iptables rules that allow traffic out.
One of the default security groups in the BOSH templates allows access on port 53. If you have no security groups, the containers will not be able to make any outgoing requests.
Joseph & Natalie CF Release Integration Team
On Thu, Sep 17, 2015 at 8:44 AM, Kyle Havlovitz (kyhavlov) < kyhavlov(a)cisco.com> wrote:
On running git clone inside the container via the warden shell, I get: "Cloning into 'staticfile-buildpack'... fatal: unable to access ' https://github.com/cloudfoundry/staticfile-buildpack/': Could not resolve host: github.com". So the container can't get to anything outside of it (I also tried pinging some external IPs to make sure it wasn't a DNS thing). Would this be caused by cloud controller security group settings?
-- Matthew Sykes matthew.sykes(a)gmail.com
|
|
Re: Removing support for v1 service brokers
Mike Youngstrom <youngm@...>
My vote on to wait a couple more months. I guess we'll see if anyone else would like more months.
Mike
toggle quoted message
Show quoted text
On Sep 23, 2015 11:52 PM, "Dieu Cao" <dcao(a)pivotal.io> wrote: Thanks Mike. Totally understandable.
On Wed, Sep 23, 2015 at 9:23 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:
Thanks Dieu, honestly I was just trying to find an angle to bargain for a bit more time. :) Three months is generous. But six months would be glorious. :)
After the CAB call this month we got started converting our brokers over but our migration is more difficult because we use Service instance credentials quite a bit and those don't appear to be handled well when doing "migrate-service-instances". I think we can do 3 months but we'll be putting our users through a bit of a fire drill.
That said I'll understand if you stick to 3 months since, we should have started this conversion log ago.
Mike
On Wed, Sep 23, 2015 at 1:22 AM, Dieu Cao <dcao(a)pivotal.io> wrote:
We've found NATS to be unstable under certain conditions, temporary network interruptions or network instability, around the client reconnection logic. We've seen that it could take anywhere from a few seconds to half an hour to reconnect properly. We spent a fair amount of time investigating ways to improve the reconnection logic and have made some improvements but believe that it's best to work towards not having this dependency. You can find more about this in the stories in this epic [1].
Mike, in addition to removing the NATS dependency, this will remove the burden on the team, almost a weekly fight, in terms of maintaining backwards compatibility for the v1 broker spec any time we work on adding functionality to the service broker api. I'll work with the team in the next couple of weeks on specific stories and I'll link to it here.
[1] https://www.pivotaltracker.com/epic/show/1440790
On Tue, Sep 22, 2015 at 10:07 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
Thanks for the announcement.
To be clear is this announcement to cease support for the old v1 brokers or is this to eliminate support for the v1 api in the CC? Does the v1 CC code depend on NATS? None of my custom v1 brokers depend on NATS.
Mike
On Tue, Sep 22, 2015 at 6:01 PM, Dieu Cao <dcao(a)pivotal.io> wrote:
Hello all,
We plan to remove support for v1 service brokers in about 3 months, in a cf-release following 12/31/2015. We are working towards removing CF's dependency on NATS and the v1 service brokers are still dependent on NATS. Please let me know if you have questions/concerns about this timeline.
I'll be working on verifying a set of steps that you can find here [1] that document how to migrate your service broker from v1 to v2 and what is required in order to persist user data and will get that posted to the service broker api docs officially.
-Dieu CF CAPI PM
[1] https://docs.google.com/document/d/1Pl1o7mxtn3Iayq2STcMArT1cJsKkvi4Ey1-d3TB_Nhs/edit?usp=sharing
|
|
Re: How to deploy a Web application using HTTPs
Juan Antonio Breña Moral <bren at juanantonio.info...>
Hi Dieu,
many thanks for the technical info.
I will consider this factor to add this restriction in the development.
Juan Antonio
|
|
Re: Removing support for v1 service brokers
Thanks Mike. Totally understandable.
toggle quoted message
Show quoted text
On Wed, Sep 23, 2015 at 9:23 AM, Mike Youngstrom <youngm(a)gmail.com> wrote: Thanks Dieu, honestly I was just trying to find an angle to bargain for a bit more time. :) Three months is generous. But six months would be glorious. :)
After the CAB call this month we got started converting our brokers over but our migration is more difficult because we use Service instance credentials quite a bit and those don't appear to be handled well when doing "migrate-service-instances". I think we can do 3 months but we'll be putting our users through a bit of a fire drill.
That said I'll understand if you stick to 3 months since, we should have started this conversion log ago.
Mike
On Wed, Sep 23, 2015 at 1:22 AM, Dieu Cao <dcao(a)pivotal.io> wrote:
We've found NATS to be unstable under certain conditions, temporary network interruptions or network instability, around the client reconnection logic. We've seen that it could take anywhere from a few seconds to half an hour to reconnect properly. We spent a fair amount of time investigating ways to improve the reconnection logic and have made some improvements but believe that it's best to work towards not having this dependency. You can find more about this in the stories in this epic [1].
Mike, in addition to removing the NATS dependency, this will remove the burden on the team, almost a weekly fight, in terms of maintaining backwards compatibility for the v1 broker spec any time we work on adding functionality to the service broker api. I'll work with the team in the next couple of weeks on specific stories and I'll link to it here.
[1] https://www.pivotaltracker.com/epic/show/1440790
On Tue, Sep 22, 2015 at 10:07 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
Thanks for the announcement.
To be clear is this announcement to cease support for the old v1 brokers or is this to eliminate support for the v1 api in the CC? Does the v1 CC code depend on NATS? None of my custom v1 brokers depend on NATS.
Mike
On Tue, Sep 22, 2015 at 6:01 PM, Dieu Cao <dcao(a)pivotal.io> wrote:
Hello all,
We plan to remove support for v1 service brokers in about 3 months, in a cf-release following 12/31/2015. We are working towards removing CF's dependency on NATS and the v1 service brokers are still dependent on NATS. Please let me know if you have questions/concerns about this timeline.
I'll be working on verifying a set of steps that you can find here [1] that document how to migrate your service broker from v1 to v2 and what is required in order to persist user data and will get that posted to the service broker api docs officially.
-Dieu CF CAPI PM
[1] https://docs.google.com/document/d/1Pl1o7mxtn3Iayq2STcMArT1cJsKkvi4Ey1-d3TB_Nhs/edit?usp=sharing
|
|
Re: How to deploy a Web application using HTTPs
Your edge load balancer should be configured to add x-forwarded-for and x-forwarded-proto headers. On Wed, Sep 23, 2015 at 4:24 AM, Juan Antonio Breña Moral < bren(a)juanantonio.info> wrote: @James,
who add the headers?
"x-forwarded-for":"CLIENT_REAL_IP, CLOUD_FOUNDRY_IP", "x-forwarded-proto":"https"
the load balancer or the GoRouter?
|
|
Re: Introducing CF-Swagger
Separate from this proposal, the CAPI team has stories for spiking on a few different api documentation options for the cloud controller api [1]. Swagger is one of the options we are looking into, but it is not the only one. [1] https://www.pivotaltracker.com/epic/show/2093796On Wed, Sep 23, 2015 at 11:00 AM, Deepak Vij (A) <deepak.vij(a)huawei.com> wrote: Hi Mohamed and Dr. Max, I fully support this effort. By having Swagger based “Application Interface” capability as part of the overall CF PaaS platform would be very useful for the CF community as a whole. As a matter of fact, I also initiated a similar thread few months ago on cf-dev alias (see email text below). Your work exactly matches up with what our current thinking is.
By having “Swagger” based “Application Interface” is a very good start along those lines. This opens up lots of other possibilities such as building out “Deployment Governance” capabilities not merely for Cloud Foundry API or Services assets but for the whole Application landscape built & deployed within CF PaaS environment and subsequently exposed as APIs to end consumers.
As described below in my email I sent out earlier that “Deployment Governance” as part of overall API Management is what we are striving towards in order to expose comprehensive telecom API Management capabilities within the public cloud environment.
Dr. Max, as I mentioned to you during our brief discussion few days ago that “Heroku” folks also have a similar initiative ongoing. They have gone lightweight “JSON” schema route versus Swagger/WADL/RAML etc.
In any case, I am fully in support of your proposal. Thanks.
Regards,
Deepak Vij
=============================
Hi folks, I would like to start a thread on the need for machine-readable “*Application Interface*” supported at the platform level. Essentially, this interface describes details such as available methods/operations, inputs/outputs data types (schema), application dependencies etc. Any standard specifications language can be used for this purpose, as long as it clearly describes the schema of the requests and responses – one can use Web Application Description Language (WADL), Swagger, RESTful API Modeling Language (RAML), JSON Schema (something like *JSON Schema for Heroku Platform APIs*) or any other language that provides similar functionality. These specifications are to be automatically derived from the code and are typically part of the application development process (e.g. generated by the build system).
Such functionality can have lots of usage scenarios:
1. First and foremost, Deployment Governance for API Management (our main vested interest) – API Versioning & Backward Compatibility, Dependency Management and many more as part of the comprehensive telecom API Management capabilities which we are currently in the process of building out.
2. Auto-creating client libraries for your favorite programming language.
3. Automatic generation of up-to-date documentation.
4. Writing automatic acceptance and integration tests etc.
From historical perspective, in the early 2000s when SOA started out, the mindset was to author the application contract-first (application interface using WSDL at that time) and subsequently generate and author code from the application interface. With the advent of RESTful services, REST community initially took a stand against such metadata for applications. Although, a number of metadata standards have none-the-less emerged over the last couple of years, mainly fueled by the use case scenarios described earlier.
Based on my knowledge, none of this currently exists within Cloud Foundry at the platform level. It would be highly desirable to have a standard common “*application interface*” definition at the platform level, agnostic of the underlying application development frameworks.
I hope this all makes sense. I think this is something could be very relevant to the “Utilities” PMC. I will also copy&paste this text under “Utilities” PMC-notes on the github.
I would love to hear from the community on this. Thanks.
Regards,
Deepak Vij
*From:* Michael Maximilien [mailto:maxim(a)us.ibm.com] *Sent:* Friday, September 18, 2015 4:52 PM *To:* cf-dev(a)lists.cloudfoundry.org *Cc:* Heiko Ludwig; Mohamed Mohamed; Alex Tarpinian; Christopher B Ferris *Subject:* [cf-dev] Introducing CF-Swagger
Hi, all,
This email serves two purposes: 1) introduce CF-Swagger, and 2) shares the results of the CF service broker compliance survey I sent out a couple of weeks ago.
------
My IBM Research colleague, Mohamed (on cc:), and I have been working on creating Swagger descriptions for some CF APIs.
Our main goal was to explore what useful tools or utilities we could build with these Swagger descriptions once created.
The initial results of this exploratory research is CF-Swagger which is included in the following:
See presentation here: https://goo.gl/Y16plT
Video demo here: http://goo.gl/C8Nz5p
Temp repo here: https://github.com/maximilien/cf-swagger
The gist of of our work and results are:
1. We created a full Swagger description of the CF service broker
2. Using this description you can use the Swagger editor to create a neat API docs that is browsable and even callable
3. Using the description you can create client and server stubs for service brokers in a variety of languages, e.g., JS, Java, Ruby, etc.
4. We've extended go-swagger to generate workable client and server stubs for service brokers in Golang. We plan to submit all changes to go-swagger back to that project
5. We've extended go-swagger to generate prototypes of working Ginkgo tests to service brokers
6. We've extended go-swagger to generate a CF service broker Ginkgo Test Compliance Kit (TCK) that anyone could use to validate their broker's compliance with any Swagger-described version of spec
7. We've created a custom Ginkgo reporter that when ran with TCK will give you a summary of your compliance, e.g., 100% compliant with v2.5 but 90% compliant with v2.6 due to failing test X, Y, Z... (in Ginkgo fashion)
8. The survey results (all included in the presentation) indicate that over 50% of respondants believe TCK tests for service broker would be valuable to them. Many (over 50%) are using custom proprietary tests, and this project maybe a way to get everyone to converge to a common set of tests we could all use and improve...
------
We plan to propose this work to become a CF incubator at the next CAB and PMC calls, especially the TCK part for service brokers. The overall approach and project could be useful for other parts of the CF APIs but we will start with CF Service Brokers.
The actual Swagger descriptions should ideally come from the teams who own the APIs. So for service brokers, the CAPI team. We are engaging them as they have also been looking at improving APIs docs and descriptions. Maybe there are potential for synergies and at a minimum making sure what we generate ends up becoming useful to their pipelines.
Finally, while the repo is temporary and will change, I welcome you to take a look at presentation and video and code and let us know your thoughts and feedback.
Thanks for your time and interest.
Mohamed and Max
IBM
|
|
Re: Error 400007: `stats_z1/0' is not running after update
That did help. It showed us the real error.
==> metron_agent/metron_agent.stdout.log <== {"timestamp":1443054247.927488327,"process_id":23472,"source":"metron","log_level":"warn","message":"Failed to create client: Could not connect to NATS: dial tcp 192.168.110.202:4222: i/o timeout","data":null,"file":"/var/vcap/data/compile/metron_agent/loggregator/src/ github.com/cloudfoundry/loggregatorlib/cfcomponent/registrars/collectorregistrar/collector_registrar.go ","line":51,"method":" github.com/cloudfoundry/loggregatorlib/cfcomponent/registrars/collectorregistrar.(*CollectorRegistrar).Run "}
I checked the security rule. It seems to have some problems.
toggle quoted message
Show quoted text
On Thu, Sep 24, 2015 at 2:47 AM, Amit Gupta <agupta(a)pivotal.io> wrote: I often take the following approach to debugging issues like this:
* Open two shell sessions to your failing VM using bosh ssh, and switch to superuser * In one session, `watch monit summary`. You might see collector going back and forth between initializing and not monitored, but please report anything else of interest you see here * In the other session, `cd /var/vcap/sys/log` and then `watch --differences=cumulative ls -altr **/*` to see which files are being written to while the startup processes are thrashing. Then `tail -f FILE_1 FILE_2 ...` listing all the files that were being written to, and seem relevant to the thrashing process(es) in monit
On Wed, Sep 23, 2015 at 12:21 AM, Guangcai Wang <guangcai.wang(a)gmail.com> wrote:
It frequently logs the message below. It seems not helpful.
{"timestamp":1442987404.9433253,"message":"collector.started","log_level":"info","source":"collector","data":{},"thread_id":70132569199380,"fiber_id":70132570371720,"process_id":19392,"file":"/var/vcap/packages/collector/lib/collector/config.rb","lineno":45,"method":"setup_logging"}
the only possible error message from the bosh debug log is "ntp":{"message":"bad ntp server"}
But I don't think, it is related to the failure of stats_z1 updating.
I, [2015-09-23 04:55:59 #2392] [canary_update(stats_z1/0)] INFO -- DirectorJobRunner: Checking if stats_z1/0 has been updated after 63.333333333333336 seconds D, [2015-09-23 04:55:59 #2392] [canary_update(stats_z1/0)] DEBUG -- DirectorJobRunner: SENT: agent.7d3452bd-679e-4a97-8514-63a373a54ffd {"method":"get_state","arguments":[],"reply_to":"director.c5b97fc1-b972-47ec-9412-a83ad240823b.473fda64-6ac3-4a53-9ebc-321fc7eabd7a"} D, [2015-09-23 04:55:59 #2392] [] DEBUG -- DirectorJobRunner: RECEIVED: director.c5b97fc1-b972-47ec-9412-a83ad240823b.473fda64-6ac3-4a53-9ebc-321fc7eabd7a {"value":{"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"stats_z1","release":"","template":"fluentd","version":"4c71c87bbf0144428afacd470e2a5e32b91932fc","sha1":"b141c6037d429d732bf3d67f7b79f8d7d80aac5d","blobstore_id":"d8451d63-2e4f-4664-93a8-a77e5419621d","templates":[{"name":"fluentd","version":"4c71c87bbf0144428afacd470e2a5e32b91932fc","sha1":"b141c6037d429d732bf3d67f7b79f8d7d80aac5d","blobstore_id":"d8451d63-2e4f-4664-93a8-a77e5419621d"},{"name":"collector","version":"889b187e2f6adc453c61fd8f706525b60e4b85ed","sha1":"f5ae15a8fa2417bf984513e5c4269f8407a274dc","blobstore_id":"3eeb0166-a75c-49fb-9f28-c29788dbf64d"},{"name":"metron_agent","version":"e6df4c316b71af68dfc4ca476c8d1a4885e82f5b","sha1":"42b6d84ad9368eba0508015d780922a43a86047d","blobstore_id":"e578bfb0-9726-4754-87ae-b54c8940e41a"},{"name":"apaas_collector","version":"8808f0ae627a54706896a784dba47570c92e0c8b","sha1":"b9a63da925b40910445d592c70abcf4d23ffe84d","blobstore_id":"3e6fa71a-07f7-446a-96f4-3caceea02f2f"}]},"packages":{"apaas_collector":{"name":"apaas_collector","version":"f294704d51d4517e4df3d8417a3d7c71699bc04d.1","sha1":"5af77ceb01b7995926dbd4ad7481dcb7c3d94faf","blobstore_id":"fa0e96b9-71a6-4828-416e-dde3427a73a9"},"collector":{"name":"collector","version":"ba47450ce83b8f2249b75c79b38397db249df48b.1","sha1":"0bf8ee0d69b3f21cf1878a43a9616cb7e14f6f25","blobstore_id":"722a5455-f7f7-427d-7e8d-e562552857bc"},"common":{"name":"common","version":"99c756b71550530632e393f5189220f170a69647.1","sha1":"90159de912c9bfc71740324f431ddce1a5fede00","blobstore_id":"37be6f28-c340-4899-7fd3-3517606491bb"},"fluentd-0.12.13":{"name":"fluentd-0.12.13","version":"71d8decbba6c863bff6c325f1f8df621a91eb45f.1","sha1":"2bd32b3d3de59e5dbdd77021417359bb5754b1cf","blobstore_id":"7bc81ac6-7c24-4a94-74d1-bb9930b07751"},"metron_agent":{"name":"metron_agent","version":"997d87534f57cad148d56c5b8362b72e726424e4.1","sha1":"a21404c50562de75000d285a02cd43bf098bfdb9","blobstore_id":"6c7cf72c-9ace-40a1-4632-c27946bf631e"},"ruby-2.1.6":{"name":"ruby-2.1.6","version":"41d0100ffa4b21267bceef055bc84dc37527fa35.1","sha1":"8a9867197682cabf2bc784f71c4d904bc479c898","blobstore_id":"536bc527-3225-43f6-7aad-71f36addec80"}},"configuration_hash":"a73c7d06b0257746e95aaa2ca994c11629cbd324","networks":{"private_cf_subnet":{"cloud_properties":{"name":"random","net_id":"1e1c9aca-0b5a-4a8f-836a-54c18c21c9b9","security_groups":["az1_cf_management_secgroup_bosh_cf_ssh_cf2","az1_cf_management_secgroup_cf_private_cf2","az1_cf_management_secgroup_cf_public_cf2"]},"default":["dns","gateway"],"dns":["192.168.110.8","133.162.193.10","133.162.193.9","192.168.110.10"],"dns_record_name":"0.stats-z1.private-cf-subnet.cf-apaas.microbosh","gateway":"192.168.110.11","ip":"192.168.110.204","netmask":"255.255.255.0"}},"resource_pool":{"cloud_properties":{"instance_type":"S-1"},"name":"small_z1","stemcell":{"name":"bosh-openstack-kvm-ubuntu-trusty-go_agent","version":"2989"}},"deployment":"cf-apaas","index":0,"persistent_disk":0,"persistent_disk_pool":null,"rendered_templates_archive":{"sha1":"0ffd89fa41e02888c9f9b09c6af52ea58265a8ec","blobstore_id":"4bd01ae7-a69a-4fe5-932b-d98137585a3b"},"agent_id":"7d3452bd-679e-4a97-8514-63a373a54ffd","bosh_protocol":"1","job_state":"failing","vm":{"name":"vm-12d45510-096d-4b8b-9547-73ea5fda00c2"},"ntp":{"message":"bad ntp server"}}}
On Wed, Sep 23, 2015 at 5:13 PM, Amit Gupta <agupta(a)pivotal.io> wrote:
Please check the file collector/collector.log, it's in a subdirectory of the unpacked log tarball.
On Wed, Sep 23, 2015 at 12:01 AM, Guangcai Wang <guangcai.wang(a)gmail.com
wrote: Actually, I checked the two files in status_z1 job VM. I did not find any clues. Attached for reference.
On Wed, Sep 23, 2015 at 4:54 PM, Amit Gupta <agupta(a)pivotal.io> wrote:
If you do "bosh logs stats_z1 0 --job" you will get a tarball of all the logs for the relevant processes running on the stats_z1/0 VM. You will likely find some error messages in the collectors stdout or stderr logs.
On Tue, Sep 22, 2015 at 11:30 PM, Guangcai Wang < guangcai.wang(a)gmail.com> wrote:
It does not help.
I always see the "collector" process bouncing between "running" and "does not exit" when I use "monit summary" in a while loop.
Who knows how to get the real error when the "collector" process is not failed? Thanks.
On Wed, Sep 23, 2015 at 4:11 PM, Tony <Tonyl(a)fast.au.fujitsu.com> wrote:
My approach is to login on the stats vm and sudo, then run "monit status" and restart the failed processes or simply restart all processes by running "monit restart all"
wait for a while(5~10 minutes at most) If there is still some failed process, e.g. collector then run ps -ef | grep collector and kill the processes in the list(may be you need to run kill -9 sometimes)
then "monit restart all"
Normally, it will fix the issue "Failed: `XXX' is not running after update"
-- View this message in context: http://cf-dev.70369.x6.nabble.com/cf-dev-Error-400007-stats-z1-0-is-not-running-after-update-tp1901p1902.html Sent from the CF Dev mailing list archive at Nabble.com.
|
|