Date   

Re: DEA/Warden staging error

kyle havlovitz <kylehav@...>
 

Ok, after more investigating the problem was that network manager was
running on the machine and was trying to take control of new network
interfaces after they came up, so it would cause problems with the
interface that Warden created for the container. With network manager
disabled I can push the app and everything is fine.

Thanks for your help everyone.

On Wed, Sep 23, 2015 at 10:45 AM, kyle havlovitz <kylehav(a)gmail.com> wrote:

Here's the output from those commands:
https://gist.github.com/MrEnzyme/36592831b1c46d44f007
Soon after running those I noticed that the container loses its IPv4
address shortly after coming up and ifconfig looks like this:

root(a)cf-build:/home/cloud-user/test# ifconfig -a
docker0 Link encap:Ethernet HWaddr 56:84:7a:fe:97:99
inet addr:172.17.42.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
eth0 Link encap:Ethernet HWaddr fa:16:3e:cd:f3:0a
inet addr:172.25.1.52 Bcast:172.25.1.127 Mask:255.255.255.128
inet6 addr: fe80::f816:3eff:fecd:f30a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:515749 errors:0 dropped:0 overruns:0 frame:0
TX packets:295471 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1162366659 (1.1 GB) TX bytes:59056756 (59.0 MB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:45057315 errors:0 dropped:0 overruns:0 frame:0
TX packets:45057315 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:18042315375 (18.0 GB) TX bytes:18042315375 (18.0 GB)
w-190db6c54la-0 Link encap:Ethernet HWaddr 12:dc:ba:da:38:5b
inet6 addr: fe80::10dc:baff:feda:385b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1454 Metric:1
RX packets:12 errors:0 dropped:0 overruns:0 frame:0
TX packets:227 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:872 (872.0 B) TX bytes:35618 (35.6 KB)

Any idea what would be causing that?


On Tue, Sep 22, 2015 at 10:31 PM, Matthew Sykes <matthew.sykes(a)gmail.com>
wrote:

Based on your description, it doesn't sound like warden networking or the
warden iptables chains are your problem. Are you able to share all of your
routes and chains via a gist?

route -n
ifconfig -a
iptables -L -n -v -t filter
iptables -L -n -v -t nat
iptables -L -n -v -t mangle

Any kernel messages that look relevant in the message buffer (dmesg)?

Have you tried doing a network capture to verify the packets are look the
way you expect? Are you sure your host routing rules are good? Do the
warden subnets overlap with any network accessible to the host?

Based on previous notes, it doesn't sound like this is a standard
deployment so it's hard to say what could be impacting you.

On Tue, Sep 22, 2015 at 1:08 PM, Kyle Havlovitz (kyhavlov) <
kyhavlov(a)cisco.com> wrote:

I didn’t; I’m still having this problem. Even adding this lenient
security group didn’t let me get any traffic out of the VM:

[{"name":"allow_all","rules":[{"protocol":"all","destination":"0.0.0.0/0
"},{"protocol":"tcp","destination":"0.0.0.0/0
","ports":"1-65535"},{"protocol":"udp","destination":"0.0.0.0/0
","ports":"1-65535"}]}]

The only way I was able to get traffic out was by manually removing the
reject/drop iptables rules that warden set up, and even with that the
container still lost all connectivity after 30 seconds.

From: CF Runtime <cfruntime(a)gmail.com>
Reply-To: "Discussions about Cloud Foundry projects and the system
overall." <cf-dev(a)lists.cloudfoundry.org>
Date: Tuesday, September 22, 2015 at 12:50 PM
To: "Discussions about Cloud Foundry projects and the system overall." <
cf-dev(a)lists.cloudfoundry.org>
Subject: [cf-dev] Re: Re: Re: Re: Re: Re: Re: Re: DEA/Warden staging
error

Hey Kyle,

Did you make any progress?

Zak & Mikhail
CF Release Integration Team

On Thu, Sep 17, 2015 at 10:28 AM, CF Runtime <cfruntime(a)gmail.com>
wrote:

It certainly could be. By default the contains reject all egress
traffic. CC security groups configure iptables rules that allow traffic
out.

One of the default security groups in the BOSH templates allows access
on port 53. If you have no security groups, the containers will not be able
to make any outgoing requests.

Joseph & Natalie
CF Release Integration Team

On Thu, Sep 17, 2015 at 8:44 AM, Kyle Havlovitz (kyhavlov) <
kyhavlov(a)cisco.com> wrote:

On running git clone inside the container via the warden shell, I get:
"Cloning into 'staticfile-buildpack'...
fatal: unable to access '
https://github.com/cloudfoundry/staticfile-buildpack/': Could not
resolve host: github.com".
So the container can't get to anything outside of it (I also tried
pinging some external IPs to make sure it wasn't a DNS thing). Would this
be caused by cloud controller security group settings?

--
Matthew Sykes
matthew.sykes(a)gmail.com


Re: Removing support for v1 service brokers

Mike Youngstrom <youngm@...>
 

My vote on to wait a couple more months. I guess we'll see if anyone else
would like more months.

Mike

On Sep 23, 2015 11:52 PM, "Dieu Cao" <dcao(a)pivotal.io> wrote:

Thanks Mike. Totally understandable.


On Wed, Sep 23, 2015 at 9:23 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Thanks Dieu, honestly I was just trying to find an angle to bargain for a
bit more time. :) Three months is generous. But six months would be
glorious. :)

After the CAB call this month we got started converting our brokers over
but our migration is more difficult because we use Service instance
credentials quite a bit and those don't appear to be handled well when
doing "migrate-service-instances". I think we can do 3 months but we'll be
putting our users through a bit of a fire drill.

That said I'll understand if you stick to 3 months since, we should have
started this conversion log ago.

Mike

On Wed, Sep 23, 2015 at 1:22 AM, Dieu Cao <dcao(a)pivotal.io> wrote:

We've found NATS to be unstable under certain conditions, temporary
network interruptions or network instability, around the client
reconnection logic.
We've seen that it could take anywhere from a few seconds to half an
hour to reconnect properly. We spent a fair amount of time investigating
ways to improve the reconnection logic and have made some improvements but
believe that it's best to work towards not having this dependency.
You can find more about this in the stories in this epic [1].

Mike, in addition to removing the NATS dependency, this will remove the
burden on the team, almost a weekly fight, in terms of maintaining
backwards compatibility for the v1 broker spec any time we work on adding
functionality to the service broker api.
I'll work with the team in the next couple of weeks on specific stories
and I'll link to it here.

[1] https://www.pivotaltracker.com/epic/show/1440790


On Tue, Sep 22, 2015 at 10:07 PM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

Thanks for the announcement.

To be clear is this announcement to cease support for the old v1
brokers or is this to eliminate support for the v1 api in the CC? Does the
v1 CC code depend on NATS? None of my custom v1 brokers depend on NATS.

Mike

On Tue, Sep 22, 2015 at 6:01 PM, Dieu Cao <dcao(a)pivotal.io> wrote:

Hello all,

We plan to remove support for v1 service brokers in about 3 months, in
a cf-release following 12/31/2015.
We are working towards removing CF's dependency on NATS and the v1
service brokers are still dependent on NATS.
Please let me know if you have questions/concerns about this timeline.

I'll be working on verifying a set of steps that you can find here [1]
that document how to migrate your service broker from v1 to v2 and what is
required in order to persist user data and will get that posted to the
service broker api docs officially.

-Dieu
CF CAPI PM

[1]
https://docs.google.com/document/d/1Pl1o7mxtn3Iayq2STcMArT1cJsKkvi4Ey1-d3TB_Nhs/edit?usp=sharing




Re: How to deploy a Web application using HTTPs

Juan Antonio Breña Moral <bren at juanantonio.info...>
 

Hi Dieu,

many thanks for the technical info.

I will consider this factor to add this restriction in the development.

Juan Antonio


Re: Removing support for v1 service brokers

Dieu Cao <dcao@...>
 

Thanks Mike. Totally understandable.

On Wed, Sep 23, 2015 at 9:23 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Thanks Dieu, honestly I was just trying to find an angle to bargain for a
bit more time. :) Three months is generous. But six months would be
glorious. :)

After the CAB call this month we got started converting our brokers over
but our migration is more difficult because we use Service instance
credentials quite a bit and those don't appear to be handled well when
doing "migrate-service-instances". I think we can do 3 months but we'll be
putting our users through a bit of a fire drill.

That said I'll understand if you stick to 3 months since, we should have
started this conversion log ago.

Mike

On Wed, Sep 23, 2015 at 1:22 AM, Dieu Cao <dcao(a)pivotal.io> wrote:

We've found NATS to be unstable under certain conditions, temporary
network interruptions or network instability, around the client
reconnection logic.
We've seen that it could take anywhere from a few seconds to half an hour
to reconnect properly. We spent a fair amount of time investigating ways to
improve the reconnection logic and have made some improvements but believe
that it's best to work towards not having this dependency.
You can find more about this in the stories in this epic [1].

Mike, in addition to removing the NATS dependency, this will remove the
burden on the team, almost a weekly fight, in terms of maintaining
backwards compatibility for the v1 broker spec any time we work on adding
functionality to the service broker api.
I'll work with the team in the next couple of weeks on specific stories
and I'll link to it here.

[1] https://www.pivotaltracker.com/epic/show/1440790


On Tue, Sep 22, 2015 at 10:07 PM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

Thanks for the announcement.

To be clear is this announcement to cease support for the old v1 brokers
or is this to eliminate support for the v1 api in the CC? Does the v1 CC
code depend on NATS? None of my custom v1 brokers depend on NATS.

Mike

On Tue, Sep 22, 2015 at 6:01 PM, Dieu Cao <dcao(a)pivotal.io> wrote:

Hello all,

We plan to remove support for v1 service brokers in about 3 months, in
a cf-release following 12/31/2015.
We are working towards removing CF's dependency on NATS and the v1
service brokers are still dependent on NATS.
Please let me know if you have questions/concerns about this timeline.

I'll be working on verifying a set of steps that you can find here [1]
that document how to migrate your service broker from v1 to v2 and what is
required in order to persist user data and will get that posted to the
service broker api docs officially.

-Dieu
CF CAPI PM

[1]
https://docs.google.com/document/d/1Pl1o7mxtn3Iayq2STcMArT1cJsKkvi4Ey1-d3TB_Nhs/edit?usp=sharing




Re: How to deploy a Web application using HTTPs

Dieu Cao <dcao@...>
 

Your edge load balancer should be configured to add x-forwarded-for and
x-forwarded-proto headers.

On Wed, Sep 23, 2015 at 4:24 AM, Juan Antonio Breña Moral <
bren(a)juanantonio.info> wrote:

@James,

who add the headers?

"x-forwarded-for":"CLIENT_REAL_IP, CLOUD_FOUNDRY_IP",
"x-forwarded-proto":"https"

the load balancer or the GoRouter?


Re: Introducing CF-Swagger

Dieu Cao <dcao@...>
 

Separate from this proposal, the CAPI team has stories for spiking on a few
different api documentation options for the cloud controller api [1].
Swagger is one of the options we are looking into, but it is not the only
one.

[1] https://www.pivotaltracker.com/epic/show/2093796



On Wed, Sep 23, 2015 at 11:00 AM, Deepak Vij (A) <deepak.vij(a)huawei.com>
wrote:

Hi Mohamed and Dr. Max, I fully support this effort. By having Swagger
based “Application Interface” capability as part of the overall CF PaaS
platform would be very useful for the CF community as a whole. As a
matter of fact, I also initiated a similar thread few months ago on cf-dev
alias (see email text below). Your work exactly matches up with what our
current thinking is.



By having “Swagger” based “Application Interface” is a very good start
along those lines. This opens up lots of other possibilities such as
building out “Deployment Governance” capabilities not merely for Cloud
Foundry API or Services assets but for the whole Application landscape
built & deployed within CF PaaS environment and subsequently exposed as
APIs to end consumers.



As described below in my email I sent out earlier that “Deployment
Governance” as part of overall API Management is what we are striving
towards in order to expose comprehensive telecom API Management
capabilities within the public cloud environment.



Dr. Max, as I mentioned to you during our brief discussion few days ago
that “Heroku” folks also have a similar initiative ongoing. They have gone
lightweight “JSON” schema route versus Swagger/WADL/RAML etc.



In any case, I am fully in support of your proposal. Thanks.



Regards,

Deepak Vij



=============================

Hi folks, I would like to start a thread on the need for machine-readable “*Application
Interface*” supported at the platform level. Essentially, this interface
describes details such as available methods/operations, inputs/outputs data
types (schema), application dependencies etc. Any standard specifications
language can be used for this purpose, as long as it clearly describes the
schema of the requests and responses – one can use Web Application
Description Language (WADL), Swagger, RESTful API Modeling Language (RAML),
JSON Schema (something like *JSON Schema for Heroku Platform APIs*) or
any other language that provides similar functionality. These
specifications are to be automatically derived from the code and are
typically part of the application development process (e.g. generated by
the build system).



Such functionality can have lots of usage scenarios:

1. First and foremost, Deployment Governance for API Management (our
main vested interest) – API Versioning & Backward Compatibility,
Dependency Management and many more as part of the comprehensive telecom
API Management capabilities which we are currently in the process of
building out.

2. Auto-creating client libraries for your favorite programming
language.

3. Automatic generation of up-to-date documentation.

4. Writing automatic acceptance and integration tests etc.



From historical perspective, in the early 2000s when SOA started out, the
mindset was to author the application contract-first (application interface
using WSDL at that time) and subsequently generate and author code from the
application interface. With the advent of RESTful services, REST community
initially took a stand against such metadata for applications. Although, a
number of metadata standards have none-the-less emerged over the last
couple of years, mainly fueled by the use case scenarios described earlier.



Based on my knowledge, none of this currently exists within Cloud Foundry
at the platform level. It would be highly desirable to have a standard
common “*application interface*” definition at the platform level,
agnostic of the underlying application development frameworks.



I hope this all makes sense. I think this is something could be very
relevant to the “Utilities” PMC. I will also copy&paste this text under
“Utilities” PMC-notes on the github.



I would love to hear from the community on this. Thanks.



Regards,

Deepak Vij



*From:* Michael Maximilien [mailto:maxim(a)us.ibm.com]
*Sent:* Friday, September 18, 2015 4:52 PM
*To:* cf-dev(a)lists.cloudfoundry.org
*Cc:* Heiko Ludwig; Mohamed Mohamed; Alex Tarpinian; Christopher B Ferris
*Subject:* [cf-dev] Introducing CF-Swagger



Hi, all,



This email serves two purposes: 1) introduce CF-Swagger, and 2) shares the
results of the CF service broker compliance survey I sent out a couple of
weeks ago.



------

My IBM Research colleague, Mohamed (on cc:), and I have been working on
creating Swagger descriptions for some CF APIs.



Our main goal was to explore what useful tools or utilities we could build
with these Swagger descriptions once created.



The initial results of this exploratory research is CF-Swagger which is
included in the following:



See presentation here: https://goo.gl/Y16plT

Video demo here: http://goo.gl/C8Nz5p

Temp repo here: https://github.com/maximilien/cf-swagger



The gist of of our work and results are:



1. We created a full Swagger description of the CF service broker

2. Using this description you can use the Swagger editor to create a neat
API docs that is browsable and even callable

3. Using the description you can create client and server stubs for
service brokers in a variety of languages, e.g., JS, Java, Ruby, etc.

4. We've extended go-swagger to generate workable client and server stubs
for service brokers in Golang. We plan to submit all changes to go-swagger
back to that project

5. We've extended go-swagger to generate prototypes of working Ginkgo
tests to service brokers

6. We've extended go-swagger to generate a CF service broker Ginkgo Test
Compliance Kit (TCK) that anyone could use to validate their broker's
compliance with any Swagger-described version of spec

7. We've created a custom Ginkgo reporter that when ran with TCK will give
you a summary of your compliance, e.g., 100% compliant with v2.5 but 90%
compliant with v2.6 due to failing test X, Y, Z... (in Ginkgo fashion)

8. The survey results (all included in the presentation) indicate that
over 50% of respondants believe TCK tests for service broker would be
valuable to them. Many (over 50%) are using custom proprietary tests, and
this project maybe a way to get everyone to converge to a common set of
tests we could all use and improve...



------

We plan to propose this work to become a CF incubator at the next CAB and
PMC calls, especially the TCK part for service brokers. The overall
approach and project could be useful for other parts of the CF APIs but we
will start with CF Service Brokers.



The actual Swagger descriptions should ideally come from the teams who own
the APIs. So for service brokers, the CAPI team. We are engaging them as
they have also been looking at improving APIs docs and descriptions. Maybe
there are potential for synergies and at a minimum making sure what we
generate ends up becoming useful to their pipelines.



Finally, while the repo is temporary and will change, I welcome you to
take a look at presentation and video and code and let us know your
thoughts and feedback.



Thanks for your time and interest.



Mohamed and Max

IBM


Re: Error 400007: `stats_z1/0' is not running after update

iamflying
 

That did help. It showed us the real error.

==> metron_agent/metron_agent.stdout.log <==
{"timestamp":1443054247.927488327,"process_id":23472,"source":"metron","log_level":"warn","message":"Failed
to create client: Could not connect to NATS: dial tcp 192.168.110.202:4222:
i/o
timeout","data":null,"file":"/var/vcap/data/compile/metron_agent/loggregator/src/
github.com/cloudfoundry/loggregatorlib/cfcomponent/registrars/collectorregistrar/collector_registrar.go
","line":51,"method":"
github.com/cloudfoundry/loggregatorlib/cfcomponent/registrars/collectorregistrar.(*CollectorRegistrar).Run
"}

I checked the security rule. It seems to have some problems.

On Thu, Sep 24, 2015 at 2:47 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

I often take the following approach to debugging issues like this:

* Open two shell sessions to your failing VM using bosh ssh, and switch to
superuser
* In one session, `watch monit summary`. You might see collector going
back and forth between initializing and not monitored, but please report
anything else of interest you see here
* In the other session, `cd /var/vcap/sys/log` and then `watch
--differences=cumulative ls -altr **/*` to see which files are being
written to while the startup processes are thrashing. Then `tail -f FILE_1
FILE_2 ...` listing all the files that were being written to, and seem
relevant to the thrashing process(es) in monit


On Wed, Sep 23, 2015 at 12:21 AM, Guangcai Wang <guangcai.wang(a)gmail.com>
wrote:

It frequently logs the message below. It seems not helpful.


{"timestamp":1442987404.9433253,"message":"collector.started","log_level":"info","source":"collector","data":{},"thread_id":70132569199380,"fiber_id":70132570371720,"process_id":19392,"file":"/var/vcap/packages/collector/lib/collector/config.rb","lineno":45,"method":"setup_logging"}

the only possible error message from the bosh debug log is
"ntp":{"message":"bad ntp server"}

But I don't think, it is related to the failure of stats_z1 updating.

I, [2015-09-23 04:55:59 #2392] [canary_update(stats_z1/0)] INFO --
DirectorJobRunner: Checking if stats_z1/0 has been updated after
63.333333333333336 seconds
D, [2015-09-23 04:55:59 #2392] [canary_update(stats_z1/0)] DEBUG --
DirectorJobRunner: SENT: agent.7d3452bd-679e-4a97-8514-63a373a54ffd
{"method":"get_state","arguments":[],"reply_to":"director.c5b97fc1-b972-47ec-9412-a83ad240823b.473fda64-6ac3-4a53-9ebc-321fc7eabd7a"}
D, [2015-09-23 04:55:59 #2392] [] DEBUG -- DirectorJobRunner: RECEIVED:
director.c5b97fc1-b972-47ec-9412-a83ad240823b.473fda64-6ac3-4a53-9ebc-321fc7eabd7a
{"value":{"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"stats_z1","release":"","template":"fluentd","version":"4c71c87bbf0144428afacd470e2a5e32b91932fc","sha1":"b141c6037d429d732bf3d67f7b79f8d7d80aac5d","blobstore_id":"d8451d63-2e4f-4664-93a8-a77e5419621d","templates":[{"name":"fluentd","version":"4c71c87bbf0144428afacd470e2a5e32b91932fc","sha1":"b141c6037d429d732bf3d67f7b79f8d7d80aac5d","blobstore_id":"d8451d63-2e4f-4664-93a8-a77e5419621d"},{"name":"collector","version":"889b187e2f6adc453c61fd8f706525b60e4b85ed","sha1":"f5ae15a8fa2417bf984513e5c4269f8407a274dc","blobstore_id":"3eeb0166-a75c-49fb-9f28-c29788dbf64d"},{"name":"metron_agent","version":"e6df4c316b71af68dfc4ca476c8d1a4885e82f5b","sha1":"42b6d84ad9368eba0508015d780922a43a86047d","blobstore_id":"e578bfb0-9726-4754-87ae-b54c8940e41a"},{"name":"apaas_collector","version":"8808f0ae627a54706896a784dba47570c92e0c8b","sha1":"b9a63da925b40910445d592c70abcf4d23ffe84d","blobstore_id":"3e6fa71a-07f7-446a-96f4-3caceea02f2f"}]},"packages":{"apaas_collector":{"name":"apaas_collector","version":"f294704d51d4517e4df3d8417a3d7c71699bc04d.1","sha1":"5af77ceb01b7995926dbd4ad7481dcb7c3d94faf","blobstore_id":"fa0e96b9-71a6-4828-416e-dde3427a73a9"},"collector":{"name":"collector","version":"ba47450ce83b8f2249b75c79b38397db249df48b.1","sha1":"0bf8ee0d69b3f21cf1878a43a9616cb7e14f6f25","blobstore_id":"722a5455-f7f7-427d-7e8d-e562552857bc"},"common":{"name":"common","version":"99c756b71550530632e393f5189220f170a69647.1","sha1":"90159de912c9bfc71740324f431ddce1a5fede00","blobstore_id":"37be6f28-c340-4899-7fd3-3517606491bb"},"fluentd-0.12.13":{"name":"fluentd-0.12.13","version":"71d8decbba6c863bff6c325f1f8df621a91eb45f.1","sha1":"2bd32b3d3de59e5dbdd77021417359bb5754b1cf","blobstore_id":"7bc81ac6-7c24-4a94-74d1-bb9930b07751"},"metron_agent":{"name":"metron_agent","version":"997d87534f57cad148d56c5b8362b72e726424e4.1","sha1":"a21404c50562de75000d285a02cd43bf098bfdb9","blobstore_id":"6c7cf72c-9ace-40a1-4632-c27946bf631e"},"ruby-2.1.6":{"name":"ruby-2.1.6","version":"41d0100ffa4b21267bceef055bc84dc37527fa35.1","sha1":"8a9867197682cabf2bc784f71c4d904bc479c898","blobstore_id":"536bc527-3225-43f6-7aad-71f36addec80"}},"configuration_hash":"a73c7d06b0257746e95aaa2ca994c11629cbd324","networks":{"private_cf_subnet":{"cloud_properties":{"name":"random","net_id":"1e1c9aca-0b5a-4a8f-836a-54c18c21c9b9","security_groups":["az1_cf_management_secgroup_bosh_cf_ssh_cf2","az1_cf_management_secgroup_cf_private_cf2","az1_cf_management_secgroup_cf_public_cf2"]},"default":["dns","gateway"],"dns":["192.168.110.8","133.162.193.10","133.162.193.9","192.168.110.10"],"dns_record_name":"0.stats-z1.private-cf-subnet.cf-apaas.microbosh","gateway":"192.168.110.11","ip":"192.168.110.204","netmask":"255.255.255.0"}},"resource_pool":{"cloud_properties":{"instance_type":"S-1"},"name":"small_z1","stemcell":{"name":"bosh-openstack-kvm-ubuntu-trusty-go_agent","version":"2989"}},"deployment":"cf-apaas","index":0,"persistent_disk":0,"persistent_disk_pool":null,"rendered_templates_archive":{"sha1":"0ffd89fa41e02888c9f9b09c6af52ea58265a8ec","blobstore_id":"4bd01ae7-a69a-4fe5-932b-d98137585a3b"},"agent_id":"7d3452bd-679e-4a97-8514-63a373a54ffd","bosh_protocol":"1","job_state":"failing","vm":{"name":"vm-12d45510-096d-4b8b-9547-73ea5fda00c2"},"ntp":{"message":"bad
ntp server"}}}


On Wed, Sep 23, 2015 at 5:13 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Please check the file collector/collector.log, it's in a subdirectory of
the unpacked log tarball.

On Wed, Sep 23, 2015 at 12:01 AM, Guangcai Wang <guangcai.wang(a)gmail.com
wrote:
Actually, I checked the two files in status_z1 job VM. I did not find
any clues. Attached for reference.

On Wed, Sep 23, 2015 at 4:54 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

If you do "bosh logs stats_z1 0 --job" you will get a tarball of all
the logs for the relevant processes running on the stats_z1/0 VM. You will
likely find some error messages in the collectors stdout or stderr logs.

On Tue, Sep 22, 2015 at 11:30 PM, Guangcai Wang <
guangcai.wang(a)gmail.com> wrote:

It does not help.

I always see the "collector" process bouncing between "running" and
"does not exit" when I use "monit summary" in a while loop.

Who knows how to get the real error when the "collector" process is
not failed? Thanks.

On Wed, Sep 23, 2015 at 4:11 PM, Tony <Tonyl(a)fast.au.fujitsu.com>
wrote:

My approach is to login on the stats vm and sudo, then
run "monit status" and restart the failed processes or simply
restart all
processes by running "monit restart all"

wait for a while(5~10 minutes at most)
If there is still some failed process, e.g. collector
then run ps -ef | grep collector
and kill the processes in the list(may be you need to run kill -9
sometimes)

then "monit restart all"

Normally, it will fix the issue "Failed: `XXX' is not running after
update"



--
View this message in context:
http://cf-dev.70369.x6.nabble.com/cf-dev-Error-400007-stats-z1-0-is-not-running-after-update-tp1901p1902.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: Loggregator/Doppler Syslog Drain Missing Logs

Michael Schwartz
 

That does makes sense. Especially since I'm seeing a consistent percentage. At one point, one app was getting 100% throughput and another was only getting 75%. So maybe one of the dopplers didn't have the drain binding.


Re: Loggregator/Doppler Syslog Drain Missing Logs

Michael Schwartz
 

We are running with 2 zones, 2 loggregators in each zone. I thought the same thing. Stopping all but one loggregator showed the same results.

If it helps, yesterday I was seeing 75% of the logs make it through with 4 loggregators and about 90% when I bumped the node count to 8. So it isn't always a 50/50.

Also, after shutting down or restarting a node, I see almost 100% of the logs come through at first. Then it slowly degrades back to 50% after a few minutes.


Re: Loggregator/Doppler Syslog Drain Missing Logs

Matthew Sykes <matthew.sykes@...>
 

v210 has quite a few bugs in this area. One fairly major one is a
connection leak [1] in the syslog_drain_binder component. When this
happens, changes to the syslog drain bindings do not make their way into
the doppler servers.

I'd strongly recommend you try to move to a newer release.

[1]:
https://github.com/cloudfoundry/loggregator/commit/b8d14b7fdc65b9d0d4a11cffa6b6f855e4d640ae

On Wed, Sep 23, 2015 at 2:48 PM, Michael Schwartz <mschwartz1411(a)gmail.com>
wrote:

The system is currently running ~200 apps and they all bind to an external
syslog drain.


--
Matthew Sykes
matthew.sykes(a)gmail.com


Re: Loggregator/Doppler Syslog Drain Missing Logs

Erik Jasiak
 

Hi Michael

First question that springs to mind when I see ~50% - how many zones are
you running as part of your setup? ("every other log" sounds like a
round-robin to something dead or misconfigured.)

Have to run but will follow up more soon,
Erik

Michael Schwartz wrote:


The system is currently running ~200 apps and they all bind to an
external syslog drain.


Re: Loggregator/Doppler Syslog Drain Missing Logs

Michael Schwartz
 

The system is currently running ~200 apps and they all bind to an external syslog drain.


Loggregator/Doppler Syslog Drain Missing Logs

Michael Schwartz
 

Loggregator appears to be dropping logs without notice. I'm noticing about 50% of the logs do not make it to our external log service. If I tail the logs using "cf logs ...", all logs are visible. They just never make it to the external drain.

I have seen the "TB: Output channel too full. Dropped 100 messages for app..." message in the doppler logs, and that makes sense for applications producing many logs. What's confusing is that I'm seeing logs missing from apps that produce very little logs (like 10 per minute). If I send curl requests to a test app very slowly, I notice approx. every other log is missing.

We've been using an ELK stack for persisting application logs. All applications bind to a user-provided service containing the syslog drain URL. Logstash does not appear to be the bottleneck here because I've tested other endpoints such as a netcat listener and I see the same issue. Ever after doubling our logstash server count to 6, I see the exact same drop rate.

Our current CF (v210) deployment contains 4 loggregator instances each running doppler, syslog_drain_binder, and metron_agent. I tried bumping the loggregator instance count to 8 and noticed very little improvement.

Monitoring CPU, memory, and diskspace on loggregator nodes show no abnormalities. CPU is under 5%.

Is this expected behavior?

Thank you.


Re: Introducing CF-Swagger

Deepak Vij
 

Hi Mohamed and Dr. Max, I fully support this effort. By having Swagger based “Application Interface” capability as part of the overall CF PaaS platform would be very useful for the CF community as a whole. As a matter of fact, I also initiated a similar thread few months ago on cf-dev alias (see email text below). Your work exactly matches up with what our current thinking is.

By having “Swagger” based “Application Interface” is a very good start along those lines. This opens up lots of other possibilities such as building out “Deployment Governance” capabilities not merely for Cloud Foundry API or Services assets but for the whole Application landscape built & deployed within CF PaaS environment and subsequently exposed as APIs to end consumers.

As described below in my email I sent out earlier that “Deployment Governance” as part of overall API Management is what we are striving towards in order to expose comprehensive telecom API Management capabilities within the public cloud environment.

Dr. Max, as I mentioned to you during our brief discussion few days ago that “Heroku” folks also have a similar initiative ongoing. They have gone lightweight “JSON” schema route versus Swagger/WADL/RAML etc.

In any case, I am fully in support of your proposal. Thanks.

Regards,
Deepak Vij

=============================
Hi folks, I would like to start a thread on the need for machine-readable “Application Interface” supported at the platform level. Essentially, this interface describes details such as available methods/operations, inputs/outputs data types (schema), application dependencies etc. Any standard specifications language can be used for this purpose, as long as it clearly describes the schema of the requests and responses – one can use Web Application Description Language (WADL), Swagger, RESTful API Modeling Language (RAML), JSON Schema (something like JSON Schema for Heroku Platform APIs) or any other language that provides similar functionality. These specifications are to be automatically derived from the code and are typically part of the application development process (e.g. generated by the build system).

Such functionality can have lots of usage scenarios:

1. First and foremost, Deployment Governance for API Management (our main vested interest) – API Versioning & Backward Compatibility, Dependency Management and many more as part of the comprehensive telecom API Management capabilities which we are currently in the process of building out.

2. Auto-creating client libraries for your favorite programming language.

3. Automatic generation of up-to-date documentation.

4. Writing automatic acceptance and integration tests etc.

From historical perspective, in the early 2000s when SOA started out, the mindset was to author the application contract-first (application interface using WSDL at that time) and subsequently generate and author code from the application interface. With the advent of RESTful services, REST community initially took a stand against such metadata for applications. Although, a number of metadata standards have none-the-less emerged over the last couple of years, mainly fueled by the use case scenarios described earlier.

Based on my knowledge, none of this currently exists within Cloud Foundry at the platform level. It would be highly desirable to have a standard common “application interface” definition at the platform level, agnostic of the underlying application development frameworks.

I hope this all makes sense. I think this is something could be very relevant to the “Utilities” PMC. I will also copy&paste this text under “Utilities” PMC-notes on the github.

I would love to hear from the community on this. Thanks.

Regards,
Deepak Vij

From: Michael Maximilien [mailto:maxim(a)us.ibm.com]
Sent: Friday, September 18, 2015 4:52 PM
To: cf-dev(a)lists.cloudfoundry.org
Cc: Heiko Ludwig; Mohamed Mohamed; Alex Tarpinian; Christopher B Ferris
Subject: [cf-dev] Introducing CF-Swagger

Hi, all,


This email serves two purposes: 1) introduce CF-Swagger, and 2) shares the results of the CF service broker compliance survey I sent out a couple of weeks ago.


------
My IBM Research colleague, Mohamed (on cc:), and I have been working on creating Swagger descriptions for some CF APIs.


Our main goal was to explore what useful tools or utilities we could build with these Swagger descriptions once created.


The initial results of this exploratory research is CF-Swagger which is included in the following:


See presentation here: https://goo.gl/Y16plT
Video demo here: http://goo.gl/C8Nz5p
Temp repo here: https://github.com/maximilien/cf-swagger


The gist of of our work and results are:


1. We created a full Swagger description of the CF service broker
2. Using this description you can use the Swagger editor to create a neat API docs that is browsable and even callable
3. Using the description you can create client and server stubs for service brokers in a variety of languages, e.g., JS, Java, Ruby, etc.
4. We've extended go-swagger to generate workable client and server stubs for service brokers in Golang. We plan to submit all changes to go-swagger back to that project
5. We've extended go-swagger to generate prototypes of working Ginkgo tests to service brokers
6. We've extended go-swagger to generate a CF service broker Ginkgo Test Compliance Kit (TCK) that anyone could use to validate their broker's compliance with any Swagger-described version of spec
7. We've created a custom Ginkgo reporter that when ran with TCK will give you a summary of your compliance, e.g., 100% compliant with v2.5 but 90% compliant with v2.6 due to failing test X, Y, Z... (in Ginkgo fashion)
8. The survey results (all included in the presentation) indicate that over 50% of respondants believe TCK tests for service broker would be valuable to them. Many (over 50%) are using custom proprietary tests, and this project maybe a way to get everyone to converge to a common set of tests we could all use and improve...


------
We plan to propose this work to become a CF incubator at the next CAB and PMC calls, especially the TCK part for service brokers. The overall approach and project could be useful for other parts of the CF APIs but we will start with CF Service Brokers.


The actual Swagger descriptions should ideally come from the teams who own the APIs. So for service brokers, the CAPI team. We are engaging them as they have also been looking at improving APIs docs and descriptions. Maybe there are potential for synergies and at a minimum making sure what we generate ends up becoming useful to their pipelines.


Finally, while the repo is temporary and will change, I welcome you to take a look at presentation and video and code and let us know your thoughts and feedback.


Thanks for your time and interest.


Mohamed and Max
IBM


Re: Error 400007: `stats_z1/0' is not running after update

Amit Kumar Gupta
 

I often take the following approach to debugging issues like this:

* Open two shell sessions to your failing VM using bosh ssh, and switch to
superuser
* In one session, `watch monit summary`. You might see collector going back
and forth between initializing and not monitored, but please report
anything else of interest you see here
* In the other session, `cd /var/vcap/sys/log` and then `watch
--differences=cumulative ls -altr **/*` to see which files are being
written to while the startup processes are thrashing. Then `tail -f FILE_1
FILE_2 ...` listing all the files that were being written to, and seem
relevant to the thrashing process(es) in monit


On Wed, Sep 23, 2015 at 12:21 AM, Guangcai Wang <guangcai.wang(a)gmail.com>
wrote:

It frequently logs the message below. It seems not helpful.


{"timestamp":1442987404.9433253,"message":"collector.started","log_level":"info","source":"collector","data":{},"thread_id":70132569199380,"fiber_id":70132570371720,"process_id":19392,"file":"/var/vcap/packages/collector/lib/collector/config.rb","lineno":45,"method":"setup_logging"}

the only possible error message from the bosh debug log is
"ntp":{"message":"bad ntp server"}

But I don't think, it is related to the failure of stats_z1 updating.

I, [2015-09-23 04:55:59 #2392] [canary_update(stats_z1/0)] INFO --
DirectorJobRunner: Checking if stats_z1/0 has been updated after
63.333333333333336 seconds
D, [2015-09-23 04:55:59 #2392] [canary_update(stats_z1/0)] DEBUG --
DirectorJobRunner: SENT: agent.7d3452bd-679e-4a97-8514-63a373a54ffd
{"method":"get_state","arguments":[],"reply_to":"director.c5b97fc1-b972-47ec-9412-a83ad240823b.473fda64-6ac3-4a53-9ebc-321fc7eabd7a"}
D, [2015-09-23 04:55:59 #2392] [] DEBUG -- DirectorJobRunner: RECEIVED:
director.c5b97fc1-b972-47ec-9412-a83ad240823b.473fda64-6ac3-4a53-9ebc-321fc7eabd7a
{"value":{"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"stats_z1","release":"","template":"fluentd","version":"4c71c87bbf0144428afacd470e2a5e32b91932fc","sha1":"b141c6037d429d732bf3d67f7b79f8d7d80aac5d","blobstore_id":"d8451d63-2e4f-4664-93a8-a77e5419621d","templates":[{"name":"fluentd","version":"4c71c87bbf0144428afacd470e2a5e32b91932fc","sha1":"b141c6037d429d732bf3d67f7b79f8d7d80aac5d","blobstore_id":"d8451d63-2e4f-4664-93a8-a77e5419621d"},{"name":"collector","version":"889b187e2f6adc453c61fd8f706525b60e4b85ed","sha1":"f5ae15a8fa2417bf984513e5c4269f8407a274dc","blobstore_id":"3eeb0166-a75c-49fb-9f28-c29788dbf64d"},{"name":"metron_agent","version":"e6df4c316b71af68dfc4ca476c8d1a4885e82f5b","sha1":"42b6d84ad9368eba0508015d780922a43a86047d","blobstore_id":"e578bfb0-9726-4754-87ae-b54c8940e41a"},{"name":"apaas_collector","version":"8808f0ae627a54706896a784dba47570c92e0c8b","sha1":"b9a63da925b40910445d592c70abcf4d23ffe84d","blobstore_id":"3e6fa71a-07f7-446a-96f4-3caceea02f2f"}]},"packages":{"apaas_collector":{"name":"apaas_collector","version":"f294704d51d4517e4df3d8417a3d7c71699bc04d.1","sha1":"5af77ceb01b7995926dbd4ad7481dcb7c3d94faf","blobstore_id":"fa0e96b9-71a6-4828-416e-dde3427a73a9"},"collector":{"name":"collector","version":"ba47450ce83b8f2249b75c79b38397db249df48b.1","sha1":"0bf8ee0d69b3f21cf1878a43a9616cb7e14f6f25","blobstore_id":"722a5455-f7f7-427d-7e8d-e562552857bc"},"common":{"name":"common","version":"99c756b71550530632e393f5189220f170a69647.1","sha1":"90159de912c9bfc71740324f431ddce1a5fede00","blobstore_id":"37be6f28-c340-4899-7fd3-3517606491bb"},"fluentd-0.12.13":{"name":"fluentd-0.12.13","version":"71d8decbba6c863bff6c325f1f8df621a91eb45f.1","sha1":"2bd32b3d3de59e5dbdd77021417359bb5754b1cf","blobstore_id":"7bc81ac6-7c24-4a94-74d1-bb9930b07751"},"metron_agent":{"name":"metron_agent","version":"997d87534f57cad148d56c5b8362b72e726424e4.1","sha1":"a21404c50562de75000d285a02cd43bf098bfdb9","blobstore_id":"6c7cf72c-9ace-40a1-4632-c27946bf631e"},"ruby-2.1.6":{"name":"ruby-2.1.6","version":"41d0100ffa4b21267bceef055bc84dc37527fa35.1","sha1":"8a9867197682cabf2bc784f71c4d904bc479c898","blobstore_id":"536bc527-3225-43f6-7aad-71f36addec80"}},"configuration_hash":"a73c7d06b0257746e95aaa2ca994c11629cbd324","networks":{"private_cf_subnet":{"cloud_properties":{"name":"random","net_id":"1e1c9aca-0b5a-4a8f-836a-54c18c21c9b9","security_groups":["az1_cf_management_secgroup_bosh_cf_ssh_cf2","az1_cf_management_secgroup_cf_private_cf2","az1_cf_management_secgroup_cf_public_cf2"]},"default":["dns","gateway"],"dns":["192.168.110.8","133.162.193.10","133.162.193.9","192.168.110.10"],"dns_record_name":"0.stats-z1.private-cf-subnet.cf-apaas.microbosh","gateway":"192.168.110.11","ip":"192.168.110.204","netmask":"255.255.255.0"}},"resource_pool":{"cloud_properties":{"instance_type":"S-1"},"name":"small_z1","stemcell":{"name":"bosh-openstack-kvm-ubuntu-trusty-go_agent","version":"2989"}},"deployment":"cf-apaas","index":0,"persistent_disk":0,"persistent_disk_pool":null,"rendered_templates_archive":{"sha1":"0ffd89fa41e02888c9f9b09c6af52ea58265a8ec","blobstore_id":"4bd01ae7-a69a-4fe5-932b-d98137585a3b"},"agent_id":"7d3452bd-679e-4a97-8514-63a373a54ffd","bosh_protocol":"1","job_state":"failing","vm":{"name":"vm-12d45510-096d-4b8b-9547-73ea5fda00c2"},"ntp":{"message":"bad
ntp server"}}}


On Wed, Sep 23, 2015 at 5:13 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Please check the file collector/collector.log, it's in a subdirectory of
the unpacked log tarball.

On Wed, Sep 23, 2015 at 12:01 AM, Guangcai Wang <guangcai.wang(a)gmail.com>
wrote:

Actually, I checked the two files in status_z1 job VM. I did not find
any clues. Attached for reference.

On Wed, Sep 23, 2015 at 4:54 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

If you do "bosh logs stats_z1 0 --job" you will get a tarball of all
the logs for the relevant processes running on the stats_z1/0 VM. You will
likely find some error messages in the collectors stdout or stderr logs.

On Tue, Sep 22, 2015 at 11:30 PM, Guangcai Wang <
guangcai.wang(a)gmail.com> wrote:

It does not help.

I always see the "collector" process bouncing between "running" and
"does not exit" when I use "monit summary" in a while loop.

Who knows how to get the real error when the "collector" process is
not failed? Thanks.

On Wed, Sep 23, 2015 at 4:11 PM, Tony <Tonyl(a)fast.au.fujitsu.com>
wrote:

My approach is to login on the stats vm and sudo, then
run "monit status" and restart the failed processes or simply restart
all
processes by running "monit restart all"

wait for a while(5~10 minutes at most)
If there is still some failed process, e.g. collector
then run ps -ef | grep collector
and kill the processes in the list(may be you need to run kill -9
sometimes)

then "monit restart all"

Normally, it will fix the issue "Failed: `XXX' is not running after
update"



--
View this message in context:
http://cf-dev.70369.x6.nabble.com/cf-dev-Error-400007-stats-z1-0-is-not-running-after-update-tp1901p1902.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: Removing support for v1 service brokers

Mike Youngstrom <youngm@...>
 

Thanks Dieu, honestly I was just trying to find an angle to bargain for a
bit more time. :) Three months is generous. But six months would be
glorious. :)

After the CAB call this month we got started converting our brokers over
but our migration is more difficult because we use Service instance
credentials quite a bit and those don't appear to be handled well when
doing "migrate-service-instances". I think we can do 3 months but we'll be
putting our users through a bit of a fire drill.

That said I'll understand if you stick to 3 months since, we should have
started this conversion log ago.

Mike

On Wed, Sep 23, 2015 at 1:22 AM, Dieu Cao <dcao(a)pivotal.io> wrote:

We've found NATS to be unstable under certain conditions, temporary
network interruptions or network instability, around the client
reconnection logic.
We've seen that it could take anywhere from a few seconds to half an hour
to reconnect properly. We spent a fair amount of time investigating ways to
improve the reconnection logic and have made some improvements but believe
that it's best to work towards not having this dependency.
You can find more about this in the stories in this epic [1].

Mike, in addition to removing the NATS dependency, this will remove the
burden on the team, almost a weekly fight, in terms of maintaining
backwards compatibility for the v1 broker spec any time we work on adding
functionality to the service broker api.
I'll work with the team in the next couple of weeks on specific stories
and I'll link to it here.

[1] https://www.pivotaltracker.com/epic/show/1440790


On Tue, Sep 22, 2015 at 10:07 PM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

Thanks for the announcement.

To be clear is this announcement to cease support for the old v1 brokers
or is this to eliminate support for the v1 api in the CC? Does the v1 CC
code depend on NATS? None of my custom v1 brokers depend on NATS.

Mike

On Tue, Sep 22, 2015 at 6:01 PM, Dieu Cao <dcao(a)pivotal.io> wrote:

Hello all,

We plan to remove support for v1 service brokers in about 3 months, in a
cf-release following 12/31/2015.
We are working towards removing CF's dependency on NATS and the v1
service brokers are still dependent on NATS.
Please let me know if you have questions/concerns about this timeline.

I'll be working on verifying a set of steps that you can find here [1]
that document how to migrate your service broker from v1 to v2 and what is
required in order to persist user data and will get that posted to the
service broker api docs officially.

-Dieu
CF CAPI PM

[1]
https://docs.google.com/document/d/1Pl1o7mxtn3Iayq2STcMArT1cJsKkvi4Ey1-d3TB_Nhs/edit?usp=sharing




Re: DEA/Warden staging error

kyle havlovitz <kylehav@...>
 

Here's the output from those commands:
https://gist.github.com/MrEnzyme/36592831b1c46d44f007
Soon after running those I noticed that the container loses its IPv4
address shortly after coming up and ifconfig looks like this:

root(a)cf-build:/home/cloud-user/test# ifconfig -a
docker0 Link encap:Ethernet HWaddr 56:84:7a:fe:97:99
inet addr:172.17.42.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
eth0 Link encap:Ethernet HWaddr fa:16:3e:cd:f3:0a
inet addr:172.25.1.52 Bcast:172.25.1.127 Mask:255.255.255.128
inet6 addr: fe80::f816:3eff:fecd:f30a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:515749 errors:0 dropped:0 overruns:0 frame:0
TX packets:295471 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1162366659 (1.1 GB) TX bytes:59056756 (59.0 MB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:45057315 errors:0 dropped:0 overruns:0 frame:0
TX packets:45057315 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:18042315375 (18.0 GB) TX bytes:18042315375 (18.0 GB)
w-190db6c54la-0 Link encap:Ethernet HWaddr 12:dc:ba:da:38:5b
inet6 addr: fe80::10dc:baff:feda:385b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1454 Metric:1
RX packets:12 errors:0 dropped:0 overruns:0 frame:0
TX packets:227 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:872 (872.0 B) TX bytes:35618 (35.6 KB)

Any idea what would be causing that?


On Tue, Sep 22, 2015 at 10:31 PM, Matthew Sykes <matthew.sykes(a)gmail.com>
wrote:

Based on your description, it doesn't sound like warden networking or the
warden iptables chains are your problem. Are you able to share all of your
routes and chains via a gist?

route -n
ifconfig -a
iptables -L -n -v -t filter
iptables -L -n -v -t nat
iptables -L -n -v -t mangle

Any kernel messages that look relevant in the message buffer (dmesg)?

Have you tried doing a network capture to verify the packets are look the
way you expect? Are you sure your host routing rules are good? Do the
warden subnets overlap with any network accessible to the host?

Based on previous notes, it doesn't sound like this is a standard
deployment so it's hard to say what could be impacting you.

On Tue, Sep 22, 2015 at 1:08 PM, Kyle Havlovitz (kyhavlov) <
kyhavlov(a)cisco.com> wrote:

I didn’t; I’m still having this problem. Even adding this lenient
security group didn’t let me get any traffic out of the VM:

[{"name":"allow_all","rules":[{"protocol":"all","destination":"0.0.0.0/0
"},{"protocol":"tcp","destination":"0.0.0.0/0
","ports":"1-65535"},{"protocol":"udp","destination":"0.0.0.0/0
","ports":"1-65535"}]}]

The only way I was able to get traffic out was by manually removing the
reject/drop iptables rules that warden set up, and even with that the
container still lost all connectivity after 30 seconds.

From: CF Runtime <cfruntime(a)gmail.com>
Reply-To: "Discussions about Cloud Foundry projects and the system
overall." <cf-dev(a)lists.cloudfoundry.org>
Date: Tuesday, September 22, 2015 at 12:50 PM
To: "Discussions about Cloud Foundry projects and the system overall." <
cf-dev(a)lists.cloudfoundry.org>
Subject: [cf-dev] Re: Re: Re: Re: Re: Re: Re: Re: DEA/Warden staging
error

Hey Kyle,

Did you make any progress?

Zak & Mikhail
CF Release Integration Team

On Thu, Sep 17, 2015 at 10:28 AM, CF Runtime <cfruntime(a)gmail.com> wrote:

It certainly could be. By default the contains reject all egress
traffic. CC security groups configure iptables rules that allow traffic
out.

One of the default security groups in the BOSH templates allows access
on port 53. If you have no security groups, the containers will not be able
to make any outgoing requests.

Joseph & Natalie
CF Release Integration Team

On Thu, Sep 17, 2015 at 8:44 AM, Kyle Havlovitz (kyhavlov) <
kyhavlov(a)cisco.com> wrote:

On running git clone inside the container via the warden shell, I get:
"Cloning into 'staticfile-buildpack'...
fatal: unable to access '
https://github.com/cloudfoundry/staticfile-buildpack/': Could not
resolve host: github.com".
So the container can't get to anything outside of it (I also tried
pinging some external IPs to make sure it wasn't a DNS thing). Would this
be caused by cloud controller security group settings?

--
Matthew Sykes
matthew.sykes(a)gmail.com


Re: Curious why CF UAA uses DNS

Filip Hanik
 

hi Anna,

Can you elaborate a little bit about what you are referring to?
I'm not quite sure what you are asking.

Filip


On Wed, Sep 23, 2015 at 8:23 AM, Anna Muravieva <ana-mur21s(a)yandex.ru>
wrote:

Hello,

We are using cf product in development. The question relates to uaa, if
you coordinate in research will be very appreciated. What are the benefits
why CF UAA uses DNS in routes management in opposite to checking this
identity for instance in request header.

Thanks in advance,
Anna


Curious why CF UAA uses DNS

Anna Muravieva
 

Hello,
We are using cf product in development. The question relates to uaa, if you coordinate in research will be very appreciated. What are the benefits why CF UAA uses DNS in routes management in opposite to checking this identity for instance in request header.
Thanks in advance, Anna


Curious why CF UAA uses DNS

Anna Muravieva
 

Hello,

We are using cf product in development. The question relates to uaa, if you coordinate in research will be very appreciated. What are the benefits why CF UAA uses DNS in routes management in opposite to checking this identity for instance in request header.

Thanks in advance,
Anna