Date   

Re: Error 400007: `stats_z1/0' is not running after update

iamflying
 

Actually, I checked the two files in status_z1 job VM. I did not find any
clues. Attached for reference.

On Wed, Sep 23, 2015 at 4:54 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

If you do "bosh logs stats_z1 0 --job" you will get a tarball of all the
logs for the relevant processes running on the stats_z1/0 VM. You will
likely find some error messages in the collectors stdout or stderr logs.

On Tue, Sep 22, 2015 at 11:30 PM, Guangcai Wang <guangcai.wang(a)gmail.com>
wrote:

It does not help.

I always see the "collector" process bouncing between "running" and "does
not exit" when I use "monit summary" in a while loop.

Who knows how to get the real error when the "collector" process is not
failed? Thanks.

On Wed, Sep 23, 2015 at 4:11 PM, Tony <Tonyl(a)fast.au.fujitsu.com> wrote:

My approach is to login on the stats vm and sudo, then
run "monit status" and restart the failed processes or simply restart all
processes by running "monit restart all"

wait for a while(5~10 minutes at most)
If there is still some failed process, e.g. collector
then run ps -ef | grep collector
and kill the processes in the list(may be you need to run kill -9
sometimes)

then "monit restart all"

Normally, it will fix the issue "Failed: `XXX' is not running after
update"



--
View this message in context:
http://cf-dev.70369.x6.nabble.com/cf-dev-Error-400007-stats-z1-0-is-not-running-after-update-tp1901p1902.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: Error 400007: `stats_z1/0' is not running after update

Amit Kumar Gupta
 

If you do "bosh logs stats_z1 0 --job" you will get a tarball of all the
logs for the relevant processes running on the stats_z1/0 VM. You will
likely find some error messages in the collectors stdout or stderr logs.

On Tue, Sep 22, 2015 at 11:30 PM, Guangcai Wang <guangcai.wang(a)gmail.com>
wrote:

It does not help.

I always see the "collector" process bouncing between "running" and "does
not exit" when I use "monit summary" in a while loop.

Who knows how to get the real error when the "collector" process is not
failed? Thanks.

On Wed, Sep 23, 2015 at 4:11 PM, Tony <Tonyl(a)fast.au.fujitsu.com> wrote:

My approach is to login on the stats vm and sudo, then
run "monit status" and restart the failed processes or simply restart all
processes by running "monit restart all"

wait for a while(5~10 minutes at most)
If there is still some failed process, e.g. collector
then run ps -ef | grep collector
and kill the processes in the list(may be you need to run kill -9
sometimes)

then "monit restart all"

Normally, it will fix the issue "Failed: `XXX' is not running after
update"



--
View this message in context:
http://cf-dev.70369.x6.nabble.com/cf-dev-Error-400007-stats-z1-0-is-not-running-after-update-tp1901p1902.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: Introducing CF-Swagger

Guillaume Berche
 

Thanks Mohamed and Max for sharing this great work. Besides the supporting
an official TCK, the cf-swagger repo seems great to ease the delivery of
acceptance tests as part of a a service broker release (e.g. scheduled
through bosh errands).

+1 for formal description of CF APIs allowing partly? automated client
generation, and lowering the maintenance burden w.r.t; existing CC API v2
manually maintained clients (e.g. cf-java-client, go-cfclient, nodejs, php
clients...). I had also suggested swagger for consideration in the CC API
v3 [1].

It seems the CAPI team was initially considering Swagger as a documentation
media for CC API v3 into [2] . Dieu, would it be possible to share the "Doc
of comparisons of pros and cons of different options" at [3] which does not
yet seem public ?

Thanks,

Guillaume.

[1] https://github.com/cloudfoundry/cc-api-v3-style-guide/issues/46
[2] https://www.pivotaltracker.com/n/projects/966314/stories/99237980
[3]
https://docs.google.com/a/pivotal.io/document/d/1aVOZfd0n7BOLuJvK0_Sgie9Y3D7GT6NUF4V-bVG-BCs/edit?usp=sharing

On Tue, Sep 22, 2015 at 9:12 PM, Michael Maximilien <maxim(a)us.ibm.com>
wrote:

Since I know various folks are looking at better API docs. I went ahead
and did some quick investigation on what other kind of docs formats could
be generated from Swagger.

Found a bunch, but experimented with Swagger2Markup
<https://github.com/Swagger2Markup/swagger2markup> and was able to
generate the following from the Service Broker Swagger definition here:
https://github.com/maximilien/cf-swagger/blob/master/descriptions/cloudfoundry/service_broker/service_broker.json

1. ASSCIIDoc:
https://github.com/maximilien/cf-swagger/tree/master/markup/cloudfoundry/service_broker/assciidoc
2. GitHub Markdown:
https://github.com/maximilien/cf-swagger/tree/master/markup/cloudfoundry/service_broker/markdown

These are generated from the JSON above without any customization or
changes.

Best,

------
dr.max
ibm cloud labs
silicon valley, ca
maximilien.org


*Michael Maximilien/Almaden/IBM*

09/18/2015 04:51 PM
To
cf-dev(a)lists.cloudfoundry.org
cc
Mohamed Mohamed/Almaden/IBM(a)ibmus, Christopher B Ferris/Waltham/IBM(a)ibmus,
Alex Tarpinian/Austin/IBM(a)ibmus, Heiko Ludwig/Watson/IBM(a)ibmus
Subject
Introducing CF-Swagger




Hi, all,

This email serves two purposes: 1) introduce CF-Swagger, and 2) shares the
results of the CF service broker compliance survey I sent out a couple of
weeks ago.

------
My IBM Research colleague, Mohamed (on cc:), and I have been working on
creating Swagger descriptions for some CF APIs.

Our main goal was to explore what useful tools or utilities we could build
with these Swagger descriptions once created.

The initial results of this exploratory research is CF-Swagger which is
included in the following:

See presentation here: *https://goo.gl/Y16plT* <https://goo.gl/Y16plT>
Video demo here: *http://goo.gl/C8Nz5p* <http://goo.gl/C8Nz5p>
Temp repo here: *https://github.com/maximilien/cf-swagger*
<https://github.com/maximilien/cf-swagger>

The gist of of our work and results are:

1. We created a full Swagger description of the CF service broker
2. Using this description you can use the Swagger editor to create a neat
API docs that is browsable and even callable
3. Using the description you can create client and server stubs for
service brokers in a variety of languages, e.g., JS, Java, Ruby, etc.
4. We've extended go-swagger to generate workable client and server stubs
for service brokers in Golang. We plan to submit all changes to go-swagger
back to that project
5. We've extended go-swagger to generate prototypes of working Ginkgo
tests to service brokers
6. We've extended go-swagger to generate a CF service broker Ginkgo Test
Compliance Kit (TCK) that anyone could use to validate their broker's
compliance with any Swagger-described version of spec
7. We've created a custom Ginkgo reporter that when ran with TCK will give
you a summary of your compliance, e.g., 100% compliant with v2.5 but 90%
compliant with v2.6 due to failing test X, Y, Z... (in Ginkgo fashion)
8. The survey results (all included in the presentation) indicate that
over 50% of respondants believe TCK tests for service broker would be
valuable to them. Many (over 50%) are using custom proprietary tests, and
this project maybe a way to get everyone to converge to a common set of
tests we could all use and improve...

------
We plan to propose this work to become a CF incubator at the next CAB and
PMC calls, especially the TCK part for service brokers. The overall
approach and project could be useful for other parts of the CF APIs but we
will start with CF Service Brokers.

The actual Swagger descriptions should ideally come from the teams who own
the APIs. So for service brokers, the CAPI team. We are engaging them as
they have also been looking at improving APIs docs and descriptions. Maybe
there are potential for synergies and at a minimum making sure what we
generate ends up becoming useful to their pipelines.

Finally, while the repo is temporary and will change, I welcome you to
take a look at presentation and video and code and let us know your
thoughts and feedback.

Thanks for your time and interest.

Mohamed and Max
IBM


Re: Error 400007: `stats_z1/0' is not running after update

iamflying
 

It does not help.

I always see the "collector" process bouncing between "running" and "does
not exit" when I use "monit summary" in a while loop.

Who knows how to get the real error when the "collector" process is not
failed? Thanks.

On Wed, Sep 23, 2015 at 4:11 PM, Tony <Tonyl(a)fast.au.fujitsu.com> wrote:

My approach is to login on the stats vm and sudo, then
run "monit status" and restart the failed processes or simply restart all
processes by running "monit restart all"

wait for a while(5~10 minutes at most)
If there is still some failed process, e.g. collector
then run ps -ef | grep collector
and kill the processes in the list(may be you need to run kill -9
sometimes)

then "monit restart all"

Normally, it will fix the issue "Failed: `XXX' is not running after update"



--
View this message in context:
http://cf-dev.70369.x6.nabble.com/cf-dev-Error-400007-stats-z1-0-is-not-running-after-update-tp1901p1902.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: Error 400007: `stats_z1/0' is not running after update

Tony
 

My approach is to login on the stats vm and sudo, then
run "monit status" and restart the failed processes or simply restart all
processes by running "monit restart all"

wait for a while(5~10 minutes at most)
If there is still some failed process, e.g. collector
then run ps -ef | grep collector
and kill the processes in the list(may be you need to run kill -9 sometimes)

then "monit restart all"

Normally, it will fix the issue "Failed: `XXX' is not running after update"



--
View this message in context: http://cf-dev.70369.x6.nabble.com/cf-dev-Error-400007-stats-z1-0-is-not-running-after-update-tp1901p1902.html
Sent from the CF Dev mailing list archive at Nabble.com.


Error 400007: `stats_z1/0' is not running after update

iamflying
 

Hi all,

I am installing cf 212 with stemcell 2989. However, I got the failure on
status_z1.

Started preparing configuration > Binding configuration. Done (00:00:04)

Started updating job ha_proxy_z1 > ha_proxy_z1/0 (canary). Done (00:00:55)
Started updating job nats_z1 > nats_z1/0 (canary). Done (00:00:56)
Started updating job etcd_z1 > etcd_z1/0 (canary). Done (00:01:35)
Started updating job stats_z1 > stats_z1/0 (canary). Failed: `stats_z1/0'
is not running after update (00:10:26)

Error 400007: `stats_z1/0' is not running after update.

I checked the debug log. But I cannot find any useful information.
Attached for reference.

I also checked the log in stats_z1 job VM. But I cannot find any error
message there.

Any suggestion to investigate the issue? Thanks.


Re: Removing support for v1 service brokers

Mike Youngstrom
 

Thanks for the announcement.

To be clear is this announcement to cease support for the old v1 brokers or
is this to eliminate support for the v1 api in the CC? Does the v1 CC code
depend on NATS? None of my custom v1 brokers depend on NATS.

Mike

On Tue, Sep 22, 2015 at 6:01 PM, Dieu Cao <dcao(a)pivotal.io> wrote:

Hello all,

We plan to remove support for v1 service brokers in about 3 months, in a
cf-release following 12/31/2015.
We are working towards removing CF's dependency on NATS and the v1 service
brokers are still dependent on NATS.
Please let me know if you have questions/concerns about this timeline.

I'll be working on verifying a set of steps that you can find here [1]
that document how to migrate your service broker from v1 to v2 and what is
required in order to persist user data and will get that posted to the
service broker api docs officially.

-Dieu
CF CAPI PM

[1]
https://docs.google.com/document/d/1Pl1o7mxtn3Iayq2STcMArT1cJsKkvi4Ey1-d3TB_Nhs/edit?usp=sharing




Re: F5 Load Balancer Configuration for Cloud Foundry Loggregator

Mike Youngstrom
 

If you are sharing a vip for http and websocket then 443 would be correct.
But Anthony, you can try creating a layer 4 virtual server on 4443 that
goes to the same pool on the back end and configure the CC to use that port
instead for loggregator connections.

Mike

On Tue, Sep 22, 2015 at 10:32 PM, Johannes Hiemer <jvhiemer(a)gmail.com>
wrote:

Are you sure your logregator endpoint is configured on 443 and not 4443?



On 23.09.2015, at 05:26, Anthony <lee.apc(a)gmail.com> wrote:

Yep. --recent works. Other cf commands and cf curl also works.

It definitely is the websockets for loggregator. Just not sure what the
right config for F5 (version 10.4) should be.

Regards,
Anthony

On Sep 22, 2015, at 9:53 PM, Rohit Kumar <rokumar(a)pivotal.io> wrote:

Does `cf logs --recent` work for you? The recent logs request goes over
HTTP. If that goes through that means only the websocket requests to
loggregator servers are a problem.

Rohit

On Tue, Sep 22, 2015 at 8:18 PM, Anthony <lee.apc(a)gmail.com> wrote:

Thanks Mike! Unfortunately, upgrading is not an option since its a really
loaded enterprise device. The interesting part is that there is a
similarly set up websockets vip (plain old server i think .net) that is
working on the same device.

We'll work with our network folks to find other devices with newer
software we can use.

Would appreciate if anyone has other ideas?

Regards,
Anthony

On Sep 22, 2015, at 7:49 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

We are running 11.4 and 11.6. I'd give an upgrade a try before digging
too much deeper.

Mike
On Sep 22, 2015 6:36 PM, "Anthony" <lee.apc(a)gmail.com> wrote:

The version we are testing in is 10.4.

Regards,
Anthony

On Sep 22, 2015, at 6:41 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

What version of F5 software are you running?

Mike

On Tue, Sep 22, 2015 at 5:20 PM, Anthony Lee <lee.apc(a)gmail.com> wrote:

Does any one have any experience configuring F5 load balancers in front
of the CF routers? We have configured F5 and app https and cf push requests
are working fine. However, the connectivity with loggregator is not
working. Taking a look at the documentation, it requires "websocket
support" on the load balancer. We've done the configuration specified here:

https://support.f5.com/kb/en-us/solutions/public/14000/800/sol14814.html

With the following irule basically, applying the default TCP profile if
it detects websocket traffic:

when HTTP_REQUEST {
if { [string tolower [HTTP::header Upgrade]] contains "websocket" }{
HTTP::disable
}
}

However, we are running into errors. Doing `cf logs myapp1` yields:

Error dialing loggregator server: read tcp <ip redacted>:443:
connection reset by peer.
Please ask your Cloud Foundry Operator to check the platform
configuration (loggregator endpoint is wss://loggregator.<sys domain
redacted>:443).

Does anyone have a clue?

Thanks!
Anthony


Re: F5 Load Balancer Configuration for Cloud Foundry Loggregator

Johannes Hiemer <jvhiemer@...>
 

Are you sure your logregator endpoint is configured on 443 and not 4443?

On 23.09.2015, at 05:26, Anthony <lee.apc(a)gmail.com> wrote:

Yep. --recent works. Other cf commands and cf curl also works.

It definitely is the websockets for loggregator. Just not sure what the right config for F5 (version 10.4) should be.

Regards,
Anthony

On Sep 22, 2015, at 9:53 PM, Rohit Kumar <rokumar(a)pivotal.io> wrote:

Does `cf logs --recent` work for you? The recent logs request goes over HTTP. If that goes through that means only the websocket requests to loggregator servers are a problem.

Rohit

On Tue, Sep 22, 2015 at 8:18 PM, Anthony <lee.apc(a)gmail.com> wrote:
Thanks Mike! Unfortunately, upgrading is not an option since its a really loaded enterprise device. The interesting part is that there is a similarly set up websockets vip (plain old server i think .net) that is working on the same device.

We'll work with our network folks to find other devices with newer software we can use.

Would appreciate if anyone has other ideas?

Regards,
Anthony

On Sep 22, 2015, at 7:49 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

We are running 11.4 and 11.6. I'd give an upgrade a try before digging too much deeper.

Mike

On Sep 22, 2015 6:36 PM, "Anthony" <lee.apc(a)gmail.com> wrote:
The version we are testing in is 10.4.

Regards,
Anthony

On Sep 22, 2015, at 6:41 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

What version of F5 software are you running?

Mike

On Tue, Sep 22, 2015 at 5:20 PM, Anthony Lee <lee.apc(a)gmail.com> wrote:
Does any one have any experience configuring F5 load balancers in front of the CF routers? We have configured F5 and app https and cf push requests are working fine. However, the connectivity with loggregator is not working. Taking a look at the documentation, it requires "websocket support" on the load balancer. We've done the configuration specified here:

https://support.f5.com/kb/en-us/solutions/public/14000/800/sol14814.html

With the following irule basically, applying the default TCP profile if it detects websocket traffic:

when HTTP_REQUEST {
if { [string tolower [HTTP::header Upgrade]] contains "websocket" }{
HTTP::disable
}
}

However, we are running into errors. Doing `cf logs myapp1` yields:

Error dialing loggregator server: read tcp <ip redacted>:443: connection reset by peer.
Please ask your Cloud Foundry Operator to check the platform configuration (loggregator endpoint is wss://loggregator.<sys domain redacted>:443).

Does anyone have a clue?

Thanks!
Anthony


Re: F5 Load Balancer Configuration for Cloud Foundry Loggregator

Anthony
 

Yep. --recent works. Other cf commands and cf curl also works.

It definitely is the websockets for loggregator. Just not sure what the right config for F5 (version 10.4) should be.

Regards,
Anthony

On Sep 22, 2015, at 9:53 PM, Rohit Kumar <rokumar(a)pivotal.io> wrote:

Does `cf logs --recent` work for you? The recent logs request goes over HTTP. If that goes through that means only the websocket requests to loggregator servers are a problem.

Rohit

On Tue, Sep 22, 2015 at 8:18 PM, Anthony <lee.apc(a)gmail.com> wrote:
Thanks Mike! Unfortunately, upgrading is not an option since its a really loaded enterprise device. The interesting part is that there is a similarly set up websockets vip (plain old server i think .net) that is working on the same device.

We'll work with our network folks to find other devices with newer software we can use.

Would appreciate if anyone has other ideas?

Regards,
Anthony

On Sep 22, 2015, at 7:49 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

We are running 11.4 and 11.6. I'd give an upgrade a try before digging too much deeper.

Mike

On Sep 22, 2015 6:36 PM, "Anthony" <lee.apc(a)gmail.com> wrote:
The version we are testing in is 10.4.

Regards,
Anthony

On Sep 22, 2015, at 6:41 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

What version of F5 software are you running?

Mike

On Tue, Sep 22, 2015 at 5:20 PM, Anthony Lee <lee.apc(a)gmail.com> wrote:
Does any one have any experience configuring F5 load balancers in front of the CF routers? We have configured F5 and app https and cf push requests are working fine. However, the connectivity with loggregator is not working. Taking a look at the documentation, it requires "websocket support" on the load balancer. We've done the configuration specified here:

https://support.f5.com/kb/en-us/solutions/public/14000/800/sol14814.html

With the following irule basically, applying the default TCP profile if it detects websocket traffic:

when HTTP_REQUEST {
if { [string tolower [HTTP::header Upgrade]] contains "websocket" }{
HTTP::disable
}
}

However, we are running into errors. Doing `cf logs myapp1` yields:

Error dialing loggregator server: read tcp <ip redacted>:443: connection reset by peer.
Please ask your Cloud Foundry Operator to check the platform configuration (loggregator endpoint is wss://loggregator.<sys domain redacted>:443).

Does anyone have a clue?

Thanks!
Anthony


Re: F5 Load Balancer Configuration for Cloud Foundry Loggregator

Rohit Kumar
 

Does `cf logs --recent` work for you? The recent logs request goes over
HTTP. If that goes through that means only the websocket requests to
loggregator servers are a problem.

Rohit

On Tue, Sep 22, 2015 at 8:18 PM, Anthony <lee.apc(a)gmail.com> wrote:

Thanks Mike! Unfortunately, upgrading is not an option since its a really
loaded enterprise device. The interesting part is that there is a
similarly set up websockets vip (plain old server i think .net) that is
working on the same device.

We'll work with our network folks to find other devices with newer
software we can use.

Would appreciate if anyone has other ideas?

Regards,
Anthony

On Sep 22, 2015, at 7:49 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

We are running 11.4 and 11.6. I'd give an upgrade a try before digging
too much deeper.

Mike
On Sep 22, 2015 6:36 PM, "Anthony" <lee.apc(a)gmail.com> wrote:

The version we are testing in is 10.4.

Regards,
Anthony

On Sep 22, 2015, at 6:41 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

What version of F5 software are you running?

Mike

On Tue, Sep 22, 2015 at 5:20 PM, Anthony Lee <lee.apc(a)gmail.com> wrote:

Does any one have any experience configuring F5 load balancers in front
of the CF routers? We have configured F5 and app https and cf push requests
are working fine. However, the connectivity with loggregator is not
working. Taking a look at the documentation, it requires "websocket
support" on the load balancer. We've done the configuration specified here:

https://support.f5.com/kb/en-us/solutions/public/14000/800/sol14814.html

With the following irule basically, applying the default TCP profile if
it detects websocket traffic:

when HTTP_REQUEST {
if { [string tolower [HTTP::header Upgrade]] contains "websocket" }{
HTTP::disable
}
}

However, we are running into errors. Doing `cf logs myapp1` yields:

Error dialing loggregator server: read tcp <ip redacted>:443: connection
reset by peer.
Please ask your Cloud Foundry Operator to check the platform
configuration (loggregator endpoint is wss://loggregator.<sys domain
redacted>:443).

Does anyone have a clue?

Thanks!
Anthony


Re: DEA/Warden staging error

Matthew Sykes <matthew.sykes@...>
 

Based on your description, it doesn't sound like warden networking or the
warden iptables chains are your problem. Are you able to share all of your
routes and chains via a gist?

route -n
ifconfig -a
iptables -L -n -v -t filter
iptables -L -n -v -t nat
iptables -L -n -v -t mangle

Any kernel messages that look relevant in the message buffer (dmesg)?

Have you tried doing a network capture to verify the packets are look the
way you expect? Are you sure your host routing rules are good? Do the
warden subnets overlap with any network accessible to the host?

Based on previous notes, it doesn't sound like this is a standard
deployment so it's hard to say what could be impacting you.

On Tue, Sep 22, 2015 at 1:08 PM, Kyle Havlovitz (kyhavlov) <
kyhavlov(a)cisco.com> wrote:

I didn’t; I’m still having this problem. Even adding this lenient security
group didn’t let me get any traffic out of the VM:

[{"name":"allow_all","rules":[{"protocol":"all","destination":"0.0.0.0/0
"},{"protocol":"tcp","destination":"0.0.0.0/0
","ports":"1-65535"},{"protocol":"udp","destination":"0.0.0.0/0
","ports":"1-65535"}]}]

The only way I was able to get traffic out was by manually removing the
reject/drop iptables rules that warden set up, and even with that the
container still lost all connectivity after 30 seconds.

From: CF Runtime <cfruntime(a)gmail.com>
Reply-To: "Discussions about Cloud Foundry projects and the system
overall." <cf-dev(a)lists.cloudfoundry.org>
Date: Tuesday, September 22, 2015 at 12:50 PM
To: "Discussions about Cloud Foundry projects and the system overall." <
cf-dev(a)lists.cloudfoundry.org>
Subject: [cf-dev] Re: Re: Re: Re: Re: Re: Re: Re: DEA/Warden staging error

Hey Kyle,

Did you make any progress?

Zak & Mikhail
CF Release Integration Team

On Thu, Sep 17, 2015 at 10:28 AM, CF Runtime <cfruntime(a)gmail.com> wrote:

It certainly could be. By default the contains reject all egress traffic.
CC security groups configure iptables rules that allow traffic out.

One of the default security groups in the BOSH templates allows access on
port 53. If you have no security groups, the containers will not be able to
make any outgoing requests.

Joseph & Natalie
CF Release Integration Team

On Thu, Sep 17, 2015 at 8:44 AM, Kyle Havlovitz (kyhavlov) <
kyhavlov(a)cisco.com> wrote:

On running git clone inside the container via the warden shell, I get:
"Cloning into 'staticfile-buildpack'...
fatal: unable to access '
https://github.com/cloudfoundry/staticfile-buildpack/': Could not
resolve host: github.com".
So the container can't get to anything outside of it (I also tried
pinging some external IPs to make sure it wasn't a DNS thing). Would this
be caused by cloud controller security group settings?

--
Matthew Sykes
matthew.sykes(a)gmail.com


Re: F5 Load Balancer Configuration for Cloud Foundry Loggregator

Anthony
 

Thanks Mike! Unfortunately, upgrading is not an option since its a really loaded enterprise device. The interesting part is that there is a similarly set up websockets vip (plain old server i think .net) that is working on the same device.

We'll work with our network folks to find other devices with newer software we can use.

Would appreciate if anyone has other ideas?

Regards,
Anthony

On Sep 22, 2015, at 7:49 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

We are running 11.4 and 11.6. I'd give an upgrade a try before digging too much deeper.

Mike

On Sep 22, 2015 6:36 PM, "Anthony" <lee.apc(a)gmail.com> wrote:
The version we are testing in is 10.4.

Regards,
Anthony

On Sep 22, 2015, at 6:41 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

What version of F5 software are you running?

Mike

On Tue, Sep 22, 2015 at 5:20 PM, Anthony Lee <lee.apc(a)gmail.com> wrote:
Does any one have any experience configuring F5 load balancers in front of the CF routers? We have configured F5 and app https and cf push requests are working fine. However, the connectivity with loggregator is not working. Taking a look at the documentation, it requires "websocket support" on the load balancer. We've done the configuration specified here:

https://support.f5.com/kb/en-us/solutions/public/14000/800/sol14814.html

With the following irule basically, applying the default TCP profile if it detects websocket traffic:

when HTTP_REQUEST {
if { [string tolower [HTTP::header Upgrade]] contains "websocket" }{
HTTP::disable
}
}

However, we are running into errors. Doing `cf logs myapp1` yields:

Error dialing loggregator server: read tcp <ip redacted>:443: connection reset by peer.
Please ask your Cloud Foundry Operator to check the platform configuration (loggregator endpoint is wss://loggregator.<sys domain redacted>:443).

Does anyone have a clue?

Thanks!
Anthony


join

Chunhua Zhang <chzhang@...>
 

--
Thanks & Best Regards,
chunhua, zhang(张春华)
M: +86 187 5198 6615
Department: CONSULTING
Manager: Leon Cheng
IT issue? Mail to: ask(a)pivotal.io


Re: Packaging CF app as bosh-release

Amit Kumar Gupta
 

Great. Let us know if you have further questions, or share if you end up
getting something cool deployed and working, since I haven't seen Spark on
BOSH or CF yet.

Best,
Amit

On Mon, Sep 21, 2015 at 11:33 PM, Kayode Odeyemi <dreyemi(a)gmail.com> wrote:

Yes Amit. Thanks

I'm trying the 2 approaches since the both have their pros and cons.

is your compute environment a multi-tenant one that will be running
multiple different workloads?
Yes. dev can push their own spark-based apps and non-spark apps. The
spark-based apps would rely on the existing Spark cluster.

it's also likely to be a more efficient use of resources, since a BOSH VM
can only run one of these spark-job-processors,
I think a Spark cluster(using YARN) of BOSH VMs should be able to run
multiple spark jobs concurrently.

With the app deployment approach, I did setup a UPS for the Spark cluster
and I've been able to submit Spark jobs to the cluster programmatically
through the Spark API. I'll stay with app deployment for now until I get a
stronger use case for a boshrelease.

On Tue, Sep 22, 2015 at 12:21 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hey Kayode,

Were you able to make any progress with the deployments you were trying
to do?

Best,
Amit

On Wed, Sep 16, 2015 at 12:48 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

My very limited understanding is that NFS writes to the actual
filesystem, and achieves persistence by having centralized NFS servers
where it writes to a real mounted device, whereas the clients write to an
ephemeral nfs-mount.

My very limited understanding of HDFS is that it's all userland FS, does
not write to the actual filesystem, and relies on replication to other
nodes in the HDFS cluster. Being a userland FS, you don't have to worry
about the data being wiped when a container is shut down, if you were to
run it as an app.

I think one main issue is going to be ensuring that you never lose too
many instances (whether they are containers or VMs), since you might then
lose all replicas of a given data shard. Whether you go with apps or BOSH
VMs doesn't make a big difference here.

Deploying as an app may be a better way to go, it's simpler right now to
configure and deploy and app, than to configure and deploy a full BOSH
release. It's also likely to be a more efficient use of resources, since a
BOSH VM can only run one of these spark-job-processors, but a CF
container-runner can run lots of other things. That actually brings up a
different question: is your compute environment a multi-tenant one that
will be running multiple different workloads? E.g. could someone also use
the CF to push their own apps? Or is the whole thing just for your spark
jobs, in which case you might only be running one container per VM anyways?

Assuming you can make use of the VMs for other workloads, I think this
would be an ideal use case for Diego. You probably don't need all the
extra logic around apps, like staging and routing, you just need Diego to
efficiently schedule containers for you.

On Wed, Sep 16, 2015 at 1:13 PM, Kayode Odeyemi <dreyemi(a)gmail.com>
wrote:

Thanks Dmitriy,

Just for clarity, are you saying multiple instances of a VM cannot
share a single shared filesystem?

On Wed, Sep 16, 2015 at 6:59 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io>
wrote:

BOSH allocates a persistent disk per instance. It never shares
persistent disks between multiple instances at the same time.

If you need a shared file system, you will have to use some kind of a
release for it. It's not any different from what people do with nfs
server/client.

On Wed, Sep 16, 2015 at 7:09 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

The shared file system aspect is an interesting wrinkle to the
problem. Unless you use some network layer to how you write to the shared
file system, e.g. SSHFS, I think apps will not work because they get
isolated to run in a container, they're given a chroot "jail" for their
file system, and it gets blown away whenever the app is stopped or
restarted (which will commonly happen, e.g. during a rolling deploy of the
container-runner VMs).

Do you have something that currently works? How do your VMs
currently access this shared FS? I'm not sure BOSH has the abstractions
for choosing a shared, already-existing "persistent disk" to be attached to
multiple VMs. I also don't know what happens when you scale your VMs down,
because BOSH would generally destroy the associated persistent disk, but
you don't want to destroy the shared data.

Dmitriy, any idea how BOSH can work with a shared filesystem (e.g.
HDFS)?

Amit

On Wed, Sep 16, 2015 at 6:54 AM, Kayode Odeyemi <dreyemi(a)gmail.com>
wrote:


On Wed, Sep 16, 2015 at 3:44 PM, Amit Gupta <agupta(a)pivotal.io>
wrote:

Are the spark jobs tasks that you expect to end, or apps that you
expect to run forever?
They are tasks that run forever. The jobs are subscribers to
RabbitMQ queues that process
messages in batches.


Do your jobs need to write to the file system, or do they access a
shared/distributed file system somehow?
The jobs write to shared filesystem.


Do you need things like a static IP allocated to your jobs?
No.


Are your spark jobs serving any web traffic?
No.




Re: Removing support for v1 service brokers

Camilo Aguilar
 

I'm curious about the reasoning for the migration out of NATS, there was
any limitation you guys hit?

On Tue, Sep 22, 2015 at 8:01 PM Dieu Cao <dcao(a)pivotal.io> wrote:

Hello all,

We plan to remove support for v1 service brokers in about 3 months, in a
cf-release following 12/31/2015.
We are working towards removing CF's dependency on NATS and the v1 service
brokers are still dependent on NATS.
Please let me know if you have questions/concerns about this timeline.

I'll be working on verifying a set of steps that you can find here [1]
that document how to migrate your service broker from v1 to v2 and what is
required in order to persist user data and will get that posted to the
service broker api docs officially.

-Dieu
CF CAPI PM

[1]
https://docs.google.com/document/d/1Pl1o7mxtn3Iayq2STcMArT1cJsKkvi4Ey1-d3TB_Nhs/edit?usp=sharing




Re: F5 Load Balancer Configuration for Cloud Foundry Loggregator

Mike Youngstrom
 

We are running 11.4 and 11.6. I'd give an upgrade a try before digging too
much deeper.

Mike

On Sep 22, 2015 6:36 PM, "Anthony" <lee.apc(a)gmail.com> wrote:

The version we are testing in is 10.4.

Regards,
Anthony

On Sep 22, 2015, at 6:41 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

What version of F5 software are you running?

Mike

On Tue, Sep 22, 2015 at 5:20 PM, Anthony Lee <lee.apc(a)gmail.com> wrote:

Does any one have any experience configuring F5 load balancers in front
of the CF routers? We have configured F5 and app https and cf push requests
are working fine. However, the connectivity with loggregator is not
working. Taking a look at the documentation, it requires "websocket
support" on the load balancer. We've done the configuration specified here:

https://support.f5.com/kb/en-us/solutions/public/14000/800/sol14814.html

With the following irule basically, applying the default TCP profile if
it detects websocket traffic:

when HTTP_REQUEST {
if { [string tolower [HTTP::header Upgrade]] contains "websocket" }{
HTTP::disable
}
}

However, we are running into errors. Doing `cf logs myapp1` yields:

Error dialing loggregator server: read tcp <ip redacted>:443: connection
reset by peer.
Please ask your Cloud Foundry Operator to check the platform
configuration (loggregator endpoint is wss://loggregator.<sys domain
redacted>:443).

Does anyone have a clue?

Thanks!
Anthony


Re: F5 Load Balancer Configuration for Cloud Foundry Loggregator

Anthony
 

The version we are testing in is 10.4.

Regards,
Anthony

On Sep 22, 2015, at 6:41 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:

What version of F5 software are you running?

Mike

On Tue, Sep 22, 2015 at 5:20 PM, Anthony Lee <lee.apc(a)gmail.com> wrote:
Does any one have any experience configuring F5 load balancers in front of the CF routers? We have configured F5 and app https and cf push requests are working fine. However, the connectivity with loggregator is not working. Taking a look at the documentation, it requires "websocket support" on the load balancer. We've done the configuration specified here:

https://support.f5.com/kb/en-us/solutions/public/14000/800/sol14814.html

With the following irule basically, applying the default TCP profile if it detects websocket traffic:

when HTTP_REQUEST {
if { [string tolower [HTTP::header Upgrade]] contains "websocket" }{
HTTP::disable
}
}

However, we are running into errors. Doing `cf logs myapp1` yields:

Error dialing loggregator server: read tcp <ip redacted>:443: connection reset by peer.
Please ask your Cloud Foundry Operator to check the platform configuration (loggregator endpoint is wss://loggregator.<sys domain redacted>:443).

Does anyone have a clue?

Thanks!
Anthony


Re: User cannot do CF login when UAA is being updated

Yunata, Ricky <rickyy@...>
 

Hi Amit,

Thank you very much. Sure, I will do that.

Regards,
Ricky


From: Amit Gupta [mailto:agupta(a)pivotal.io]
Sent: Wednesday, 23 September 2015 9:58 AM
To: Discussions about Cloud Foundry projects and the system overall.
Subject: [cf-dev] Re: Re: Re: Re: Re: Re: Re: User cannot do CF login when UAA is being updated

Hi Ricky,

I reviewed the files you sent to Joseph (via Dies). It looks like you're experiencing several different issues (you have a deployment that has no login job, and then later does? you're doing "cf auth" in a loop and at some point the CLI complains that you never set the API?). Also looks like this is being done in a bunch of different scenarios: update stemcell, change flavour, etc. Finally, it's hard for the community to benefit from the investigation because the information is in files that aren't public.

Can I recommend the following courses of action:

* open up a Github issue (it is easier for the core developers to track, and @-mention developers from CLI and UAA, etc.)
* make sure there are separate issues created for each distinct error or deployment scenario; if two errors seem closely related, it's reasonable to mention them in the same issue, otherwise please create separate issues and provide links to other issues that are loosely related
* include any manifest, scripts, or terminal output in text format that can be copied, pasted, and searched. Sites like gist.github.com<http://gist.github.com> and pastebin make this easy, instead of using screenshots.

We look forward to helping you!

Best,
Amit

On Wed, Sep 16, 2015 at 2:07 AM, CF Runtime <cfruntime(a)gmail.com<mailto:cfruntime(a)gmail.com>> wrote:
If you can't get the list to accept the attachment, you can give it to Dies and he should be able to get it to us.

Joseph
OSS Release Integration Team

On Tue, Sep 15, 2015 at 7:19 PM, Yunata, Ricky <rickyy(a)fast.au.fujitsu.com<mailto:rickyy(a)fast.au.fujitsu.com>> wrote:
Hi Joseph,

Yes that is the case. I have sent my test result but it seems that my e-mail does not get through. How can I sent attachment in this mailing list?

Regards,
Ricky


From: CF Runtime [mailto:cfruntime(a)gmail.com<mailto:cfruntime(a)gmail.com>]
Sent: Tuesday, 15 September 2015 8:10 PM
To: Discussions about Cloud Foundry projects and the system overall.
Subject: [cf-dev] Re: Re: Re: Re: User cannot do CF login when UAA is being updated

Couple of updates here for clarity. No databases are stored on NFS in any default installation. NFS is only used to store blobstore data. If you are using the postgres job from cf-release, since it is single node there will be downtime during a stemcell deploy.

I talked with Dies from Fujitsu earlier and confirmed they are NOT using the postgres job but an external non-cf deployed postgres instance. So during a deploy, the UAA db should be up and available the entire time.

The issue they are seeing is that even though the database is up, and I'm guessing there is at least a single node of UAA up during the deploy, there are still login failures.

Joseph
OSS Release Integration Team

On Mon, Sep 14, 2015 at 6:39 PM, Filip Hanik <fhanik(a)pivotal.io<mailto:fhanik(a)pivotal.io>> wrote:
Amit, see previous comment.

Postgresql database is stored on NFS that is restarted during nfs job update.
UAA, while being up, is non functional while the NFS job is updated because it can't get to the DB.



On Mon, Sep 14, 2015 at 5:09 PM, Amit Gupta <agupta(a)pivotal.io<mailto:agupta(a)pivotal.io>> wrote:
Hi Ricky,

My understanding is that you still need help, and the issues Jiang and Alexander raised are different. To avoid confusion, let's keep this thread focused on your issue.

Can you confirm that you have two UAA VMs in separate bosh jobs, separate AZs, etc. Can you confirm that when you roll the UAAs, only one goes down at a time? The simplest way to affect a roll is to change some trivial property in the manifest for your UAA jobs. If you're using v215, any of the properties referenced here will do:

https://github.com/cloudfoundry/cf-release/blob/v215/jobs/uaa/spec#L321-L335

You should confirm that only one UAA is down at a time, and comes back up before bosh moves on to updating the other UAA.

While this roll is happening, can you just do `CF_TRACE=true cf auth USERNAME PASSWORD` in a loop, and if you see one that fails, post the output, along with noting the state of the bosh deploy when the error happens.

Thanks,
Amit

On Mon, Sep 14, 2015 at 10:51 AM, Amit Gupta <agupta(a)pivotal.io<mailto:agupta(a)pivotal.io>> wrote:
Ricky, Jiang, Alexander, are the three of you working together? It's hard to tell since you've got Fujitsu, Gmail, and Altoros email addresses. Are you folks talking about the same issue with the same deployment, or three separate issues.

Ricky, if you still need assistance with your issue, please let us know.

On Mon, Sep 14, 2015 at 10:16 AM, Lomov Alexander <alexander.lomov(a)altoros.com<mailto:alexander.lomov(a)altoros.com>> wrote:
Yes, the problem is that postgresql database is stored on NFS that is restarted during nfs job update. I’m sure that you’ll be able to run updates without outage with several customizations.

It is hard to tell without knowing your environment, but in common case steps will be following:


1. Add additional instances to nfs job and customize it to make replications (for instance use this docs for release customization [1])
2. Make your NFS job to update sequently without our jobs updates in parallel (like it is done for postgresql [2])
3. Check your options in update section [3].

[1] https://help.ubuntu.com/community/HighlyAvailableNFS
[2] https://github.com/cloudfoundry/cf-release/blob/master/example_manifests/minimal-aws.yml#L115-L116
[3] https://github.com/cloudfoundry/cf-release/blob/master/example_manifests/minimal-aws.yml#L57-L62

On Sep 14, 2015, at 9:47 AM, Yitao Jiang <jiangyt.cn(a)gmail.com<mailto:jiangyt.cn(a)gmail.com>> wrote:

On upgrading the deployment, the uaa not working due the uaadb filesystem hangup.Under my environment , the nfs-wal-server's ip changed which causing uaadb,ccdb hang up. Hard reboot the uaadb, restart uaa service solve the issue.

Hopes can help you.

On Mon, Sep 14, 2015 at 2:13 PM, Yunata, Ricky <rickyy(a)fast.au.fujitsu.com<mailto:rickyy(a)fast.au.fujitsu.com>> wrote:
Hello,

I have a question regarding UAA in Cloud Foundry. I’m currently running Cloud Foundry on Openstack.
I have 2 availability zones and redundancy of the important VMs including UAA.
Whenever I do an upgrade of either stemcell or CF release, user will not be able to do CF login when when CF is updating UAA VM.
My question is, is this a normal behaviour? If I have redundant UAA VM, shouldn’t user still be able to still login to the apps even though it’s being updated?
I’ve done this test a few times, with different CF version and stemcells and all of them are giving me the same result. The latest test that I’ve done was to upgrade CF version from 212 to 215.
Has anyone experienced the same issue?

Regards,
Ricky
Disclaimer

The information in this e-mail is confidential and may contain content that is subject to copyright and/or is commercial-in-confidence and is intended only for the use of the above named addressee. If you are not the intended recipient, you are hereby notified that dissemination, copying or use of the information is strictly prohibited. If you have received this e-mail in error, please telephone Fujitsu Australia Software Technology Pty Ltd on + 61 2 9452 9000<tel:%2B%2061%202%209452%209000> or by reply e-mail to the sender and delete the document and all copies thereof.


Whereas Fujitsu Australia Software Technology Pty Ltd would not knowingly transmit a virus within an email communication, it is the receiver’s responsibility to scan all communication and any files attached for computer viruses and other defects. Fujitsu Australia Software Technology Pty Ltd does not accept liability for any loss or damage (whether direct, indirect, consequential or economic) however caused, and whether by negligence or otherwise, which may result directly or indirectly from this communication or any files attached.


If you do not wish to receive commercial and/or marketing email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe(a)fast.au.fujitsu.com<mailto:unsubscribe(a)fast.au.fujitsu.com>




--

Regards,

Yitao
jiangyt.github.io<http://jiangyt.github.io/>





Disclaimer

The information in this e-mail is confidential and may contain content that is subject to copyright and/or is commercial-in-confidence and is intended only for the use of the above named addressee. If you are not the intended recipient, you are hereby notified that dissemination, copying or use of the information is strictly prohibited. If you have received this e-mail in error, please telephone Fujitsu Australia Software Technology Pty Ltd on + 61 2 9452 9000<tel:%2B%2061%202%209452%209000> or by reply e-mail to the sender and delete the document and all copies thereof.


Whereas Fujitsu Australia Software Technology Pty Ltd would not knowingly transmit a virus within an email communication, it is the receiver’s responsibility to scan all communication and any files attached for computer viruses and other defects. Fujitsu Australia Software Technology Pty Ltd does not accept liability for any loss or damage (whether direct, indirect, consequential or economic) however caused, and whether by negligence or otherwise, which may result directly or indirectly from this communication or any files attached.


If you do not wish to receive commercial and/or marketing email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe(a)fast.au.fujitsu.com<mailto:unsubscribe(a)fast.au.fujitsu.com>


Disclaimer

The information in this e-mail is confidential and may contain content that is subject to copyright and/or is commercial-in-confidence and is intended only for the use of the above named addressee. If you are not the intended recipient, you are hereby notified that dissemination, copying or use of the information is strictly prohibited. If you have received this e-mail in error, please telephone Fujitsu Australia Software Technology Pty Ltd on + 61 2 9452 9000 or by reply e-mail to the sender and delete the document and all copies thereof.


Whereas Fujitsu Australia Software Technology Pty Ltd would not knowingly transmit a virus within an email communication, it is the receiver’s responsibility to scan all communication and any files attached for computer viruses and other defects. Fujitsu Australia Software Technology Pty Ltd does not accept liability for any loss or damage (whether direct, indirect, consequential or economic) however caused, and whether by negligence or otherwise, which may result directly or indirectly from this communication or any files attached.


If you do not wish to receive commercial and/or marketing email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe(a)fast.au.fujitsu.com


Removing support for v1 service brokers

Dieu Cao <dcao@...>
 

Hello all,

We plan to remove support for v1 service brokers in about 3 months, in a
cf-release following 12/31/2015.
We are working towards removing CF's dependency on NATS and the v1 service
brokers are still dependent on NATS.
Please let me know if you have questions/concerns about this timeline.

I'll be working on verifying a set of steps that you can find here [1] that
document how to migrate your service broker from v1 to v2 and what is
required in order to persist user data and will get that posted to the
service broker api docs officially.

-Dieu
CF CAPI PM

[1]
https://docs.google.com/document/d/1Pl1o7mxtn3Iayq2STcMArT1cJsKkvi4Ey1-d3TB_Nhs/edit?usp=sharing

7501 - 7520 of 9425