Date   

Re: Reg combining vms in cf-231

Amit Kumar Gupta
 

Your first job has consul services, so you'll need to colocate the
consul_agent there too. You also don't have any consul servers running
anywhere.

On Mon, Apr 25, 2016 at 7:25 AM, Nithiyasri Gnanasekaran -X (ngnanase -
TECH MAHINDRA LIM at Cisco) <ngnanase(a)cisco.com> wrote:

Hi Amit

I am using cf-231 release.

In the spiff generated yml, I could combine api_worker and
cloud_controller jobs into one, dea-spare with router.

But when I remove consul as a separate VM and combine it with api_worker
and cloud_controller . The consul process is not starting in router and
haproxy.

If I have a separate consul vm, this problem doesn’t occur.



Following is the consul process error in router.



error during start: timeout exceeded

2016/04/25 08:57:06 [ERR] agent.client: Failed to decode response header:
EOF

++ logger -p user.info -t vcap.consul-agent

++ tee -a /var/vcap/sys/log/consul_agent/consul_agent.stdout.log

error during start: timeout exceeded

2016/04/25 08:58:15 [ERR] agent.client: Failed to decode response header:
EOF

++ logger -p user.info -t vcap.consul-agent

++ tee -a /var/vcap/sys/log/consul_agent/consul_agent.stdout.log



Pls let me know how can I fix this.



Regards

Nithiyasri



Re: HTTP request status text is changed

Stanley Shen <meteorping@...>
 

Actually one CF instance is bosh lite one deployed on AWS, I didn't change things related to haproxy.
I also can reproduce it on this bosh lite instance.

- default_networks:
- name: cf1
static_ips: null
instances: 1
name: ha_proxy_z1
networks:
- name: cf1
static_ips:
- 10.244.0.34
properties:
ha_proxy:
ssl_pem: |+
-----BEGIN CERTIFICATE-----
MIICsjCCAZoCCQC+xvE/1ZQgFzANBgkqhkiG9w0BAQUFADAaMRgwFgYDVQQDFA8q
LmJvc2gtbGl0ZS5jb20wIBcNMTUxMDA4MjIwNDQ3WhgPMjI4OTA3MjIyMjA0NDda
MBoxGDAWBgNVBAMUDyouYm9zaC1saXRlLmNvbTCCASIwDQYJKoZIhvcNAQEBBQAD
ggEPADCCAQoCggEBAK09Q520xrKx75uew3mAS+y4uyRPZPEjt/pYdBl40PXIwaqO
X7LGoc9lNZS/eAPX6xeVFmZbLZReQ5+Fm0moeLzsh58W9jjkWWk7oGISmxfoQz9B
X9Eh0NHCrtKXMrCPlr+2RI/qLinJDqn87UEZqwX+84JU8hBZ8RD8D7YnfuDteySV
SYOEUjkiN/pIWmbJQY1sjEyk1zH1Hiy8kmnait2sX8Td2S/aV6EJBgODOstzEtnf
HFDIfoTJxbSK/0TbF6qBaSl0CLaOop9FX2ULEZUgAuIW4dG2k/xnpMLdz7A0ZsSU
Haw9okZ5wNuYk1RSqhnqw+9KUWgXwV6RlMvtXMkCAwEAATANBgkqhkiG9w0BAQUF
AAOCAQEAShOqAFLIc93yIjhcnN7L4ZXFo+CvOgklJqFeBbwRshsEptbaddDJYmRr
ZXzOE7MiTOBM8YzKqtHvl/ZguXmIAXSZlnq6kuJHdPtcZOqu1x2GAvWWOzn9Xl4m
T3RmwF3NgiX0jgNMkkm8i8jfT7uN9BnHxMv65b9yKeM0sRFN5XigA43DDQnfF3j4
FQ9jwpmS7zOx2wn6FayOgoE4rgJfV/9637ZprQOMfUbZPKgQQplDn6bvK13rj9g9
zCC9W0fy29l7VDuAOOSI5xzsoYyH6DfX7oySxn291hidSCb/buadNG4dgI4keMGw
u5K8QQYmlSY91IJtuRRITYXGmIiPpg==
-----END CERTIFICATE-----
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEArT1DnbTGsrHvm57DeYBL7Li7JE9k8SO3+lh0GXjQ9cjBqo5f
ssahz2U1lL94A9frF5UWZlstlF5Dn4WbSah4vOyHnxb2OORZaTugYhKbF+hDP0Ff
0SHQ0cKu0pcysI+Wv7ZEj+ouKckOqfztQRmrBf7zglTyEFnxEPwPtid+4O17JJVJ
g4RSOSI3+khaZslBjWyMTKTXMfUeLLySadqK3axfxN3ZL9pXoQkGA4M6y3MS2d8c
UMh+hMnFtIr/RNsXqoFpKXQIto6in0VfZQsRlSAC4hbh0baT/Gekwt3PsDRmxJQd
rD2iRnnA25iTVFKqGerD70pRaBfBXpGUy+1cyQIDAQABAoIBACXzdt2UnbbF3jzU
QfRbE8bvDSg+MFnXPlWcjQqLehNuAGcxu2s5snbxsBQ/Abat1XWcFoUj0k9feyb2
KPew7YpNssQ6ToRWGfRAuLjjZJCPNDQmSSxSYSGiqZO+xb8CJb8n2ctBPQ2wWwMI
Qp1xVxMAMC5MF59XZMUYwwRfkJ8LawB90+S9BjHcU3GqoPECLFkgEeIj3mrnmpAD
vhIeYvQj2W5JCpxLUA+7lnyoqnx8OTOXvBPAsKwO1Hx88yCitnxXro7i0ZAw4ErH
zrnMgWkFDvRiS3ta/QS2RcBBiZHKX/gRRT/AvqJ+Erveu0BcZ9AVy1UpPB0w9rBK
PTxS2BECgYEA3MLd6Og+xQpw4UNhy9EjeDE/b/rZK4w/vfD3WE5J3Nm4HGdSA6Q4
YmQYVg+VuCLR+HHsk58LxEf+cU0MNgDJR1/rFZRmociF+G0i7/7DuhFm891wWWGW
Iz7XeGWHi+LIeYWkteuflrkmvy/7xqArgcNqnirGhba6706MZz0G0YUCgYEAyOR5
aF7qRpLXHgMOPOzJKC4ceWA5rY8rcdJZFI7aNq5MJF9o+fNNt8YRJ1hQTzs5K/R+
HwBJel8J6CoPQo9WUXnj0md4M67sCZSBqWANMO/J0f4VkbLS/lwch+ZPS8jt3Z4z
umYW4QnloIKXxORfySo7r9DzZSgmxuDE8PVWn3UCgYAFTwpXF36q7l1YjW5EoHrh
4Q1NfBLM4UqHHsxT604LaZDr3fAy9jgE5bNQHn/TNcMm3lZ6FlEKH1EXGGs6wToV
5VCZ7D+rlE7kcntsmgvK5bA8HQ8elyItJs23r3la+9EmWvhjB4+G6FzuLBE57ZAe
RrzBoPW1MXe9WX423VjUoQKBgGea5T49jSc+fbDdtI8ZMxkExuyWAskOyEIYUJa4
obOHqn8rsZEOuKspfBlFg42JJpATtKO6WyrALvTMFDiogcTdTvBpKmXFNbgvHbvD
bKorUHN7TZZpmkVSLeisj4KvKnWcLGNaWTxQBVwFXc5OVVQC8utWoOAvl+gDba4z
aSwtAoGANdquHRNbigPj2y0cRoexYJwKgpfGEK4HXitsKZUUg09gVfagM1HynVFz
RA0LVac0oJZFdMYZyU/PXCySS237xUD2/0oySYJIK9E0C4ZxKD+DoAk5Z097z0LM
7rxStMCBWB2x4ommvEnpdgntEKkl4buIDatvmbdmdwkY3+X65Ks=
-----END RSA PRIVATE KEY-----
metron_agent:
zone: z1
router:
servers:
z1:
- 10.244.0.22
z2: []
resource_pool: router_z1
templates:
- name: haproxy
release: cf
- name: metron_agent
release: cf
- name: consul_agent
release: cf
update: {}


Re: Persistent problem with -p option for cups and uups

Sunil Babu <cloudgrp.assist@...>
 

Pls change the lookup location and it work's

On Monday, April 25, 2016, Don Nelson <dieseldonx(a)gmail.com> wrote:

Thanks, Dies. I opened GitHub issue #824 at the cf repo.
--
Thanks & Regards
Sunil Babu K C
+91-81970-35608


Reg combining vms in cf-231

Nithiyasri Gnanasekaran -X (ngnanase - TECH MAHINDRA LIM@Cisco) <ngnanase at cisco.com...>
 

Hi Amit
I am using cf-231 release.
In the spiff generated yml, I could combine api_worker and cloud_controller jobs into one, dea-spare with router.
But when I remove consul as a separate VM and combine it with api_worker and cloud_controller . The consul process is not starting in router and haproxy.
If I have a separate consul vm, this problem doesn't occur.

Following is the consul process error in router.

error during start: timeout exceeded
2016/04/25 08:57:06 [ERR] agent.client: Failed to decode response header: EOF
++ logger -p user.info -t vcap.consul-agent
++ tee -a /var/vcap/sys/log/consul_agent/consul_agent.stdout.log
error during start: timeout exceeded
2016/04/25 08:58:15 [ERR] agent.client: Failed to decode response header: EOF
++ logger -p user.info -t vcap.consul-agent
++ tee -a /var/vcap/sys/log/consul_agent/consul_agent.stdout.log

Pls let me know how can I fix this.

Regards
Nithiyasri


Re: HTTP request status text is changed

Sunil Babu <cloudgrp.assist@...>
 

Can u check on the haproxy config on the web component

On Monday, April 25, 2016, Stanley Shen <meteorping(a)gmail.com> wrote:

yes, the request goes to my APP.

I can reproduce it with a very small zip file as ZIP format is forbidden
in my system.
The default value configured in CF is "client_max_body_size: 15M"
I am using HA_Proxy, no LB is introduced in my system.

in gorouter I can see the log of my request, and its status is 413 indeed.
I am not sure where the statusText is changed.
--
Thanks & Regards
Sunil Babu K C
+91-81970-35608


Re: HTTP request status text is changed

Stanley Shen <meteorping@...>
 

yes, the request goes to my APP.

I can reproduce it with a very small zip file as ZIP format is forbidden in my system.
The default value configured in CF is "client_max_body_size: 15M"
I am using HA_Proxy, no LB is introduced in my system.

in gorouter I can see the log of my request, and its status is 413 indeed.
I am not sure where the statusText is changed.


Re: HTTP request status text is changed

Daniel Mikusa
 

Are you sure the request is actually making it to your app? Requests that
come into CF go through a load balancer and the gorouters before they hit
your app. ie. browser -> your LB -> gorouters -> app.

If the upload is too large for the LB or gorouter it would probably
generate a 413 with the standard response.

Take a look at the access logs for your app and for the gorouter (i.e. RTR
tag in `cf logs`). If you don't see either, it's your LB. If you just see
the RTR then it's the gorouter. If you see access log from your app then
it is hitting your app and it would seem that something else is causing
this.

Dan

On Mon, Apr 25, 2016 at 9:25 AM, Stanley Shen <meteorping(a)gmail.com> wrote:

Hello all

I am having an web application running on CF/Diego with version 233, and I
notice strange thing about my APP.
I have some code to check attachments will be uploaded to system.

If the attachment doesn't pass the checking, we will use code like
response.sendError(413, "The file cannot be uploaded due to file extension
jar ")
Where response is HttpServletResponse.

In front end, we have file upload widget to try to upload file to system.

Before, we can always get statusText which is what we set in response,
like "The file cannot be uploaded due to file extension jar"
But right now, I always get statusText "Request Entity Too Large", which
is the standard status text of error code 413.
I tried in non CF environment, and it works as expected.

It looks like in somewhere CF changed the statusText based on the
statusCode, but I didn't get clue yet.

Any information about this?

Thanks in advance.






HTTP request status text is changed

Stanley Shen <meteorping@...>
 

Hello all

I am having an web application running on CF/Diego with version 233, and I notice strange thing about my APP.
I have some code to check attachments will be uploaded to system.

If the attachment doesn't pass the checking, we will use code like
response.sendError(413, "The file cannot be uploaded due to file extension jar ")
Where response is HttpServletResponse.

In front end, we have file upload widget to try to upload file to system.

Before, we can always get statusText which is what we set in response, like "The file cannot be uploaded due to file extension jar"
But right now, I always get statusText "Request Entity Too Large", which is the standard status text of error code 413.
I tried in non CF environment, and it works as expected.

It looks like in somewhere CF changed the statusText based on the statusCode, but I didn't get clue yet.

Any information about this?

Thanks in advance.


Brokered route services only receiving traffic for routes mapped to started apps

Guillaume Berche
 

Hi Shannon and the routing team,

Testing the route service support in v230, I observe that brokered route
services are only receiving traffic for routes mapped to started apps. In
other words, if a route is mapped to an app in the "crashed", "starting" or
"stopped" state, then any fully-brokered route service bound to that route
won't receive traffic sent to that route, instead the gorouter directly
responds with a 404.

I wonder whether this is a product decision (I had missed this from the
design proposal [3]), or rather this is an intermediate implementation
choice, and that it could be considered to forward any traffic received
from a route to the associated bound route service regardless of the status
of mapped apps (I did not yet find a related story in the routing backlog).

I collected at the end of this email a list of use-cases where I believe
route services will benefit from unconditionally receiving traffic from
bound routes.

In addition, the "unconditional routing of traffic to route services" would
also offer a more consistent behavior between "static route services" and
"fully brokered services" to developers interested in consuming route
services. App developers can be guaranteed that fully-brokered services
that receive traffic even if the app is unavailable (CRASHED, or during a
transient Diego cell unavailability...), just like "static route services"
would.

Lastly, the "unconditional routing of traffic to route services" also seems
more consistent with the current CLI UX: the binding of a route service to
a route is independent of app mapping to the same route. The "cf
bind-route-service" command does not require the route to be bound to an
app. The "cf routes" commands does list routes with bound route services
and not mapped to any apps, etc...

Trying to imagine drawbacks/impacts of such "unconditional routing", I
could so far only spot:
a- slightly more traffic handled by route services that don't wish to
account for/modify the default 404 response on unavailable app
b- slightly more traffic for the gorouter for handling requests sent to
routes mapped to unavailable apps: the request would now be proxied to
route services, which will query back the gorouter
c- potentially slightly larger gorouter routing table that need to kept in
memory (route entries for route services but no app endpoints).

I believe these impacts are acceptable. b) could potentially be reduced by
passing the app status to the route service (e.g. via an additional header
"X-CF-Route-Status" with value "404: route does not exist" ).

Implementation wise, I'm not sure how deep/strong is the current assumption
that "an active route is associated to at least one endpoint" in the
different CF components (gorouter, its nats messages and routing-api, diego
route emiter and BBS models) and therefore the effort required to implement
the "unconditional routing to route services" behavior.

Thanks in advance for your thoughts on this,

Guillaume.


Related use-cases:

*1- returning custom response when app is unavailable (crashed, starting,
stopped, or zero available app instances). *

For apps returning HTML, this may be custom HTML response (rather than the
default gorouter 404 response page), or specific HTTP response code such as
"503 service unavailable" to suggest client some retries. For route
services dealing with routes serving APIs (e.g. SOAP), a route service may
return a proper SOAP-formatted fault response.

Multi-site aware route services may choose to redirect users to a route
hosted on a second CF instance through a "307 temporary redirect" status
code.

A caching service may choose to return (potentially stale) cached content
when the mapped app is in the CRASHED state, rather than returning a 404.

*2- Applying side effects upon unavailability of app*

A SOX-compliant lossless logging service (unlike the potentially lossly
loggregator-based logging), may wish to log full details of the requests
sent to the route, including those that never reached the an available app
instance.

A api gateway route services that would maintain measurements of
performance and availability of the exposed APIs that transit through its
bound routes, would need to receive traffic when bound apps are crashed.

The autosleep service [1] that I'm working on would be able to dynamically
start a previously stopped app in order to save ram during inactivity.

[1]
https://docs.google.com/document/d/1tMhIBX3tw7kPEOMCzKhUgmtmr26GVxyXwUTwMO71THI/edit
<https://github.com/Orange-OpenSource/autosleep>
[2] http://docs.cloudfoundry.org/services/route-services.html#architecture
[3]
https://docs.google.com/document/d/1bGOQxiKkmaw6uaRWGd-sXpxL0Y28d3QihcluI15FiIA/edit#heading=h.8djffzes9pnb


Re: Persistent problem with -p option for cups and uups

Don Nelson
 

Thanks, Dies. I opened GitHub issue #824 at the cf repo.


Re: Persistent problem with -p option for cups and uups

Koper, Dies <diesk@...>
 

Hi Don,

-p should work. Please raise a GitHub issue at the CLI repo . Please include a (sanitized) version of your JSON file.

Regards,
Dies Koper
CF CLI PM
Sent from my iPhone

On 25 Apr 2016, at 12:32, Don Nelson <dieseldonx(a)gmail.com> wrote:

Hi all,

Using the latest version of cf (6.17.0+5d0be0a-2016-04-15), I notice that there is still a problem with the -p option for cups and uups. The online help indicates that one can use -p to point to a JSON file with credentials. Some of the docs also indicate this. However, cf takes the path to the file as a prompt and wants input.

Perhaps this is just a synchronization problem with the docs, and it's not really supposed to work this way. Personally, I would vote for a -f option to point to a file, since IMHO having the credentials in a secure and potentially versionalbe file would be more secure and repeatable than entering it on the CL.

Don


Persistent problem with -p option for cups and uups

Don Nelson
 

Hi all,

Using the latest version of cf (6.17.0+5d0be0a-2016-04-15), I notice that there is still a problem with the -p option for cups and uups. The online help indicates that one can use -p to point to a JSON file with credentials. Some of the docs also indicate this. However, cf takes the path to the file as a prompt and wants input.

Perhaps this is just a synchronization problem with the docs, and it's not really supposed to work this way. Personally, I would vote for a -f option to point to a file, since IMHO having the credentials in a secure and potentially versionalbe file would be more secure and repeatable than entering it on the CL.

Don


Re: Static IP setup for routers on AWS

Amit Kumar Gupta
 

Thanks Dan,

So it sounds like *if* your router has public IPs, then you need to tell
those IPs to the UAA so it knows to trust them and handle the x-forwarded-*
headers from them. Going back to Johannes original question, I think he's
right, in the typical AWS configuration there's no reason to give the
routers static IPs. I'll go ahead and submit a PR for this change.

Cheers,
Amit

On Thu, Apr 14, 2016 at 6:57 AM, Daniel Mikusa <dmikusa(a)pivotal.io> wrote:

On Fri, Apr 8, 2016 at 7:04 AM, Engelke, Johannes <
info(a)johannes-engelke.de> wrote:

Hi Amit,
thanks for your answer. I deployed cloud foundry without using static
IP’s. It is working well.

As far as I understood the uaa config the entire 10.x.x.x network is
allowed to access the UAA Servers anyway, so there is no reason to place
the dedicated static IP's of the routers into the config.
Are you referring to the RemoteIpValve that is configured for UAA?


https://github.com/cloudfoundry/uaa-release/blob/develop/jobs/uaa/templates/tomcat.server.xml.erb#L70-L73

Because the RemoteIpValve doesn't restrict access to Tomcat / UAA. It's
controls how (and if) Tomcat handles the x-forwarded-* headers. In short,
it will only process those headers if it "trusts" them (by trust, it really
means if the regex matches).

My understanding is that the UAA job will take the gorouter IP's and
prepend them to the front of this regex so that it will always match at
least the IP's for the gorouter. If you're using private IP's, it's not
really necessary as the default regex used by Tomcat will match all private
IP's.

If you're using public IP's for some reason, you'd need to configure this
or UAA might not detect the incoming connects as HTTPS and it would very
likely detect the wrong remote IP address (necessary for audit records in
the logs).


Do you see any security improvements, if only routers are allowed to
access the UAA?
As long as we're talking about RemoteIpValve, sorry if I'm not following
the conversation completely I jumped in a little late, and you're using
private IP addresses for your VMs then I don't see any difference in
behavior.

If you have public IP's assigned to your gorouter VMs then you may see
some issues with how the x-forwarded-for and x-forwarded-proto headers are
processed, which in turn could affect the accuracy of the audit messages in
the logs.

Hope that helps!

Dan



On 08 Apr 2016, at 02:19, Amit Gupta <agupta(a)pivotal.io> wrote:

The UAA needs to know the router IPs to know which IPs to accept inbound
requests from. If you don't care about this, you can try configuring UAA
to allow requests from many IPs, and remove the static IPs from gorouter.
I would be interested to find out the result of this experiment should you
try it out.

Best,
Amit

On Thu, Apr 7, 2016 at 6:28 AM, Engelke, Johannes <
info(a)johannes-engelke.de> wrote:

Hi,
does anybody know, why the routers got static ips in the
cf-infrastructure-aws.yml file?
https://github.com/cloudfoundry/cf-release/blob/master/templates/cf-infrastructure-aws.yml#L173

Bosh is assigning the instances to ELB’s during deploy time, so there
should be no need to have static addresses here.

If nobody know’s a good reason should we remove them ;-)

Cheers
Johannes


Re: Java Buildpack v3.7

Josh Long <starbuxman@...>
 

Congratulations on the massive new release!

On Sat, Apr 23, 2016 at 18:43 James Bayer <jbayer(a)pivotal.io> wrote:

impressive list of new stuff ben and team!

On Thu, Apr 21, 2016 at 12:19 PM, Danny Rosen <drosen(a)pivotal.io> wrote:

Great work!

On Thu, Apr 21, 2016 at 3:07 PM, Dieu Cao <dcao(a)pivotal.io> wrote:

Congrats on the release! Very exciting stuff.

On Thu, Apr 21, 2016 at 10:16 AM, Ben Hale <bhale(a)pivotal.io> wrote:

I'm pleased to announce the release of the java-buildpack, version 3.7.
This release contains the addition of a number of frameworks and updates to
the dependencies.

* Container Certificate Trust Store Framework
* Ruxit APM Framework (via Alois Mayr)
* Dynatrace Framework Enabled (via Mike Villiger)
* Tomcat Configuration Extension Point (via Violeta Georgieva)
* Improved Debug Framework Documentation (via Mike Youngstrom)
* Improved Configuration Diagnostics (via Yann Robert)

For a more detailed look at the changes in 3.7, please take a look at
the commit log[1]. Packaged versions of the buildpack, suitable for use
with create-buildpack and update-buildpack, can be found attached to the
release.


-Ben Hale
Cloud Foundry Java Experience


## Packaged Dependencies

AppDynamics 4.1.8_5
Dynatrace 6.3.0_1305
GemFire Modules Tomcat7 8.2.0
GemFire Modules 8.2.0
GemFire Security 8.2.0
GemFire 8.2.0
Groovy 2.4.6
JRebel 6.4.2
Log4j API 2.1.0
Log4j Core 2.1.0
Log4j Jcl 2.1.0
Log4j Jul 2.1.0
Log4j Slf4j 2.1.0
MariaDB JDBC 1.4.2
Memory Calculator (mountainlion) 2.0.2_RELEASE
Memory Calculator (precise) 2.0.2_RELEASE
Memory Calculator (trusty) 2.0.2_RELEASE
New Relic Agent 3.27.0
OpenJDK JRE (mountainlion) 1.8.0_91
OpenJDK JRE (precise) 1.8.0_73
OpenJDK JRE (trusty) 1.8.0_91
Play Framework JPA Plugin 1.10.0_RELEASE
PostgreSQL JDBC 9.4.1208
RedisStore 1.2.0_RELEASE
Ruxit 1.91.271
SLF4J API 1.7.7
SLF4J JDK14 1.7.7
Spring Auto-reconfiguration 1.10.0_RELEASE
Spring Boot CLI 1.3.3_RELEASE
Spring Boot Container Customizer 1.0.0_RELEASE
Tomcat Access Logging Support 2.5.0_RELEASE
Tomcat Lifecycle Support 2.5.0_RELEASE
Tomcat Logging Support 2.5.0_RELEASE
Tomcat 8.33.0
YourKit Profiler (mountainlion) 2016.02.34
YourKit Profiler (precise) 2016.02.33
YourKit Profiler (trusty) 2016.02.34


[1]: https://github.com/cloudfoundry/java-buildpack/compare/v3.6...v3.7


--
Thank you,

James Bayer


Re: Java Buildpack v3.7

James Bayer
 

impressive list of new stuff ben and team!

On Thu, Apr 21, 2016 at 12:19 PM, Danny Rosen <drosen(a)pivotal.io> wrote:

Great work!

On Thu, Apr 21, 2016 at 3:07 PM, Dieu Cao <dcao(a)pivotal.io> wrote:

Congrats on the release! Very exciting stuff.

On Thu, Apr 21, 2016 at 10:16 AM, Ben Hale <bhale(a)pivotal.io> wrote:

I'm pleased to announce the release of the java-buildpack, version 3.7.
This release contains the addition of a number of frameworks and updates to
the dependencies.

* Container Certificate Trust Store Framework
* Ruxit APM Framework (via Alois Mayr)
* Dynatrace Framework Enabled (via Mike Villiger)
* Tomcat Configuration Extension Point (via Violeta Georgieva)
* Improved Debug Framework Documentation (via Mike Youngstrom)
* Improved Configuration Diagnostics (via Yann Robert)

For a more detailed look at the changes in 3.7, please take a look at
the commit log[1]. Packaged versions of the buildpack, suitable for use
with create-buildpack and update-buildpack, can be found attached to the
release.


-Ben Hale
Cloud Foundry Java Experience


## Packaged Dependencies

AppDynamics 4.1.8_5
Dynatrace 6.3.0_1305
GemFire Modules Tomcat7 8.2.0
GemFire Modules 8.2.0
GemFire Security 8.2.0
GemFire 8.2.0
Groovy 2.4.6
JRebel 6.4.2
Log4j API 2.1.0
Log4j Core 2.1.0
Log4j Jcl 2.1.0
Log4j Jul 2.1.0
Log4j Slf4j 2.1.0
MariaDB JDBC 1.4.2
Memory Calculator (mountainlion) 2.0.2_RELEASE
Memory Calculator (precise) 2.0.2_RELEASE
Memory Calculator (trusty) 2.0.2_RELEASE
New Relic Agent 3.27.0
OpenJDK JRE (mountainlion) 1.8.0_91
OpenJDK JRE (precise) 1.8.0_73
OpenJDK JRE (trusty) 1.8.0_91
Play Framework JPA Plugin 1.10.0_RELEASE
PostgreSQL JDBC 9.4.1208
RedisStore 1.2.0_RELEASE
Ruxit 1.91.271
SLF4J API 1.7.7
SLF4J JDK14 1.7.7
Spring Auto-reconfiguration 1.10.0_RELEASE
Spring Boot CLI 1.3.3_RELEASE
Spring Boot Container Customizer 1.0.0_RELEASE
Tomcat Access Logging Support 2.5.0_RELEASE
Tomcat Lifecycle Support 2.5.0_RELEASE
Tomcat Logging Support 2.5.0_RELEASE
Tomcat 8.33.0
YourKit Profiler (mountainlion) 2016.02.34
YourKit Profiler (precise) 2016.02.33
YourKit Profiler (trusty) 2016.02.34


[1]: https://github.com/cloudfoundry/java-buildpack/compare/v3.6...v3.7

--
Thank you,

James Bayer


Re: How can i configure HA Doppler at cf.yml?

inho cho
 

My question missed a point.
yes. metron gets doppler addresses from etcd. But only dopplers in the same
zone.if possible, I would like to know how to set and get different zone's
doppler .
For example,
- cf.yml
metron-agent:
- zone : z1, z2
2016. 4. 23. 오전 3:50에 "Warren Fernandes" <wfernandes(a)pivotal.io>님이 작성:

Metron reads the doppler addresses from etcd. Each doppler advertises its
address to etcd.

Increasing the number of doppler instances will automatically spread the
load from all the metrons over all the dopplers.



On Mon, Apr 18, 2016 at 5:36 AM, 인호 조 <ihocho(a)crossent.com> wrote:

I read "Overview of the Loggregator System " -
https://docs.cloudfoundry.org/loggregator/architecture.html

In that document, metron_agent can forward metrics or logs to N doppler.

But i don't know how to do it.

Would you let me know how to configure it at cf.yml.

Thanks & Regards


Re: Remarks about the “confab” wrapper for consul

Amit Kumar Gupta
 

Hi Benjamin,

Re bumping consul:

Yes, soon: https://www.pivotaltracker.com/story/show/113007637

Re updating confab so that people can tweak their consul settings directly
from BOSH deployment:

Currently that's not in the plans, no. However, we'd very much like to
understand your findings after changing configurations.

Re updating this “skip_leave_on_interrupt” config in confab:

We don't currently plan on changing it. Because RAFT needs to be very
carefully orchestrated when rolling clusters, scaling up, scaling down,
etc. we would need to see that there is a problem and that this is the root
cause. There are Consul acceptance tests (CONSATS) that exercise all this
orchestration:
https://github.com/cloudfoundry-incubator/consul-release#acceptance-tests

If you're having a lot of flapping, suspicions, failed acks, etc. this
points to a different root cause. This often has to do with restricted UDP
traffic, network ACLs, etc. I opened an issue about this on the consul
repo a year ago, and it's still open:
https://github.com/hashicorp/consul/issues/916.

Please keep us posted on your discoveries.

Best,
Amit

On Fri, Apr 15, 2016 at 5:32 AM, Benjamin Gandon <benjamin(a)gandon.org>
wrote:

As an update, it looks like I’m running into the Node health flapping
<https://github.com/hashicorp/consul/issues/1212> issue that is more
frequent with consul 0.5.x servers compared to 0.6.x servers.

→ Q1: Are you planning to upgrade the consul version used in CF and Diego
from 0.5.2 to 0.6.4 in near future?


Also, people recommend the following settings to mitigate the issue.

"dns_config": {
"allow_stale": true,
"node_ttl": "5s",
"service_ttl": {
"*": "5s"
}
}

I’ll try those and keep you updated with the results next week.
Unfortunately, I’ll have to fork the consul-release
<https://github.com/cloudfoundry-incubator/consul-release> because those
settings are also hardwired to their default
<https://github.com/cloudfoundry-incubator/consul-release/blob/master/src/confab/config/consul_config_definer.go#L13-L35> in
confab.

→ Q2: Are you planing so update of confab so that people can tweak their
consul settings directly from BOSH deployment?


Regarding my previous remark about properly configuring “
skip_leave_on_interrupt” and “leave_on_terminate” in confab, I understand
that the default value of “true” for “leave_on_terminate” might be
necessary to properly scale down a consul cluster with BOSH.

But I saw today that skip_leave_on_interrupt will default to true
<https://github.com/hashicorp/consul/blob/master/CHANGELOG.md> for consul
*servers* in the upcoming version 0.7.0. Currently, this config is
hard-wired to its default value of “false” in confab.

→ Q3: Are you planning to update this “skip_leave_on_interrupt” config in
confab?


/Benjamin


Le 14 avr. 2016 à 17:00, Benjamin Gandon <benjamin(a)gandon.org> a écrit :

Thank you Amit for your answer.


I ran again in the “all-consuls-go-crazy” situation today, as quite every
day actually. As soon as they start this flapping membership issue, the
whole cf+diego deployment goes down.

Before I delete the content of the persistent storage, when I restart the
consul servers, they don’t manage to elect a leader :
https://gist.github.com/bgandon/08707466324be7c9a093a56fd95a64e4

After I delete */var/vcap/store/consul_agent* on all 3 consul servers, a
consul leader is properly elected, but the cluster rapidly re-start
flapping again with failures suspicions, missing acks, and timeouts :
https://gist.github.com/bgandon/cab53c22da66b24beff46389ba7f0bdc

And at that time, the load of the bosh-ite VM goes up to 280+ and
everything becomes very unresponsive.

How is it possible to bring the consul cluster in a healthy state again? I
don’t want to reboot the bosh-lite VM and recreate all deployments with
cloudchecks anymore.


/Benjamin


Le 11 avr. 2016 à 22:40, Amit Gupta <agupta(a)pivotal.io> a écrit :

Orchestrating a raft cluster in a way that requires no manual intervention
is incredibly difficult. We write the PID file late for a specific reason:

https://www.pivotaltracker.com/story/show/112018069

For dealing with wedged states like the one you encountered, we have some
recommendations in the documentation:

https://github.com/cloudfoundry-incubator/consul-release/#disaster-recovery

We have acceptance tests we run in CI that exercise rolling a 3 node
cluster, so if you hit a failure it would be useful to get logs if you have
any.

Cheers,
Amit

On Mon, Apr 11, 2016 at 9:38 AM, Benjamin Gandon <benjamin(a)gandon.org>
wrote:

Actually, doing some further tests, I realize a mere 'join' is definitely
not enough.

Instead, you need to restore the raft/peers.json on each one of the 3
consul server nodes:

monit stop consul_agent
echo '["10.244.0.58:8300","10.244.2.54:8300","10.244.0.54:8300"]' >
/var/vcap/store/consul_agent/raft/peers.json


And make sure you start them quite at the same time with “monit start
consul_agent”

So this advocates a strongly for setting *skip_leave_on_interrupt=true*
and *leave_on_terminate=false* in confab, because loosing the peers.json
is really something we don't want in our CF deployments!

/Benjamin


Le 11 avr. 2016 à 18:15, Benjamin Gandon <benjamin(a)gandon.org> a écrit :

Hi cf devs,


I’m running a CF deployment with redundancy, and I just experienced my
consul servers not being able to elect any leader.
That’s a VERY frustrating situation that keeps the whole CF deployment
down, until you get a deeper understanding of consul, and figure out they
just need a silly manual 'join' so that they get back together.

But that was definitely not easy to nail down because at first look, I
could just see monit restarting the “agent_ctl” every 60 seconds because
confab was not writing the damn PID file.


More specifically, the 3 consul servers (i.e. consul_z1/0, consul_z1/1
and consul_z2/0) had properly left oneanother uppon a graceful shutdown.
This state was persisted in /var/vcap/store/raft/peers.json being “null” on
each one of them, so they would not get back together on restart. A manual
'join' was necessary. But it took me hours to get there because I’m no
expert with consul.

And until the 'join' is made, VerifySynced() was negative in confab, and
monit was constantly starting and stopping it every 60 seconds. But once
you step back, you realize confab was actually waiting for the new leader
to be elected before it writes the PID file. Which is questionable.

So, I’m asking 3 questions here:

1. Does writing the PID file in confab *that* late really makes sense?
2. Could someone please write some minimal documentation about confab, at
least to tell what it is supposed to do?
3. Wouldn’t it be wiser that whenever any of the consul servers is not
here, then the cluster gets unhealthy?

With this 3rd question, I mean that even on a graceful TERM or INT, no
consul server should not perform any graceful 'leave'. With this different
approach, then they would properly be back up even when performing a
complete graceful restart of the cluster.

This can be done with those extra configs from the “confab” wrapper:

{
"skip_leave_on_interrupt": true,
"leave_on_terminate": false
}

What do you guys think of it?


/Benjamin




removing global "domain" property from Cloud Foundry manifest

Amit Kumar Gupta
 

Hi all,

Every week, I see a user confused by the domain, system_domain, and
app_domains properties in the Cloud Foundry manifest. The confusion is
primarily around domain and system_domain. These properties can be
confusing even for the core development teams, since these global
properties are used by many different jobs coming from many different
projects, and the usage may be inconsistent.

I propose we consolidate on system_domain. Here's where the other
property, domain, is currently used:

$ grep -r ' domain:' -- */spec
blobstore/spec: domain:
cloud_controller_clock/spec: domain:
cloud_controller_ng/spec: domain:
cloud_controller_worker/spec: domain:
dea_next/spec: domain:
hm9000/spec: domain:
uaa/spec: domain:
uaa/spec: domain: <(String) domain for cookie, default is incoming
request domain>

Details:

- *uaa* job spec actually says it's deprecated, so we can just delete it
- *hm9000* doesn't actually use it, it probably used to when it
registered its own route
- *dea* seems to only use it for directory server, and can probably
safely use system_domain instead
- *cloud_controller_ng & friends* use both domain and system_domain, but
the CAPI team has concluded that only system_domain is needed.

I just wanted to give the community a heads-up on this coming change. I'll
ask the PMs of the various teams to try to remove these properties from
their BOSH job specs, and the Release Integration team will then make sure
to remove it from any manifest generation templates and documentation.

Cheers,
Amit


Re: How can i configure HA Doppler at cf.yml?

Warren Fernandes
 

Metron reads the doppler addresses from etcd. Each doppler advertises its
address to etcd.

Increasing the number of doppler instances will automatically spread the
load from all the metrons over all the dopplers.

On Mon, Apr 18, 2016 at 5:36 AM, 인호 조 <ihocho(a)crossent.com> wrote:

I read "Overview of the Loggregator System " -
https://docs.cloudfoundry.org/loggregator/architecture.html

In that document, metron_agent can forward metrics or logs to N doppler.

But i don't know how to do it.

Would you let me know how to configure it at cf.yml.

Thanks & Regards


Re: pg gem and ruby app deployment issues

John Shahid
 

Hi Evgeniy,

Are you using a customized stack with that cloudfdoundry deployment. I’m
looking at cflinuxfs2 stack/rootfs and libpq-dev has been there for a while
now. This package has provided pg_config since version 9.3.4
<http://packages.ubuntu.com/trusty/arm64/libpq-dev/filelist>.

Another (unlikely) possiblity, have you set --with-pg-config locally? If
you had there would be a .bundle/config inside your app with similar
contents as below:


BUNDLE_BUILD__PG: "--with-pg-config=/usr/pgsql-9.1/bin/pg_config"

On Fri, Apr 22, 2016 at 5:44 AM Evgeniy Litvinenko mirakl577(a)gmail.com
<http://mailto:mirakl577(a)gmail.com> wrote:

Good day,

We have ruby application which requires gem 'pg' for connection to
postgres db, for some reason I can't deploy it, it fails on bundle install
phase, I used different ruby buildpacks and ruby versions but without any
luck, our cf version is 212.

Gemfile:
source "https://rubygems.org"

ruby '2.3.0'
gem "sinatra", "~> 1.4.3"
gem 'json'
gem 'rest-client'
gem 'pg'


ERROR:
Cloning into '/tmp/buildpacks/ruby-buildpack'...
Submodule 'compile-extensions' (
https://github.com/cloudfoundry/compile-extensions) registered for path
'compile-extensions'
Cloning into 'compile-extensions'...
Submodule path 'compile-extensions': checked out
'4a0e48afc46c1d467b7c75a8ae5e6f3a044d3d64'
-------> Buildpack version 1.6.16
Downloaded [
https://pivotal-buildpacks.s3.amazonaws.com/ruby/binaries/shared/bundler-1.11.2.tgz
]
-----> Compiling Ruby/Rack
Downloaded [
https://pivotal-buildpacks.s3.amazonaws.com/concourse-binaries/ruby/ruby-2.3.0-linux-x64.tgz
]
-----> Using Ruby version: ruby-2.3.0
-----> Installing dependencies using bundler 1.11.2
Downloaded [
https://pivotal-buildpacks.s3.amazonaws.com/ruby/binaries/cflinuxfs2/libyaml-0.1.6.tgz
]
Running: bundle install --without development:test --path
vendor/bundle --binstubs vendor/bundle/bin -j4 --deployment
Fetching gem metadata from https://rubygems.org/.........
Fetching version metadata from https://rubygems.org/..
Using json 1.8.3
Installing netrc 0.11.0
Installing unf_ext 0.0.7.2 with native extensions
Installing mime-types 2.99.1
Installing pg 0.18.4 with native extensions
Installing rack 1.6.4
Installing tilt 2.0.2
Using bundler 1.11.2
Installing rack-protection 1.5.3
Installing sinatra 1.4.7
Gem::Ext::BuildError: ERROR: Failed to build gem native extension.
current directory:
/tmp/staged/app/vendor/bundle/ruby/2.3.0/gems/pg-0.18.4/ext
/tmp/staged/app/vendor/ruby-2.3.0/bin/ruby -r
./siteconf20160422-297-z2ly6c.rb extconf.rb
checking for pg_config... no
No pg_config... trying anyway. If building fails, please try again
with
--with-pg-config=/path/to/pg_config
checking for libpq-fe.h... no
Can't find the 'libpq-fe.h header
*** extconf.rb failed ***

4721 - 4740 of 9425