Date   

Re: Issues on Diego Deployment on Openstack

Tom Sherrod <tom.sherrod@...>
 

Johannes,

You got diego and v212 running successfully in Openstack? Can you share the manifests?
I just got v214 running successfully and wish to get diego into it. Hoping your trial and error can help.

Thanks,
Tom


Re: Running Docker private images on CF

Gwenn Etourneau
 

t "#<RspecApiDocumentation::Views::HtmlExample:0x0000000bb883e0>" To be a
bug in the Documentation tracked here
https://www.pivotaltracker.com/n/projects/1003146/stories/100845526

Real example is (
https://github.com/cloudfoundry/cloud_controller_ng/blob/dfcb9553bab8ca8c430a399620b5b4a028bdd2f7/spec/api/documentation/apps_api_spec.rb#L49
)

{ 'docker_user' => 'user name',
'docker_password' => 's3cr3t',
'docker_email' => 'email(a)example.com',
'docker_login_server' => 'https://index.docker.io/v1/' }


But you error 404 just mean something was not found host, docker image not
sure which one

On Wed, Aug 12, 2015 at 1:32 AM, dharmi <dharmi(a)gmail.com> wrote:

We have CF v214 with Diego deployed on AWS.

I am able to successfully create apps from Docker public repo, as per the
apidocs <http://apidocs.cloudfoundry.org/214/apps/creating_an_app.html> ,
but, while creating apps from the Docker private repos, I see the below
error from 'cf logs' when starting the app.

[API/0] OUT Updated app with guid bcb8f363-xyz
({"route"=>"5af6948b-xyz"})
[API/0] OUT Updated app with guid bcb8f363-xyz ({"state"=>"STARTED"})
[STG/0] OUT Creating container
[STG/0] OUT Successfully created container
[STG/0] OUT Staging...
[STG/0] OUT Staging process started ...
[STG/0] ERR Staging process failed: Exit trace for group:
[STG/0] ERR builder exited with error: failed to fetch metadata from
[:dockerid/go-app] with tag [latest] and insecure registries [] due to HTTP
code: 404
[STG/0] OUT Exit status 2
[STG/0] ERR Staging Failed: Exited with status 2
[API/0] ERR Failed to stage application: staging failed


cf curl command for reference.

cf curl /v2/apps -X POST -H "Content-Type: application/json" -H
"Authorization: bearer *accessToken*" -d '
{"name": "myapp",
"space_guid": "71b22eba-xyz",
"docker_image": ":dockerid/go-app",
"diego": true,
"docker_credentials_json":
{"docker_login_server": "https://index.docker.io/v1/",
"docker_user": ":dockerid",
"docker_password": ":dockerpwd",
"docker_email": ":email"
}
}'

Looking at the apidocs, the 'Example value' for 'docker_credentials_json'
indicates a Hash value
(#<RspecApiDocumentation::Views::HtmlExample:0x0000000bb883e0>), but
looking
inside the code, we found the below JSON format.

let(:docker_credentials) do
{
docker_login_server: login_server,
docker_user: user,
docker_password: password,
docker_email: email
}

Pls correct me if I am missing something.

Thanks,
Dharmi



--
View this message in context:
http://cf-dev.70369.x6.nabble.com/Running-Docker-private-images-on-CF-tp1148.html
Sent from the CF Dev mailing list archive at Nabble.com.


Re: Diego log grouping

MJ
 

I’m wondering if there are any updates on this.

thanks,
Mike

On Jul 21, 2015, at 10:14 AM, Mike Jacobi wrote:

The CF messages ingest OK into Splunk; None of the Diego messages ingest, not even the first line of their payload.

The problematic deployment:

• CF v212
• Stemcell 3012 (bosh-stemcell-3012-vsphere-esxi-ubuntu-trusty-go_agent.tgz)
• Diego 1304


An single CF syslog message payload:

2015-07-21T00:26:51.619730+00:00 10.5.139.228 vcap.hm9000.listener [job=hm9000_z1 index=1] {"timestamp":1437438411.619571447,"process_id":9750,"source":"vcap.hm9000.listener","log_level":"info","message":"Saved Heartbeats - {\"Duration\":\"217.887334ms\",\"Heartbeats to Save\":\"3\"}","data":null}

An single Diego syslog message payload:

2015-07-21T00:27:02.389177+00:00 10.5.139.241 vcap.receptor [job=cell_z1 index=1] {"timestamp":"1437438422.389095783","source":"receptor","message":"receptor.task-handler.get-all.succeeded-fetching-tasks-from-store","log_level":1,"data":{"domain":"","session":"3.3949"}}
<13>2015-07-21T00:27:02.389399+00:00 10.5.139.241 vcap.receptor [job=cell_z1 index=1] {"timestamp":"1437438422.389337301","source":"receptor","message":"receptor.request.done","log_level":1,"data":{"method":"GET","request":"/v1/tasks","session":"20204"}}
<13>2015-07-21T00:27:02.389982+00:00 10.5.139.241 vcap.receptor [job=cell_z1 index=1] {"timestamp":"1437438422.389916658","source":"receptor","message":"receptor.request.serving","log_level":1,"data":{"method":"GET","request":"/v1/desired_lrps","session":"20205"}}
<13>2015-07-21T00:27:02.620949+00:00 10.5.139.241 vcap.receptor [job=cell_z1 index=1] {"timestamp":"1437438422.620829105","source":"receptor","message":"receptor.request.done","log_level":1,"data":{"method":"GET","request":"/v1/desired_lrps","session":"20205"}}
<13>2015-07-21T00:27:02.621660+00:00 10.5.139.241 vcap.receptor [job=cell_z1 index=1] {"timestamp":"1437438422.621593714","source":"receptor","message":"receptor.request.serving","log_level":1,"data":{"method":"GET","request":"/v1/actual_lrps","session":"20206"}}
<13>2015-07-21T00:27:03.145810+00:00 10.5.139.241 vcap.receptor [job=cell_z1 index=1] {"timestamp":"1437438423.145678282","source":"receptor","message":"receptor.request.done","log_level":1,"data":{"method":"GET","request":"/v1/actual_lrps","session":"20206"}}
<13>2015-07-21T00:27:03.146939+00:00 10.5.139.241 vcap.receptor [job=cell_z1 index=1] {"timestamp":"1437438423.146871328","source":"receptor","message":"receptor.request.serving","log_level":1,"data":{"method":"GET","request":"/v1/domains","session":"20207"}}
<13>2015-07-21T00:27:03.192371+00:00 10.5.139.241 vcap.receptor [job=cell_z1 index=1] {"timestamp":"1437438423.192266226","source":"receptor","message":"receptor.request.done","log_level":1,"data":{"method":"GET","request":"/v1/domains","session":"20207"}}
<12>2015-07-21T00:27:03.692338+00:00 10.5.139.241 vmsvc [job=cell_z1 index=1] [ warning] [guestinfo] Failed to get vmstats.
<13>2015-07-21T00:27:03.936841+00:00 10.5.139.241 vcap.rep [job=cell_z1 index=1] {"timestamp":"1437438423.935924530","source":"rep","message":"rep.running-bulker.sync.starting","log_level":1,"data":{"session":"11.7938"}}
<13>2015-07-21T00:27:03.937079+00:00 10.5.139.241 vcap.rep [job=cell_z1 index=1] {"timestamp":"1437438423.936018705","source":"rep","message":"rep.running-bulker.sync.batch-operations.started","log_level":1,"data":{"session":"11.7938.1"}}
<13>2015-07-21T00:27:03.937110+00:00 10.5.139.241 vcap.rep [job=cell_z1 index=1] {"timestamp":"1437438423.936035156","source":"rep","message":"rep.running-bulker.sync.batch-operations.getting-containers-lrps-and-tasks","log_level":1,"data":{"session":"11.7938.1"}}
<13>2015-07-21T00:27:03.937132+00:00 10.5.139.241 vcap.rep [job=cell_z1 index=1] {"timestamp":"1437438423.936167717","source":"rep","message":"rep.running-bulker.sync.batch-operations.fetching-tasks-from-store","log_level":1,"data":{"session":"11.7938.1"}}
<13>2015-07-21T00:27:03.968190+00:00 10.5.139.241 vcap.rep [job=cell_z1 index=1] {"timestamp":"1437438423.968116999","source":"rep","message":"rep.running-bulker.sync.batch-operations.succeeded-fetching-tasks-from-store","log_level":1,"data":{"session":"11.7938.1"}}
<13>2015-07-21T00:27:06.495263+00:00 10.5.139.241 vcap.rep [job=cell_z1 index=1] {"timestamp":"1437438426.495136499","source":"rep","message":"rep.running-bulker.sync.batch-operations.succeeded-getting-containers-lrps-and-tasks","log_level":1,"data":{"session":"11.7938.1"}}
<13>2015-07-21T00:27:06.495341+00:00 10.5.139.241 vcap.rep [job=cell_z1 index=1] {"timestamp":"1437438426.495275497","source":"rep","message":"rep.running-bulker.sync.batch-operations.succeeded","log_level":1,"data":{"batch-size":0,"session":"11.7938.1"}}

Another (single) Diego syslog message payload:

2015-07-21T00:27:10.310490+00:00 10.5.139.246 vcap.route-emitter [job=route_emitter_z1 index=0] {"timestamp":"1437438430.310252428","source":"route-emitter","message":"route-emitter.watcher.sync.emitting-messages","log_level":1,"data":{"messages":{"RegistrationMessages":null,"UnregistrationMessages":null},"session":"5.10080"}}
<13>2015-07-21T00:27:10.312037+00:00 10.5.139.246 vcap.route-emitter [job=route_emitter_z1 index=0] {"timestamp":"1437438430.311890841","source":"route-emitter","message":"route-emitter.watcher.sync.complete","log_level":1,"data":{"session":"5.10080"}}
<12>2015-07-21T00:27:11.049315+00:00 10.5.139.246 vmsvc [job=route_emitter_z1 index=0] [ warning] [guestinfo] Failed to get vmstats.

-Mike

From: cf-dev-bounces(a)lists.cloudfoundry.org<mailto:cf-dev-bounces(a)lists.cloudfoundry.org> [mailto:cf-dev-bounces(a)lists.cloudfoundry.org] On Behalf Of Eric Malm
Sent: Tuesday, July 21, 2015 12:01 AM
To: Discussions about Cloud Foundry projects and the system overall.
Subject: Re: [cf-dev] Diego log grouping

Hi, Mike,

Thanks for the report! From your packet captures or on-VM logs, do you have an example of the log line groups that Splunk is failing to ingest? Is it all the log lines, or just ones coming from particular Diego components?

The github.com/tedsuo/ifrit<http://github.com/tedsuo/ifrit> dependency hasn't changed in diego-release between 1099 and 1304, but it's possible that our use of it in diego-release has. Likewise, the github.com/pivotal-golang/lager<http://github.com/pivotal-golang/lager> package that's emitting logs has changed in only trivial ways between those releases. We have upgraded the release to use Go 1.4.2 instead of 1.4, though.

Also, what stemcell versions are you using in the deployments? I'm assuming that if CF is deployed alongside these Diego deployments, it's at the corresponding recommended final version (v207 and v212, respectively). If so, are there any problems with the syslog messages coming from those deployments?

Thanks,
Eric, CF Runtime Diego PM



On Mon, Jul 20, 2015 at 6:51 PM, Mike Jacobi wrote:
We have a Diego 1099 deployment and syslog_daemon_config configured. We see a 1:1 mapping for Diego platform messages to syslog messages. In other words, for each syslog message that hits the wire, there is one platform message as its payload. This works well with Splunk, which is ultimately where the messages end up.

We have another deployment, but on Diego 1304, with its syslog_daemon_config identical to the other, but Splunk is *not* ingesting its logs. We ran a packet capture and discovered that this deployment is grouping its log messages in a 1:n manner: For each syslog message on the wire, we have multiple platform messages within, separated by newlines. I suspect this is the reason the logs aren’t being ingested.

I took a quick glance at the code and it seems like this might be due to ifrit/grouper, but I can’t say for sure.

Has anyone run into this issue?

Thanks,
Mike


_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org<mailto:cf-dev(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Missing routing logs from "cf logs app-name"

Simon Johansson <simon@...>
 

Howdie! Not really a dev related question, but since there is no cf-user list I'm trying my luck here. :)

Im looking into an issue where we don't get enough RTR logs from cf logs / firehose-to-syslog
I'm using cf-env[1] as the test app, and Im hitting app.domain.com/kehehehe.
For each request the app itself logs the the path requested and the gorouters logs the same information but a bit more verbose.

$ cf logs cf-env-test
...
2015-08-11T11:43:41.24+0200 [RTR/0] OUT cf-env-test.domain.com - [11/08/2015:09:43:41 +0000] "GET /kehehehe HTTP/1.1" 404 0 18 "-" "curl/7.43.0" 10.230.15.4:59728 x_forwarded_for:"10.230.15.4" vcap_request_id:4a186d9e-046a-47bc-74d0-f0c3a8cb1257 response_time:0.008503156 app_id:7f1944a2-2197-43b5-9334-0be7c8b9b40e
2015-08-11T11:43:41.31+0200 [App/0] ERR 10.230.15.8 - - [11/Aug/2015 09:43:41] "GET /kehehehe HTTP/1.1" 404 18 0.0006

I shoot off 1000 requests and would expect to see 1000 of each log type. But only the App logs are correct.

$ grep "App/" /tmp/cf-logs | wc -l # /tmp/cf-logs comes from cf logs cf-env-test > /tmp/cf-logs
1000
$ grep "RTR/" /tmp/cf-logs | wc -l
145

Looking at the gorouters
nsaadmin(a)a99a5339-4308-43db-b3a4-59442169861d:~$ grep "GET /kehehehe" /var/vcap/sys/log/gorouter/access.log | wc -l
484
nsaadmin(a)1d8ef2c2-2ec1-417d-8882-0234de251c60:~$ grep "GET /kehehehe" /var/vcap/sys/log/gorouter/access.log | wc -l
516
So the gorouter process definitely logs the right amount of requests to disk.

In metron_agent.stdout.log on the gorouters there are a lot of lines like

{"timestamp":1439286941.958369255,"process_id":25749,"source":"metron","log_level":"warn","message":"no matching HTTP start message found for {low:6001753075995010215 high:3404432699330643838 1}","data":null,"file":"/var/vcap/data/compile/metron_agent/loggregator/src/metron/messageaggregator/message_aggregator.go","line":97,"method":"metron/messageaggregator.(*MessageAggregator).handleHTTPStop"}
{"timestamp":1439286941.959420443,"process_id":25749,"source":"metron","log_level":"warn","message":"no matching HTTP start message found for {low:8668459744461365080 high:13197025281073682796 2}","data":null,"file":"/var/vcap/data/compile/metron_agent/loggregator/src/metron/messageaggregator/message_aggregator.go","line":97,"method":"metron/messageaggregator.(*MessageAggregator).handleHTTPStop"}
{"timestamp":1439286941.964927912,"process_id":25749,"source":"metron","log_level":"warn","message":"no matching HTTP start message found for {low:13205430422726438703 high:7208042014820494920 1}","data":null,"file":"/var/vcap/data/compile/metron_agent/loggregator/src/metron/messageaggregator/message_aggregator.go","line":97,"method":"metron/messageaggregator.(*MessageAggregator).handleHTTPStop"}

Has anyone seen this before?

[1] https://github.com/cloudfoundry-community/cf-env


Re: Strategies for limiting metric updates with a clustered nozzle

Mike Youngstrom
 

Sounds great. I think the random solution works for me now. I'm glad you
are aware of the use case and have tentative plans to improve it in the
future. Thanks Erik!

Mike

On Tue, Aug 11, 2015 at 1:10 PM, Erik Jasiak <ejasiak(a)pivotal.io> wrote:

(list resend #1)
Hi Mike,

I think your random approach is workable; what you are doing in effect
is taking fewer polling samples off of the firehose stream.

Short of the aggregation answer James pointed out, this has the
potential to mess with a few things, like averages, but it's better than
nothing if you have to rate-control at ingest, and are looking for a
low-cost solution.

In the longer-term, we are looking closely at how to make it easier to
aggregate metrics at either end of loggregator to help with the amount of
data, and hope to have more info shortly. Hopefully that will help with
controlling data flow no matter how often a component emits metrics.

Erik

On Sat, Aug 8, 2015 at 10:49 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Thanks James,

A little more complicated with more moving parts than I was hoping for
but if I don't want to miss anything I probably don't have much of a choice.

I think for now I'm going to go with some kind of random approach. At
least for the dropsonde generated metrics since they are by far the most
frequent/expensive and I think grabbing a random smattering of them will be
good enough for my current uses.

Mike

On Sat, Aug 8, 2015 at 7:02 AM, James Bayer <jbayer(a)pivotal.io> wrote:

warning, thinking out loud here...

your nozzle will tap the firehose, and filter for the metrics you care
about

currently you're publishing theses events to your metrics backend as
fast as they come in across a horizontally scalable tier that doesn't
coordinate and that can be expensive if your backend charges by the
transaction

to slow down the stream, you could consider having the work in two
phases:
1) aggregation phase
2) publish phase

the aggregation phase could have each instance of the horizontally scale
out tier put the metric in a temporary data store such as redis or other
in-memory data grid with HA like apache geode [1].

the publish phase would have something like a cron / spring batch
capability to occasionally (as often as made sense for your costs) flush
the metrics from the temporary data store to the backend per-transaction
cost backend

[1] http://geode.incubator.apache.org/

On Fri, Aug 7, 2015 at 9:26 AM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

I suppose one relatively simple solution to this problem is I can have
each cluster member randomly decide if it should log each metric. :) If I
pick a number between 1 and 6 I suppose odds are I would log about every
6th message on average or something like that. :)

Another idea, I could have each member pick a random number between 1
and 10 and I would skip that many messages before publishing then pick a
new random number.

I think it is mostly the dropsonde messages that are killing me. A
technique like this probably wouldn't really work for metrics derived from
http events and such.

Anyone have any other ideas?

MIke

On Wed, Aug 5, 2015 at 12:06 PM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

I'm working on adding support for Firehose metrics to our monitoring
solution. The firehose is working great. However, it appears each
component seems to send updates every 10 seconds or so. This might be a
great interval for some use cases but for my monitoring provider it can get
expensive. Any ideas on how I might limit the frequency of metric updates
from the firehose?

The obvious initial solution is to just do that in my nozzle.
However, I plan to cluster my nozzle using a subscriptionId. My
understanding is that when using a subscriptionId events will get balanced
between the subscribers. That would mean one nozzle instance might know
when it last sent a particular metric, but, the other instances wouldn't,
without making the solution more complex than I'd like it to be.

Any thoughts on how I might approach this problem?

Mike

--
Thank you,

James Bayer


Re: How to call Cloud Foundry API from a node.js application deployed?

Amit Kumar Gupta
 

Have you tried the same thing that worked before:
https://api.MY_PUBLIC_IP.xip.io/v2/info?

That public IP is your load balancer, HA Proxy, or router; traffic there
will know how to get routed to the CC's advertising the "
api.MY_PUBLIC_IP.xip.io" route.

The things like CF_INSTANCE_IP are the IP of the container (running your
node app), so it will definitely not work.

Best,
Amit

On Tue, Aug 11, 2015 at 12:05 PM, Juan Antonio Breña Moral <
bren(a)juanantonio.info> wrote:

I have tested with this combinations:

var API_URL = "http://api." + process.env.VCAP_APP_HOST + ".xip.io/v2/info
";
var API_URL = "http://api." + process.env.VCAP_APP_HOST + "/v2/info";
var API_URL = "http://api." + process.env.CF_INSTANCE_IP + "/v2/info";
var API_URL = "http://api." + process.env.CF_INSTANCE_IP + ".
xip.io/v2/info";
var API_URL = "http://api." + process.env.CF_INSTANCE_ADDR + ".
xip.io/v2/info";
var API_URL = "http://api." + process.env.CF_INSTANCE_IP + ".
xip.io/v2/info";
var API_URL = "//api." + process.env.VCAP_APP_HOST + "/v2/info";

but I failed with the idea to get this data:
http://apidocs.cloudfoundry.org/214/info/get_info.html

How to connect?


Re: Announcing version 2.6 of the Service Broker API

Dieu Cao <dcao@...>
 

Thanks for the reminder Mike. I must have missed that email.
Yes, that sounds like a reasonable ask as well. I'll add some stories.

On Thu, Aug 6, 2015 at 9:13 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Thanks Dieu! Any thoughts on Mike's follow up for the same type of change
on async polling api?


http://cf-dev.70369.x6.nabble.com/cf-dev-Service-Broker-API-updating-instances-tp392p495.html

Mike

On Wed, Aug 5, 2015 at 12:37 AM, Dieu Cao <dcao(a)pivotal.io> wrote:

On behalf of the CAPI team I'm pleased to announce version 2.6 of the
Service Broker API. Two minor changes were introduced in this update.

Support for Service Keys was introduced with cf-release v213 and CLI
v6.12.1. This features enables service providers to support creation of
credentials, called service keys, without requiring the user to bind the
service to an application. This allows for easier provisioning of service
credentials for use by external applications and clients.


In support of Service Keys, the field app_guid is no longer guaranteed
with the bind request
<http://docs.cloudfoundry.org/services/api.html#binding>. When users
bind a service instance to an application, the field will be included with
the request. When users create a service key the bind request is also made,
but app_guid will not be included. Brokers that require this field can
reject the request with a specific error code which causes Cloud Foundry to
return a meaningful error to the user.

With cf-release v213, the field service_id is now included in the update
request
<http://docs.cloudfoundry.org/services/api.html#updating_service_instance>.
Thanks to Mike Heath for suggesting this improvement.


Documentation:

- http://docs.cloudfoundry.org/services/api.html

- http://docs.cloudfoundry.org/devguide/services/managing-services.html



-Dieu

CF CAPI PM



Overcommit on Diego Cells

Mike Youngstrom
 

Today my org manages our DEA resources using a heavy overcommit strategy.
Rather than being conservative and ensuring that none of our DEAs commit to
more than they can handle we have instead decided to overcommit to the
point where we basically turn off DEA resource management.

All our DEAs have the same amount of RAM and Disk and we closely monitor
these resources. When load gets beyond a threshold we deploy more DEAs.
We use Org quotas as ceilings to help stop an app from accidentally killing
everything.

So far this strategy has worked out great for us. It's allowed us to
provide much more friendly defaults for RAM and Disk and allowed us to get
more value out of our DEA dollar.

As we move into Diego we're attempting to implement the same strategy. We
want to be sure to do it correctly since we're less comfortable with Diego
at this point.

Diego doesn't have the friendly "overcommit" property DEAs do. Instead I
see "diego.executor.memory_capacity_mb" and
"diego.executor.disk_capacity_mb". Can I overcommit these values and get
the same behaviour I would overcommitting DEAs?

I'd also like some advice on what "diego.garden-linux.btrfs_store_size_mb"
is and how it might apply to my overcommit plans.

Thanks,
Mike


Re: Logstash and Multiline Log Entry

Erik Jasiak
 

Hi Steve and Simon; hello again Mike,

First, apologies for the delay in reply on this one- I've also
been trying to come up with a simple, short answer to this problem. I
failed.

Here are the high-level, non-technical answers:
1) Yes, we'd love to enable multi-line logging. Regardless of any
other challenges, we know that there's interest.
2) The problem is multi-layered, and extends beyond loggregator.
2a) Most of the problems with multi-line logging that overlap
loggregator also overlap "general scalability" - problems we've been
handling as part of moving toward collector retirement.
3) We have a hack day project looking at anything "quick and dirty"
to help fix this.
4) Redirecting app logs have known workarounds (eg in Java: via
log4j or similar) while we tackled this - not preferred at all, but do-able.

#########

Technical answers: Loggregator's goals are "Fast, thorough, dumb."
Multi-line logging - as handled by loggregator - has no clean way of
working at the moment w/o violating "fast" or "dumb" principles today.
We're getting there though.

Here's how we've been working towards a fix:
* Syslog drains were not performant enough, or could not handle
large java traces - something we recently fixed[1][2][3] and are going
to email about separately.
* Horizontal scalability allows for overall better performance
and reliability, but pushes the cost on data consistency to the edges of
loggregator (hence nozzles, injectors.)
* Loggregator's dropsonde protocol didn't allow for a clean way
to enforce/tag multi-line data consistency - something we are about to
put forward a proposal to remedy.
** Timestamps are not a clean mechanism for reliably
re-assembling a multi-line log - some combination of app-instance and
order-of-output would need to be tacked on, or a decent vector-time
implementation. We'd need a way to add this metadata that would allow
for re-assembly (see protocol item above). We'd also have to add extra
info at DEA or garden without sacrificing performance - and we know that
the DEA logging agent today already has questions around "acceptable"
performance.

So a multiline fix intersects our goals today. I will do my best
to highlight stories that help us with multi-line logging, and we need
to do a better job at communicating that we're working toward it, even
if it's not the obvious target goal.

,
Erik

[1] https://www.pivotaltracker.com/story/show/99494586
[2] https://www.pivotaltracker.com/story/show/97928938
[3] https://www.pivotaltracker.com/story/show/100163298

Steve Wall wrote:

Now I see what that means. Each line of a multiline log message could
be sent to a different logstash server. Definitely problematic.
Especially with the ephemeral nature of the CF logs there needs to be
a viable solution to persist the logs and syslog seems to be a natural
solution. I'm located in Denver and attend the local CF meetups held
in the Pivotal offices. I believe some LAMB devs attend. I'll be sure
to bring it up with them.
-Steve

On Wed, Jul 29, 2015 at 9:47 AM, Mike Youngstrom <youngm(a)gmail.com
<mailto:youngm(a)gmail.com>> wrote:

Thanks Steve. Though I'm no logstash expert I assume this won't
work if you have multiple logstash machine's doing filtering like
Simon mentioned right? Same is true for us with splunk if you are
forwarding logs to more than one indexer via the REST api. I'd
still like to have a discussion with Erik about this problem see
if he thinks there is anything that can be done in loggregator to
help.

Mike

On Wed, Jul 29, 2015 at 9:00 AM, Steve Wall
<steve.wall(a)primetimesoftware.com
<mailto:steve.wall(a)primetimesoftware.com>> wrote:

Here's a suggested pattern to handle stack traces.

http://stackoverflow.com/questions/31657863/logstash-and-multiline-log-entry-from-cloud-foundry?noredirect=1#comment51279061_31657863


On Mon, Jul 27, 2015 at 11:02 AM, Mike Youngstrom
<youngm(a)gmail.com <mailto:youngm(a)gmail.com>> wrote:

Yet another request for improved multi line log message
handling. Is there any update from the LAMB team on plans
to improve this problem? There have been several proposed
solutions but I'm not aware of anything actually making it
into the LAMB tracker. It would be great if we could hear
from Erik on this issue. Does the LAMB team believe it is
not an issue? Are there plans to improve this situation?
Whatever the perspective lets discuss it as a community
and see if there are any options better than the current.
I'd really like to see something turned into a tracker
issue if there are better options.

Mike

[0]
http://lists.cloudfoundry.org/pipermail/cf-dev/2015-June/000423.html
[1]
http://lists.cloudfoundry.org/pipermail/cf-dev/2015-May/000083.html
[2]
https://groups.google.com/a/cloudfoundry.org/forum/?utm_medium=email&utm_source=footer#!msg/vcap-dev/B1W6_vO0oyo/84X1eAtFsKoJ
<https://groups.google.com/a/cloudfoundry.org/forum/?utm_medium=email&utm_source=footer#%21msg/vcap-dev/B1W6_vO0oyo/84X1eAtFsKoJ>

On Mon, Jul 27, 2015 at 9:47 AM, Simon Johansson
<simon(a)simonjohansson.com
<mailto:simon(a)simonjohansson.com>> wrote:

This is a tricky one. Especially if you have more than
one logstash machine doing filtering as they will do
filtering independently of each other as the events
come in.

The reason why CF adds a timestamp to each line is
because how syslog works, where each line is its own even.

What we tend to do in my company is to log this kind
of stuff via GELF or with Sentry.

On Mon, Jul 27, 2015 at 5:41 PM, Steve Wall
<stevewallone(a)gmail.com
<mailto:stevewallone(a)gmail.com>> wrote:

Hello,
We are sending CF logs message to an ELK stack.
Multiline logs message are broken out into several
log messages in Logstash. One end per line of the
multiline log message. This is problematic when
stack traces dumped to the log. Each line of the
stack trace is translated into a log message.
Trying to view this through Kibana is nearly
impossible. Logstash provides a Grok feature
allowing for the manipulation of the log messages.
One common solution is to create a Grok filter
that using a timestamp to indicate when a log
entry starts and to combine all lines until the
next timestamp into one log message. The problem
is that CF adds a timestamp to every line. Has
anyone come up with a good Grok expression to
handle multiline log message coming out of CF?
Thanks!
Steve



_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
<mailto:cf-dev(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
<mailto:cf-dev(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
<mailto:cf-dev(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
<mailto:cf-dev(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org <mailto:cf-dev(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


_______________________________________________
cf-dev mailing list
cf-dev(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev


Re: Questions about removal of the heartbeat message type from dropsonde-protocol

Erik Jasiak <ejasiak@...>
 

(resend #2)
Hi again Mike,

There were quite a few pros and cons that went into it; the high (low?)
lights from my notes are below. I'll have the rest of the team check in
if they have more info.

1) A ruby version of the dropsonde-protocol would require some amount of
maintaining state done by the consumer, which is more challenging in ruby.
2) How to shoehorn a heartbeat mechanism into the statsd injector (by
its nature, statsd sends last known value; is a heartbeat binary yes/no, or
milliseconds uptime, and a component is dead when there's no increase?)
3) Whose job is it to maintain heartbeat state to begin with?
Metron's, as the aggregator of dropsonde counters? A Nozzle's?
4) Is the correct model to use heartbeats as the 'source of truth' about
a component being alive, regardless of the data being broadcast, or does a
component / developer prefer the non-statsd-model of wanting metric updates
to serve as a heartbeat? (We've leaned toward the statsd model of 'last
update is valid', but then that implies everyone agrees a heartbeat is
really a running uptime counter or similar.)

We didn't have answers to all of these questions; what we did find was
that dropsonde-protocol heartbeats were rarely being used, and largely
being ignored. Because they were also in the way of figuring out a path
forward for things like dropsonde with ruby, we went for their removal
until we had a clearer use case and strategy, or we could handle them in a
cleaner, generally agreed upon way.

Hope that helps,
Erik

On Sat, Aug 8, 2015 at 11:32 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

I noticed that heartbeat messages are no longer a part of the
dropsonde-protocol.

Can I get a quick summary of the thinking behind this change?

Is there an assumption that we should be using the bosh health manager and
not the firehose for this type of thing?

I'm just like some background and help understanding the LAMB team's
monitoring mindset regarding the removal of this message.

Thanks,
Mike


Re: Strategies for limiting metric updates with a clustered nozzle

Erik Jasiak <ejasiak@...>
 

(list resend #1)
Hi Mike,

I think your random approach is workable; what you are doing in effect is
taking fewer polling samples off of the firehose stream.

Short of the aggregation answer James pointed out, this has the potential
to mess with a few things, like averages, but it's better than nothing if
you have to rate-control at ingest, and are looking for a low-cost solution.

In the longer-term, we are looking closely at how to make it easier to
aggregate metrics at either end of loggregator to help with the amount of
data, and hope to have more info shortly. Hopefully that will help with
controlling data flow no matter how often a component emits metrics.

Erik

On Sat, Aug 8, 2015 at 10:49 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

Thanks James,

A little more complicated with more moving parts than I was hoping for but
if I don't want to miss anything I probably don't have much of a choice.

I think for now I'm going to go with some kind of random approach. At
least for the dropsonde generated metrics since they are by far the most
frequent/expensive and I think grabbing a random smattering of them will be
good enough for my current uses.

Mike

On Sat, Aug 8, 2015 at 7:02 AM, James Bayer <jbayer(a)pivotal.io> wrote:

warning, thinking out loud here...

your nozzle will tap the firehose, and filter for the metrics you care
about

currently you're publishing theses events to your metrics backend as fast
as they come in across a horizontally scalable tier that doesn't coordinate
and that can be expensive if your backend charges by the transaction

to slow down the stream, you could consider having the work in two phases:
1) aggregation phase
2) publish phase

the aggregation phase could have each instance of the horizontally scale
out tier put the metric in a temporary data store such as redis or other
in-memory data grid with HA like apache geode [1].

the publish phase would have something like a cron / spring batch
capability to occasionally (as often as made sense for your costs) flush
the metrics from the temporary data store to the backend per-transaction
cost backend

[1] http://geode.incubator.apache.org/

On Fri, Aug 7, 2015 at 9:26 AM, Mike Youngstrom <youngm(a)gmail.com> wrote:

I suppose one relatively simple solution to this problem is I can have
each cluster member randomly decide if it should log each metric. :) If I
pick a number between 1 and 6 I suppose odds are I would log about every
6th message on average or something like that. :)

Another idea, I could have each member pick a random number between 1
and 10 and I would skip that many messages before publishing then pick a
new random number.

I think it is mostly the dropsonde messages that are killing me. A
technique like this probably wouldn't really work for metrics derived from
http events and such.

Anyone have any other ideas?

MIke

On Wed, Aug 5, 2015 at 12:06 PM, Mike Youngstrom <youngm(a)gmail.com>
wrote:

I'm working on adding support for Firehose metrics to our monitoring
solution. The firehose is working great. However, it appears each
component seems to send updates every 10 seconds or so. This might be a
great interval for some use cases but for my monitoring provider it can get
expensive. Any ideas on how I might limit the frequency of metric updates
from the firehose?

The obvious initial solution is to just do that in my nozzle. However,
I plan to cluster my nozzle using a subscriptionId. My understanding is
that when using a subscriptionId events will get balanced between the
subscribers. That would mean one nozzle instance might know when it last
sent a particular metric, but, the other instances wouldn't, without making
the solution more complex than I'd like it to be.

Any thoughts on how I might approach this problem?

Mike

--
Thank you,

James Bayer


Re: How to call Cloud Foundry API from a node.js application deployed?

Juan Antonio Breña Moral <bren at juanantonio.info...>
 

I have tested with this combinations:

var API_URL = "http://api." + process.env.VCAP_APP_HOST + ".xip.io/v2/info";
var API_URL = "http://api." + process.env.VCAP_APP_HOST + "/v2/info";
var API_URL = "http://api." + process.env.CF_INSTANCE_IP + "/v2/info";
var API_URL = "http://api." + process.env.CF_INSTANCE_IP + ".xip.io/v2/info";
var API_URL = "http://api." + process.env.CF_INSTANCE_ADDR + ".xip.io/v2/info";
var API_URL = "http://api." + process.env.CF_INSTANCE_IP + ".xip.io/v2/info";
var API_URL = "//api." + process.env.VCAP_APP_HOST + "/v2/info";

but I failed with the idea to get this data:
http://apidocs.cloudfoundry.org/214/info/get_info.html

How to connect?


How to call Cloud Foundry API from a node.js application deployed?

Juan Antonio Breña Moral <bren at juanantonio.info...>
 

Hi,

I would like to know how to call cloud foundry API from a node.js application.

From local, it is very easy to call the API using a absolute address:

https://api.MY_PUBLIC_IP.xip.io/v2/info
http://apidocs.cloudfoundry.org/214/info/get_info.html

but when I deploy the node application to the platform, I can't connect with the API and I receive the following error:

{"error":{"code":"ECONNREFUSED","errno":"ECONNREFUSED","syscall":"connect"}}

I checked Application Security Group, but no rule is associated to the space used with the application.
http://docs.pivotal.io/pivotalcf/adminguide/app-sec-groups.html

Does exist a way to call the API from a application deployed?
Does exist a VCAP variable similar to: process.env.VCAP_APP_PORT to avoid this problem?

Many thanks in advance

Juan Antonio


Re: How to build binaries for buildpacks?

Alexander Lomov <alexander.lomov@...>
 

Thank you for that useful answer, Matthew.


------------------------
Alex Lomov
*Altoros* — Cloud Foundry deployment, training and integration
*Twitter:* @code1n <https://twitter.com/code1n> *GitHub:* @allomov
<https://gist.github.com/allomov>

On Tue, Aug 11, 2015 at 8:33 PM, Matthew Horan <mhoran(a)pivotal.io> wrote:

Hey Alex -

Check out the binary-builder repo on GitHub [1]. In particular, the Ruby
blueprint [2].

The binary builder is used by our CI pipeline [3] to produce the binaries
bundled in the buildpacks. The README [4] includes instructions on how to
run binary-builder locally, and the CI repository includes a script [5] for
building binaries via a script.

Best,

Matt

[1] https://github.com/cloudfoundry/binary-builder
[2]
https://github.com/cloudfoundry/binary-builder/blob/master/templates/ruby_blueprint.sh.erb
[3]
https://github.com/cloudfoundry/buildpacks-ci/blob/master/pipelines/binary-builder.yml
[4] https://github.com/cloudfoundry/binary-builder/blob/master/README.md
[5]
https://github.com/cloudfoundry/buildpacks-ci/blob/master/scripts/build-binary.rb

On Tue, Aug 11, 2015 at 10:50 AM, Lomov Alexander <
alexander.lomov(a)altoros.com> wrote:

I was able to solve the problem with ruby by adding
--enable-load-relative to ./configure command.

Still it would be useful to know how do you build binaries for buildpacks.

Thank you,
Alex L.

On Aug 11, 2015, at 4:44 PM, Lomov Alexander <alexander.lomov(a)altoros.com>
wrote:

Hello, everyone.

I try to create custom ruby-buildpack to support Power8. This means I
need to rebuild ruby binaries. I’ve wrote some scripts that build necessary
binaries and upload them to S3 [1]. Here is a script that build ruby [2].
Still after I tried to run an app I’ve got this error [3], that said “`require':
cannot load such file -- rubygems.rb (LoadError)”. I think that
potential problem could be in the way I build ruby binary.

Could you please tell what do you use to build binaries for buildpacks?

[1] https://github.com/Altoros/ruby-buildpack/tree/power/power
[2]
https://github.com/Altoros/ruby-buildpack/blob/power/power/scripts/ruby-2.1.5.sh
[3] https://gist.github.com/allomov-altoros/4cfbd463a8bde056680d

Thank you,
Alex L.



Mailing list service disruption

Eric Searcy <eric@...>
 

During continued work on the new Mailman 3 list server, a configuration change caused mail to get stuck and eventually bounce. If you sent a message between Aug 8 17:50 UTC and Aug 11 10:35 UTC and received a bounce message, you will need to resend the email. We apologize for the delay in discovering and fixing the problem.

--
Eric Searcy, Infrastructure Manager
The Linux Foundation


Re: CAB call for August scheduled Wednesday 08/12/15

Chip Childers <cchilders@...>
 

You can subscribe to the public calendar here:
https://www.google.com/calendar/ical/cloudfoundry.org_oedb0ilotg5udspdlv32a5vc78%40group.calendar.google.com/public/basic.ics

I believe the Dieu has an invite for those that might be local to the
Pivotal SF office.

Chip Childers | VP Technology | Cloud Foundry Foundation

On Tue, Aug 11, 2015 at 12:56 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Thanks, will do.

On Tue, Aug 11, 2015 at 9:54 AM, Michael Maximilien <mmaximilien(a)gmail.com
wrote:
Yup, Dieu manages it now---I believe. Before it was Janes.

Please ask her.

Best,

Max


Sent from Mailbox <https://www.dropbox.com/mailbox>


On Wed, Aug 12, 2015 at 12:35 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hey Max,

Is there a calendar event you can invite me to?

Amit

On Tue, Aug 11, 2015 at 2:11 AM, Michael Maximilien <maxim(a)us.ibm.com>
wrote:

Hi, all,

Hope you are ready for the August CAB call. I am in Beijing but fear
not, I am ready to rock and roll.

As usual, please take some time to update the google docs (link below)
with any highlights (PMs) or any items you would like to bring up to the
community.

Find all important info here:

-------
CF community call
USA 888-426-6840; 215-861-6239 | Leader code: 66850163 | Participant
code: 1985291

All other countries can find dial-in numbers here: http://goo.gl/RnNfc1

*6 to mute/unmute

Agenda

https://docs.google.com/document/d/1SCOlAquyUmNM-AQnekCOXiwhLs6gveTxAcduvDcW_xI/edit
-------

Talk soon,

James, Chip, and Max
ibm cloud labs
silicon valley, ca


Re: How to build binaries for buildpacks?

Matthew Horan
 

Hey Alex -

Check out the binary-builder repo on GitHub [1]. In particular, the Ruby
blueprint [2].

The binary builder is used by our CI pipeline [3] to produce the binaries
bundled in the buildpacks. The README [4] includes instructions on how to
run binary-builder locally, and the CI repository includes a script [5] for
building binaries via a script.

Best,

Matt

[1] https://github.com/cloudfoundry/binary-builder
[2]
https://github.com/cloudfoundry/binary-builder/blob/master/templates/ruby_blueprint.sh.erb
[3]
https://github.com/cloudfoundry/buildpacks-ci/blob/master/pipelines/binary-builder.yml
[4] https://github.com/cloudfoundry/binary-builder/blob/master/README.md
[5]
https://github.com/cloudfoundry/buildpacks-ci/blob/master/scripts/build-binary.rb

On Tue, Aug 11, 2015 at 10:50 AM, Lomov Alexander <
alexander.lomov(a)altoros.com> wrote:

I was able to solve the problem with ruby by adding --enable-load-relative
to ./configure command.

Still it would be useful to know how do you build binaries for buildpacks.

Thank you,
Alex L.

On Aug 11, 2015, at 4:44 PM, Lomov Alexander <alexander.lomov(a)altoros.com>
wrote:

Hello, everyone.

I try to create custom ruby-buildpack to support Power8. This means I need
to rebuild ruby binaries. I’ve wrote some scripts that build necessary
binaries and upload them to S3 [1]. Here is a script that build ruby [2].
Still after I tried to run an app I’ve got this error [3], that said “`require':
cannot load such file -- rubygems.rb (LoadError)”. I think that potential
problem could be in the way I build ruby binary.

Could you please tell what do you use to build binaries for buildpacks?

[1] https://github.com/Altoros/ruby-buildpack/tree/power/power
[2]
https://github.com/Altoros/ruby-buildpack/blob/power/power/scripts/ruby-2.1.5.sh
[3] https://gist.github.com/allomov-altoros/4cfbd463a8bde056680d

Thank you,
Alex L.



Re: CAB call for August scheduled Wednesday 08/12/15

Amit Kumar Gupta
 

Thanks, will do.

On Tue, Aug 11, 2015 at 9:54 AM, Michael Maximilien <mmaximilien(a)gmail.com>
wrote:

Yup, Dieu manages it now---I believe. Before it was Janes.

Please ask her.

Best,

Max


Sent from Mailbox <https://www.dropbox.com/mailbox>


On Wed, Aug 12, 2015 at 12:35 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hey Max,

Is there a calendar event you can invite me to?

Amit

On Tue, Aug 11, 2015 at 2:11 AM, Michael Maximilien <maxim(a)us.ibm.com>
wrote:

Hi, all,

Hope you are ready for the August CAB call. I am in Beijing but fear
not, I am ready to rock and roll.

As usual, please take some time to update the google docs (link below)
with any highlights (PMs) or any items you would like to bring up to the
community.

Find all important info here:

-------
CF community call
USA 888-426-6840; 215-861-6239 | Leader code: 66850163 | Participant
code: 1985291

All other countries can find dial-in numbers here: http://goo.gl/RnNfc1

*6 to mute/unmute

Agenda

https://docs.google.com/document/d/1SCOlAquyUmNM-AQnekCOXiwhLs6gveTxAcduvDcW_xI/edit
-------

Talk soon,

James, Chip, and Max
ibm cloud labs
silicon valley, ca


Re: CAB call for August scheduled Wednesday 08/12/15

Michael Maximilien
 

Yup, Dieu manages it now---I believe. Before it was Janes. 




Please ask her.


Best,




Max






Sent from Mailbox

On Wed, Aug 12, 2015 at 12:35 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

Hey Max,
Is there a calendar event you can invite me to?
Amit
On Tue, Aug 11, 2015 at 2:11 AM, Michael Maximilien <maxim(a)us.ibm.com>
wrote:
Hi, all,

Hope you are ready for the August CAB call. I am in Beijing but fear not,
I am ready to rock and roll.

As usual, please take some time to update the google docs (link below)
with any highlights (PMs) or any items you would like to bring up to the
community.

Find all important info here:

-------
CF community call
USA 888-426-6840; 215-861-6239 | Leader code: 66850163 | Participant
code: 1985291

All other countries can find dial-in numbers here: http://goo.gl/RnNfc1

*6 to mute/unmute

Agenda

https://docs.google.com/document/d/1SCOlAquyUmNM-AQnekCOXiwhLs6gveTxAcduvDcW_xI/edit
-------

Talk soon,

James, Chip, and Max
ibm cloud labs
silicon valley, ca


Emails not delivered to the mailing list - connection refused error

Jean-Sebastien Delfino
 

Hi,

The emails I'm trying to send to this list are getting returned with the
following delivery error:

===
From: Mail Delivery System <MAILER-DAEMON(a)smtp1.linuxfoundation.org>
Date: Tue, Aug 11, 2015 at 4:46 AM
Subject: Undelivered Mail Returned to Sender
To: jsdelfino(a)gmail.com

This is the mail system at host smtp1.linuxfoundation.org.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

The mail system

<cf-dev(a)lists.cloudfoundry.org>: connect to 172.17.197.36[172.17.197.36]:25:
Connection refused

Final-Recipient: rfc822; cf-dev(a)lists.cloudfoundry.org
Original-Recipient: rfc822;cf-dev(a)lists.cloudfoundry.org
Action: failed
Status: 4.4.1
Diagnostic-Code: X-Postfix; connect to 172.17.197.36[172.17.197.36]:25:
Connection refused

===

Anyone else is seeing this? Is the list not accepting emails anymore?
(sending this from the nabble forum)

Thanks

- Jean-Sebastien





--
View this message in context: http://cf-dev.70369.x6.nabble.com/Emails-not-delivered-to-the-mailing-list-connection-refused-error-tp1149.html
Sent from the CF Dev mailing list archive at Nabble.com.

8221 - 8240 of 9425