Re: Identifying all DEA's on a CF install
Matt Curry
I believe you could use the BOSH API:
toggle quoted messageShow quoted text
https://bosh.io/docs/director-api-v1.html#list-vms-detailed From: john mcteague <john.mcteague(a)gmail.com<mailto:john.mcteague(a)gmail.com>> Reply-To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev(a)lists.cloudfoundry.org<mailto:cf-dev(a)lists.cloudfoundry.org>> Date: Monday, December 7, 2015 at 3:29 PM To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev(a)lists.cloudfoundry.org<mailto:cf-dev(a)lists.cloudfoundry.org>> Subject: [cf-dev] Re: Re: Identifying all DEA's on a CF install I hadnt considered that. I see that there are now docs for the BOSH api, however is the BOSH api suited to concurrent access by multiple processes? We are writing a component that will need to frequently validate that a particular machine is a DEA.
On Mon, Dec 7, 2015 at 10:26 PM, Amit Gupta <agupta(a)pivotal.io<mailto:agupta(a)pivotal.io>> wrote:
What about with BOSH? On Mon, Dec 7, 2015 at 2:17 PM, john mcteague <john.mcteague(a)gmail.com<mailto:john.mcteague(a)gmail.com>> wrote: I am trying to enumerate all the ways I could identify all the DEA's in a CF cluster in an IAAS agnostic way (e.g. not querying Openstack to read metadata on the VM's that list the job type). I think the only reliable way is to listen on NATS and look for the DEA advertisement messages. Am I correct or is there a method I am missing (I'd rather call and API rather than subscribe to NATS)? Thanks, John
|
|
Re: Identifying all DEA's on a CF install
john mcteague <john.mcteague@...>
I hadnt considered that. I see that there are now docs for the BOSH api,
toggle quoted messageShow quoted text
however is the BOSH api suited to concurrent access by multiple processes? We are writing a component that will need to frequently validate that a particular machine is a DEA.
On Mon, Dec 7, 2015 at 10:26 PM, Amit Gupta <agupta(a)pivotal.io> wrote:
What about with BOSH?
|
|
Re: Identifying all DEA's on a CF install
Amit Kumar Gupta
What about with BOSH?
On Mon, Dec 7, 2015 at 2:17 PM, john mcteague <john.mcteague(a)gmail.com> wrote: I am trying to enumerate all the ways I could identify all the DEA's in a
|
|
Re: v226 release notes
Amit Kumar Gupta
Hi John,
Sorry for the delay. The notes are in draft mode until all sections have been populated. I will ask the respective Product Managers for the remaining section to add their notes. Thanks, Amit On Mon, Dec 7, 2015 at 2:19 PM, john mcteague <john.mcteague(a)gmail.com> wrote: v226 was cut 4 days ago, is there any sign of the release notes?
|
|
v226 release notes
john mcteague <john.mcteague@...>
v226 was cut 4 days ago, is there any sign of the release notes?
Thanks, John
|
|
Identifying all DEA's on a CF install
john mcteague <john.mcteague@...>
I am trying to enumerate all the ways I could identify all the DEA's in a
CF cluster in an IAAS agnostic way (e.g. not querying Openstack to read metadata on the VM's that list the job type). I think the only reliable way is to listen on NATS and look for the DEA advertisement messages. Am I correct or is there a method I am missing (I'd rather call and API rather than subscribe to NATS)? Thanks, John
|
|
Re: - Reducing number of instances on a cloud foundry deployment
Amit Kumar Gupta
Hey Kinjal,
toggle quoted messageShow quoted text
Your best bet, if you choose to use cf-boshworkspace, would be to open up issues against their GitHub repo. The maintainers there will be able to tell you if their stuff is up-to-date and expected to work with v222, what sort of properties you need to put in your stub, etc. For starters, it looks like you're missing values for "bulk_api_password", "staging_upload_user", "db_encryption_key", and "staging_upload_password" under the "cc" section of "properties". Amit
On Fri, Dec 4, 2015 at 7:43 AM, Kinjal Doshi <kindoshi(a)gmail.com> wrote:
Thanks a lot for directing me towards this project. When I am trying to
|
|
Re: doppler issue which fails to emit logs with syslog protocol on CFv212
Amit Kumar Gupta
Hey Masumi,
toggle quoted messageShow quoted text
Glad you've gotten further, thanks for the update. Best, Amit
On Mon, Dec 7, 2015 at 6:20 AM, Masumi Ito <msmi10f(a)gmail.com> wrote:
Hi Amit,This could be due to several reasons -- network issues, etcd slowness,etcd
|
|
Bosh version and stemcell for 225
Mike Youngstrom
We are preparing to release 225 and noticed the release notes don't list a
bosh and stemcell version. Does anyone have that info? Mike
|
|
Re: [cf-env] [abacus] Changing how resources are organized
Jean-Sebastien Delfino
Hi Daniel,
toggle quoted messageShow quoted text
Is that related to Github issue #38? https://github.com/cloudfoundry-incubator/cf-abacus/issues/38 Thanks - Jean-Sebastien
On Fri, Dec 4, 2015 at 11:22 AM, dmangin <dmangin(a)us.ibm.com> wrote:
With the current way we have resource_ids being used for metering,
|
|
- Urgent - Cloud Foundry Deployment is failing on dea.yml.erb
Kinjal Doshi
Hi,
I am trying to deploy cloud foundry with the stemcell light-bosh-stemcell-3147-aws-xen-hvm-ubuntu-trusty-go_agent.tgz and cloud foundry release manifest cf-226yml I am also using the minimal-aws.yml for configuration data. During 'bosh deploy' command, I run into the following deployment error: Started preparing deployment Started preparing deployment > Binding releases. Done (00:00:00) Started preparing deployment > Binding existing deployment. Done (00:00:01) Started preparing deployment > Binding resource pools. Done (00:00:00) Started preparing deployment > Binding stemcells. Done (00:00:00) Started preparing deployment > Binding templates. Done (00:00:00) Started preparing deployment > Binding properties. Done (00:00:00) Started preparing deployment > Binding unallocated VMs. Done (00:00:00) Started preparing deployment > Binding instance networks. Done (00:00:00) Started preparing package compilation > Finding packages to compile. Done (00:00:00) Started preparing dns > Binding DNS. Done (00:00:00) Started preparing configuration > Binding configuration. Failed: Error filling in template `dea.yml.erb' for `runner_z1/0' (line 86: bad component(expected user component): Oro(a)1602) (00:00:01) Error 100: Error filling in template `dea.yml.erb' for `runner_z1/0' (line 86: bad component(expected user component): Oro(a)1602) I noticed that the property cc.internal_api_user is missing from the global properties and have added the same to minimal-aws.yml but the deployment still fails. I need to have the CF deployment up and running tonight. Would be great if some one can please help me with this on priority? Regards, Kinjal
|
|
How to estimate reconnection / failover time between gorouter and nats
Masumi Ito
Hi,
Can anyone explain about the expected reconnection / failover time for gorouter when one of the nats VMs hangs up accidentally? The background of this question is that I found the gorouter had some timeframe to return "404 Not found Err" for app requests temporarily when one of the clusted nats was not responsive. This happened after about 2 min and then recovered in another 2-3min. I understand it is mainly due to pruning stale routes and reconnection / failover time to a healthy nats by gorouter. First 2 min can be explained as droplet_stale_threshold value. However I am wondering if what exactly happened in another 2-3min. Note that bosh health monitor detected an unresponsive nats and recreated it finally however the gorouter had received "router.register" from DEAs before the recreation was complete. Therefore I think this indicates the failover to the other nats rather than reconnecting to the recreated nats which was previously down. I believe some connection parameters in the yagnats and apcera/nats client are keys for this. - Timeout: timeout to create a new connection - ReconnectWait: wait time before reconnect happens - MaxReconnect: unlimited reconnect times if this value is -1 - PingInterval: interval of each pinging to check if a connection is healthy - MaxPingOut: trial times of pinging before determining reconnection is necessary 1. When one of nats hangs up, the connection might still exist until TCP timeout has been reached. 2. PingTimer periodically sends ping to check if the connection is stale totally (PingInterval * MaxPingOut) times and concluds it is necessary to reconnect to the next nats server. 3. Before reconecting it, the gorouter waits in ReconnectWait. 4. Create a new connection for the next nats server within Timeout. 5. After that, the gorouter starts to register app routes from DEAs through the nats connected. Therefore my rough estimation is: PingInterval(2 min) * MaxPingOut(2) + ReconnectWait(500 millisec) + Timeout(2 sec) I would appreciate if someone could correct this rough explanation or give some more details. Regards, Masumi -- View this message in context: http://cf-dev.70369.x6.nabble.com/How-to-estimate-reconnection-failover-time-between-gorouter-and-nats-tp2980.html Sent from the CF Dev mailing list archive at Nabble.com.
|
|
Re: doppler issue which fails to emit logs with syslog protocol on CFv212
Masumi Ito
Hi Amit,
This could be due to several reasons -- network issues, etcd slowness, etcddown, etc. All right. I might ask questions again if the error messages are frequently encountered however this is a separated issue and not highly prioritized so far . So I am going to close this thread because the original issue has been resolved thanks to you. Thanks for your help. Regards, Masumi -- View this message in context: http://cf-dev.70369.x6.nabble.com/doppler-issue-which-fails-to-emit-logs-with-syslog-protocol-on-CFv212-tp2418p2979.html Sent from the CF Dev mailing list archive at Nabble.com.
|
|
Re: Cloud Foundry Elastic Clusters - Proposal
Dieu Cao <dcao@...>
Hi All,
toggle quoted messageShow quoted text
I've updated the Elastic Clusters proposal [1] with more detail based on additional discussions we've had among CAPI, Routing, Diego, Loggregator, and UAA. Please have a look at this updated version and use the google doc to provide feedback and ideas. There are still a few open questions but I think the proposal is much more fleshed out now and based on feedback we can hopefully get started on the work soon. -Dieu CF CAPI PM / CF Runtime PMC Lead [1] https://docs.google.com/document/d/1BZL4GwluaRtUVfeW-j8_ezBJQmwnhGqcMNvQIy9ltZE/edit?usp=sharing
On Fri, Oct 2, 2015 at 7:19 PM, Mark Kropf <markkropf(a)gmail.com> wrote:
Hello CF Community,
|
|
CF CAB call for December is this Wednesday Dec. 9th, 2015 @ 8a PDT
Michael Maximilien
Hi, all, 您好 Nín hǎo,
Quick reminder that the last CAB call for 2015 is this week Wednesday December 9th @ 8a PDT. I'll be calling from Beijing where it will be 1a, so I plan to start and end on time :) Product managers, please add project updates to Agenda here: https://docs.google.com/document/d/1SCOlAquyUmNM-AQnekCOXiwhLs6gveTxAcduvDcW_xI/edit#heading=h.o44xhgvum2we Everyone, if you have something to share, please also add an entry at the end in Other section. Best, Chip, James, and Max PS: call info and details are in the link above; just visit it dr.max ibm cloud labs sillicon valley, ca Sent from my iPhone
|
|
Replacing nfs with WebDAV
Dieu Cao <dcao@...>
Hello All,
The CAPI team has been investigating replacing the nfs as the default blobstore option in cf-release with WebDAV. There are a number of reasons for doing this. The flakiness of the nfs_mounter job. The requirement for rpc-bind and nfs-common packages in the bosh stemcell and thus open ports on jobs that didn't need them [1] There is a feature on the bosh backlog [2] to have the agent poll process state on "STOP" messages which is blocked by cloud controller's use of nfs because cc would get stuck if the nfs job should change IP. We successfully completed a spike to investigate if we could replace nfs in a backwards compatible way without requiring migration of blobs with webdav building on some work that the bosh team did. We'll be working on the bosh packaging for webdav and making sure that we can provide instructions to make this change as seamless as possible and hopefully we'll have this available "soon". -Dieu CF CAPI PM [1] https://www.pivotaltracker.com/story/show/88810548 [2] https://www.pivotaltracker.com/story/show/62354622
|
|
Re: [abacus] Refactor Aggregated Usage and Aggregated Rated Usage data model
Saravanakumar A. Srinivasan
c) a middle-ground approach where we'll store the aggregated usage per app in separate docs, but maintain the aggregated usage at the upper levels (org, space, resource, plan) in the parent doc linking the app usage docs together, and explore what constrains or limitations that would impose on our ability to trigger real time usage limit alerts at any org, space, resource, plan, app etc level.As a first step (refer to [1] for more details) to refactor the usage data model using middle-ground approach, we have removed Usage Rating Service from Abacus pipeline (refer to commit at [2]) and moved entire rating implementation from Usage Rating Service to Usage Aggregator (refer to commit at [3]) With these commits, If you are using Abacus, be aware that the Abacus pipeline has become shorter and you have one less application (Usage Rating Service) to manage. [1] https://github.com/cloudfoundry-incubator/cf-abacus/issues/184 [2] https://github.com/cloudfoundry-incubator/cf-abacus/commit/1488e1ae2e4547a010151ad2245f3a3f1ff2e488 [3] https://github.com/cloudfoundry-incubator/cf-abacus/commit/c661b7bdd35e70e985583570cb9920b90ced44a8
|
|
Re: Dev and Production environment inconsistent
CF Runtime
Hi Juan,
That is correct. The "production" flag on apps in the CC should not be used. Instead you should push apps to different spaces with separate domains associated to each app/space to create a staging/production pattern as you described. Best, Zak Auerbach, CF Release Integration On Fri, Nov 27, 2015 at 4:31 AM, Juan Antonio Breña Moral < bren(a)juanantonio.info> wrote: Hi Alex,
|
|
Re: Passwords visible in infrastructure logs
Amit Kumar Gupta
Hey Momchil,
toggle quoted messageShow quoted text
Do you know whether it's the DEA or Warden that's logging that sensitive data when you say "runner"? I would recommended opening issues against the relevant projects: API: https://github.com/cloudfoundry/cloud_controller_ng/issues DEA or Warden: https://github.com/cloudfoundry/dea_ng/issues or https://github.com/cloudfoundry/warden/issues As for NATS, you may be able to change the logging level? Alternatively, NATS is not a Cloud Foundry project but you could ask over there about encrypting log output: https://github.com/nats-io/gnatsd In Pivotal's production environments, we run 100% on Diego, so we are not concerned with DEA/Warden logging, and this move also removes NATS from the flows like create-user-provided-service. CC is likely still an issue, so it would be a good one to raise against their GitHub project. Best, Amit
On Fri, Dec 4, 2015 at 2:09 AM, Momchil Atanassov <momchil.atanassov(a)sap.com
wrote: Hi,
|
|
Re: Garden Port Assignment Story
Mike Youngstrom
Thanks for the update Will. I'll keep waiting patiently. :)
toggle quoted messageShow quoted text
On Fri, Dec 4, 2015 at 10:44 AM, Will Pragnell <wpragnell(a)pivotal.io> wrote:
Hi Mike,
|
|