Date   

Re: Environment variable names (was RE: Environment variable changes in DIEGO)

Mike Youngstrom
 

+1 We'd love a metadata service as well when the dust settles.

Mike

On Wed, Sep 16, 2015 at 1:17 PM, Christopher B Ferris <chrisfer(a)us.ibm.com>
wrote:

Onsi,

I can tell you that we (IBM) would welcome a secure metadata service for
retrieving credentials, etc. I'm sure we'd help make that happen.

Cheers,

Christopher Ferris
IBM Distinguished Engineer, CTO Open Technology
IBM Cloud, Open Technologies
email: chrisfer(a)us.ibm.com
twitter: @christo4ferris
blog: http://thoughtsoncloud.com/index.php/author/cferris/
phone: +1 508 667 0402



----- Original message -----
From: Onsi Fakhouri <ofakhouri(a)pivotal.io>
To: "Discussions about Cloud Foundry projects and the system overall." <
cf-dev(a)lists.cloudfoundry.org>
Cc:
Subject: [cf-dev] Re: Environment variable names (was RE: Environment
variable changes in DIEGO)
Date: Wed, Sep 16, 2015 11:40 AM


Would if we could! Environment variables are part of the surface area
that constitutes the contract between the platform and the application.
While we could have duplicates with CF_* and eventually deprecate VCAP_*
the new runtime needs to support droplets staged by the legacy runtime
seamlessly.

Once the dust settles I'd sooner see us step back and reconsider this
particular abstraction. For example, instead of env vars an authenticated
metadata service a la AWS's http://169.254.169.254/ would give us more
flexibility, allow us to dynamically control metadata made available to
apps, and even version metadata sanely.

Onsi

Sent from my iPad

On Sep 16, 2015, at 8:30 AM, john mcteague <john.mcteague(a)gmail.com>
wrote:


On a related, but slightly off, topic, while renaming the VCAP_* vars
would have a big impact, is it not time we thought about renaming these to
CF_* ?

John.
On 16 Sep 2015 16:09, "Matthew Sykes" <matthew.sykes(a)gmail.com> wrote:

The changes, in general, were intentional. The `application_uris` data was
always broken as it didn't reflect route changes. I can't speak directly to
the time stamp data.

The host is present still so I don't know why you don't see it.

We also have a migration guide [1]. If you think additional information is
needed there, pull requests would be welcome.

[1]:
https://github.com/cloudfoundry-incubator/diego-design-notes/blob/master/migrating-to-diego.md

On Wed, Sep 16, 2015 at 10:19 AM, Jack Cai <greensight(a)gmail.com> wrote:

I notice the below changes in the environment variables of DIEGO:
1. VCAP_APP_HOST & VCAP_APP_PORT are removed.
2. These fields are removed from VCAP_APPLICATION value: application_uris,
started_at, start, started_at_timestamp, host, state_timestamp

I suppose #1 is intentional. How about #2?

Jack




--
Matthew Sykes
matthew.sykes(a)gmail.com




Re: Packaging CF app as bosh-release

Amit Kumar Gupta
 

My very limited understanding is that NFS writes to the actual filesystem,
and achieves persistence by having centralized NFS servers where it writes
to a real mounted device, whereas the clients write to an ephemeral
nfs-mount.

My very limited understanding of HDFS is that it's all userland FS, does
not write to the actual filesystem, and relies on replication to other
nodes in the HDFS cluster. Being a userland FS, you don't have to worry
about the data being wiped when a container is shut down, if you were to
run it as an app.

I think one main issue is going to be ensuring that you never lose too many
instances (whether they are containers or VMs), since you might then lose
all replicas of a given data shard. Whether you go with apps or BOSH VMs
doesn't make a big difference here.

Deploying as an app may be a better way to go, it's simpler right now to
configure and deploy and app, than to configure and deploy a full BOSH
release. It's also likely to be a more efficient use of resources, since a
BOSH VM can only run one of these spark-job-processors, but a CF
container-runner can run lots of other things. That actually brings up a
different question: is your compute environment a multi-tenant one that
will be running multiple different workloads? E.g. could someone also use
the CF to push their own apps? Or is the whole thing just for your spark
jobs, in which case you might only be running one container per VM anyways?

Assuming you can make use of the VMs for other workloads, I think this
would be an ideal use case for Diego. You probably don't need all the
extra logic around apps, like staging and routing, you just need Diego to
efficiently schedule containers for you.

On Wed, Sep 16, 2015 at 1:13 PM, Kayode Odeyemi <dreyemi(a)gmail.com> wrote:

Thanks Dmitriy,

Just for clarity, are you saying multiple instances of a VM cannot share a
single shared filesystem?

On Wed, Sep 16, 2015 at 6:59 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io>
wrote:

BOSH allocates a persistent disk per instance. It never shares persistent
disks between multiple instances at the same time.

If you need a shared file system, you will have to use some kind of a
release for it. It's not any different from what people do with nfs
server/client.

On Wed, Sep 16, 2015 at 7:09 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

The shared file system aspect is an interesting wrinkle to the problem.
Unless you use some network layer to how you write to the shared file
system, e.g. SSHFS, I think apps will not work because they get isolated to
run in a container, they're given a chroot "jail" for their file system,
and it gets blown away whenever the app is stopped or restarted (which will
commonly happen, e.g. during a rolling deploy of the container-runner VMs).

Do you have something that currently works? How do your VMs currently
access this shared FS? I'm not sure BOSH has the abstractions for choosing
a shared, already-existing "persistent disk" to be attached to multiple
VMs. I also don't know what happens when you scale your VMs down, because
BOSH would generally destroy the associated persistent disk, but you don't
want to destroy the shared data.

Dmitriy, any idea how BOSH can work with a shared filesystem (e.g. HDFS)?

Amit

On Wed, Sep 16, 2015 at 6:54 AM, Kayode Odeyemi <dreyemi(a)gmail.com>
wrote:


On Wed, Sep 16, 2015 at 3:44 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Are the spark jobs tasks that you expect to end, or apps that you
expect to run forever?
They are tasks that run forever. The jobs are subscribers to RabbitMQ
queues that process
messages in batches.


Do your jobs need to write to the file system, or do they access a
shared/distributed file system somehow?
The jobs write to shared filesystem.


Do you need things like a static IP allocated to your jobs?
No.


Are your spark jobs serving any web traffic?
No.




Re: Environment variable names (was RE: Environment variable changes in DIEGO)

Christopher B Ferris <chrisfer@...>
 

Onsi,
 
I can tell you that we (IBM) would welcome a secure metadata service for retrieving credentials, etc. I'm sure we'd help make that happen.
 
Cheers,

Christopher Ferris
IBM Distinguished Engineer, CTO Open Technology
IBM Cloud, Open Technologies
email: chrisfer@...
twitter: @christo4ferris
blog: http://thoughtsoncloud.com/index.php/author/cferris/
phone: +1 508 667 0402
 
 

----- Original message -----
From: Onsi Fakhouri <ofakhouri@...>
To: "Discussions about Cloud Foundry projects and the system overall." <cf-dev@...>
Cc:
Subject: [cf-dev] Re: Environment variable names (was RE: Environment variable changes in DIEGO)
Date: Wed, Sep 16, 2015 11:40 AM
 
 
Would if we could!  Environment variables are part of the surface area that constitutes the contract between the platform and the application.  While we could have duplicates with CF_* and eventually deprecate VCAP_* the new runtime needs to support droplets staged by the legacy runtime seamlessly.
 
Once the dust settles I'd sooner see us step back and reconsider this particular abstraction.  For example, instead of env vars an authenticated metadata service a la AWS's http://169.254.169.254/ would give us more flexibility, allow us to dynamically control metadata made available to apps, and even version metadata sanely.
 
Onsi

Sent from my iPad

On Sep 16, 2015, at 8:30 AM, john mcteague <john.mcteague@...> wrote:
 

On a related, but slightly off, topic, while renaming the VCAP_* vars would have a big impact, is it not time we thought about renaming these to CF_*  ?

John.

On 16 Sep 2015 16:09, "Matthew Sykes" <matthew.sykes@...> wrote:
The changes, in general, were intentional. The `application_uris` data was always broken as it didn't reflect route changes. I can't speak directly to the time stamp data.
 
The host is present still so I don't know why you don't see it.
 
We also have a migration guide [1]. If you think additional information is needed there, pull requests would be welcome.
 
 
On Wed, Sep 16, 2015 at 10:19 AM, Jack Cai <greensight@...> wrote:
I notice the below changes in the environment variables of DIEGO:
1. VCAP_APP_HOST & VCAP_APP_PORT are removed.
2. These fields are removed from VCAP_APPLICATION value: application_uris, started_at, start, started_at_timestamp, host, state_timestamp
 
I suppose #1 is intentional. How about #2?
 
Jack
 
 
 
--
Matthew Sykes
matthew.sykes@...
 


Re: consolidated routing api

Shannon Coen
 

Checking once more that no one is using the Routing API yet, so we can
proceed with backward incompatible API changes. I believe we still have
this opportunity as we have not yet announced the component is ready for
use.

Our current plan is that the first consumers of the Routing API will be:
- CC (to register it's own route: api.system-domain)
- TCP Route Emitter (to register routes for Diego LRPs with TCP Routes)
- TCP Router (to fetch routing table of TCP routes)

Thank you,

Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.

On Wed, Sep 9, 2015 at 6:05 PM, Shannon Coen <scoen(a)pivotal.io> wrote:

Some of the proposed changes to the Routing API are backward incompatible.
We don't believe anyone is using it yet, as adoption has generally be
blocked on securing connections to Consul, but we'd like to confirm.

Please raise your hand if you're using the routing API.

Thank you!

Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.

On Wed, Sep 9, 2015 at 12:10 PM, Shannon Coen <scoen(a)pivotal.io> wrote:

We currently have two routing APIs in CF.
1. HTTP Routing API in cf-release:
https://github.com/cloudfoundry-incubator/routing-api
2. TCP Routing API in cf-routing-release:
https://github.com/cloudfoundry-incubator/cf-routing-release

The TCP Routing API is quite basic and we want to extend it for high
availability, authentication, etc. However, instead of enhancing the
existing TCP Routing API, we plan to add support for TCP route registration
to the Routing API in cf-release, as it already supports many of the
desired features. We'll get rid of the current API in cf-routing-release
and submodule in the Routing API from cf-release. Eventually we'll move the
Routing API (along with GoRouter and HAProxy) from cf-release into
cf-routing-release and submodule them into cf-release.

This consolidation, along with our not having any API consumer besides
GoRouter yet, gives us the opportunity to consider a common behavior for
routing API endpoints. We welcome your comments in our design doc:


https://docs.google.com/document/d/1v941oy3Y7RRI80gmLfhPZlaahElK_Q0C3pCQewK8Z3g/edit?usp=sharing

Thank you,

Shannon Coen
Product Manager, Cloud Foundry
Pivotal, Inc.


Re: Relationship between HM9000 and router jobs

CF Runtime
 

The DEAs are responsible for broadcasting the routes for the apps they are
running. I can't think of why an hm9000 problem would cause routes to get
lost, unless there was some problem with NATS itself.

Joseph
OSS Release Integration Team

On Wed, Sep 16, 2015 at 1:37 AM, Sylvain FAUVE <sylvain.fauve(a)sap.com>
wrote:

Hello,

My team was working on solving inconsistencies issue on etcd jobs, and
realized that two hm9000 jobs were running at same time.
When fixing this, we experienced route loss to our apps (then restage apps
was needed).

As far as I could read/understand there is no direct communication between
router and hm9000...
Router is getting info from NATS, and NATS gets it from ...? hm9000 ?
I wonder which component is sending routes update to the router to keep
them alive ?


Thank you for your help
Regards,
Sylvain.


Re: cc metric: total_users

CF Runtime
 

Yeah, just as it is possible to create users in UAA, it is also possible to
delete them there, thus orphaning the user accounts in the Cloud Controller.

The orphaned users won't cause any problems in the Cloud Controller, but
the metrics may not be what you expect at that point.

The cf CLI automatically deletes users out of both when it is used.

Joseph
OSS Release Integration Team

On Wed, Sep 16, 2015 at 4:29 AM, Klevenz, Stephan <stephan.klevenz(a)sap.com>
wrote:

Hi,

I did look deeper into implementation. Actually there are two databases:
ccdb and uaadb. Each of them has its own users table. The user count value
of the uaadb/users is reported to admin ui and the user count of the
ccdb/users is used for total_user metric. A ccdb/users table entry contains
just a refid and for getting user details from uaa.

So, we have two totals which can be different if users created for uaa
which a not cc users. That's fine and this is what I did get from the first
answer.

But there is a remaining open point. The total of our ccdb/users is bigger
than total of uaa/users. This is an inconsistency in ccdb/users which
contains references to uaa users that do not exist. If this diffs grows
over time then this is maybe a problem.

Regards,
Stephan







Von: CF Runtime
Antworten an: "Discussions about Cloud Foundry projects and the system
overall."
Datum: Mittwoch, 16. September 2015 10:56
An: "Discussions about Cloud Foundry projects and the system overall."
Betreff: [cf-dev] Re: Re: Re: Re: cc metric: total_users

The users reported by CloudController.total_users are the users the Cloud
Controller has in its database. This is normally the same set of users that
exist in UAA.

However, there is nothing that prevents you from creating users via the
UAAC cli tool, or creating new UAA clients that can create users themselves.

Joseph
OSS Release Integration Team

On Wed, Sep 16, 2015 at 12:03 AM, Voelz, Marco <marco.voelz(a)sap.com>
wrote:

Hi John,

thanks for your answer, however, I don’t understand that completely. For
the open source version of CF, what does “register in CF console” mean? And
what might be an example of “other applications” you are referring to?

Thanks and warm regards
Marco

On 16/09/15 08:29, "Klevenz, Stephan" <stephan.klevenz(a)sap.com> wrote:

Thanks for clarification

-- Stephan

Von: John Liptak
Antworten an: "Discussions about Cloud Foundry projects and the system
overall."
Datum: Dienstag, 15. September 2015 18:17
An: "Discussions about Cloud Foundry projects and the system overall."
Betreff: [cf-dev] Re: cc metric: total_users

Cloud Controller reports the number of users register in CF console. UAAC
reports additional users who may have access to other applications. So
they are both correct, depending on what you need.

For example, if you call the REST API for a UAAC user that isn't in the
CF console, but still call the cloud controller REST API, you will get a
404.

On Tue, Sep 15, 2015 at 10:10 AM, Klevenz, Stephan <
stephan.klevenz(a)sap.com> wrote:

Hi all,

I have a question, hopefully a small one :-)

The CloudController.total_users metric
(/CF\.CloudController\.0\..*\.total_users/) differs from number of users
reported by uaac (uaac users command / admin ui). Can someone explain why
this differs or which number is the correct one?

Regards,
Stephan





Re: Packaging CF app as bosh-release

Paul Bakare
 

Thanks Dmitriy,

Just for clarity, are you saying multiple instances of a VM cannot share a
single shared filesystem?

On Wed, Sep 16, 2015 at 6:59 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io>
wrote:

BOSH allocates a persistent disk per instance. It never shares persistent
disks between multiple instances at the same time.

If you need a shared file system, you will have to use some kind of a
release for it. It's not any different from what people do with nfs
server/client.

On Wed, Sep 16, 2015 at 7:09 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

The shared file system aspect is an interesting wrinkle to the problem.
Unless you use some network layer to how you write to the shared file
system, e.g. SSHFS, I think apps will not work because they get isolated to
run in a container, they're given a chroot "jail" for their file system,
and it gets blown away whenever the app is stopped or restarted (which will
commonly happen, e.g. during a rolling deploy of the container-runner VMs).

Do you have something that currently works? How do your VMs currently
access this shared FS? I'm not sure BOSH has the abstractions for choosing
a shared, already-existing "persistent disk" to be attached to multiple
VMs. I also don't know what happens when you scale your VMs down, because
BOSH would generally destroy the associated persistent disk, but you don't
want to destroy the shared data.

Dmitriy, any idea how BOSH can work with a shared filesystem (e.g. HDFS)?

Amit

On Wed, Sep 16, 2015 at 6:54 AM, Kayode Odeyemi <dreyemi(a)gmail.com>
wrote:


On Wed, Sep 16, 2015 at 3:44 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Are the spark jobs tasks that you expect to end, or apps that you
expect to run forever?
They are tasks that run forever. The jobs are subscribers to RabbitMQ
queues that process
messages in batches.


Do your jobs need to write to the file system, or do they access a
shared/distributed file system somehow?
The jobs write to shared filesystem.


Do you need things like a static IP allocated to your jobs?
No.


Are your spark jobs serving any web traffic?
No.




Re: Packaging CF app as bosh-release

Dmitriy Kalinin <dkalinin@...>
 

BOSH allocates a persistent disk per instance. It never shares persistent
disks between multiple instances at the same time.

If you need a shared file system, you will have to use some kind of a
release for it. It's not any different from what people do with nfs
server/client.

On Wed, Sep 16, 2015 at 7:09 AM, Amit Gupta <agupta(a)pivotal.io> wrote:

The shared file system aspect is an interesting wrinkle to the problem.
Unless you use some network layer to how you write to the shared file
system, e.g. SSHFS, I think apps will not work because they get isolated to
run in a container, they're given a chroot "jail" for their file system,
and it gets blown away whenever the app is stopped or restarted (which will
commonly happen, e.g. during a rolling deploy of the container-runner VMs).

Do you have something that currently works? How do your VMs currently
access this shared FS? I'm not sure BOSH has the abstractions for choosing
a shared, already-existing "persistent disk" to be attached to multiple
VMs. I also don't know what happens when you scale your VMs down, because
BOSH would generally destroy the associated persistent disk, but you don't
want to destroy the shared data.

Dmitriy, any idea how BOSH can work with a shared filesystem (e.g. HDFS)?

Amit

On Wed, Sep 16, 2015 at 6:54 AM, Kayode Odeyemi <dreyemi(a)gmail.com> wrote:


On Wed, Sep 16, 2015 at 3:44 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Are the spark jobs tasks that you expect to end, or apps that you expect
to run forever?
They are tasks that run forever. The jobs are subscribers to RabbitMQ
queues that process
messages in batches.


Do your jobs need to write to the file system, or do they access a
shared/distributed file system somehow?
The jobs write to shared filesystem.


Do you need things like a static IP allocated to your jobs?
No.


Are your spark jobs serving any web traffic?
No.




Re: Packaging CF app as bosh-release

Jim Park
 

IDK if BOSH has the abstractions for shared filesystems, but a helper monit
job could make sure that the filesystem is mounted and use
deployment-manifest supplied parameters to drive the ctl script. Monit also
has file_exists? status checker as well. The job could wait on this as a
monit-level dependency.

On Wed, Sep 16, 2015 at 8:11 AM Kayode Odeyemi <dreyemi(a)gmail.com> wrote:

On Wed, Sep 16, 2015 at 4:09 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

How do your VMs currently access this shared FS?
This is the big ???? . What I planned on using is HDFS over http or
SWIFTFS. Would have to see what this gives.

I also don't know what happens when you scale your VMs down, because BOSH
would generally destroy the associated persistent disk
Not until you mentioned this, my understanding is this wasn't going to be
an issue for BOSH.

I don't have something useful yet since I'm still experimenting. Would
push it somewhere once I'm able to package it as a boshrelease.


CloudFoundry not fouding cloud profile to connect using Spring Cloud

Flávio Henrique Schuindt da Silva <flavio.schuindt at gmail.com...>
 

Hi, guys.

i'm following this guide to connect a spring boot app deployed on cf using a mysqlservice bounded to the app (accessing it via spring data jpa etc): https://github.com/cf-platform-eng/spring-boot-cities/blob/master/cities-service/demo-script.adoc .

Everything works fine until Spring Cloud. When the app start, I got it in the logs: org.springframework.cloud.CloudException: No unique service matching interface javax.sql.DataSource found. Expected 1, found 0

The app is bounded to the service but it seems that CF is not setting this "cloud" profile.

What am I missing?

Thanks in advance, guys!


Re: DEA/Warden staging error

kyle havlovitz <kylehav@...>
 

It's just the staticfile buildpack from
https://github.com/cloudfoundry/staticfile-buildpack.git, I can try git
cloning it from inside the warden container though

On Wed, Sep 16, 2015 at 9:30 AM, Mike Dalessio <mdalessio(a)pivotal.io> wrote:

Worth noting that the git repo also needs to allow anonymous access. If
it's a private repo, then the 'git clone' is going to fail.

Can you verify that you can download the buildpack from your repo without
authenticating?

On Tue, Sep 15, 2015 at 7:43 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

It's not something we've ever seen before.

In theory, the warden container needs the git binary, which I think it
gets from the cflinuxfs2 stack; and internet access to wherever the git
repo lives.

If the warden container has both of those things, I can't think of any
reason why it wouldn't work.

Joseph
OSS Release Integration Team

On Tue, Sep 15, 2015 at 2:06 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I tried deploying via uploading a buildpack to the CC (had to set up
nginx first, I didnt have it running/configured before) and that worked! So
that's awesome, but I'm not sure what the problem with using a remote
buildpack is. Even with nginx, I still get the exact same error as before
when pushing using a remote buildpack from git.

On Tue, Sep 15, 2015 at 6:57 AM, CF Runtime <cfruntime(a)gmail.com> wrote:

Looking at the logs, we can see it finishing downloading the app
package. The next step should be to download and run the buildpack. Since
you mention there is no output after this, I'm guessing it doesn't get that
far.

It might be having trouble downloading the buildpack from the remote
git url. Could you try uploading the buildpack to Cloud Controller and then
having it use that buildpack to see if that makes a difference?


http://apidocs.cloudfoundry.org/217/buildpacks/creates_an_admin_buildpack.html

http://apidocs.cloudfoundry.org/217/buildpacks/upload_the_bits_for_an_admin_buildpack.html

Joseph
OSS Release Integration Team

On Mon, Sep 14, 2015 at 5:37 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

Here's the full dea_ng and warden debug logs:
https://gist.github.com/MrEnzyme/6dcc74174482ac62c1cf

Are there any other places I should look for logs?

On Mon, Sep 14, 2015 at 8:14 PM, CF Runtime <cfruntime(a)gmail.com>
wrote:

That's not an error we normally get. It's not clear if the
staging_info.yml error is the source of the problem or an artifact of it.
Having more logs would allow us to speculate more.

Joseph & Dan
OSS Release Integration Team

On Mon, Sep 14, 2015 at 2:24 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I have the cloudfoundry components built, configured and running on
one VM (not in BOSH), and when I push an app I'm getting a generic 'FAILED
StagingError' message after '-----> Downloaded app package (460K)'.

There's nothing in the logs for the dea/warden that seems suspect
other than these 2 things:


{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{
"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}


And I think the second error is just during cleanup, only failing
because the staging process didn't get far enough in to create the
'staging_info.yml'. The one about iomux-link exiting with status 1 is
pretty mysterious though and I have no idea what caused it. Does anyone
know why this might be happening?


Re: Environment variable names (was RE: Environment variable changes in DIEGO)

Onsi Fakhouri <ofakhouri@...>
 

Would if we could! Environment variables are part of the surface area that constitutes the contract between the platform and the application. While we could have duplicates with CF_* and eventually deprecate VCAP_* the new runtime needs to support droplets staged by the legacy runtime seamlessly.

Once the dust settles I'd sooner see us step back and reconsider this particular abstraction. For example, instead of env vars an authenticated metadata service a la AWS's http://169.254.169.254/ would give us more flexibility, allow us to dynamically control metadata made available to apps, and even version metadata sanely.

Onsi

Sent from my iPad

On Sep 16, 2015, at 8:30 AM, john mcteague <john.mcteague(a)gmail.com> wrote:

On a related, but slightly off, topic, while renaming the VCAP_* vars would have a big impact, is it not time we thought about renaming these to CF_* ?

John.

On 16 Sep 2015 16:09, "Matthew Sykes" <matthew.sykes(a)gmail.com> wrote:
The changes, in general, were intentional. The `application_uris` data was always broken as it didn't reflect route changes. I can't speak directly to the time stamp data.

The host is present still so I don't know why you don't see it.

We also have a migration guide [1]. If you think additional information is needed there, pull requests would be welcome.

[1]: https://github.com/cloudfoundry-incubator/diego-design-notes/blob/master/migrating-to-diego.md

On Wed, Sep 16, 2015 at 10:19 AM, Jack Cai <greensight(a)gmail.com> wrote:
I notice the below changes in the environment variables of DIEGO:
1. VCAP_APP_HOST & VCAP_APP_PORT are removed.
2. These fields are removed from VCAP_APPLICATION value: application_uris, started_at, start, started_at_timestamp, host, state_timestamp

I suppose #1 is intentional. How about #2?

Jack


--
Matthew Sykes
matthew.sykes(a)gmail.com


Environment variable names (was RE: Environment variable changes in DIEGO)

john mcteague <john.mcteague@...>
 

On a related, but slightly off, topic, while renaming the VCAP_* vars would
have a big impact, is it not time we thought about renaming these to CF_* ?

John.

On 16 Sep 2015 16:09, "Matthew Sykes" <matthew.sykes(a)gmail.com> wrote:

The changes, in general, were intentional. The `application_uris` data was
always broken as it didn't reflect route changes. I can't speak directly to
the time stamp data.

The host is present still so I don't know why you don't see it.

We also have a migration guide [1]. If you think additional information is
needed there, pull requests would be welcome.

[1]:
https://github.com/cloudfoundry-incubator/diego-design-notes/blob/master/migrating-to-diego.md

On Wed, Sep 16, 2015 at 10:19 AM, Jack Cai <greensight(a)gmail.com> wrote:

I notice the below changes in the environment variables of DIEGO:
1. VCAP_APP_HOST & VCAP_APP_PORT are removed.
2. These fields are removed from VCAP_APPLICATION value:
application_uris, started_at, start, started_at_timestamp, host,
state_timestamp

I suppose #1 is intentional. How about #2?

Jack


--
Matthew Sykes
matthew.sykes(a)gmail.com


Re: Packaging CF app as bosh-release

Paul Bakare
 

On Wed, Sep 16, 2015 at 4:09 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

How do your VMs currently access this shared FS?
This is the big ???? . What I planned on using is HDFS over http or
SWIFTFS. Would have to see what this gives.

I also don't know what happens when you scale your VMs down, because BOSH
would generally destroy the associated persistent disk
Not until you mentioned this, my understanding is this wasn't going to be
an issue for BOSH.

I don't have something useful yet since I'm still experimenting. Would push
it somewhere once I'm able to package it as a boshrelease.


Re: Environment variable changes in DIEGO

Matthew Sykes <matthew.sykes@...>
 

The changes, in general, were intentional. The `application_uris` data was
always broken as it didn't reflect route changes. I can't speak directly to
the time stamp data.

The host is present still so I don't know why you don't see it.

We also have a migration guide [1]. If you think additional information is
needed there, pull requests would be welcome.

[1]:
https://github.com/cloudfoundry-incubator/diego-design-notes/blob/master/migrating-to-diego.md

On Wed, Sep 16, 2015 at 10:19 AM, Jack Cai <greensight(a)gmail.com> wrote:

I notice the below changes in the environment variables of DIEGO:
1. VCAP_APP_HOST & VCAP_APP_PORT are removed.
2. These fields are removed from VCAP_APPLICATION value: application_uris,
started_at, start, started_at_timestamp, host, state_timestamp

I suppose #1 is intentional. How about #2?

Jack

--
Matthew Sykes
matthew.sykes(a)gmail.com


Environment variable changes in DIEGO

Jack Cai
 

I notice the below changes in the environment variables of DIEGO:
1. VCAP_APP_HOST & VCAP_APP_PORT are removed.
2. These fields are removed from VCAP_APPLICATION value: application_uris,
started_at, start, started_at_timestamp, host, state_timestamp

I suppose #1 is intentional. How about #2?

Jack


Re: Packaging CF app as bosh-release

Amit Kumar Gupta
 

The shared file system aspect is an interesting wrinkle to the problem.
Unless you use some network layer to how you write to the shared file
system, e.g. SSHFS, I think apps will not work because they get isolated to
run in a container, they're given a chroot "jail" for their file system,
and it gets blown away whenever the app is stopped or restarted (which will
commonly happen, e.g. during a rolling deploy of the container-runner VMs).

Do you have something that currently works? How do your VMs currently
access this shared FS? I'm not sure BOSH has the abstractions for choosing
a shared, already-existing "persistent disk" to be attached to multiple
VMs. I also don't know what happens when you scale your VMs down, because
BOSH would generally destroy the associated persistent disk, but you don't
want to destroy the shared data.

Dmitriy, any idea how BOSH can work with a shared filesystem (e.g. HDFS)?

Amit

On Wed, Sep 16, 2015 at 6:54 AM, Kayode Odeyemi <dreyemi(a)gmail.com> wrote:


On Wed, Sep 16, 2015 at 3:44 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Are the spark jobs tasks that you expect to end, or apps that you expect
to run forever?
They are tasks that run forever. The jobs are subscribers to RabbitMQ
queues that process
messages in batches.


Do your jobs need to write to the file system, or do they access a
shared/distributed file system somehow?
The jobs write to shared filesystem.


Do you need things like a static IP allocated to your jobs?
No.


Are your spark jobs serving any web traffic?
No.




Re: Packaging CF app as bosh-release

Paul Bakare
 

On Wed, Sep 16, 2015 at 3:44 PM, Amit Gupta <agupta(a)pivotal.io> wrote:

Are the spark jobs tasks that you expect to end, or apps that you expect
to run forever?
They are tasks that run forever. The jobs are subscribers to RabbitMQ
queues that process
messages in batches.


Do your jobs need to write to the file system, or do they access a
shared/distributed file system somehow?
The jobs write to shared filesystem.


Do you need things like a static IP allocated to your jobs?
No.


Are your spark jobs serving any web traffic?
No.


Re: Packaging CF app as bosh-release

Amit Kumar Gupta
 

Are the spark jobs tasks that you expect to end, or apps that you expect to
run forever?
Do your jobs need to write to the file system, or do they access a
shared/distributed file system somehow?
Do you need things like a static IP allocated to your jobs?
Are your spark jobs serving any web traffic?

On Wed, Sep 16, 2015 at 1:32 AM, Kayode Odeyemi <dreyemi(a)gmail.com> wrote:


On Wed, Sep 16, 2015 at 5:15 AM, Amit Gupta <amitkgupta84(a)gmail.com>
wrote:

Can you say a bit more about what you're trying to do?

I'm working on an experimental analytics project that leverages logsearch
+ Apache Spark.

So instead of having the Spark jobs as apps, I'm thinking of building a
bosh release job for it.

Can possibly package anything but that way application will run inside a
vm rather than on platform.

What are the cons of apps running on VMs instead of warden?


Re: DEA/Warden staging error

Mike Dalessio
 

Worth noting that the git repo also needs to allow anonymous access. If
it's a private repo, then the 'git clone' is going to fail.

Can you verify that you can download the buildpack from your repo without
authenticating?

On Tue, Sep 15, 2015 at 7:43 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

It's not something we've ever seen before.

In theory, the warden container needs the git binary, which I think it
gets from the cflinuxfs2 stack; and internet access to wherever the git
repo lives.

If the warden container has both of those things, I can't think of any
reason why it wouldn't work.

Joseph
OSS Release Integration Team

On Tue, Sep 15, 2015 at 2:06 PM, kyle havlovitz <kylehav(a)gmail.com> wrote:

I tried deploying via uploading a buildpack to the CC (had to set up
nginx first, I didnt have it running/configured before) and that worked! So
that's awesome, but I'm not sure what the problem with using a remote
buildpack is. Even with nginx, I still get the exact same error as before
when pushing using a remote buildpack from git.

On Tue, Sep 15, 2015 at 6:57 AM, CF Runtime <cfruntime(a)gmail.com> wrote:

Looking at the logs, we can see it finishing downloading the app
package. The next step should be to download and run the buildpack. Since
you mention there is no output after this, I'm guessing it doesn't get that
far.

It might be having trouble downloading the buildpack from the remote git
url. Could you try uploading the buildpack to Cloud Controller and then
having it use that buildpack to see if that makes a difference?


http://apidocs.cloudfoundry.org/217/buildpacks/creates_an_admin_buildpack.html

http://apidocs.cloudfoundry.org/217/buildpacks/upload_the_bits_for_an_admin_buildpack.html

Joseph
OSS Release Integration Team

On Mon, Sep 14, 2015 at 5:37 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

Here's the full dea_ng and warden debug logs:
https://gist.github.com/MrEnzyme/6dcc74174482ac62c1cf

Are there any other places I should look for logs?

On Mon, Sep 14, 2015 at 8:14 PM, CF Runtime <cfruntime(a)gmail.com>
wrote:

That's not an error we normally get. It's not clear if the
staging_info.yml error is the source of the problem or an artifact of it.
Having more logs would allow us to speculate more.

Joseph & Dan
OSS Release Integration Team

On Mon, Sep 14, 2015 at 2:24 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I have the cloudfoundry components built, configured and running on
one VM (not in BOSH), and when I push an app I'm getting a generic 'FAILED
StagingError' message after '-----> Downloaded app package (460K)'.

There's nothing in the logs for the dea/warden that seems suspect
other than these 2 things:


{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{
"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}


And I think the second error is just during cleanup, only failing
because the staging process didn't get far enough in to create the
'staging_info.yml'. The one about iomux-link exiting with status 1 is
pretty mysterious though and I have no idea what caused it. Does anyone
know why this might be happening?

7621 - 7640 of 9425