Warden: staging error when pushing app


Alexander Lomov <alexander.lomov@...>
 

Hey, Kyle.

Just a quick tip I use to debug failed containers: I comment a content of do_destroy method in /var/vcap/jobs/dea_next/packages/warden/warden/lib/warden/container/linux.rb file to prevent warden removing any containers at all, after that you’ll need to run “sudo /var/vcap/bosh/bin/monit -I restart warden”. This changes can result in danger behavior, so don’t run them on any sort of production.

Probably you can find something useful in my blogpost [1].

[1] http://blog.altoros.com/creating-a-custom-cloud-foundry-buildpack-from-scratch-whats-under-the-hood.html

On Sep 12, 2015, at 12:44 AM, kyle havlovitz <kylehav(a)gmail.com<mailto:kylehav(a)gmail.com>> wrote:

I built all the components on my own and have them running on one VM, with config changed from the bosh defaults as needed (I had already had this working, but with an older CF version (203) and now i'm updating).

It seems like the app and buildpack are getting placed in the container successfully, but the push never gets past the '-----> Downloaded app package (460K)'.
I also tried creating a container with the warden client and it says it was created successfully, then gets deleted by warden soon (about 30s) afterward. I have the 'container_grace_time' set to 300, so I thought it would be longer.

On Fri, Sep 11, 2015 at 5:07 PM, CF Runtime <cfruntime(a)gmail.com<mailto:cfruntime(a)gmail.com>> wrote:
Hi Kyle,

I'm not sure what mean by "running it locally". Can you explain in more detail how you've deployed your CF installation?

Zak + Dan

On Fri, Sep 11, 2015 at 1:12 PM, kyle havlovitz <kylehav(a)gmail.com<mailto:kylehav(a)gmail.com>> wrote:
I also see this error in the DEA logs: {"timestamp":1441999712.400334,"message":"Error copying files out of container: command exited with failure","log_level":"warn","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a"},"thread_id":70184860078900,"fiber_id":70184883843040,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/task.rb","lineno":126,"method":"rescue in copy_out_request"}
{"timestamp":1441999712.4004574,"message":"staging.task-log.copying-out","log_level":"info","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a","source":"/tmp/staged/logs/staging_task.log","destination":"/tmp/dea_ng/staging/d20150911-23523-ms2lax/staging_task.log"},"thread_id":70184860078900,"fiber_id":70184883400540,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/staging/staging_task.rb","lineno":258,"method":"block in promise_task_log"}

Could it be something wrong with the DEA's directory server? I don't anything in its logs after it starts so maybe it isn't being used.

On Fri, Sep 11, 2015 at 1:55 PM, kyle havlovitz <kylehav(a)gmail.com<mailto:kylehav(a)gmail.com>> wrote:
I'm not using bosh, just trying to run it locally. I can post the dea and warden configs though:

dea.yml:
base_dir: /tmp/dea_ng

domain: local.example.com<http://local.example.com/>

logging:
file: /opt/cloudfoundry/logs/dea_ng.log
level: debug

loggregator:
router: "127.0.0.1:3456<http://127.0.0.1:3456/>"
shared_secret: "secret"

resources:
memory_mb: 8000
memory_overcommit_factor: 1
disk_mb: 40960
disk_overcommit_factor: 1

nats_servers:
- nats://127.0.0.1:4222<http://127.0.0.1:4222/>

pid_filename: /tmp/dea_ng.pid

warden_socket: /tmp/warden.sock

evacuation_delay_secs: 10

default_health_check_timeout: 120

index: 0

intervals:
heartbeat: 10
advertise: 5
router_register_in_seconds: 20

staging:
enabled: true
memory_limit_mb: 4096
disk_limit_mb: 6144
disk_inode_limit: 200000
cpu_limit_shares: 512
max_staging_duration: 900

instance:
disk_inode_limit: 200000
memory_to_cpu_share_ratio: 8
max_cpu_share_limit: 256
min_cpu_share_limit: 1

dea_ruby: /usr/bin/ruby

# For Go-based directory server
directory_server:
protocol: 'http'
v1_port: 4385
v2_port: 5678
file_api_port: 1234
streaming_timeout: 10
logging:
file: /opt/cloudfoundry/logs/dea_dirserver.log
level: debug

stacks:
- name: cflinuxfs2
package_path: /var/warden/rootfs

placement_properties:
zone: "zone"

warden test_vm.yml:

server:
container_klass: Warden::Container::Linux

# Wait this long before destroying a container, after the last client
# referencing it disconnected. The timer is cancelled when during this
# period, another client references the container.
#
# Clients can be forced to specify this setting by setting the
# server-wide variable to an invalid value:
# container_grace_time: invalid
#
# The grace time can be disabled by setting it to nil:
# container_grace_time: ~
#
container_grace_time: 300

unix_domain_permissions: 0777
unix_domain_path: /tmp/warden.sock

# Specifies the path to the base chroot used as the read-only root
# filesystem
container_rootfs_path: /var/warden/rootfs

# Specifies the path to the parent directory under which all containers
# will live.
container_depot_path: /var/warden/containers

# See getrlimit(2) for details. Integer values are passed verbatim.
container_rlimits:
core: 0

quota:
disk_quota_enabled: false

allow_nested_warden: false

health_check_server:
port: 2345

logging:
file: /opt/cloudfoundry/logs/warden.log
level: debug2

network:
# Use this /30 network as offset for the network pool.
pool_start_address: 10.254.0.0

# Pool this many /30 networks.
pool_size: 256

# Interface MTU size
# (for OpenStack use 1454 to avoid problems with rubygems with GRE tunneling)
mtu: 1500

user:
pool_start_uid: 11000
pool_size: 256

This is all using the latest CFv217

On Fri, Sep 11, 2015 at 1:31 PM, CF Runtime <cfruntime(a)gmail.com<mailto:cfruntime(a)gmail.com>> wrote:
Hey Kyle,

Can we take a look at your deployment manifest (with all the secrets redacted)?

Zak + Dan, CF OSS Integration team

On Fri, Sep 11, 2015 at 8:55 AM, kyle havlovitz <kylehav(a)gmail.com<mailto:kylehav(a)gmail.com>> wrote:
I'm getting an error pushing any app during the staging step. The cf logs returns only this:
2015-09-11T15:24:24.33+0000 [API] OUT Created app with guid 47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:24.41+0000 [API] OUT Updated app with guid 47efa472-e2d5-400b-9135-b1a1dbe3ba88 ({"route"=>"5737c5f5-b017-43da-9013-2b6fe7db03f7"})
2015-09-11T15:24:29.54+0000 [DEA/0] OUT Got staging request for app with id 47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:30.71+0000 [API] OUT Updated app with guid 47efa472-e2d5-400b-9135-b1a1dbe3ba88 ({"state"=>"STARTED"})
2015-09-11T15:24:30.76+0000 [STG/0] OUT -----> Downloaded app package (4.0K)
2015-09-11T15:25:06.00+0000 [API] ERR encountered error: Staging error: failed to stage application:
2015-09-11T15:25:06.00+0000 [API] ERR Script exited with status 1

In the warden logs, there are a few suspect messages:
{
"timestamp": 1441985105.8883495,
"message": "Exited with status 1 (35.120s): [[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\", \"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\", \"/var/warden/containers/18vf956il5v/jobs/8/cursors\", \"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",
"source": "Warden::Container::Linux",
"data": {
"handle": "18vf956il5v",
"stdout": "",
"stderr": ""
},
"thread_id": 69890836968240,
"fiber_id": 69890849112480,
"process_id": 17063,
"file": "/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,
"method": "set_deferred_success"
}

{
"timestamp": 1441985105.94083,
"message": "Exited with status 23 (0.023s): [[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\", \"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\", \"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket /var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\", \"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\", \"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",
"source": "Warden::Container::Linux",
"data": {
"handle": "18vf956il5v",
"stdout": "",
"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\" failed: No such file or directory (2)\nrsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},
"thread_id": 69890836968240,
"fiber_id": 69890849112480,
"process_id": 17063,
"file": "/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,
"method": "set_deferred_success"
}

{
"timestamp": 1441985106.0086887,
"message": "Killing oom-notifier process",
"log_level": "debug",
"source": "Warden::Container::Features::MemLimit::OomNotifier",
"data": {

},
"thread_id": 69890836968240,
"fiber_id": 69890848620580,
"process_id": 17063,
"file": "/opt/cloudfoundry/warden/warden/lib/warden/container/features/mem_limit.rb",
"lineno": 51,
"method": "kill"
}

{
"timestamp": 1441985106.0095143,
"message": "Exited with status 0 (35.427s): [[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\", \"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"/opt/cloudfoundry/warden/warden/src/oom/oom\", \"/tmp/warden/cgroup/memory/instance-18vf956il5v\"]",
"log_level": "warn",
"source": "Warden::Container::Features::MemLimit::OomNotifier",
"data": {
"stdout": "",
"stderr": ""
},
"thread_id": 69890836968240,
"fiber_id": 69890849112480,
"process_id": 17063,
"file": "/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,
"method": "set_deferred_success"
}


Obviously something is misconfigured, but I'm not sure what. I don't know why the out of memory thing is appearing as the memory being used by the test app i've pushed is tiny (64M app with staticfile buildpack) and the dea config has resource.memory_mb set to 8 gigs and staging.memory_limit_mb set to 1 gigs. Is there some config I'm lacking that's causing this to fail?


CF Runtime
 

I think that the DEA will reap any warden containers it finds every 30
seconds. The containers in a grace period are containers it is still
keeping track of.

We're not sure about the error you're getting. If you provide the warden
and dea logs we might be able to trace through what is happening.

Joseph & Dan
OSS Release Integration Team

On Fri, Sep 11, 2015 at 2:44 PM, kyle havlovitz <kylehav(a)gmail.com> wrote:

I built all the components on my own and have them running on one VM, with
config changed from the bosh defaults as needed (I had already had this
working, but with an older CF version (203) and now i'm updating).

It seems like the app and buildpack are getting placed in the container
successfully, but the push never gets past the '-----> Downloaded app
package (460K)'.
I also tried creating a container with the warden client and it says it
was created successfully, then gets deleted by warden soon (about 30s)
afterward. I have the 'container_grace_time' set to 300, so I thought it
would be longer.

On Fri, Sep 11, 2015 at 5:07 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

Hi Kyle,

I'm not sure what mean by "running it locally". Can you explain in more
detail how you've deployed your CF installation?

Zak + Dan

On Fri, Sep 11, 2015 at 1:12 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I also see this error in the DEA logs:
{"timestamp":1441999712.400334,"message":"Error copying files out of
container: command exited with
failure","log_level":"warn","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a"},"thread_id":70184860078900,"fiber_id":70184883843040,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/task.rb","lineno":126,"method":"rescue
in copy_out_request"}
{"timestamp":1441999712.4004574,"message":"staging.task-log.copying-out","log_level":"info","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a","source":"/tmp/staged/logs/staging_task.log","destination":"/tmp/dea_ng/staging/d20150911-23523-ms2lax/staging_task.log"},"thread_id":70184860078900,"fiber_id":70184883400540,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/staging/staging_task.rb","lineno":258,"method":"block
in promise_task_log"}

Could it be something wrong with the DEA's directory server? I don't
anything in its logs after it starts so maybe it isn't being used.

On Fri, Sep 11, 2015 at 1:55 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I'm not using bosh, just trying to run it locally. I can post the dea
and warden configs though:

dea.yml:

base_dir: /tmp/dea_ng


domain: local.example.com

logging:
file: /opt/cloudfoundry/logs/dea_ng.log

level: debug


loggregator:
router: "127.0.0.1:3456"

shared_secret: "secret"


resources:
memory_mb: 8000

memory_overcommit_factor: 1

disk_mb: 40960

disk_overcommit_factor: 1


nats_servers:
- nats://127.0.0.1:4222


pid_filename: /tmp/dea_ng.pid

warden_socket: /tmp/warden.sock

evacuation_delay_secs: 10

default_health_check_timeout: 120

index: 0

intervals:
heartbeat: 10

advertise: 5

router_register_in_seconds: 20


staging:
enabled: true

memory_limit_mb: 4096

disk_limit_mb: 6144

disk_inode_limit: 200000

cpu_limit_shares: 512

max_staging_duration: 900


instance:
disk_inode_limit: 200000

memory_to_cpu_share_ratio: 8

max_cpu_share_limit: 256

min_cpu_share_limit: 1


dea_ruby: /usr/bin/ruby

# For Go-based directory server
directory_server:

protocol: 'http'

v1_port: 4385

v2_port: 5678

file_api_port: 1234

streaming_timeout: 10

logging:

file: /opt/cloudfoundry/logs/dea_dirserver.log

level: debug


stacks:
- name: cflinuxfs2

package_path: /var/warden/rootfs


placement_properties:
zone: "zone"

warden test_vm.yml:

server:

container_klass: Warden::Container::Linux


# Wait this long before destroying a container, after the last client
# referencing it disconnected. The timer is cancelled when during this

# period, another client references the container.

#

# Clients can be forced to specify this setting by setting the

# server-wide variable to an invalid value:

# container_grace_time: invalid

#

# The grace time can be disabled by setting it to nil:

# container_grace_time: ~

#

container_grace_time: 300


unix_domain_permissions: 0777
unix_domain_path: /tmp/warden.sock


# Specifies the path to the base chroot used as the read-only root
# filesystem

container_rootfs_path: /var/warden/rootfs


# Specifies the path to the parent directory under which all
containers
# will live.

container_depot_path: /var/warden/containers


# See getrlimit(2) for details. Integer values are passed verbatim.
container_rlimits:

core: 0


quota:
disk_quota_enabled: false


allow_nested_warden: false

health_check_server:
port: 2345


logging:
file: /opt/cloudfoundry/logs/warden.log

level: debug2


network:
# Use this /30 network as offset for the network pool.

pool_start_address: 10.254.0.0


# Pool this many /30 networks.
pool_size: 256


# Interface MTU size
# (for OpenStack use 1454 to avoid problems with rubygems with GRE
tunneling)
mtu: 1500


user:
pool_start_uid: 11000

pool_size: 256


This is all using the latest CFv217

On Fri, Sep 11, 2015 at 1:31 PM, CF Runtime <cfruntime(a)gmail.com>
wrote:

Hey Kyle,

Can we take a look at your deployment manifest (with all the secrets
redacted)?

Zak + Dan, CF OSS Integration team

On Fri, Sep 11, 2015 at 8:55 AM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I'm getting an error pushing any app during the staging step. The cf
logs returns only this:

2015-09-11T15:24:24.33+0000 [API] OUT Created app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:24.41+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
({"route"=>"5737c5f5-b017-43da-9013-2b6fe7db03f7"})
2015-09-11T15:24:29.54+0000 [DEA/0] OUT Got staging request for
app with id 47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:30.71+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88 ({"state"=>"STARTED"})
2015-09-11T15:24:30.76+0000 [STG/0] OUT -----> Downloaded app
package (4.0K)
2015-09-11T15:25:06.00+0000 [API] ERR encountered error:
Staging error: failed to stage application:
2015-09-11T15:25:06.00+0000 [API] ERR Script exited with
status 1
In the warden logs, there are a few suspect messages:

{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985106.0086887,

"message": "Killing oom-notifier process",

"log_level": "debug",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {



},

"thread_id": 69890836968240,

"fiber_id": 69890848620580,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/features/mem_limit.rb",
"lineno": 51,

"method": "kill"

}



{

"timestamp": 1441985106.0095143,

"message": "Exited with status 0 (35.427s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/opt/cloudfoundry/warden/warden/src/oom/oom\",
\"/tmp/warden/cgroup/memory/instance-18vf956il5v\"]",
"log_level": "warn",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}




Obviously something is misconfigured, but I'm not sure what. I don't
know why the out of memory thing is appearing as the memory being used by
the test app i've pushed is tiny (64M app with staticfile buildpack) and
the dea config has resource.memory_mb set to 8 gigs and
staging.memory_limit_mb set to 1 gigs. Is there some config I'm lacking
that's causing this to fail?


kyle havlovitz <kylehav@...>
 

I'm not sure if the "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)" is the cause of the staging failure
or just a side effect of the staging not happening correctly. Does anyone
have any ideas on why this is happening?

On Fri, Sep 11, 2015 at 5:44 PM, kyle havlovitz <kylehav(a)gmail.com> wrote:

I built all the components on my own and have them running on one VM, with
config changed from the bosh defaults as needed (I had already had this
working, but with an older CF version (203) and now i'm updating).

It seems like the app and buildpack are getting placed in the container
successfully, but the push never gets past the '-----> Downloaded app
package (460K)'.
I also tried creating a container with the warden client and it says it
was created successfully, then gets deleted by warden soon (about 30s)
afterward. I have the 'container_grace_time' set to 300, so I thought it
would be longer.

On Fri, Sep 11, 2015 at 5:07 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

Hi Kyle,

I'm not sure what mean by "running it locally". Can you explain in more
detail how you've deployed your CF installation?

Zak + Dan

On Fri, Sep 11, 2015 at 1:12 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I also see this error in the DEA logs:
{"timestamp":1441999712.400334,"message":"Error copying files out of
container: command exited with
failure","log_level":"warn","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a"},"thread_id":70184860078900,"fiber_id":70184883843040,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/task.rb","lineno":126,"method":"rescue
in copy_out_request"}
{"timestamp":1441999712.4004574,"message":"staging.task-log.copying-out","log_level":"info","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a","source":"/tmp/staged/logs/staging_task.log","destination":"/tmp/dea_ng/staging/d20150911-23523-ms2lax/staging_task.log"},"thread_id":70184860078900,"fiber_id":70184883400540,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/staging/staging_task.rb","lineno":258,"method":"block
in promise_task_log"}

Could it be something wrong with the DEA's directory server? I don't
anything in its logs after it starts so maybe it isn't being used.

On Fri, Sep 11, 2015 at 1:55 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I'm not using bosh, just trying to run it locally. I can post the dea
and warden configs though:

dea.yml:

base_dir: /tmp/dea_ng


domain: local.example.com

logging:
file: /opt/cloudfoundry/logs/dea_ng.log

level: debug


loggregator:
router: "127.0.0.1:3456"

shared_secret: "secret"


resources:
memory_mb: 8000

memory_overcommit_factor: 1

disk_mb: 40960

disk_overcommit_factor: 1


nats_servers:
- nats://127.0.0.1:4222


pid_filename: /tmp/dea_ng.pid

warden_socket: /tmp/warden.sock

evacuation_delay_secs: 10

default_health_check_timeout: 120

index: 0

intervals:
heartbeat: 10

advertise: 5

router_register_in_seconds: 20


staging:
enabled: true

memory_limit_mb: 4096

disk_limit_mb: 6144

disk_inode_limit: 200000

cpu_limit_shares: 512

max_staging_duration: 900


instance:
disk_inode_limit: 200000

memory_to_cpu_share_ratio: 8

max_cpu_share_limit: 256

min_cpu_share_limit: 1


dea_ruby: /usr/bin/ruby

# For Go-based directory server
directory_server:

protocol: 'http'

v1_port: 4385

v2_port: 5678

file_api_port: 1234

streaming_timeout: 10

logging:

file: /opt/cloudfoundry/logs/dea_dirserver.log

level: debug


stacks:
- name: cflinuxfs2

package_path: /var/warden/rootfs


placement_properties:
zone: "zone"

warden test_vm.yml:

server:

container_klass: Warden::Container::Linux


# Wait this long before destroying a container, after the last client
# referencing it disconnected. The timer is cancelled when during this

# period, another client references the container.

#

# Clients can be forced to specify this setting by setting the

# server-wide variable to an invalid value:

# container_grace_time: invalid

#

# The grace time can be disabled by setting it to nil:

# container_grace_time: ~

#

container_grace_time: 300


unix_domain_permissions: 0777
unix_domain_path: /tmp/warden.sock


# Specifies the path to the base chroot used as the read-only root
# filesystem

container_rootfs_path: /var/warden/rootfs


# Specifies the path to the parent directory under which all
containers
# will live.

container_depot_path: /var/warden/containers


# See getrlimit(2) for details. Integer values are passed verbatim.
container_rlimits:

core: 0


quota:
disk_quota_enabled: false


allow_nested_warden: false

health_check_server:
port: 2345


logging:
file: /opt/cloudfoundry/logs/warden.log

level: debug2


network:
# Use this /30 network as offset for the network pool.

pool_start_address: 10.254.0.0


# Pool this many /30 networks.
pool_size: 256


# Interface MTU size
# (for OpenStack use 1454 to avoid problems with rubygems with GRE
tunneling)
mtu: 1500


user:
pool_start_uid: 11000

pool_size: 256


This is all using the latest CFv217

On Fri, Sep 11, 2015 at 1:31 PM, CF Runtime <cfruntime(a)gmail.com>
wrote:

Hey Kyle,

Can we take a look at your deployment manifest (with all the secrets
redacted)?

Zak + Dan, CF OSS Integration team

On Fri, Sep 11, 2015 at 8:55 AM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I'm getting an error pushing any app during the staging step. The cf
logs returns only this:

2015-09-11T15:24:24.33+0000 [API] OUT Created app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:24.41+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
({"route"=>"5737c5f5-b017-43da-9013-2b6fe7db03f7"})
2015-09-11T15:24:29.54+0000 [DEA/0] OUT Got staging request for
app with id 47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:30.71+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88 ({"state"=>"STARTED"})
2015-09-11T15:24:30.76+0000 [STG/0] OUT -----> Downloaded app
package (4.0K)
2015-09-11T15:25:06.00+0000 [API] ERR encountered error:
Staging error: failed to stage application:
2015-09-11T15:25:06.00+0000 [API] ERR Script exited with
status 1
In the warden logs, there are a few suspect messages:

{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985106.0086887,

"message": "Killing oom-notifier process",

"log_level": "debug",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {



},

"thread_id": 69890836968240,

"fiber_id": 69890848620580,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/features/mem_limit.rb",
"lineno": 51,

"method": "kill"

}



{

"timestamp": 1441985106.0095143,

"message": "Exited with status 0 (35.427s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/opt/cloudfoundry/warden/warden/src/oom/oom\",
\"/tmp/warden/cgroup/memory/instance-18vf956il5v\"]",
"log_level": "warn",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}




Obviously something is misconfigured, but I'm not sure what. I don't
know why the out of memory thing is appearing as the memory being used by
the test app i've pushed is tiny (64M app with staticfile buildpack) and
the dea config has resource.memory_mb set to 8 gigs and
staging.memory_limit_mb set to 1 gigs. Is there some config I'm lacking
that's causing this to fail?


kyle havlovitz <kylehav@...>
 

I built all the components on my own and have them running on one VM, with
config changed from the bosh defaults as needed (I had already had this
working, but with an older CF version (203) and now i'm updating).

It seems like the app and buildpack are getting placed in the container
successfully, but the push never gets past the '-----> Downloaded app
package (460K)'.
I also tried creating a container with the warden client and it says it was
created successfully, then gets deleted by warden soon (about 30s)
afterward. I have the 'container_grace_time' set to 300, so I thought it
would be longer.

On Fri, Sep 11, 2015 at 5:07 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

Hi Kyle,

I'm not sure what mean by "running it locally". Can you explain in more
detail how you've deployed your CF installation?

Zak + Dan

On Fri, Sep 11, 2015 at 1:12 PM, kyle havlovitz <kylehav(a)gmail.com> wrote:

I also see this error in the DEA logs:
{"timestamp":1441999712.400334,"message":"Error copying files out of
container: command exited with
failure","log_level":"warn","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a"},"thread_id":70184860078900,"fiber_id":70184883843040,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/task.rb","lineno":126,"method":"rescue
in copy_out_request"}
{"timestamp":1441999712.4004574,"message":"staging.task-log.copying-out","log_level":"info","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a","source":"/tmp/staged/logs/staging_task.log","destination":"/tmp/dea_ng/staging/d20150911-23523-ms2lax/staging_task.log"},"thread_id":70184860078900,"fiber_id":70184883400540,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/staging/staging_task.rb","lineno":258,"method":"block
in promise_task_log"}

Could it be something wrong with the DEA's directory server? I don't
anything in its logs after it starts so maybe it isn't being used.

On Fri, Sep 11, 2015 at 1:55 PM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I'm not using bosh, just trying to run it locally. I can post the dea
and warden configs though:

dea.yml:

base_dir: /tmp/dea_ng


domain: local.example.com

logging:
file: /opt/cloudfoundry/logs/dea_ng.log

level: debug


loggregator:
router: "127.0.0.1:3456"

shared_secret: "secret"


resources:
memory_mb: 8000

memory_overcommit_factor: 1

disk_mb: 40960

disk_overcommit_factor: 1


nats_servers:
- nats://127.0.0.1:4222


pid_filename: /tmp/dea_ng.pid

warden_socket: /tmp/warden.sock

evacuation_delay_secs: 10

default_health_check_timeout: 120

index: 0

intervals:
heartbeat: 10

advertise: 5

router_register_in_seconds: 20


staging:
enabled: true

memory_limit_mb: 4096

disk_limit_mb: 6144

disk_inode_limit: 200000

cpu_limit_shares: 512

max_staging_duration: 900


instance:
disk_inode_limit: 200000

memory_to_cpu_share_ratio: 8

max_cpu_share_limit: 256

min_cpu_share_limit: 1


dea_ruby: /usr/bin/ruby

# For Go-based directory server
directory_server:

protocol: 'http'

v1_port: 4385

v2_port: 5678

file_api_port: 1234

streaming_timeout: 10

logging:

file: /opt/cloudfoundry/logs/dea_dirserver.log

level: debug


stacks:
- name: cflinuxfs2

package_path: /var/warden/rootfs


placement_properties:
zone: "zone"

warden test_vm.yml:

server:

container_klass: Warden::Container::Linux


# Wait this long before destroying a container, after the last client
# referencing it disconnected. The timer is cancelled when during this

# period, another client references the container.

#

# Clients can be forced to specify this setting by setting the

# server-wide variable to an invalid value:

# container_grace_time: invalid

#

# The grace time can be disabled by setting it to nil:

# container_grace_time: ~

#

container_grace_time: 300


unix_domain_permissions: 0777
unix_domain_path: /tmp/warden.sock


# Specifies the path to the base chroot used as the read-only root
# filesystem

container_rootfs_path: /var/warden/rootfs


# Specifies the path to the parent directory under which all
containers
# will live.

container_depot_path: /var/warden/containers


# See getrlimit(2) for details. Integer values are passed verbatim.
container_rlimits:

core: 0


quota:
disk_quota_enabled: false


allow_nested_warden: false

health_check_server:
port: 2345


logging:
file: /opt/cloudfoundry/logs/warden.log

level: debug2


network:
# Use this /30 network as offset for the network pool.

pool_start_address: 10.254.0.0


# Pool this many /30 networks.
pool_size: 256


# Interface MTU size
# (for OpenStack use 1454 to avoid problems with rubygems with GRE
tunneling)
mtu: 1500


user:
pool_start_uid: 11000

pool_size: 256


This is all using the latest CFv217

On Fri, Sep 11, 2015 at 1:31 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

Hey Kyle,

Can we take a look at your deployment manifest (with all the secrets
redacted)?

Zak + Dan, CF OSS Integration team

On Fri, Sep 11, 2015 at 8:55 AM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I'm getting an error pushing any app during the staging step. The cf
logs returns only this:

2015-09-11T15:24:24.33+0000 [API] OUT Created app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:24.41+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
({"route"=>"5737c5f5-b017-43da-9013-2b6fe7db03f7"})
2015-09-11T15:24:29.54+0000 [DEA/0] OUT Got staging request for
app with id 47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:30.71+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88 ({"state"=>"STARTED"})
2015-09-11T15:24:30.76+0000 [STG/0] OUT -----> Downloaded app
package (4.0K)
2015-09-11T15:25:06.00+0000 [API] ERR encountered error:
Staging error: failed to stage application:
2015-09-11T15:25:06.00+0000 [API] ERR Script exited with
status 1
In the warden logs, there are a few suspect messages:

{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985106.0086887,

"message": "Killing oom-notifier process",

"log_level": "debug",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {



},

"thread_id": 69890836968240,

"fiber_id": 69890848620580,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/features/mem_limit.rb",
"lineno": 51,

"method": "kill"

}



{

"timestamp": 1441985106.0095143,

"message": "Exited with status 0 (35.427s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/opt/cloudfoundry/warden/warden/src/oom/oom\",
\"/tmp/warden/cgroup/memory/instance-18vf956il5v\"]",
"log_level": "warn",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}




Obviously something is misconfigured, but I'm not sure what. I don't
know why the out of memory thing is appearing as the memory being used by
the test app i've pushed is tiny (64M app with staticfile buildpack) and
the dea config has resource.memory_mb set to 8 gigs and
staging.memory_limit_mb set to 1 gigs. Is there some config I'm lacking
that's causing this to fail?


CF Runtime
 

Hi Kyle,

I'm not sure what mean by "running it locally". Can you explain in more
detail how you've deployed your CF installation?

Zak + Dan

On Fri, Sep 11, 2015 at 1:12 PM, kyle havlovitz <kylehav(a)gmail.com> wrote:

I also see this error in the DEA logs:
{"timestamp":1441999712.400334,"message":"Error copying files out of
container: command exited with
failure","log_level":"warn","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a"},"thread_id":70184860078900,"fiber_id":70184883843040,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/task.rb","lineno":126,"method":"rescue
in copy_out_request"}
{"timestamp":1441999712.4004574,"message":"staging.task-log.copying-out","log_level":"info","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a","source":"/tmp/staged/logs/staging_task.log","destination":"/tmp/dea_ng/staging/d20150911-23523-ms2lax/staging_task.log"},"thread_id":70184860078900,"fiber_id":70184883400540,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/staging/staging_task.rb","lineno":258,"method":"block
in promise_task_log"}

Could it be something wrong with the DEA's directory server? I don't
anything in its logs after it starts so maybe it isn't being used.

On Fri, Sep 11, 2015 at 1:55 PM, kyle havlovitz <kylehav(a)gmail.com> wrote:

I'm not using bosh, just trying to run it locally. I can post the dea and
warden configs though:

dea.yml:

base_dir: /tmp/dea_ng


domain: local.example.com

logging:
file: /opt/cloudfoundry/logs/dea_ng.log

level: debug


loggregator:
router: "127.0.0.1:3456"

shared_secret: "secret"


resources:
memory_mb: 8000

memory_overcommit_factor: 1

disk_mb: 40960

disk_overcommit_factor: 1


nats_servers:
- nats://127.0.0.1:4222


pid_filename: /tmp/dea_ng.pid

warden_socket: /tmp/warden.sock

evacuation_delay_secs: 10

default_health_check_timeout: 120

index: 0

intervals:
heartbeat: 10

advertise: 5

router_register_in_seconds: 20


staging:
enabled: true

memory_limit_mb: 4096

disk_limit_mb: 6144

disk_inode_limit: 200000

cpu_limit_shares: 512

max_staging_duration: 900


instance:
disk_inode_limit: 200000

memory_to_cpu_share_ratio: 8

max_cpu_share_limit: 256

min_cpu_share_limit: 1


dea_ruby: /usr/bin/ruby

# For Go-based directory server
directory_server:

protocol: 'http'

v1_port: 4385

v2_port: 5678

file_api_port: 1234

streaming_timeout: 10

logging:

file: /opt/cloudfoundry/logs/dea_dirserver.log

level: debug


stacks:
- name: cflinuxfs2

package_path: /var/warden/rootfs


placement_properties:
zone: "zone"

warden test_vm.yml:

server:

container_klass: Warden::Container::Linux


# Wait this long before destroying a container, after the last client
# referencing it disconnected. The timer is cancelled when during this

# period, another client references the container.

#

# Clients can be forced to specify this setting by setting the

# server-wide variable to an invalid value:

# container_grace_time: invalid

#

# The grace time can be disabled by setting it to nil:

# container_grace_time: ~

#

container_grace_time: 300


unix_domain_permissions: 0777
unix_domain_path: /tmp/warden.sock


# Specifies the path to the base chroot used as the read-only root
# filesystem

container_rootfs_path: /var/warden/rootfs


# Specifies the path to the parent directory under which all containers
# will live.

container_depot_path: /var/warden/containers


# See getrlimit(2) for details. Integer values are passed verbatim.
container_rlimits:

core: 0


quota:
disk_quota_enabled: false


allow_nested_warden: false

health_check_server:
port: 2345


logging:
file: /opt/cloudfoundry/logs/warden.log

level: debug2


network:
# Use this /30 network as offset for the network pool.

pool_start_address: 10.254.0.0


# Pool this many /30 networks.
pool_size: 256


# Interface MTU size
# (for OpenStack use 1454 to avoid problems with rubygems with GRE
tunneling)
mtu: 1500


user:
pool_start_uid: 11000

pool_size: 256


This is all using the latest CFv217

On Fri, Sep 11, 2015 at 1:31 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

Hey Kyle,

Can we take a look at your deployment manifest (with all the secrets
redacted)?

Zak + Dan, CF OSS Integration team

On Fri, Sep 11, 2015 at 8:55 AM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I'm getting an error pushing any app during the staging step. The cf
logs returns only this:

2015-09-11T15:24:24.33+0000 [API] OUT Created app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:24.41+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
({"route"=>"5737c5f5-b017-43da-9013-2b6fe7db03f7"})
2015-09-11T15:24:29.54+0000 [DEA/0] OUT Got staging request for
app with id 47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:30.71+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88 ({"state"=>"STARTED"})
2015-09-11T15:24:30.76+0000 [STG/0] OUT -----> Downloaded app
package (4.0K)
2015-09-11T15:25:06.00+0000 [API] ERR encountered error:
Staging error: failed to stage application:
2015-09-11T15:25:06.00+0000 [API] ERR Script exited with status
1
In the warden logs, there are a few suspect messages:

{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985106.0086887,

"message": "Killing oom-notifier process",

"log_level": "debug",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {



},

"thread_id": 69890836968240,

"fiber_id": 69890848620580,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/features/mem_limit.rb",
"lineno": 51,

"method": "kill"

}



{

"timestamp": 1441985106.0095143,

"message": "Exited with status 0 (35.427s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/opt/cloudfoundry/warden/warden/src/oom/oom\",
\"/tmp/warden/cgroup/memory/instance-18vf956il5v\"]",
"log_level": "warn",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}




Obviously something is misconfigured, but I'm not sure what. I don't
know why the out of memory thing is appearing as the memory being used by
the test app i've pushed is tiny (64M app with staticfile buildpack) and
the dea config has resource.memory_mb set to 8 gigs and
staging.memory_limit_mb set to 1 gigs. Is there some config I'm lacking
that's causing this to fail?


kyle havlovitz <kylehav@...>
 

I also see this error in the DEA logs:
{"timestamp":1441999712.400334,"message":"Error copying files out of
container: command exited with
failure","log_level":"warn","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a"},"thread_id":70184860078900,"fiber_id":70184883843040,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/task.rb","lineno":126,"method":"rescue
in copy_out_request"}
{"timestamp":1441999712.4004574,"message":"staging.task-log.copying-out","log_level":"info","source":"Staging","data":{"app_guid":"207dd67c-415a-4108-a31d-7e82590b1e28","task_id":"01db0928ddf14426b47e21459bf1e57a","source":"/tmp/staged/logs/staging_task.log","destination":"/tmp/dea_ng/staging/d20150911-23523-ms2lax/staging_task.log"},"thread_id":70184860078900,"fiber_id":70184883400540,"process_id":23523,"file":"/opt/cloudfoundry/dea_next/lib/dea/staging/staging_task.rb","lineno":258,"method":"block
in promise_task_log"}

Could it be something wrong with the DEA's directory server? I don't
anything in its logs after it starts so maybe it isn't being used.

On Fri, Sep 11, 2015 at 1:55 PM, kyle havlovitz <kylehav(a)gmail.com> wrote:

I'm not using bosh, just trying to run it locally. I can post the dea and
warden configs though:

dea.yml:

base_dir: /tmp/dea_ng


domain: local.example.com

logging:
file: /opt/cloudfoundry/logs/dea_ng.log

level: debug


loggregator:
router: "127.0.0.1:3456"

shared_secret: "secret"


resources:
memory_mb: 8000

memory_overcommit_factor: 1

disk_mb: 40960

disk_overcommit_factor: 1


nats_servers:
- nats://127.0.0.1:4222


pid_filename: /tmp/dea_ng.pid

warden_socket: /tmp/warden.sock

evacuation_delay_secs: 10

default_health_check_timeout: 120

index: 0

intervals:
heartbeat: 10

advertise: 5

router_register_in_seconds: 20


staging:
enabled: true

memory_limit_mb: 4096

disk_limit_mb: 6144

disk_inode_limit: 200000

cpu_limit_shares: 512

max_staging_duration: 900


instance:
disk_inode_limit: 200000

memory_to_cpu_share_ratio: 8

max_cpu_share_limit: 256

min_cpu_share_limit: 1


dea_ruby: /usr/bin/ruby

# For Go-based directory server
directory_server:

protocol: 'http'

v1_port: 4385

v2_port: 5678

file_api_port: 1234

streaming_timeout: 10

logging:

file: /opt/cloudfoundry/logs/dea_dirserver.log

level: debug


stacks:
- name: cflinuxfs2

package_path: /var/warden/rootfs


placement_properties:
zone: "zone"

warden test_vm.yml:

server:

container_klass: Warden::Container::Linux


# Wait this long before destroying a container, after the last client
# referencing it disconnected. The timer is cancelled when during this

# period, another client references the container.

#

# Clients can be forced to specify this setting by setting the

# server-wide variable to an invalid value:

# container_grace_time: invalid

#

# The grace time can be disabled by setting it to nil:

# container_grace_time: ~

#

container_grace_time: 300


unix_domain_permissions: 0777
unix_domain_path: /tmp/warden.sock


# Specifies the path to the base chroot used as the read-only root
# filesystem

container_rootfs_path: /var/warden/rootfs


# Specifies the path to the parent directory under which all containers
# will live.

container_depot_path: /var/warden/containers


# See getrlimit(2) for details. Integer values are passed verbatim.
container_rlimits:

core: 0


quota:
disk_quota_enabled: false


allow_nested_warden: false

health_check_server:
port: 2345


logging:
file: /opt/cloudfoundry/logs/warden.log

level: debug2


network:
# Use this /30 network as offset for the network pool.

pool_start_address: 10.254.0.0


# Pool this many /30 networks.
pool_size: 256


# Interface MTU size
# (for OpenStack use 1454 to avoid problems with rubygems with GRE
tunneling)
mtu: 1500


user:
pool_start_uid: 11000

pool_size: 256


This is all using the latest CFv217

On Fri, Sep 11, 2015 at 1:31 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

Hey Kyle,

Can we take a look at your deployment manifest (with all the secrets
redacted)?

Zak + Dan, CF OSS Integration team

On Fri, Sep 11, 2015 at 8:55 AM, kyle havlovitz <kylehav(a)gmail.com>
wrote:

I'm getting an error pushing any app during the staging step. The cf
logs returns only this:

2015-09-11T15:24:24.33+0000 [API] OUT Created app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:24.41+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
({"route"=>"5737c5f5-b017-43da-9013-2b6fe7db03f7"})
2015-09-11T15:24:29.54+0000 [DEA/0] OUT Got staging request for
app with id 47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:30.71+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88 ({"state"=>"STARTED"})
2015-09-11T15:24:30.76+0000 [STG/0] OUT -----> Downloaded app
package (4.0K)
2015-09-11T15:25:06.00+0000 [API] ERR encountered error: Staging
error: failed to stage application:
2015-09-11T15:25:06.00+0000 [API] ERR Script exited with status 1

In the warden logs, there are a few suspect messages:

{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985106.0086887,

"message": "Killing oom-notifier process",

"log_level": "debug",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {



},

"thread_id": 69890836968240,

"fiber_id": 69890848620580,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/features/mem_limit.rb",
"lineno": 51,

"method": "kill"

}



{

"timestamp": 1441985106.0095143,

"message": "Exited with status 0 (35.427s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/opt/cloudfoundry/warden/warden/src/oom/oom\",
\"/tmp/warden/cgroup/memory/instance-18vf956il5v\"]",
"log_level": "warn",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}




Obviously something is misconfigured, but I'm not sure what. I don't
know why the out of memory thing is appearing as the memory being used by
the test app i've pushed is tiny (64M app with staticfile buildpack) and
the dea config has resource.memory_mb set to 8 gigs and
staging.memory_limit_mb set to 1 gigs. Is there some config I'm lacking
that's causing this to fail?


kyle havlovitz <kylehav@...>
 

I'm not using bosh, just trying to run it locally. I can post the dea and
warden configs though:

dea.yml:

base_dir: /tmp/dea_ng


domain: local.example.com

logging:
file: /opt/cloudfoundry/logs/dea_ng.log

level: debug


loggregator:
router: "127.0.0.1:3456"

shared_secret: "secret"


resources:
memory_mb: 8000

memory_overcommit_factor: 1

disk_mb: 40960

disk_overcommit_factor: 1


nats_servers:
- nats://127.0.0.1:4222


pid_filename: /tmp/dea_ng.pid

warden_socket: /tmp/warden.sock

evacuation_delay_secs: 10

default_health_check_timeout: 120

index: 0

intervals:
heartbeat: 10

advertise: 5

router_register_in_seconds: 20


staging:
enabled: true

memory_limit_mb: 4096

disk_limit_mb: 6144

disk_inode_limit: 200000

cpu_limit_shares: 512

max_staging_duration: 900


instance:
disk_inode_limit: 200000

memory_to_cpu_share_ratio: 8

max_cpu_share_limit: 256

min_cpu_share_limit: 1


dea_ruby: /usr/bin/ruby

# For Go-based directory server
directory_server:

protocol: 'http'

v1_port: 4385

v2_port: 5678

file_api_port: 1234

streaming_timeout: 10

logging:

file: /opt/cloudfoundry/logs/dea_dirserver.log

level: debug


stacks:
- name: cflinuxfs2

package_path: /var/warden/rootfs


placement_properties:
zone: "zone"

warden test_vm.yml:

server:

container_klass: Warden::Container::Linux


# Wait this long before destroying a container, after the last client
# referencing it disconnected. The timer is cancelled when during this

# period, another client references the container.

#

# Clients can be forced to specify this setting by setting the

# server-wide variable to an invalid value:

# container_grace_time: invalid

#

# The grace time can be disabled by setting it to nil:

# container_grace_time: ~

#

container_grace_time: 300


unix_domain_permissions: 0777
unix_domain_path: /tmp/warden.sock


# Specifies the path to the base chroot used as the read-only root
# filesystem

container_rootfs_path: /var/warden/rootfs


# Specifies the path to the parent directory under which all containers
# will live.

container_depot_path: /var/warden/containers


# See getrlimit(2) for details. Integer values are passed verbatim.
container_rlimits:

core: 0


quota:
disk_quota_enabled: false


allow_nested_warden: false

health_check_server:
port: 2345


logging:
file: /opt/cloudfoundry/logs/warden.log

level: debug2


network:
# Use this /30 network as offset for the network pool.

pool_start_address: 10.254.0.0


# Pool this many /30 networks.
pool_size: 256


# Interface MTU size
# (for OpenStack use 1454 to avoid problems with rubygems with GRE
tunneling)
mtu: 1500


user:
pool_start_uid: 11000

pool_size: 256


This is all using the latest CFv217

On Fri, Sep 11, 2015 at 1:31 PM, CF Runtime <cfruntime(a)gmail.com> wrote:

Hey Kyle,

Can we take a look at your deployment manifest (with all the secrets
redacted)?

Zak + Dan, CF OSS Integration team

On Fri, Sep 11, 2015 at 8:55 AM, kyle havlovitz <kylehav(a)gmail.com> wrote:

I'm getting an error pushing any app during the staging step. The cf logs
returns only this:

2015-09-11T15:24:24.33+0000 [API] OUT Created app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:24.41+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
({"route"=>"5737c5f5-b017-43da-9013-2b6fe7db03f7"})
2015-09-11T15:24:29.54+0000 [DEA/0] OUT Got staging request for app
with id 47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:30.71+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88 ({"state"=>"STARTED"})
2015-09-11T15:24:30.76+0000 [STG/0] OUT -----> Downloaded app
package (4.0K)
2015-09-11T15:25:06.00+0000 [API] ERR encountered error: Staging
error: failed to stage application:
2015-09-11T15:25:06.00+0000 [API] ERR Script exited with status 1

In the warden logs, there are a few suspect messages:

{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985106.0086887,

"message": "Killing oom-notifier process",

"log_level": "debug",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {



},

"thread_id": 69890836968240,

"fiber_id": 69890848620580,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/features/mem_limit.rb",
"lineno": 51,

"method": "kill"

}



{

"timestamp": 1441985106.0095143,

"message": "Exited with status 0 (35.427s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/opt/cloudfoundry/warden/warden/src/oom/oom\",
\"/tmp/warden/cgroup/memory/instance-18vf956il5v\"]",
"log_level": "warn",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}




Obviously something is misconfigured, but I'm not sure what. I don't know
why the out of memory thing is appearing as the memory being used by the
test app i've pushed is tiny (64M app with staticfile buildpack) and the
dea config has resource.memory_mb set to 8 gigs and staging.memory_limit_mb
set to 1 gigs. Is there some config I'm lacking that's causing this to fail?


CF Runtime
 

Hey Kyle,

Can we take a look at your deployment manifest (with all the secrets
redacted)?

Zak + Dan, CF OSS Integration team

On Fri, Sep 11, 2015 at 8:55 AM, kyle havlovitz <kylehav(a)gmail.com> wrote:

I'm getting an error pushing any app during the staging step. The cf logs
returns only this:

2015-09-11T15:24:24.33+0000 [API] OUT Created app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:24.41+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
({"route"=>"5737c5f5-b017-43da-9013-2b6fe7db03f7"})
2015-09-11T15:24:29.54+0000 [DEA/0] OUT Got staging request for app
with id 47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:30.71+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88 ({"state"=>"STARTED"})
2015-09-11T15:24:30.76+0000 [STG/0] OUT -----> Downloaded app
package (4.0K)
2015-09-11T15:25:06.00+0000 [API] ERR encountered error: Staging
error: failed to stage application:
2015-09-11T15:25:06.00+0000 [API] ERR Script exited with status 1

In the warden logs, there are a few suspect messages:

{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985106.0086887,

"message": "Killing oom-notifier process",

"log_level": "debug",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {



},

"thread_id": 69890836968240,

"fiber_id": 69890848620580,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/features/mem_limit.rb",
"lineno": 51,

"method": "kill"

}



{

"timestamp": 1441985106.0095143,

"message": "Exited with status 0 (35.427s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/opt/cloudfoundry/warden/warden/src/oom/oom\",
\"/tmp/warden/cgroup/memory/instance-18vf956il5v\"]",
"log_level": "warn",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",
"lineno": 135,

"method": "set_deferred_success"

}




Obviously something is misconfigured, but I'm not sure what. I don't know
why the out of memory thing is appearing as the memory being used by the
test app i've pushed is tiny (64M app with staticfile buildpack) and the
dea config has resource.memory_mb set to 8 gigs and staging.memory_limit_mb
set to 1 gigs. Is there some config I'm lacking that's causing this to fail?


kyle havlovitz <kylehav@...>
 

I'm getting an error pushing any app during the staging step. The cf logs
returns only this:

2015-09-11T15:24:24.33+0000 [API] OUT Created app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:24.41+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88
({"route"=>"5737c5f5-b017-43da-9013-2b6fe7db03f7"})
2015-09-11T15:24:29.54+0000 [DEA/0] OUT Got staging request for app
with id 47efa472-e2d5-400b-9135-b1a1dbe3ba88
2015-09-11T15:24:30.71+0000 [API] OUT Updated app with guid
47efa472-e2d5-400b-9135-b1a1dbe3ba88 ({"state"=>"STARTED"})
2015-09-11T15:24:30.76+0000 [STG/0] OUT -----> Downloaded app package
(4.0K)
2015-09-11T15:25:06.00+0000 [API] ERR encountered error: Staging
error: failed to stage application:
2015-09-11T15:25:06.00+0000 [API] ERR Script exited with status 1

In the warden logs, there are a few suspect messages:

{
"timestamp": 1441985105.8883495,

"message": "Exited with status 1 (35.120s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/var/warden/containers/18vf956il5v/bin/iomux-link\", \"-w\",
\"/var/warden/containers/18vf956il5v/jobs/8/cursors\",
\"/var/warden/containers/18vf956il5v/jobs/8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file": "/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",

"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985105.94083,

"message": "Exited with status 23 (0.023s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"], \"rsync\",
\"-e\", \"/var/warden/containers/18vf956il5v/bin/wsh --socket
/var/warden/containers/18vf956il5v/run/wshd.sock --rsh\", \"-r\", \"-p\",
\"--links\", \"vcap(a)container:/tmp/staged/staging_info.yml\",
\"/tmp/dea_ng/staging/d20150911-17093-1amg6y8\"]",
"log_level": "warn",

"source": "Warden::Container::Linux",

"data": {

"handle": "18vf956il5v",

"stdout": "",

"stderr": "rsync: link_stat \"/tmp/staged/staging_info.yml\"
failed: No such file or directory (2)\nrsync error: some files/attrs were
not transferred (see previous errors) (code 23) at main.c(1655)
[Receiver=3.1.0]\nrsync: [Receiver] write error: Broken pipe (32)\n"
},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file": "/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",

"lineno": 135,

"method": "set_deferred_success"

}



{

"timestamp": 1441985106.0086887,

"message": "Killing oom-notifier process",

"log_level": "debug",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {



},

"thread_id": 69890836968240,

"fiber_id": 69890848620580,

"process_id": 17063,

"file":
"/opt/cloudfoundry/warden/warden/lib/warden/container/features/mem_limit.rb",
"lineno": 51,

"method": "kill"

}



{

"timestamp": 1441985106.0095143,

"message": "Exited with status 0 (35.427s):
[[\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\",
\"/opt/cloudfoundry/warden/warden/src/closefds/closefds\"],
\"/opt/cloudfoundry/warden/warden/src/oom/oom\",
\"/tmp/warden/cgroup/memory/instance-18vf956il5v\"]",
"log_level": "warn",

"source": "Warden::Container::Features::MemLimit::OomNotifier",

"data": {

"stdout": "",

"stderr": ""

},

"thread_id": 69890836968240,

"fiber_id": 69890849112480,

"process_id": 17063,

"file": "/opt/cloudfoundry/warden/warden/lib/warden/container/spawn.rb",

"lineno": 135,

"method": "set_deferred_success"

}




Obviously something is misconfigured, but I'm not sure what. I don't know
why the out of memory thing is appearing as the memory being used by the
test app i've pushed is tiny (64M app with staticfile buildpack) and the
dea config has resource.memory_mb set to 8 gigs and staging.memory_limit_mb
set to 1 gigs. Is there some config I'm lacking that's causing this to fail?