bosh-lite / cf fails in ~15 minutes


Kris Rice
 

All,
I’ve run the install several times now on both OEL 6.4 and Fedora 20 with virtualbox 4.3.28 w/extras ( current ) They have all very consistently stopped being able to access the network at about the 15 minute mark with this failure

System call error while talking to director: No route to host - connect(2) (https://192.168.50.4:25555)
Pings to the 192.168.50.4 also fail ( as expected from the message )

I’ve checked that iptables is clean. Is there something else I should be looking at ?
-kris


Dmitriy Kalinin
 

Can you successfully use `vagrant ssh` to get into the VM? If so how are
the system resources? CPU, RAM...

It sounds like some kind of virtualbox bug on your OS or misconfiguration.

On Wed, Jul 8, 2015 at 8:14 AM, kris rice <kris.rice(a)jokr.net> wrote:

All,
I’ve run the install several times now on both OEL 6.4 and Fedora
20 with virtualbox 4.3.28 w/extras ( current ) They have all very
consistently stopped being able to access the network at about the 15
minute mark with this failure

System call error while talking to director: No route to host -
connect(2) (https://192.168.50.4:25555)

Pings to the 192.168.50.4 also fail ( as expected from the message )

I’ve checked that iptables is clean. Is there something else I should be
looking at ?
-kris

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Kris Rice
 

I’m sure it’s something os/vbox but on two machines repeating nothing but the steps in the doc I get the same results.

Here’s another example of what’s right now. I stop/start vagrant/vbox and then issue a bosh vms and everything is unresponsive but 1
http://pastebin.com/4THpYher <http://pastebin.com/4THpYher>

I’m the only person on this machine which is 48 cores and 256m of ram in it so should be more than enough.


-kris

On Jul 8, 2015, at 11:14, kris rice <kris.rice(a)jokr.net> wrote:

All,
I’ve run the install several times now on both OEL 6.4 and Fedora 20 with virtualbox 4.3.28 w/extras ( current ) They have all very consistently stopped being able to access the network at about the 15 minute mark with this failure

System call error while talking to director: No route to host - connect(2) (https://192.168.50.4:25555)
Pings to the 192.168.50.4 also fail ( as expected from the message )

I’ve checked that iptables is clean. Is there something else I should be looking at ?
-kris


Dmitriy Kalinin
 

Issue of not being able to ping 192.168.50.4 is different from bohs
reporting unresponsive VMs. At the bottom of the readme it mentions cck
command:
https://github.com/cloudfoundry/bosh-lite/blob/master/docs/bosh-cck.md.
This should resolve unresponsive agents.

On Wed, Jul 8, 2015 at 9:00 AM, kris rice <kris.rice(a)jokr.net> wrote:

I’m sure it’s something os/vbox but on two machines repeating nothing but
the steps in the doc I get the same results.

Here’s another example of what’s right now. I stop/start vagrant/vbox and
then issue a bosh vms and everything is unresponsive but 1
http://pastebin.com/4THpYher

I’m the only person on this machine which is 48 cores and 256m of ram in
it so should be more than enough.


-kris


On Jul 8, 2015, at 11:14, kris rice <kris.rice(a)jokr.net> wrote:

All,
I’ve run the install several times now on both OEL 6.4 and Fedora 20 with
virtualbox 4.3.28 w/extras ( current ) They have all very consistently
stopped being able to access the network at about the 15 minute mark with
this failure

System call error while talking to director: No route to host - connect(2)
(https://192.168.50.4:25555)


Pings to the 192.168.50.4 also fail ( as expected from the message )

I’ve checked that iptables is clean. Is there something else I should be
looking at ?
-kris



_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Kris Rice
 

Right. I've done that and 15 minutes later same exact thing again

-kris

On Jul 8, 2015, at 12:40, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:

Issue of not being able to ping 192.168.50.4 is different from bohs reporting unresponsive VMs. At the bottom of the readme it mentions cck command: https://github.com/cloudfoundry/bosh-lite/blob/master/docs/bosh-cck.md. This should resolve unresponsive agents.

On Wed, Jul 8, 2015 at 9:00 AM, kris rice <kris.rice(a)jokr.net> wrote:
I’m sure it’s something os/vbox but on two machines repeating nothing but the steps in the doc I get the same results.

Here’s another example of what’s right now. I stop/start vagrant/vbox and then issue a bosh vms and everything is unresponsive but 1
http://pastebin.com/4THpYher

I’m the only person on this machine which is 48 cores and 256m of ram in it so should be more than enough.


-kris


On Jul 8, 2015, at 11:14, kris rice <kris.rice(a)jokr.net> wrote:

All,
I’ve run the install several times now on both OEL 6.4 and Fedora 20 with virtualbox 4.3.28 w/extras ( current ) They have all very consistently stopped being able to access the network at about the 15 minute mark with this failure

System call error while talking to director: No route to host - connect(2) (https://192.168.50.4:25555)
Pings to the 192.168.50.4 also fail ( as expected from the message )

I’ve checked that iptables is clean. Is there something else I should be looking at ?
-kris

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Joshua McKenty <jmckenty@...>
 

Sounds like DHCP lease time or some kind of external network config issue - is there a software firewall running on your local machine? Weird NATTING or ARP Proxy?

On Jul 8, 2015, at 10:04 AM, kris rice <kris.rice(a)jokr.net> wrote:

Right. I've done that and 15 minutes later same exact thing again

-kris

On Jul 8, 2015, at 12:40, Dmitriy Kalinin <dkalinin(a)pivotal.io <mailto:dkalinin(a)pivotal.io>> wrote:

Issue of not being able to ping 192.168.50.4 is different from bohs reporting unresponsive VMs. At the bottom of the readme it mentions cck command: https://github.com/cloudfoundry/bosh-lite/blob/master/docs/bosh-cck.md <https://github.com/cloudfoundry/bosh-lite/blob/master/docs/bosh-cck.md>. This should resolve unresponsive agents.

On Wed, Jul 8, 2015 at 9:00 AM, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:
I’m sure it’s something os/vbox but on two machines repeating nothing but the steps in the doc I get the same results.

Here’s another example of what’s right now. I stop/start vagrant/vbox and then issue a bosh vms and everything is unresponsive but 1
http://pastebin.com/4THpYher <http://pastebin.com/4THpYher>

I’m the only person on this machine which is 48 cores and 256m of ram in it so should be more than enough.


-kris


On Jul 8, 2015, at 11:14, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:

All,
I’ve run the install several times now on both OEL 6.4 and Fedora 20 with virtualbox 4.3.28 w/extras ( current ) They have all very consistently stopped being able to access the network at about the 15 minute mark with this failure

System call error while talking to director: No route to host - connect(2) (https://192.168.50.4:25555 <https://192.168.50.4:25555/>)
Pings to the 192.168.50.4 also fail ( as expected from the message )

I’ve checked that iptables is clean. Is there something else I should be looking at ?
-kris

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Dmitriy Kalinin
 

Is your vagrant VM gets shut down by something/someone after 15 minutes?
The only time bosh managed containers inside bosh-lite get destroyed is
when the VM is rebooted/shut down.

On Wed, Jul 8, 2015 at 11:35 AM, Joshua McKenty <jmckenty(a)pivotal.io> wrote:

Sounds like DHCP lease time or some kind of external network config issue
- is there a software firewall running on your local machine? Weird NATTING
or ARP Proxy?


On Jul 8, 2015, at 10:04 AM, kris rice <kris.rice(a)jokr.net> wrote:

Right. I've done that and 15 minutes later same exact thing again

-kris

On Jul 8, 2015, at 12:40, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:

Issue of not being able to ping 192.168.50.4 is different from bohs
reporting unresponsive VMs. At the bottom of the readme it mentions cck
command:
https://github.com/cloudfoundry/bosh-lite/blob/master/docs/bosh-cck.md.
This should resolve unresponsive agents.

On Wed, Jul 8, 2015 at 9:00 AM, kris rice <kris.rice(a)jokr.net> wrote:

I’m sure it’s something os/vbox but on two machines repeating nothing but
the steps in the doc I get the same results.

Here’s another example of what’s right now. I stop/start vagrant/vbox
and then issue a bosh vms and everything is unresponsive but 1
http://pastebin.com/4THpYher

I’m the only person on this machine which is 48 cores and 256m of ram in
it so should be more than enough.


-kris


On Jul 8, 2015, at 11:14, kris rice <kris.rice(a)jokr.net> wrote:

All,
I’ve run the install several times now on both OEL 6.4 and Fedora 20 with
virtualbox 4.3.28 w/extras ( current ) They have all very consistently
stopped being able to access the network at about the 15 minute mark with
this failure

System call error while talking to director: No route to host -
connect(2) (https://192.168.50.4:25555)


Pings to the 192.168.50.4 also fail ( as expected from the message )

I’ve checked that iptables is clean. Is there something else I should be
looking at ?
-kris



_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh



_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Kris Rice
 

Nope vbox shows the vm still running just no longer accessible. So I do a vagrant halt / up which brings it back up and the only one from running bosh vms that is ever running is the postgres one. This machine is brand newly created so everything is bone stock. The only thing I’ve wondered is we internally use the 10.244.x.x ranges however I’ve added the routes to the linux host but we do not use 192.168.x.x at all. Is there a way to move what’s used internally? However this doesn’t solve the fact that 192.168.50.4 becomes unacccessible unless that is simply doing a port-fwd to one of the 10.x IPs ?

└─>netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 10.88.68.1 0.0.0.0 UG 0 0 0 eth0
10.88.68.0 0.0.0.0 255.255.252.0 U 0 0 0 eth0
10.244.0.0 192.168.50.4 255.255.224.0 UG 0 0 0 vboxnet0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 usb0
169.254.182.0 0.0.0.0 255.255.255.0 U 0 0 0 usb0
192.168.50.0 0.0.0.0 255.255.255.0 U 0 0 0 vboxnet0

On Jul 8, 2015, at 14:40, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:

Is your vagrant VM gets shut down by something/someone after 15 minutes? The only time bosh managed containers inside bosh-lite get destroyed is when the VM is rebooted/shut down.

On Wed, Jul 8, 2015 at 11:35 AM, Joshua McKenty <jmckenty(a)pivotal.io <mailto:jmckenty(a)pivotal.io>> wrote:
Sounds like DHCP lease time or some kind of external network config issue - is there a software firewall running on your local machine? Weird NATTING or ARP Proxy?


On Jul 8, 2015, at 10:04 AM, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:

Right. I've done that and 15 minutes later same exact thing again

-kris

On Jul 8, 2015, at 12:40, Dmitriy Kalinin <dkalinin(a)pivotal.io <mailto:dkalinin(a)pivotal.io>> wrote:

Issue of not being able to ping 192.168.50.4 is different from bohs reporting unresponsive VMs. At the bottom of the readme it mentions cck command: https://github.com/cloudfoundry/bosh-lite/blob/master/docs/bosh-cck.md <https://github.com/cloudfoundry/bosh-lite/blob/master/docs/bosh-cck.md>. This should resolve unresponsive agents.

On Wed, Jul 8, 2015 at 9:00 AM, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:
I’m sure it’s something os/vbox but on two machines repeating nothing but the steps in the doc I get the same results.

Here’s another example of what’s right now. I stop/start vagrant/vbox and then issue a bosh vms and everything is unresponsive but 1
http://pastebin.com/4THpYher <http://pastebin.com/4THpYher>

I’m the only person on this machine which is 48 cores and 256m of ram in it so should be more than enough.


-kris


On Jul 8, 2015, at 11:14, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:

All,
I’ve run the install several times now on both OEL 6.4 and Fedora 20 with virtualbox 4.3.28 w/extras ( current ) They have all very consistently stopped being able to access the network at about the 15 minute mark with this failure

System call error while talking to director: No route to host - connect(2) (https://192.168.50.4:25555 <https://192.168.50.4:25555/>)
Pings to the 192.168.50.4 also fail ( as expected from the message )

I’ve checked that iptables is clean. Is there something else I should be looking at ?
-kris

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Sabha
 

Did you install the vbox client extensions toolkit? I had issues with
networking when the client extensions were not installed.

-Sabha



--
View this message in context: http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p422.html
Sent from the CF BOSH mailing list archive at Nabble.com.


Kris Rice
 

Yup. I saw that note somewhere and did it before even starting at all.

-kris

On Jul 8, 2015, at 16:14, Sabha <sabhap(a)pivotal.io> wrote:

Did you install the vbox client extensions toolkit? I had issues with
networking when the client extensions were not installed.

-Sabha



--
View this message in context: http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p422.html
Sent from the CF BOSH mailing list archive at Nabble.com.
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Kris Rice
 

Here’s what just happened now testing it again. Note the times are a little longer this time so it’s not a hard 15 minutes.


┌─[13:07:09]─[klrice]─[denab214]:/scratch/klrice/workspace/bosh-lite$
└─>bosh vms
Acting as user 'admin' on 'Bosh Lite Director'
Deployment `cf-warden'

Director task 106

Task 106 done

+------------------------------------+---------+---------------+--------------+
| Job/index | State | Resource Pool | IPs |
+------------------------------------+---------+---------------+--------------+
| api_z1/0 | running | large_z1 | 10.244.0.134 |
| consul_z1/0 | running | medium_z1 | 10.244.0.54 |
| doppler_z1/0 | running | medium_z1 | 10.244.0.142 |
| etcd_z1/0 | running | medium_z1 | 10.244.0.42 |
| ha_proxy_z1/0 | running | router_z1 | 10.244.0.34 |
| hm9000_z1/0 | running | medium_z1 | 10.244.0.138 |
| loggregator_trafficcontroller_z1/0 | running | small_z1 | 10.244.0.146 |
| nats_z1/0 | running | medium_z1 | 10.244.0.6 |
| postgres_z1/0 | running | medium_z1 | 10.244.0.30 |
| router_z1/0 | running | router_z1 | 10.244.0.22 |
| runner_z1/0 | running | runner_z1 | 10.244.0.26 |
| uaa_z1/0 | running | medium_z1 | 10.244.0.130 |
+------------------------------------+---------+---------------+--------------+

VMs total: 12

...

└─>ping -i 30 10.244.0.42
PING 10.244.0.42 (10.244.0.42) 56(84) bytes of data.
64 bytes from 10.244.0.42: icmp_seq=1 ttl=63 time=0.456 ms
^[64 bytes from 10.244.0.42: icmp_seq=2 ttl=63 time=0.428 ms
64 bytes from 10.244.0.42: icmp_seq=3 ttl=63 time=0.506 ms
64 bytes from 10.244.0.42: icmp_seq=4 ttl=63 time=0.437 ms
From 192.168.50.1 icmp_seq=8 Destination Host Unreachable
From 192.168.50.1 icmp_seq=10 Destination Host Unreachable
From 192.168.50.1 icmp_seq=12 Destination Host Unreachable
^C
--- 10.244.0.42 ping statistics ---
12 packets transmitted, 4 received, +3 errors, 66% packet loss, time 338926ms
rtt min/avg/max/mdev = 0.428/0.456/0.506/0.039 ms
┌─[13:33:52]─[klrice]─[denab214]:/scratch/klrice/workspace/bosh-lite$
└─>bosh vms
System call error while talking to director: No route to host - connect(2) (https://192.168.50.4:25555)
┌─[13:34:00]─[klrice]─[denab214]:/scratch/klrice/workspace/bosh-lite$
└─>

On Jul 8, 2015, at 16:14, Sabha <sabhap(a)pivotal.io> wrote:

Did you install the vbox client extensions toolkit? I had issues with
networking when the client extensions were not installed.

-Sabha



--
View this message in context: http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p422.html
Sent from the CF BOSH mailing list archive at Nabble.com.
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Sabha
 

So, ping to 10.244.xxx fails but ping to the 192.168.50.4 also fails?

What happens if you disconnect from external network (to avoid the
10.244...) or delete the cf deployment from bosh-lite (dont delete the
release, let it stay, just the deployment using bosh delete deployment
cf-warden and then you can redeploy again) and see if the 10.244.x is the
cause for the problem.

Also, you can probably enable vbox gui option to see if any errors are being
reported.
config.vm.provider "virtualbox" do |v|
v.gui = true
end

-Sabha



--
View this message in context: http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html
Sent from the CF BOSH mailing list archive at Nabble.com.


Kris Rice
 

It is indeed the virtualbox ethernet going bad or something
https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png <https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png>

I’ll drop the vbox team a note. Thanks the v.gui helped a ton to see this.

-kris

On Jul 8, 2015, at 16:43, Sabha <sabhap(a)pivotal.io> wrote:

So, ping to 10.244.xxx fails but ping to the 192.168.50.4 also fails?

What happens if you disconnect from external network (to avoid the
10.244...) or delete the cf deployment from bosh-lite (dont delete the
release, let it stay, just the deployment using bosh delete deployment
cf-warden and then you can redeploy again) and see if the 10.244.x is the
cause for the problem.

Also, you can probably enable vbox gui option to see if any errors are being
reported.
config.vm.provider "virtualbox" do |v|
v.gui = true
end

-Sabha



--
View this message in context: http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html
Sent from the CF BOSH mailing list archive at Nabble.com.
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Sabha
 

can you try this:
VBoxManage modifyvm "VM name" --natdnshostresolver1 on

from: https://www.virtualbox.org/ticket/12441

On Wed, Jul 8, 2015 at 2:13 PM, kris rice <kris.rice(a)jokr.net> wrote:

It is indeed the virtualbox ethernet going bad or something

https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png

I’ll drop the vbox team a note. Thanks the v.gui helped a ton to see this.

-kris

On Jul 8, 2015, at 16:43, Sabha <sabhap(a)pivotal.io> wrote:

So, ping to 10.244.xxx fails but ping to the 192.168.50.4 also fails?

What happens if you disconnect from external network (to avoid the
10.244...) or delete the cf deployment from bosh-lite (dont delete the
release, let it stay, just the deployment using bosh delete deployment
cf-warden and then you can redeploy again) and see if the 10.244.x is the
cause for the problem.

Also, you can probably enable vbox gui option to see if any errors are
being
reported.
config.vm.provider "virtualbox" do |v|
v.gui = true
end

-Sabha



--
View this message in context:
http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html
Sent from the CF BOSH mailing list archive at Nabble.com.
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh



_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Kris Rice
 

Thanks appears to not make a difference. I did drop the team a note so hopefully will get an answer back shortly.
-kris

On Jul 8, 2015, at 17:25, Sabha Parameswaran <sabhap(a)pivotal.io> wrote:

can you try this:
VBoxManage modifyvm "VM name" --natdnshostresolver1 on

from: https://www.virtualbox.org/ticket/12441 <https://www.virtualbox.org/ticket/12441>

On Wed, Jul 8, 2015 at 2:13 PM, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:
It is indeed the virtualbox ethernet going bad or something
https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png <https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png>

I’ll drop the vbox team a note. Thanks the v.gui helped a ton to see this.

-kris

On Jul 8, 2015, at 16:43, Sabha <sabhap(a)pivotal.io <mailto:sabhap(a)pivotal.io>> wrote:

So, ping to 10.244.xxx fails but ping to the 192.168.50.4 also fails?

What happens if you disconnect from external network (to avoid the
10.244...) or delete the cf deployment from bosh-lite (dont delete the
release, let it stay, just the deployment using bosh delete deployment
cf-warden and then you can redeploy again) and see if the 10.244.x is the
cause for the problem.

Also, you can probably enable vbox gui option to see if any errors are being
reported.
config.vm.provider "virtualbox" do |v|
v.gui = true
end

-Sabha



--
View this message in context: http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html <http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html>
Sent from the CF BOSH mailing list archive at Nabble.com <http://nabble.com/>.
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Armin Ranjbar
 

hmm, it's a known (but old) bug, can you try to change NIC type?
also try this on the VM:
add pcie_aspm=off to cmdline, this is supposed to disable power management
features of PCI-Express components, if that didn't helped try to disable
NIC TSO:
ethtool -K ethX tso off

i have also noticed that you are using OEL 6.4 and fedora20, both are
pretty old (2013 i guess), can you try using a more recent (ubuntu14 lts
for example) distro?


---
Armin ranjbar

On Thu, Jul 9, 2015 at 2:06 AM, kris rice <kris.rice(a)jokr.net> wrote:

Thanks appears to not make a difference. I did drop the team a note so
hopefully will get an answer back shortly.
-kris

On Jul 8, 2015, at 17:25, Sabha Parameswaran <sabhap(a)pivotal.io> wrote:

can you try this:
VBoxManage modifyvm "VM name" --natdnshostresolver1 on

from: https://www.virtualbox.org/ticket/12441

On Wed, Jul 8, 2015 at 2:13 PM, kris rice <kris.rice(a)jokr.net> wrote:

It is indeed the virtualbox ethernet going bad or something

https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png

I’ll drop the vbox team a note. Thanks the v.gui helped a ton to see
this.

-kris

On Jul 8, 2015, at 16:43, Sabha <sabhap(a)pivotal.io> wrote:

So, ping to 10.244.xxx fails but ping to the 192.168.50.4 also fails?

What happens if you disconnect from external network (to avoid the
10.244...) or delete the cf deployment from bosh-lite (dont delete the
release, let it stay, just the deployment using bosh delete deployment
cf-warden and then you can redeploy again) and see if the 10.244.x is the
cause for the problem.

Also, you can probably enable vbox gui option to see if any errors are
being
reported.
config.vm.provider "virtualbox" do |v|
v.gui = true
end

-Sabha



--
View this message in context:
http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html
Sent from the CF BOSH mailing list archive at Nabble.com
<http://nabble.com/>.
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh



_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh



_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Kris Rice
 

First I really like that 2 years means old you should tell some of my customers that !

I work at oracle so have an in with the vbox team :) They are looking at it for me now.

On Jul 9, 2015, at 02:55, Armin Ranjbar <zoup(a)zoup.org> wrote:

hmm, it's a known (but old) bug, can you try to change NIC type?
also try this on the VM:
add pcie_aspm=off to cmdline, this is supposed to disable power management features of PCI-Express components, if that didn't helped try to disable NIC TSO:
ethtool -K ethX tso off

i have also noticed that you are using OEL 6.4 and fedora20, both are pretty old (2013 i guess), can you try using a more recent (ubuntu14 lts for example) distro?


---
Armin ranjbar


On Thu, Jul 9, 2015 at 2:06 AM, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:
Thanks appears to not make a difference. I did drop the team a note so hopefully will get an answer back shortly.
-kris

On Jul 8, 2015, at 17:25, Sabha Parameswaran <sabhap(a)pivotal.io <mailto:sabhap(a)pivotal.io>> wrote:

can you try this:
VBoxManage modifyvm "VM name" --natdnshostresolver1 on

from: https://www.virtualbox.org/ticket/12441 <https://www.virtualbox.org/ticket/12441>

On Wed, Jul 8, 2015 at 2:13 PM, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:
It is indeed the virtualbox ethernet going bad or something
https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png <https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png>

I’ll drop the vbox team a note. Thanks the v.gui helped a ton to see this.

-kris

On Jul 8, 2015, at 16:43, Sabha <sabhap(a)pivotal.io <mailto:sabhap(a)pivotal.io>> wrote:

So, ping to 10.244.xxx fails but ping to the 192.168.50.4 also fails?

What happens if you disconnect from external network (to avoid the
10.244...) or delete the cf deployment from bosh-lite (dont delete the
release, let it stay, just the deployment using bosh delete deployment
cf-warden and then you can redeploy again) and see if the 10.244.x is the
cause for the problem.

Also, you can probably enable vbox gui option to see if any errors are being
reported.
config.vm.provider "virtualbox" do |v|
v.gui = true
end

-Sabha



--
View this message in context: http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html <http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html>
Sent from the CF BOSH mailing list archive at Nabble.com <http://nabble.com/>.
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Armin Ranjbar
 

hahaha :))
i think it's kernel realted, so you might want to tip them about that.


---
Armin ranjbar

On Thu, Jul 9, 2015 at 5:56 PM, kris rice <kris.rice(a)jokr.net> wrote:

First I really like that 2 years means old you should tell some of my
customers that !

I work at oracle so have an in with the vbox team :) They are looking at
it for me now.



On Jul 9, 2015, at 02:55, Armin Ranjbar <zoup(a)zoup.org> wrote:

hmm, it's a known (but old) bug, can you try to change NIC type?
also try this on the VM:
add pcie_aspm=off to cmdline, this is supposed to disable power management
features of PCI-Express components, if that didn't helped try to disable
NIC TSO:
ethtool -K ethX tso off

i have also noticed that you are using OEL 6.4 and fedora20, both are
pretty old (2013 i guess), can you try using a more recent (ubuntu14 lts
for example) distro?


---
Armin ranjbar


On Thu, Jul 9, 2015 at 2:06 AM, kris rice <kris.rice(a)jokr.net> wrote:

Thanks appears to not make a difference. I did drop the team a note so
hopefully will get an answer back shortly.
-kris

On Jul 8, 2015, at 17:25, Sabha Parameswaran <sabhap(a)pivotal.io> wrote:

can you try this:
VBoxManage modifyvm "VM name" --natdnshostresolver1 on

from: https://www.virtualbox.org/ticket/12441

On Wed, Jul 8, 2015 at 2:13 PM, kris rice <kris.rice(a)jokr.net> wrote:

It is indeed the virtualbox ethernet going bad or something

https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png

I’ll drop the vbox team a note. Thanks the v.gui helped a ton to see
this.

-kris

On Jul 8, 2015, at 16:43, Sabha <sabhap(a)pivotal.io> wrote:

So, ping to 10.244.xxx fails but ping to the 192.168.50.4 also fails?

What happens if you disconnect from external network (to avoid the
10.244...) or delete the cf deployment from bosh-lite (dont delete the
release, let it stay, just the deployment using bosh delete deployment
cf-warden and then you can redeploy again) and see if the 10.244.x is the
cause for the problem.

Also, you can probably enable vbox gui option to see if any errors are
being
reported.
config.vm.provider "virtualbox" do |v|
v.gui = true
end

-Sabha



--
View this message in context:
http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html
Sent from the CF BOSH mailing list archive at Nabble.com
<http://nabble.com/>.
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh



_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh



_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh



_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Kris Rice
 

Fyi for everyone.

Looks like this box bug and the team is working on it with debug vbox builds.

https://forums.virtualbox.org/viewtopic.php?f=2&t=67437
-kris

On Jul 9, 2015, at 09:54, Armin Ranjbar <zoup(a)zoup.org> wrote:

hahaha :))
i think it's kernel realted, so you might want to tip them about that.


---
Armin ranjbar


On Thu, Jul 9, 2015 at 5:56 PM, kris rice <kris.rice(a)jokr.net> wrote:
First I really like that 2 years means old you should tell some of my customers that !

I work at oracle so have an in with the vbox team :) They are looking at it for me now.



On Jul 9, 2015, at 02:55, Armin Ranjbar <zoup(a)zoup.org> wrote:

hmm, it's a known (but old) bug, can you try to change NIC type?
also try this on the VM:
add pcie_aspm=off to cmdline, this is supposed to disable power management features of PCI-Express components, if that didn't helped try to disable NIC TSO:
ethtool -K ethX tso off

i have also noticed that you are using OEL 6.4 and fedora20, both are pretty old (2013 i guess), can you try using a more recent (ubuntu14 lts for example) distro?


---
Armin ranjbar


On Thu, Jul 9, 2015 at 2:06 AM, kris rice <kris.rice(a)jokr.net> wrote:
Thanks appears to not make a difference. I did drop the team a note so hopefully will get an answer back shortly.
-kris

On Jul 8, 2015, at 17:25, Sabha Parameswaran <sabhap(a)pivotal.io> wrote:

can you try this:
VBoxManage modifyvm "VM name" --natdnshostresolver1 on

from: https://www.virtualbox.org/ticket/12441

On Wed, Jul 8, 2015 at 2:13 PM, kris rice <kris.rice(a)jokr.net> wrote:
It is indeed the virtualbox ethernet going bad or something
https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png

I’ll drop the vbox team a note. Thanks the v.gui helped a ton to see this.

-kris

On Jul 8, 2015, at 16:43, Sabha <sabhap(a)pivotal.io> wrote:

So, ping to 10.244.xxx fails but ping to the 192.168.50.4 also fails?

What happens if you disconnect from external network (to avoid the
10.244...) or delete the cf deployment from bosh-lite (dont delete the
release, let it stay, just the deployment using bosh delete deployment
cf-warden and then you can redeploy again) and see if the 10.244.x is the
cause for the problem.

Also, you can probably enable vbox gui option to see if any errors are being
reported.
config.vm.provider "virtualbox" do |v|
v.gui = true
end

-Sabha



--
View this message in context: http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html
Sent from the CF BOSH mailing list archive at Nabble.com.
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh


Kris Rice
 

So…

The E1000 emulation in vbox is being overloaded because the internal traffic for 10.244.x.x. is being routed out to the host which has to go back into the vm.

From the vbox team:
So if you really want to route your traffic to 10.244.0.0/19 through VM for whatever processing you need to apply, you must arrange the routing in the VM in such a way that it doesn't route 10.244.0.0/19 through the host again (which would create the loop), so NAT or NAT Network are not suitable as outgoing interfaces. You can either use bridged, or if you must use NAT you will have to set up some tunnel elsewhere I guess to bypass host's route back to the VM.

Now the question is can the vagrant vm be changed so it’s internal route tables keeps the 10.244.x.x inside the vm instead of bouncing out just to come back in ?


-kris



On Jul 9, 2015, at 09:54, Armin Ranjbar <zoup(a)zoup.org> wrote:

hahaha :))
i think it's kernel realted, so you might want to tip them about that.


---
Armin ranjbar


On Thu, Jul 9, 2015 at 5:56 PM, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:
First I really like that 2 years means old you should tell some of my customers that !

I work at oracle so have an in with the vbox team :) They are looking at it for me now.



On Jul 9, 2015, at 02:55, Armin Ranjbar <zoup(a)zoup.org <mailto:zoup(a)zoup.org>> wrote:

hmm, it's a known (but old) bug, can you try to change NIC type?
also try this on the VM:
add pcie_aspm=off to cmdline, this is supposed to disable power management features of PCI-Express components, if that didn't helped try to disable NIC TSO:
ethtool -K ethX tso off

i have also noticed that you are using OEL 6.4 and fedora20, both are pretty old (2013 i guess), can you try using a more recent (ubuntu14 lts for example) distro?


---
Armin ranjbar


On Thu, Jul 9, 2015 at 2:06 AM, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:
Thanks appears to not make a difference. I did drop the team a note so hopefully will get an answer back shortly.
-kris

On Jul 8, 2015, at 17:25, Sabha Parameswaran <sabhap(a)pivotal.io <mailto:sabhap(a)pivotal.io>> wrote:

can you try this:
VBoxManage modifyvm "VM name" --natdnshostresolver1 on

from: https://www.virtualbox.org/ticket/12441 <https://www.virtualbox.org/ticket/12441>

On Wed, Jul 8, 2015 at 2:13 PM, kris rice <kris.rice(a)jokr.net <mailto:kris.rice(a)jokr.net>> wrote:
It is indeed the virtualbox ethernet going bad or something
https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png <https://dl.dropboxusercontent.com/u/12188201/Screen%20Shot%202015-07-08%20at%2017.03.46.png>

I’ll drop the vbox team a note. Thanks the v.gui helped a ton to see this.

-kris

On Jul 8, 2015, at 16:43, Sabha <sabhap(a)pivotal.io <mailto:sabhap(a)pivotal.io>> wrote:

So, ping to 10.244.xxx fails but ping to the 192.168.50.4 also fails?

What happens if you disconnect from external network (to avoid the
10.244...) or delete the cf deployment from bosh-lite (dont delete the
release, let it stay, just the deployment using bosh delete deployment
cf-warden and then you can redeploy again) and see if the 10.244.x is the
cause for the problem.

Also, you can probably enable vbox gui option to see if any errors are being
reported.
config.vm.provider "virtualbox" do |v|
v.gui = true
end

-Sabha



--
View this message in context: http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html <http://cf-bosh.70367.x6.nabble.com/cf-bosh-bosh-lite-cf-fails-in-15-minutes-tp412p425.html>
Sent from the CF BOSH mailing list archive at Nabble.com <http://nabble.com/>.
_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>

_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org <mailto:cf-bosh(a)lists.cloudfoundry.org>
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh <https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh>


_______________________________________________
cf-bosh mailing list
cf-bosh(a)lists.cloudfoundry.org
https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh