Get stuck at 96% when running "bosh upload stemcell"


Diego Lin
 

Hi all,

I am new for Cloud Foundry. I am trying to set up Cloud Foundry on bosh-lite to show a complete micro service solution based on Cloud Foundry.

I encounter a strange issue that the progress of uploading stemcell gets stuck at 96% when I run the command: "/home/arch/sourceCode/gSource/bosh-lite>bosh upload stemcell bosh-stemcell-3312.9-warden-boshlite-ubuntu-trusty-go_agent.tgz".

I read all I searched on the Internet, but they are helpless:
https://github.com/cloudfoundry/bosh/issues/1352
https://github.com/cloudfoundry-incubator/bosh-openstack-cpi-release/issues/44

I have carefully tried several times with a common account and the root account according to the guide: http://docs.cloudfoundry.org/deploying/boshlite/index.html.
I can enter the VirtualBox VM by "ssh -p 2222 vagrant(a)127.0.0.1" and "vagrant ssh".

It's a little urgent. I am very grateful to you for any help!


My machine is a HP PC with: i5-4570 CPU @ 3.20GHz 4 Cores, 32G memory, CentOS 7.3.

Current Settings:

1. The output of "bosh upload stemcell"(I even don't know where I can read the error log like Java App.)
root(a)05:47:58
/home/arch/sourceCode/gSource/bosh-lite>bosh upload stemcell bosh-stemcell-3312.9-warden-boshlite-ubuntu-trusty-go_agent.tgz
RSA 1024 bit CA certificates are loaded due to old openssl compatibility
Acting as user 'admin' on 'Bosh Lite Director'

Verifying stemcell...
File exists and readable OK
Verifying tarball...
Read tarball OK
Manifest exists OK
Stemcell image file OK
Stemcell properties OK

Stemcell info
-------------
Name: bosh-warden-boshlite-ubuntu-trusty-go_agent
Version: 3312.9

Checking if stemcell already exists...
No

Uploading stemcell...

bosh-stemcell: 96% |ooooooooooooooooooooooooooo | 347.4MB 39.4MB/s ETA: 00:00:00
Director task 1


^C
Do you want to cancel task 1? [yN] (^C again to detach): ^C
bosh-stemcell: 96% |ooooooooooooooooooooooooooo | 348.2MB 269.4KB/s Time: 00:22:03
Task 1 is still running
-----------------------------


2. The output of "vagrant --version"
[root(a)hadoopdatanode3 ~]# vagrant --version
Vagrant 1.9.1
-----------------------------


3. The output of "VBoxManage --version"
[root(a)hadoopdatanode3 ~]# VBoxManage --version
5.1.10r112026
-----------------------------


4. The output of "vagrant up"
root(a)03:36:11
/home/arch/sourceCode/gSource/bosh-lite>vagrant up --provider=virtualbox
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Box 'cloudfoundry/bosh-lite' could not be found. Attempting to find and install...
default: Box Provider: virtualbox
default: Box Version: 9000.131.0
==> default: Loading metadata for box 'cloudfoundry/bosh-lite'
default: URL: https://atlas.hashicorp.com/cloudfoundry/bosh-lite
==> default: Adding box 'cloudfoundry/bosh-lite' (v9000.131.0) for provider: virtualbox
default: Downloading: https://atlas.hashicorp.com/cloudfoundry/boxes/bosh-lite/versions/9000.131.0/providers/virtualbox.box
==> default: Successfully added box 'cloudfoundry/bosh-lite' (v9000.131.0) for 'virtualbox'!
==> default: Importing base box 'cloudfoundry/bosh-lite'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'cloudfoundry/bosh-lite' is up to date...
==> default: Setting the name of the VM: bosh-lite_default_1482223891678_3283
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
default: Adapter 1: nat
default: Adapter 2: hostonly
==> default: Forwarding ports...
default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
default: SSH address: 127.0.0.1:2222
default: SSH username: vagrant
default: SSH auth method: private key
default:
default: Vagrant insecure key detected. Vagrant will automatically replace
default: this with a newly generated keypair for better security.
default:
default: Inserting generated public key within guest...
default: Removing insecure key from the guest if it's present...
default: Key inserted! Disconnecting and reconnecting using new SSH key...
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
default: The guest additions on this VM do not match the installed version of
default: VirtualBox! In most cases this is fine, but in rare cases it can
default: prevent things such as shared folders from working properly. If you see
default: shared folder errors, please make sure the guest additions within the
default: virtual machine match the version of VirtualBox you have installed on
default: your host and reload your VM.
default:
default: Guest Additions Version: 5.0.0
default: VirtualBox Version: 5.1
==> default: Setting hostname...
==> default: Configuring and enabling network interfaces...
==> default: Mounting shared folders...
default: /vagrant => /home/arch/sourceCode/gSource/bosh-lite
-----------------------------


5. The output of "route"
[root(a)hadoopdatanode3 ~]# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default xx_core_1 0.0.0.0 UG 100 0 0 eno1
cdc04.xxxx. xx_core_1 255.255.255.255 UGH 100 0 0 eno1
10.244.0.0 bogon 255.255.224.0 UG 0 0 0 vboxnet1
192.168.50.0 0.0.0.0 255.255.255.0 U 0 0 0 vboxnet1
192.168.56.0 0.0.0.0 255.255.255.0 U 0 0 0 vboxnet0
192.168.80.0 0.0.0.0 255.255.255.0 U 100 0 0 eno1
-----------------------------


6. The output of "vagrant global-status"
[root(a)hadoopdatanode3 ~]# vagrant global-status
id name provider state directory
----------------------------------------------------------------------------
34b83c6 default virtualbox running /home/arch/sourceCode/gSource/bosh-lite
.............
-----------------------------


7. The output of "bosh status"
[root(a)hadoopdatanode3 ~]# bosh status
Config
/root/.bosh_config

Director
RSA 1024 bit CA certificates are loaded due to old openssl compatibility
Name Bosh Lite Director
URL https://192.168.50.4:25555
Version 1.3262.3.0 (00000000)
User admin
UUID fb70f16a-4d6f-41f8-bc2e-0181a03dff18
CPI warden_cpi
dns disabled
compiled_package_cache enabled (provider: local)
snapshots disabled

Deployment
Manifest /home/arch/sourceCode/gSource/cf-release/bosh-lite/deployments/cf.yml
-----------------------------


Krannich, Bernd <bernd.krannich@...>
 

Hi Diego,

I ran into the same issue recently. Preliminary analysis by BOSH colleagues hints to a race condition between Vagrant and BOSH changing the hostname in parallel.

The way we could resolve the issue was:

# ssh into the bosh-lite VM
sudo su -
monit restart all # ignore error messages
# wait for
monit summary
# to show all processes as being restarted.

Regards,
Bernd

P.S.: The 96% display is a different issue, tracked here: https://github.com/cloudfoundry/bosh/issues/1352

On 20/12/2016, 12:54, "Diego Lin" <diegol(a)synnex.com> wrote:

Hi all,

I am new for Cloud Foundry. I am trying to set up Cloud Foundry on bosh-lite to show a complete micro service solution based on Cloud Foundry.

I encounter a strange issue that the progress of uploading stemcell gets stuck at 96% when I run the command: "/home/arch/sourceCode/gSource/bosh-lite>bosh upload stemcell bosh-stemcell-3312.9-warden-boshlite-ubuntu-trusty-go_agent.tgz".

I read all I searched on the Internet, but they are helpless:
https://github.com/cloudfoundry/bosh/issues/1352
https://github.com/cloudfoundry-incubator/bosh-openstack-cpi-release/issues/44

I have carefully tried several times with a common account and the root account according to the guide: http://docs.cloudfoundry.org/deploying/boshlite/index.html.
I can enter the VirtualBox VM by "ssh -p 2222 vagrant(a)127.0.0.1" and "vagrant ssh".

It's a little urgent. I am very grateful to you for any help!


My machine is a HP PC with: i5-4570 CPU @ 3.20GHz 4 Cores, 32G memory, CentOS 7.3.

Current Settings:

1. The output of "bosh upload stemcell"(I even don't know where I can read the error log like Java App.)
root(a)05:47:58
/home/arch/sourceCode/gSource/bosh-lite>bosh upload stemcell bosh-stemcell-3312.9-warden-boshlite-ubuntu-trusty-go_agent.tgz
RSA 1024 bit CA certificates are loaded due to old openssl compatibility
Acting as user 'admin' on 'Bosh Lite Director'

Verifying stemcell...
File exists and readable OK
Verifying tarball...
Read tarball OK
Manifest exists OK
Stemcell image file OK
Stemcell properties OK

Stemcell info
-------------
Name: bosh-warden-boshlite-ubuntu-trusty-go_agent
Version: 3312.9

Checking if stemcell already exists...
No

Uploading stemcell...

bosh-stemcell: 96% |ooooooooooooooooooooooooooo | 347.4MB 39.4MB/s ETA: 00:00:00
Director task 1


^C
Do you want to cancel task 1? [yN] (^C again to detach): ^C
bosh-stemcell: 96% |ooooooooooooooooooooooooooo | 348.2MB 269.4KB/s Time: 00:22:03
Task 1 is still running
-----------------------------


2. The output of "vagrant --version"
[root(a)hadoopdatanode3 ~]# vagrant --version
Vagrant 1.9.1
-----------------------------


3. The output of "VBoxManage --version"
[root(a)hadoopdatanode3 ~]# VBoxManage --version
5.1.10r112026
-----------------------------


4. The output of "vagrant up"
root(a)03:36:11
/home/arch/sourceCode/gSource/bosh-lite>vagrant up --provider=virtualbox
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Box 'cloudfoundry/bosh-lite' could not be found. Attempting to find and install...
default: Box Provider: virtualbox
default: Box Version: 9000.131.0
==> default: Loading metadata for box 'cloudfoundry/bosh-lite'
default: URL: https://atlas.hashicorp.com/cloudfoundry/bosh-lite
==> default: Adding box 'cloudfoundry/bosh-lite' (v9000.131.0) for provider: virtualbox
default: Downloading: https://atlas.hashicorp.com/cloudfoundry/boxes/bosh-lite/versions/9000.131.0/providers/virtualbox.box
==> default: Successfully added box 'cloudfoundry/bosh-lite' (v9000.131.0) for 'virtualbox'!
==> default: Importing base box 'cloudfoundry/bosh-lite'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'cloudfoundry/bosh-lite' is up to date...
==> default: Setting the name of the VM: bosh-lite_default_1482223891678_3283
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
default: Adapter 1: nat
default: Adapter 2: hostonly
==> default: Forwarding ports...
default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
default: SSH address: 127.0.0.1:2222
default: SSH username: vagrant
default: SSH auth method: private key
default:
default: Vagrant insecure key detected. Vagrant will automatically replace
default: this with a newly generated keypair for better security.
default:
default: Inserting generated public key within guest...
default: Removing insecure key from the guest if it's present...
default: Key inserted! Disconnecting and reconnecting using new SSH key...
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
default: The guest additions on this VM do not match the installed version of
default: VirtualBox! In most cases this is fine, but in rare cases it can
default: prevent things such as shared folders from working properly. If you see
default: shared folder errors, please make sure the guest additions within the
default: virtual machine match the version of VirtualBox you have installed on
default: your host and reload your VM.
default:
default: Guest Additions Version: 5.0.0
default: VirtualBox Version: 5.1
==> default: Setting hostname...
==> default: Configuring and enabling network interfaces...
==> default: Mounting shared folders...
default: /vagrant => /home/arch/sourceCode/gSource/bosh-lite
-----------------------------


5. The output of "route"
[root(a)hadoopdatanode3 ~]# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default xx_core_1 0.0.0.0 UG 100 0 0 eno1
cdc04.xxxx. xx_core_1 255.255.255.255 UGH 100 0 0 eno1
10.244.0.0 bogon 255.255.224.0 UG 0 0 0 vboxnet1
192.168.50.0 0.0.0.0 255.255.255.0 U 0 0 0 vboxnet1
192.168.56.0 0.0.0.0 255.255.255.0 U 0 0 0 vboxnet0
192.168.80.0 0.0.0.0 255.255.255.0 U 100 0 0 eno1
-----------------------------


6. The output of "vagrant global-status"
[root(a)hadoopdatanode3 ~]# vagrant global-status
id name provider state directory
----------------------------------------------------------------------------
34b83c6 default virtualbox running /home/arch/sourceCode/gSource/bosh-lite
.............
-----------------------------


7. The output of "bosh status"
[root(a)hadoopdatanode3 ~]# bosh status
Config
/root/.bosh_config

Director
RSA 1024 bit CA certificates are loaded due to old openssl compatibility
Name Bosh Lite Director
URL https://192.168.50.4:25555
Version 1.3262.3.0 (00000000)
User admin
UUID fb70f16a-4d6f-41f8-bc2e-0181a03dff18
CPI warden_cpi
dns disabled
compiled_package_cache enabled (provider: local)
snapshots disabled

Deployment
Manifest /home/arch/sourceCode/gSource/cf-release/bosh-lite/deployments/cf.yml
-----------------------------


Ronak Banka
 

Hi Diego ,

you can try a something which helped me

1. Do bosh upload stemcell
2. While uploading and when getting stuck at 96% , open a new terminal and
login into vagrant vm with vagrant ssh & do monit restart for each director
worker ( 1 at a time ).

last time i did this this , soon after restarting worker_1 with monit, cli
on other terminal got successful response for stemcell upload.

Thanks
Ronak



--
View this message in context: http://cf-bosh.70367.x6.nabble.com/cf-bosh-Get-stuck-at-96-when-running-bosh-upload-stemcell-tp2009p2013.html
Sent from the CF BOSH mailing list archive at Nabble.com.


Ballok, Istvan Zoltan <istvan.zoltan.ballok@...>
 

Hi,



I faced the same issue and came up with the following workaround thanks to the help of my colleagues:

--> restart the bosh director when initialising the vagrant virtual machine.



--- a/Vagrantfile

+++ b/Vagrantfile

@@ -39,4 +39,6 @@ Vagrant.configure('2') do |config|

#we no longer build current boxes for vmware_workstation

#ensure that this fails. otherwise the user gets an old box

end

+

+ config.vm.provision "shell", inline: "sudo /var/vcap/bosh/bin/monit restart director"

end



In my case the root cause might be the corporate proxy. I configured the proxy settings with the “vagrant-proxyconf” plugin [1].

Note that in bosh-lite for some reason the Google DNS server (8.8.8.8) is hard coded as a first entry [2]. If that DNS server is not reachable from your environment [3], you might want to remove that resolvconf entry as well to use the default DNS forwarding mechanism provided by VirtualBox:



+ config.vm.provision "shell", inline: "sudo sed -i 's/\\(^nameserver 8\\.8\\.8\\.8.*\\)/#\\1/' /etc/resolvconf/resolv.conf.d/head"

+ config.vm.provision "shell", inline: "sudo resolvconf -u"



Cheers,

Istvan



[1] https://github.com/tmatilai/vagrant-proxyconf

[2] https://github.com/cloudfoundry/bosh-lite/commit/d984f8c34ae14ca8f53e3b6266dfa7d3f0ecfe0b

[3] Test it with: $ nslookup www.google.com 8.8.8.8

Istvan Zoltan Ballok
Software Developer, PI Tech HCP Core Platform Dev KA (SE)
SAP SE Karlsruhe, Vincenz-Prießnitz-Straße 1, 76131 Karlsruhe, Germany

E istvan.zoltan.ballok(a)sap.com<mailto:istvan.zoltan.ballok(a)sap.com>


Pflichtangaben/Mandatory Disclosure Statement:
http://www.sap.com/company/legal/impressum.epx

Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtümlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

This e-mail may contain trade secrets or privileged, undisclosed, or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying, or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation.

On 20/12/2016, 12:54, "Diego Lin" <diegol(a)synnex.com> wrote:



Hi all,



I am new for Cloud Foundry. I am trying to set up Cloud Foundry on bosh-lite to show a complete micro service solution based on Cloud Foundry.



I encounter a strange issue that the progress of uploading stemcell gets stuck at 96% when I run the command: "/home/arch/sourceCode/gSource/bosh-lite>bosh upload stemcell bosh-stemcell-3312.9-warden-boshlite-ubuntu-trusty-go_agent.tgz".



I read all I searched on the Internet, but they are helpless:

https://github.com/cloudfoundry/bosh/issues/1352

https://github.com/cloudfoundry-incubator/bosh-openstack-cpi-release/issues/44



I have carefully tried several times with a common account and the root account according to the guide: http://docs.cloudfoundry.org/deploying/boshlite/index.html.

I can enter the VirtualBox VM by "ssh -p 2222 vagrant(a)127.0.0.1" and "vagrant ssh".



It's a little urgent. I am very grateful to you for any help!





My machine is a HP PC with: i5-4570 CPU @ 3.20GHz 4 Cores, 32G memory, CentOS 7.3.



Current Settings:



1. The output of "bosh upload stemcell"(I even don't know where I can read the error log like Java App.)

root(a)05:47:58

/home/arch/sourceCode/gSource/bosh-lite>bosh upload stemcell bosh-stemcell-3312.9-warden-boshlite-ubuntu-trusty-go_agent.tgz

RSA 1024 bit CA certificates are loaded due to old openssl compatibility

Acting as user 'admin' on 'Bosh Lite Director'



Verifying stemcell...

File exists and readable OK

Verifying tarball...

Read tarball OK

Manifest exists OK

Stemcell image file OK

Stemcell properties OK



Stemcell info

-------------

Name: bosh-warden-boshlite-ubuntu-trusty-go_agent

Version: 3312.9



Checking if stemcell already exists...

No



Uploading stemcell...



bosh-stemcell: 96% |ooooooooooooooooooooooooooo | 347.4MB 39.4MB/s ETA: 00:00:00

Director task 1





^C

Do you want to cancel task 1? [yN] (^C again to detach): ^C

bosh-stemcell: 96% |ooooooooooooooooooooooooooo | 348.2MB 269.4KB/s Time: 00:22:03

Task 1 is still running

-----------------------------





2. The output of "vagrant --version"

[root(a)hadoopdatanode3 ~]# vagrant --version

Vagrant 1.9.1

-----------------------------





3. The output of "VBoxManage --version"

[root(a)hadoopdatanode3 ~]# VBoxManage --version

5.1.10r112026

-----------------------------





4. The output of "vagrant up"

root(a)03:36:11

/home/arch/sourceCode/gSource/bosh-lite>vagrant up --provider=virtualbox

Bringing machine 'default' up with 'virtualbox' provider...

==> default: Box 'cloudfoundry/bosh-lite' could not be found. Attempting to find and install...

default: Box Provider: virtualbox

default: Box Version: 9000.131.0

==> default: Loading metadata for box 'cloudfoundry/bosh-lite'

default: URL: https://atlas.hashicorp.com/cloudfoundry/bosh-lite

==> default: Adding box 'cloudfoundry/bosh-lite' (v9000.131.0) for provider: virtualbox

default: Downloading: https://atlas.hashicorp.com/cloudfoundry/boxes/bosh-lite/versions/9000.131.0/providers/virtualbox.box

==> default: Successfully added box 'cloudfoundry/bosh-lite' (v9000.131.0) for 'virtualbox'!

==> default: Importing base box 'cloudfoundry/bosh-lite'...

==> default: Matching MAC address for NAT networking...

==> default: Checking if box 'cloudfoundry/bosh-lite' is up to date...

==> default: Setting the name of the VM: bosh-lite_default_1482223891678_3283

==> default: Clearing any previously set network interfaces...

==> default: Preparing network interfaces based on configuration...

default: Adapter 1: nat

default: Adapter 2: hostonly

==> default: Forwarding ports...

default: 22 (guest) => 2222 (host) (adapter 1)

==> default: Running 'pre-boot' VM customizations...

==> default: Booting VM...

==> default: Waiting for machine to boot. This may take a few minutes...

default: SSH address: 127.0.0.1:2222

default: SSH username: vagrant

default: SSH auth method: private key

default:

default: Vagrant insecure key detected. Vagrant will automatically replace

default: this with a newly generated keypair for better security.

default:

default: Inserting generated public key within guest...

default: Removing insecure key from the guest if it's present...

default: Key inserted! Disconnecting and reconnecting using new SSH key...

==> default: Machine booted and ready!

==> default: Checking for guest additions in VM...

default: The guest additions on this VM do not match the installed version of

default: VirtualBox! In most cases this is fine, but in rare cases it can

default: prevent things such as shared folders from working properly. If you see

default: shared folder errors, please make sure the guest additions within the

default: virtual machine match the version of VirtualBox you have installed on

default: your host and reload your VM.

default:

default: Guest Additions Version: 5.0.0

default: VirtualBox Version: 5.1

==> default: Setting hostname...

==> default: Configuring and enabling network interfaces...

==> default: Mounting shared folders...

default: /vagrant => /home/arch/sourceCode/gSource/bosh-lite

-----------------------------





5. The output of "route"

[root(a)hadoopdatanode3 ~]# route

Kernel IP routing table

Destination Gateway Genmask Flags Metric Ref Use Iface

default xx_core_1 0.0.0.0 UG 100 0 0 eno1

cdc04.xxxx. xx_core_1 255.255.255.255 UGH 100 0 0 eno1

10.244.0.0 bogon 255.255.224.0 UG 0 0 0 vboxnet1

192.168.50.0 0.0.0.0 255.255.255.0 U 0 0 0 vboxnet1

192.168.56.0 0.0.0.0 255.255.255.0 U 0 0 0 vboxnet0

192.168.80.0 0.0.0.0 255.255.255.0 U 100 0 0 eno1

-----------------------------





6. The output of "vagrant global-status"

[root(a)hadoopdatanode3 ~]# vagrant global-status

id name provider state directory

----------------------------------------------------------------------------

34b83c6 default virtualbox running /home/arch/sourceCode/gSource/bosh-lite

.............

-----------------------------





7. The output of "bosh status"

[root(a)hadoopdatanode3 ~]# bosh status

Config

/root/.bosh_config



Director

RSA 1024 bit CA certificates are loaded due to old openssl compatibility

Name Bosh Lite Director

URL https://192.168.50.4:25555

Version 1.3262.3.0 (00000000)

User admin

UUID fb70f16a-4d6f-41f8-bc2e-0181a03dff18

CPI warden_cpi

dns disabled

compiled_package_cache enabled (provider: local)

snapshots disabled



Deployment

Manifest /home/arch/sourceCode/gSource/cf-release/bosh-lite/deployments/cf.yml

-----------------------------


Diego Lin
 

Thank you, thank you very much, ronak!
You give me a big help!!!

Now it works after running the command "monit -v restart all" according to what you said.