We recently scaled down a bunch of our DEAs. We did this by simply reducing the instances in our job and deploying. When doing this it didn't appear that bosh shut down these extra VMs allowing them to evacuate prior to being deleted. We got messages like:
"Started deleting unneeded instances"
Is this expected behaviour?
Mike
|
|
Hi, Did you decrease resource pool size too ?
Thanks
toggle quoted message
Show quoted text
On Wed, Jul 15, 2015 at 1:27 PM, Mike Youngstrom <youngm(a)gmail.com> wrote: We recently scaled down a bunch of our DEAs. We did this by simply reducing the instances in our job and deploying. When doing this it didn't appear that bosh shut down these extra VMs allowing them to evacuate prior to being deleted. We got messages like:
"Started deleting unneeded instances"
Is this expected behaviour?
Mike
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
|
|
Ah, we aren't specifying resource pool sizes. Letting bosh manage that. Mike On Tue, Jul 14, 2015 at 10:52 PM, Gwenn Etourneau <getourneau(a)pivotal.io> wrote: Hi, Did you decrease resource pool size too ?
Thanks
On Wed, Jul 15, 2015 at 1:27 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
We recently scaled down a bunch of our DEAs. We did this by simply reducing the instances in our job and deploying. When doing this it didn't appear that bosh shut down these extra VMs allowing them to evacuate prior to being deleted. We got messages like:
"Started deleting unneeded instances"
Is this expected behaviour?
Mike
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
|
|
You should not have to specify resource pool sizes (in fact we are deprecating this feature soon). I'll look into this tomorrow morning.
Sent from my iPhone
toggle quoted message
Show quoted text
On Jul 14, 2015, at 9:56 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
Ah, we aren't specifying resource pool sizes. Letting bosh manage that.
Mike
On Tue, Jul 14, 2015 at 10:52 PM, Gwenn Etourneau <getourneau(a)pivotal.io> wrote: Hi, Did you decrease resource pool size too ?
Thanks
On Wed, Jul 15, 2015 at 1:27 PM, Mike Youngstrom <youngm(a)gmail.com> wrote: We recently scaled down a bunch of our DEAs. We did this by simply reducing the instances in our job and deploying. When doing this it didn't appear that bosh shut down these extra VMs allowing them to evacuate prior to being deleted. We got messages like:
"Started deleting unneeded instances"
Is this expected behaviour?
Mike
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh _______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh _______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
|
|
Thanks, for the record I'm on release 2889. Mike On Wed, Jul 15, 2015 at 1:48 AM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote: You should not have to specify resource pool sizes (in fact we are deprecating this feature soon). I'll look into this tomorrow morning.
Sent from my iPhone
On Jul 14, 2015, at 9:56 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
Ah, we aren't specifying resource pool sizes. Letting bosh manage that.
Mike
On Tue, Jul 14, 2015 at 10:52 PM, Gwenn Etourneau <getourneau(a)pivotal.io> wrote:
Hi, Did you decrease resource pool size too ?
Thanks
On Wed, Jul 15, 2015 at 1:27 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
We recently scaled down a bunch of our DEAs. We did this by simply reducing the instances in our job and deploying. When doing this it didn't appear that bosh shut down these extra VMs allowing them to evacuate prior to being deleted. We got messages like:
"Started deleting unneeded instances"
Is this expected behaviour?
Mike
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
|
|
Aristoteles Neto <neto@...>
toggle quoted message
Show quoted text
On 16/07/2015, at 5:24, Mike Youngstrom <youngm(a)gmail.com> wrote: Thanks, for the record I'm on release 2889.
Mike
On Wed, Jul 15, 2015 at 1:48 AM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote: You should not have to specify resource pool sizes (in fact we are deprecating this feature soon). I'll look into this tomorrow morning.
Sent from my iPhone
On Jul 14, 2015, at 9:56 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
Ah, we aren't specifying resource pool sizes. Letting bosh manage that.
Mike
On Tue, Jul 14, 2015 at 10:52 PM, Gwenn Etourneau <getourneau(a)pivotal.io> wrote: Hi, Did you decrease resource pool size too ?
Thanks
On Wed, Jul 15, 2015 at 1:27 PM, Mike Youngstrom <youngm(a)gmail.com> wrote: We recently scaled down a bunch of our DEAs. We did this by simply reducing the instances in our job and deploying. When doing this it didn't appear that bosh shut down these extra VMs allowing them to evacuate prior to being deleted. We got messages like:
"Started deleting unneeded instances"
Is this expected behaviour?
Mike
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh _______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
|
|
I've looked at the debug logs for when instances are decreased. It does make the correct calls: drain, then stop before deleting the VM:
I, [2015-07-16 00:18:30 #6206] [] INFO -- DirectorJobRunner: Delete unneeded instance: i-1202c2c1 D, [2015-07-16 00:18:30 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"drain","arguments":["shutdown"],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.dfa7737f-8ae7-457c-8c5a-f1481fb5496b"} # <----------------- DRAIN D, [2015-07-16 00:18:30 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.dfa7737f-8ae7-457c-8c5a-f1481fb5496b {"value":{"agent_task_id":"3b56a992-6bcb-4059-7594-4d31e7dbb706","state":"running"}} D, [2015-07-16 00:18:30 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"get_task","arguments":["3b56a992-6bcb-4059-7594-4d31e7dbb706"],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.12d27422-27c3-4d74-a16c-2616bebcb2df"} D, [2015-07-16 00:18:30 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.12d27422-27c3-4d74-a16c-2616bebcb2df {"value":{"agent_task_id":"3b56a992-6bcb-4059-7594-4d31e7dbb706","state":"running"}} D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"get_task","arguments":["3b56a992-6bcb-4059-7594-4d31e7dbb706"],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.3bda4644-8005-4deb-82c8-c1225641d343"} D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.3bda4644-8005-4deb-82c8-c1225641d343 {"value":0} D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"stop","arguments":[],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.23fcaba6-aa57-4614-ae03-ce7304d6ef4e"} # <----------------- STOP D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.23fcaba6-aa57-4614-ae03-ce7304d6ef4e {"value":{"agent_task_id":"2e7de824-f7ab-40a0-6e86-7d958745c452","state":"running"}} D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"get_task","arguments":["2e7de824-f7ab-40a0-6e86-7d958745c452"],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.352f1cca-af22-44dc-89e4-2725b837a7c0"} D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.352f1cca-af22-44dc-89e4-2725b837a7c0 {"value":{"agent_task_id":"2e7de824-f7ab-40a0-6e86-7d958745c452","state":"running"}} D, [2015-07-16 00:18:32 #6206] [] DEBUG -- DirectorJobRunner: Renewing lock: lock:deployment:dummy-on-uaa D, [2015-07-16 00:18:32 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"get_task","arguments":["2e7de824-f7ab-40a0-6e86-7d958745c452"],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.259e6f72-b304-4827-b133-d69cf5cc4986"} D, [2015-07-16 00:18:32 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.259e6f72-b304-4827-b133-d69cf5cc4986 {"value":"stopped"} D, [2015-07-16 00:18:32 #6206] [] DEBUG -- DirectorJobRunner: External CPI sending request: {"method":"delete_vm","arguments":["i-1202c2c1"],"context":{"director_uuid":"9d00a46b-e311-4b41-8916-16f0e5a92e21"}} with command: /var/vcap/jobs/cpi/bin/cpi # <----------------- DELETE
toggle quoted message
Show quoted text
On Wed, Jul 15, 2015 at 3:13 PM, Aristoteles Neto <neto(a)orcon.net.nz> wrote: Despite that message being displayed (it’s displayed before stop/shutdown commands are sent to the VMs), it should’ve still triggered the drain script [1].
Looks like the drain log file isn’t forwarded to an aggregator, so you won’t have its logs anymore… I wonder if other DEAs would contain logs pertaining to an attempt to evacuate?
[1] https://github.com/cloudfoundry/cf-release/blob/master/jobs/dea_next/templates/deterministic_drain.rb
Regards,
Neto
On 16/07/2015, at 5:24, Mike Youngstrom <youngm(a)gmail.com> wrote:
Thanks, for the record I'm on release 2889.
Mike
On Wed, Jul 15, 2015 at 1:48 AM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:
You should not have to specify resource pool sizes (in fact we are deprecating this feature soon). I'll look into this tomorrow morning.
Sent from my iPhone
On Jul 14, 2015, at 9:56 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
Ah, we aren't specifying resource pool sizes. Letting bosh manage that.
Mike
On Tue, Jul 14, 2015 at 10:52 PM, Gwenn Etourneau <getourneau(a)pivotal.io> wrote:
Hi, Did you decrease resource pool size too ?
Thanks
On Wed, Jul 15, 2015 at 1:27 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
We recently scaled down a bunch of our DEAs. We did this by simply reducing the instances in our job and deploying. When doing this it didn't appear that bosh shut down these extra VMs allowing them to evacuate prior to being deleted. We got messages like:
"Started deleting unneeded instances"
Is this expected behaviour?
Mike
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
|
|
Yup, you're right. I looked through the debug logs of my deploy and they look the same. I guess it threw me off that it was draining and deleting so fast (10 seconds in my case). Thanks for helping me work through that question. Mike On Wed, Jul 15, 2015 at 6:30 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote: I've looked at the debug logs for when instances are decreased. It does make the correct calls: drain, then stop before deleting the VM:
I, [2015-07-16 00:18:30 #6206] [] INFO -- DirectorJobRunner: Delete unneeded instance: i-1202c2c1 D, [2015-07-16 00:18:30 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"drain","arguments":["shutdown"],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.dfa7737f-8ae7-457c-8c5a-f1481fb5496b"} # <----------------- DRAIN D, [2015-07-16 00:18:30 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.dfa7737f-8ae7-457c-8c5a-f1481fb5496b {"value":{"agent_task_id":"3b56a992-6bcb-4059-7594-4d31e7dbb706","state":"running"}} D, [2015-07-16 00:18:30 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"get_task","arguments":["3b56a992-6bcb-4059-7594-4d31e7dbb706"],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.12d27422-27c3-4d74-a16c-2616bebcb2df"} D, [2015-07-16 00:18:30 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.12d27422-27c3-4d74-a16c-2616bebcb2df {"value":{"agent_task_id":"3b56a992-6bcb-4059-7594-4d31e7dbb706","state":"running"}} D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"get_task","arguments":["3b56a992-6bcb-4059-7594-4d31e7dbb706"],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.3bda4644-8005-4deb-82c8-c1225641d343"} D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.3bda4644-8005-4deb-82c8-c1225641d343 {"value":0} D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"stop","arguments":[],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.23fcaba6-aa57-4614-ae03-ce7304d6ef4e"} # <----------------- STOP D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.23fcaba6-aa57-4614-ae03-ce7304d6ef4e {"value":{"agent_task_id":"2e7de824-f7ab-40a0-6e86-7d958745c452","state":"running"}} D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"get_task","arguments":["2e7de824-f7ab-40a0-6e86-7d958745c452"],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.352f1cca-af22-44dc-89e4-2725b837a7c0"} D, [2015-07-16 00:18:31 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.352f1cca-af22-44dc-89e4-2725b837a7c0 {"value":{"agent_task_id":"2e7de824-f7ab-40a0-6e86-7d958745c452","state":"running"}} D, [2015-07-16 00:18:32 #6206] [] DEBUG -- DirectorJobRunner: Renewing lock: lock:deployment:dummy-on-uaa D, [2015-07-16 00:18:32 #6206] [] DEBUG -- DirectorJobRunner: SENT: agent.ec02e6e0-eaa7-413c-aa65-13c96a6d77fe {"method":"get_task","arguments":["2e7de824-f7ab-40a0-6e86-7d958745c452"],"reply_to":"director.cb29deb2-699f-4128-91a3-52d3c061e41b.259e6f72-b304-4827-b133-d69cf5cc4986"} D, [2015-07-16 00:18:32 #6206] [] DEBUG -- DirectorJobRunner: RECEIVED: director.cb29deb2-699f-4128-91a3-52d3c061e41b.259e6f72-b304-4827-b133-d69cf5cc4986 {"value":"stopped"} D, [2015-07-16 00:18:32 #6206] [] DEBUG -- DirectorJobRunner: External CPI sending request: {"method":"delete_vm","arguments":["i-1202c2c1"],"context":{"director_uuid":"9d00a46b-e311-4b41-8916-16f0e5a92e21"}} with command: /var/vcap/jobs/cpi/bin/cpi # <----------------- DELETE
On Wed, Jul 15, 2015 at 3:13 PM, Aristoteles Neto <neto(a)orcon.net.nz> wrote:
Despite that message being displayed (it’s displayed before stop/shutdown commands are sent to the VMs), it should’ve still triggered the drain script [1].
Looks like the drain log file isn’t forwarded to an aggregator, so you won’t have its logs anymore… I wonder if other DEAs would contain logs pertaining to an attempt to evacuate?
[1] https://github.com/cloudfoundry/cf-release/blob/master/jobs/dea_next/templates/deterministic_drain.rb
Regards,
Neto
On 16/07/2015, at 5:24, Mike Youngstrom <youngm(a)gmail.com> wrote:
Thanks, for the record I'm on release 2889.
Mike
On Wed, Jul 15, 2015 at 1:48 AM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:
You should not have to specify resource pool sizes (in fact we are deprecating this feature soon). I'll look into this tomorrow morning.
Sent from my iPhone
On Jul 14, 2015, at 9:56 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
Ah, we aren't specifying resource pool sizes. Letting bosh manage that.
Mike
On Tue, Jul 14, 2015 at 10:52 PM, Gwenn Etourneau <getourneau(a)pivotal.io
wrote: Hi, Did you decrease resource pool size too ?
Thanks
On Wed, Jul 15, 2015 at 1:27 PM, Mike Youngstrom <youngm(a)gmail.com> wrote:
We recently scaled down a bunch of our DEAs. We did this by simply reducing the instances in our job and deploying. When doing this it didn't appear that bosh shut down these extra VMs allowing them to evacuate prior to being deleted. We got messages like:
"Started deleting unneeded instances"
Is this expected behaviour?
Mike
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
_______________________________________________ cf-bosh mailing list cf-bosh(a)lists.cloudfoundry.org https://lists.cloudfoundry.org/mailman/listinfo/cf-bosh
|
|