Mike Youngstrom <youngm@...>
Today my org manages our DEA resources using a heavy overcommit strategy. Rather than being conservative and ensuring that none of our DEAs commit to more than they can handle we have instead decided to overcommit to the point where we basically turn off DEA resource management.
All our DEAs have the same amount of RAM and Disk and we closely monitor these resources. When load gets beyond a threshold we deploy more DEAs. We use Org quotas as ceilings to help stop an app from accidentally killing everything.
So far this strategy has worked out great for us. It's allowed us to provide much more friendly defaults for RAM and Disk and allowed us to get more value out of our DEA dollar.
As we move into Diego we're attempting to implement the same strategy. We want to be sure to do it correctly since we're less comfortable with Diego at this point.
Diego doesn't have the friendly "overcommit" property DEAs do. Instead I see "diego.executor.memory_capacity_mb" and "diego.executor.disk_capacity_mb". Can I overcommit these values and get the same behaviour I would overcommitting DEAs?
I'd also like some advice on what "diego.garden-linux.btrfs_store_size_mb" is and how it might apply to my overcommit plans.
Thanks, Mike
|
|
i know that onsi and eric have discussed this. i've heard that eric is working on a reply.
toggle quoted message
Show quoted text
On Tue, Aug 11, 2015 at 12:50 PM, Mike Youngstrom <youngm(a)gmail.com> wrote: Today my org manages our DEA resources using a heavy overcommit strategy. Rather than being conservative and ensuring that none of our DEAs commit to more than they can handle we have instead decided to overcommit to the point where we basically turn off DEA resource management.
All our DEAs have the same amount of RAM and Disk and we closely monitor these resources. When load gets beyond a threshold we deploy more DEAs. We use Org quotas as ceilings to help stop an app from accidentally killing everything.
So far this strategy has worked out great for us. It's allowed us to provide much more friendly defaults for RAM and Disk and allowed us to get more value out of our DEA dollar.
As we move into Diego we're attempting to implement the same strategy. We want to be sure to do it correctly since we're less comfortable with Diego at this point.
Diego doesn't have the friendly "overcommit" property DEAs do. Instead I see "diego.executor.memory_capacity_mb" and "diego.executor.disk_capacity_mb". Can I overcommit these values and get the same behaviour I would overcommitting DEAs?
I'd also like some advice on what "diego.garden-linux.btrfs_store_size_mb" is and how it might apply to my overcommit plans.
Thanks, Mike
-- Thank you,
James Bayer
|
|
Hi, Mike,
Apologies, I emailed this to cf-dev a few days ago, but it seems not to have gone through. Anyway, thanks for asking about the different configuration values Diego exposes for disk and memory. Yes, you can use the 'diego.executor.memory_capacity_mb' and 'diego.executor.disk_capacity_mb' properties to specify overcommits in absolute terms rather than the relative factors configurable on the DEAs. The cell reps will advertise those values as their maximum memory and disk capacity, and subtract memory and disk for allocated containers when reporting their available capacity during auctions.
The 'btrfs_store_size_mb' property on garden-linux is more of a moving target as garden-linux settles in on that filesystem as a backing store. As of garden-linux-release 0.292.0, which diego-release 0.1412.0 and later consume, that property accepts a '-1' value that allows it to grow up to the full size of the available disk on the /var/vcap/data ephemeral disk volume. The btrfs volume itself is sparse, so it will start at effectively zero size and grow as needed to accommodate the container layers. Since you're already monitoring disk usage on your VMs carefully and scaling out when you hit certain limits, this might be a good option for you. This is also effectively how the DEAs operate today, without an explicit limit on the total amount of disk they allocate for containers.
If you do want more certainty in the maximum size that the garden-linux btrfs volume will grow to, or if you're on a version of diego-release earlier than 0.1412.0, you should set btrfs_store_size_mb to a positive value, and garden-linux will create the volume to grow only up to that size. One strategy to determine that value would be to use the maximum size of the ephemeral disk, less the size of the BOSH-deployed packages (for the executor, currently around 1.3 GB, including the untarred cflinuxfs2 rootfs), less the size allocated to the executor cache in the 'diego.executor.max_cache_size_in_bytes' property (which currently defaults to 10GB).
Best, Eric
|
|
Mike Youngstrom <youngm@...>
Thanks for the response Eric. It was very helpful.
One last question. Any thoughts on what would be the best way to monitor free ephemeral disk space in my overcommitted situation? If using btrfs_store_size_mb=-1 will btrfs free ephemeral disk space when less is being used or does it just grow when it needs more? Looking at firehose stats in 1398 I don't see any btrfs usage metrics being sent from garden-linux.
Thanks, Mike
toggle quoted message
Show quoted text
On Mon, Aug 17, 2015 at 9:14 PM, Eric Malm <emalm(a)pivotal.io> wrote: Hi, Mike,
Apologies, I emailed this to cf-dev a few days ago, but it seems not to have gone through. Anyway, thanks for asking about the different configuration values Diego exposes for disk and memory. Yes, you can use the 'diego.executor.memory_capacity_mb' and 'diego.executor.disk_capacity_mb' properties to specify overcommits in absolute terms rather than the relative factors configurable on the DEAs. The cell reps will advertise those values as their maximum memory and disk capacity, and subtract memory and disk for allocated containers when reporting their available capacity during auctions.
The 'btrfs_store_size_mb' property on garden-linux is more of a moving target as garden-linux settles in on that filesystem as a backing store. As of garden-linux-release 0.292.0, which diego-release 0.1412.0 and later consume, that property accepts a '-1' value that allows it to grow up to the full size of the available disk on the /var/vcap/data ephemeral disk volume. The btrfs volume itself is sparse, so it will start at effectively zero size and grow as needed to accommodate the container layers. Since you're already monitoring disk usage on your VMs carefully and scaling out when you hit certain limits, this might be a good option for you. This is also effectively how the DEAs operate today, without an explicit limit on the total amount of disk they allocate for containers.
If you do want more certainty in the maximum size that the garden-linux btrfs volume will grow to, or if you're on a version of diego-release earlier than 0.1412.0, you should set btrfs_store_size_mb to a positive value, and garden-linux will create the volume to grow only up to that size. One strategy to determine that value would be to use the maximum size of the ephemeral disk, less the size of the BOSH-deployed packages (for the executor, currently around 1.3 GB, including the untarred cflinuxfs2 rootfs), less the size allocated to the executor cache in the 'diego.executor.max_cache_size_in_bytes' property (which currently defaults to 10GB).
Best, Eric
|
|
Will Pragnell <wpragnell@...>
Apparently my last reply to this thread never made it through. Hope this one does!
Mike, you're right that there are currently no btrfs metrics being emitted from garden-linux. There are currently no immediate plans to implement this, but clearly such metrics are useful, so I'll raise this with the team and see where we land.
As for your question about btrfs freeing disk space, I'm afraid I don't know off hand. I'll have to do some investigation and get back to you on that next week.
toggle quoted message
Show quoted text
On 19 August 2015 at 23:46, Mike Youngstrom <youngm(a)gmail.com> wrote: Thanks for the response Eric. It was very helpful.
One last question. Any thoughts on what would be the best way to monitor free ephemeral disk space in my overcommitted situation? If using btrfs_store_size_mb=-1 will btrfs free ephemeral disk space when less is being used or does it just grow when it needs more? Looking at firehose stats in 1398 I don't see any btrfs usage metrics being sent from garden-linux.
Thanks, Mike
On Mon, Aug 17, 2015 at 9:14 PM, Eric Malm <emalm(a)pivotal.io> wrote:
Hi, Mike,
Apologies, I emailed this to cf-dev a few days ago, but it seems not to have gone through. Anyway, thanks for asking about the different configuration values Diego exposes for disk and memory. Yes, you can use the 'diego.executor.memory_capacity_mb' and 'diego.executor.disk_capacity_mb' properties to specify overcommits in absolute terms rather than the relative factors configurable on the DEAs. The cell reps will advertise those values as their maximum memory and disk capacity, and subtract memory and disk for allocated containers when reporting their available capacity during auctions.
The 'btrfs_store_size_mb' property on garden-linux is more of a moving target as garden-linux settles in on that filesystem as a backing store. As of garden-linux-release 0.292.0, which diego-release 0.1412.0 and later consume, that property accepts a '-1' value that allows it to grow up to the full size of the available disk on the /var/vcap/data ephemeral disk volume. The btrfs volume itself is sparse, so it will start at effectively zero size and grow as needed to accommodate the container layers. Since you're already monitoring disk usage on your VMs carefully and scaling out when you hit certain limits, this might be a good option for you. This is also effectively how the DEAs operate today, without an explicit limit on the total amount of disk they allocate for containers.
If you do want more certainty in the maximum size that the garden-linux btrfs volume will grow to, or if you're on a version of diego-release earlier than 0.1412.0, you should set btrfs_store_size_mb to a positive value, and garden-linux will create the volume to grow only up to that size. One strategy to determine that value would be to use the maximum size of the ephemeral disk, less the size of the BOSH-deployed packages (for the executor, currently around 1.3 GB, including the untarred cflinuxfs2 rootfs), less the size allocated to the executor cache in the 'diego.executor.max_cache_size_in_bytes' property (which currently defaults to 10GB).
Best, Eric
|
|
Mike Youngstrom <youngm@...>
Thanks Will. If btrfs does free disk space then I can just use the bosh ephemeral disk metric to monitor. If it doesn't then I'll need Garden to provide me with something. Thanks, Mike On Thu, Aug 20, 2015 at 10:58 AM, Will Pragnell <wpragnell(a)pivotal.io> wrote: Apparently my last reply to this thread never made it through. Hope this one does!
Mike, you're right that there are currently no btrfs metrics being emitted from garden-linux. There are currently no immediate plans to implement this, but clearly such metrics are useful, so I'll raise this with the team and see where we land.
As for your question about btrfs freeing disk space, I'm afraid I don't know off hand. I'll have to do some investigation and get back to you on that next week.
On 19 August 2015 at 23:46, Mike Youngstrom <youngm(a)gmail.com> wrote:
Thanks for the response Eric. It was very helpful.
One last question. Any thoughts on what would be the best way to monitor free ephemeral disk space in my overcommitted situation? If using btrfs_store_size_mb=-1 will btrfs free ephemeral disk space when less is being used or does it just grow when it needs more? Looking at firehose stats in 1398 I don't see any btrfs usage metrics being sent from garden-linux.
Thanks, Mike
On Mon, Aug 17, 2015 at 9:14 PM, Eric Malm <emalm(a)pivotal.io> wrote:
Hi, Mike,
Apologies, I emailed this to cf-dev a few days ago, but it seems not to have gone through. Anyway, thanks for asking about the different configuration values Diego exposes for disk and memory. Yes, you can use the 'diego.executor.memory_capacity_mb' and 'diego.executor.disk_capacity_mb' properties to specify overcommits in absolute terms rather than the relative factors configurable on the DEAs. The cell reps will advertise those values as their maximum memory and disk capacity, and subtract memory and disk for allocated containers when reporting their available capacity during auctions.
The 'btrfs_store_size_mb' property on garden-linux is more of a moving target as garden-linux settles in on that filesystem as a backing store. As of garden-linux-release 0.292.0, which diego-release 0.1412.0 and later consume, that property accepts a '-1' value that allows it to grow up to the full size of the available disk on the /var/vcap/data ephemeral disk volume. The btrfs volume itself is sparse, so it will start at effectively zero size and grow as needed to accommodate the container layers. Since you're already monitoring disk usage on your VMs carefully and scaling out when you hit certain limits, this might be a good option for you. This is also effectively how the DEAs operate today, without an explicit limit on the total amount of disk they allocate for containers.
If you do want more certainty in the maximum size that the garden-linux btrfs volume will grow to, or if you're on a version of diego-release earlier than 0.1412.0, you should set btrfs_store_size_mb to a positive value, and garden-linux will create the volume to grow only up to that size. One strategy to determine that value would be to use the maximum size of the ephemeral disk, less the size of the BOSH-deployed packages (for the executor, currently around 1.3 GB, including the untarred cflinuxfs2 rootfs), less the size allocated to the executor cache in the 'diego.executor.max_cache_size_in_bytes' property (which currently defaults to 10GB).
Best, Eric
|
|