Re: Overcommit on Diego Cells

Eric Malm <emalm@...>

Hi, Mike,

Apologies, I emailed this to cf-dev a few days ago, but it seems not to have gone through. Anyway, thanks for asking about the different configuration values Diego exposes for disk and memory. Yes, you can use the 'diego.executor.memory_capacity_mb' and 'diego.executor.disk_capacity_mb' properties to specify overcommits in absolute terms rather than the relative factors configurable on the DEAs. The cell reps will advertise those values as their maximum memory and disk capacity, and subtract memory and disk for allocated containers when reporting their available capacity during auctions.

The 'btrfs_store_size_mb' property on garden-linux is more of a moving target as garden-linux settles in on that filesystem as a backing store. As of garden-linux-release 0.292.0, which diego-release 0.1412.0 and later consume, that property accepts a '-1' value that allows it to grow up to the full size of the available disk on the /var/vcap/data ephemeral disk volume. The btrfs volume itself is sparse, so it will start at effectively zero size and grow as needed to accommodate the container layers. Since you're already monitoring disk usage on your VMs carefully and scaling out when you hit certain limits, this might be a good option for you. This is also effectively how the DEAs operate today, without an explicit limit on the total amount of disk they allocate for containers.

If you do want more certainty in the maximum size that the garden-linux btrfs volume will grow to, or if you're on a version of diego-release earlier than 0.1412.0, you should set btrfs_store_size_mb to a positive value, and garden-linux will create the volume to grow only up to that size. One strategy to determine that value would be to use the maximum size of the ephemeral disk, less the size of the BOSH-deployed packages (for the executor, currently around 1.3 GB, including the untarred cflinuxfs2 rootfs), less the size allocated to the executor cache in the 'diego.executor.max_cache_size_in_bytes' property (which currently defaults to 10GB).


Join { to automatically receive all group messages.