Am 23.12.2016 um 10:36 schrieb Graham Bleach:
On 23 December 2016 at 09:21, Daniel Jones
Hmm, here's an idea that I haven't through and so is probably rubbish...An idea we've been kicking around is to ensure that app instance
How about an immutability enforcer? Recursively checksum the expanded
contents of a droplet, and kill-with-fire anything that doesn't match it.
It'd need to be optional for folks storing ephemeral data on their ephemeral
disk, and a non-invasive (ie no changes to CF components) implementation
would depend on `cf ssh` or a chained buildpack, but maybe that's a nice
compromise that could be quicker to develop than waiting for mainline code
changes to CF?
containers never live longer than a certain time (eg. 3, 6, 12 or 24
This would ensure that we'd catch cases where apps weren't able to
cope with being rescheduled to different cells. It'd also strongly
discourage manual tweaks via ssh. It'd probably be useful for people
deploying apps to be able to initiate an aggressive version of this
behaviour to run in their testing pipelines, prior to production
deployment, to catch regressions in keeping state in app instances.
There's a naive implementation in my head that would work fine on
smaller installations by looping through app instances returned by the
API and restarting them.
How to cope with the following issues?
Temporary data: some software still uses sessions, file uploads or
caches which are buffered or written to disk (Java/Tomcat, PHP, ...).
While it is okay to loose this data when a container is restarted (after
you had some time to work with this data) it becomes a problem when
every write can cause the recreation of this container. How should an
upload form work if every upload can kill the container? I'm only
refering the processing of the upload - not permanently storing it.
Single instances: recreating app containers when there are more than two
should not cause to many issues. But if there is only one instance you
have two choices:
- kill the running container and start a new one -> short downtime
- start a second instance and kill the first one afterwards -> problem
if the application is only allowed to run with one instance (singleton).
One-shot tasks: a slight variation of the single instance problem and
the question if you are allowed to restart a oneshot task