Date
1 - 9 of 9
container restart on logout
DHR
Hi,
Last year when ‘cf ssh’ functionality was being discussed, I’m pretty sure that the concept of automatically restarting containers following an SSH session was discussed. It was to protect against creating app container snowflakes. I’m fairly sure that protection hasn’t been introduced yet: I tested cf ssh-ing into a PCFDEV container today, writing a file & was able to log back in to the container later and see that it was still present. Is this feature or any other app container snowflake protection still planned? I couldn’t see anything in the Diego backlog (https://www.pivotaltracker.com/n/projects/1003146 <https://www.pivotaltracker.com/n/projects/1003146>). Thanks Dave |
|
Jon Price
This is something that has been on our wishlist as well but I haven't seen any discussion about it in quite some time. Here is one of the original discussions about it: https://lists.cloudfoundry.org/archives/list/cf-dev(a)lists.cloudfoundry.org/thread/GCFOOYRUT5ARBMUHDGINID46KFNORNYM/
It would go a long way with our security team if we could have some sort of recycling policy for containers in some of our more secure environments. Jon Price Intel Corporation |
|
DHR
Thanks Jon. The financial services clients I have worked with would also like the ability to turn on ‘cf ssh’ support in production, safe in the knowledge that app teams won’t abuse it by creating app snowflakes.
toggle quoted message
Show quoted text
I see that the audit trail mentioned in the thread you posted have been implemented in ‘cf events’. Like this: time event actor description 2016-12-19T16:20:36.00+0000 audit.app.ssh-authorized user index: 0 2016-12-19T15:30:33.00+0000 audit.app.ssh-authorized user index: 0 2016-12-19T12:00:53.00+0000 audit.app.ssh-authorized user index: 0 That said: I still think the container recycle functionality, available as say a feature flag, would be really appreciated by the large enterprise community. On 19 Dec 2016, at 18:25, Jon Price <jon.price(a)intel.com> wrote: |
|
Daniel Jones
Plus one!
toggle quoted message
Show quoted text
An implementation whereby the recycling behaviour can be feature-flagged by space or globally would be nice, so you could turn it off whilst debugging in a space, and then re-enable it when you've finished debugging via a series of short-lived SSH sessions. Regards, Daniel Jones - CTO +44 (0)79 8000 9153 @DanielJonesEB <https://twitter.com/DanielJonesEB> *EngineerBetter* Ltd <http://www.engineerbetter.com> - UK Cloud Foundry Specialists On Tue, Dec 20, 2016 at 8:06 AM, DHR <lists(a)dhrapson.com> wrote:
Thanks Jon. The financial services clients I have worked with would also |
|
David Illsley <davidillsley@...>
I have no idea why the idea hasn't be implemented, but pondering it, it
seems like it's hard to do because of the cases you mention. Some people need a policy that 'app teams won’t abuse it by creating app snowflakes', and in some (most?) cases you need the flexibility to do debugging as you mentioned. I think it's possible to combine the SSH authorized events, and the instance uptime details from the API to build audit capability - identify instances which have been SSH'd to and not recycled within some time period (eg 1 hour). You could have either some escalations process to get a human to do something about it (in case there's a reason an hour wasn't enough), or more brutally, give the audit code the ability to do a restart instance. On Tue, Dec 20, 2016 at 12:48 PM, Daniel Jones < daniel.jones(a)engineerbetter.com> wrote: Plus one! |
|
Daniel Jones
Hmm, here's an idea that I haven't through and so is probably rubbish...
How about an immutability enforcer? Recursively checksum the expanded contents of a droplet, and kill-with-fire anything that doesn't match it. It'd need to be optional for folks storing ephemeral data on their ephemeral disk, and a non-invasive (ie no changes to CF components) implementation would *depend* on `cf ssh` or a chained buildpack, but maybe that's a nice compromise that could be quicker to develop than waiting for mainline code changes to CF? Regards, Daniel Jones - CTO +44 (0)79 8000 9153 @DanielJonesEB <https://twitter.com/DanielJonesEB> *EngineerBetter* Ltd <http://www.engineerbetter.com> - UK Cloud Foundry Specialists On Thu, Dec 22, 2016 at 10:01 AM, David Illsley <davidillsley(a)gmail.com> wrote: I have no idea why the idea hasn't be implemented, but pondering it, it |
|
Graham Bleach
On 23 December 2016 at 09:21, Daniel Jones
<daniel.jones(a)engineerbetter.com> wrote: Hmm, here's an idea that I haven't through and so is probably rubbish...An idea we've been kicking around is to ensure that app instance containers never live longer than a certain time (eg. 3, 6, 12 or 24 hours). This would ensure that we'd catch cases where apps weren't able to cope with being rescheduled to different cells. It'd also strongly discourage manual tweaks via ssh. It'd probably be useful for people deploying apps to be able to initiate an aggressive version of this behaviour to run in their testing pipelines, prior to production deployment, to catch regressions in keeping state in app instances. There's a naive implementation in my head that would work fine on smaller installations by looping through app instances returned by the API and restarting them. Cheers, Graham |
|
Stefan Mayr
Am 23.12.2016 um 10:36 schrieb Graham Bleach:
On 23 December 2016 at 09:21, Daniel JonesHow to cope with the following issues? Temporary data: some software still uses sessions, file uploads or caches which are buffered or written to disk (Java/Tomcat, PHP, ...). While it is okay to loose this data when a container is restarted (after you had some time to work with this data) it becomes a problem when every write can cause the recreation of this container. How should an upload form work if every upload can kill the container? I'm only refering the processing of the upload - not permanently storing it. Single instances: recreating app containers when there are more than two should not cause to many issues. But if there is only one instance you have two choices: - kill the running container and start a new one -> short downtime - start a second instance and kill the first one afterwards -> problem if the application is only allowed to run with one instance (singleton). One-shot tasks: a slight variation of the single instance problem and the question if you are allowed to restart a oneshot task Happy holidays, Stefan |
|
Graham Bleach
Hi Stefan,
On 23 December 2016 at 13:52, Stefan Mayr <stefan(a)mayr-stefan.de> wrote: Am 23.12.2016 um 10:36 schrieb Graham Bleach:I think this was in response to Dan's immutability enforcementOn 23 December 2016 at 09:21, Daniel JonesHow to cope with the following issues? proposal, so I'll let him respond :) Single instances: recreating app containers when there are more than twoApp instances go away when the cells get replaced (eg. stemcell update) or fail, so apps need to be able to cope with it. If you're not comfortable with downtime then the app probably shouldn't be single instance. For my naive "loop through all the app instances" script I'd be inclined to check that the restarted instance was healthy again before moving onto the next one. One-shot tasks: a slight variation of the single instance problem andTasks feel less safe to interrupt than app instances. I'm unclear what happens to a running task when the cell gets destroyed and therefore if there's some reasonable upper bound on how long a task should take to complete. -- Technical Architect Government Digital Service |
|