Just for clarity, are you saying multiple instances of a VM cannot share a single shared filesystem?
On Wed, Sep 16, 2015 at 6:59 PM, Dmitriy Kalinin <dkalinin(a)pivotal.io> wrote:
BOSH allocates a persistent disk per instance. It never shares persistent disks between multiple instances at the same time.
If you need a shared file system, you will have to use some kind of a release for it. It's not any different from what people do with nfs server/client.
On Wed, Sep 16, 2015 at 7:09 AM, Amit Gupta <agupta(a)pivotal.io> wrote:
The shared file system aspect is an interesting wrinkle to the problem. Unless you use some network layer to how you write to the shared file system, e.g. SSHFS, I think apps will not work because they get isolated to run in a container, they're given a chroot "jail" for their file system, and it gets blown away whenever the app is stopped or restarted (which will commonly happen, e.g. during a rolling deploy of the container-runner VMs).
Do you have something that currently works? How do your VMs currently access this shared FS? I'm not sure BOSH has the abstractions for choosing a shared, already-existing "persistent disk" to be attached to multiple VMs. I also don't know what happens when you scale your VMs down, because BOSH would generally destroy the associated persistent disk, but you don't want to destroy the shared data.
Dmitriy, any idea how BOSH can work with a shared filesystem (e.g. HDFS)?
Amit
On Wed, Sep 16, 2015 at 6:54 AM, Kayode Odeyemi <dreyemi(a)gmail.com> wrote:
On Wed, Sep 16, 2015 at 3:44 PM, Amit Gupta <agupta(a)pivotal.io> wrote:
Are the spark jobs tasks that you expect to end, or apps that you expect to run forever?
They are tasks that run forever. The jobs are subscribers to RabbitMQ queues that process messages in batches.
Do your jobs need to write to the file system, or do they access a shared/distributed file system somehow?
The jobs write to shared filesystem.
Do you need things like a static IP allocated to your jobs?