Docker for parallel tasks - parallel-processing

I just started with Docker because I'd like to use it to run parallel tasks.
My problem is that I don't understand how Docker handles the resources on the host (CPU, RAM, etc.): i.e. how can I evaluate the maximum number of containers to run at the same time?
Thanks for your suggestions.

Related

How to use AWS sbatch (SLURM) inside docker on an EC2 instance?

I am trying to get OpenFOAM to run on an AWS EC2 cluster using AWS parallelCluster.
One possibility is to compile OpenFOAM. Another is to use a docker container. I am trying to get the second option to work.
However, I am running into trouble understanding how I should orchestrate the various operations. Basically what I need is :
copy an OpenFOAM case from S3 to FSx file system on the master node
run the docker container containing OpenFOAM
Perform OpenFOAM operations, some of them using the cluster (running the computation in parallel being the most important one)
I want to put all of this into scripts to make it reproducible. But I am wondering how should I structure the scripts together to have SLURM handle the parallel side of things.
My problem at the moment is that the Master node shell knows the command e.g. sbatch but when I launch docker to access the OpenFOAM command, it "forgets" the sbatch commands.
How could I export all SLURM related commands (sbatch, ...) to docker easily ? Is this the correct way to handle the problem ?
Thanks for the support
for the first option there is a workshop that walks you through:
cfd-on-pcluster.
For the second option; I created a container workshop that uses HPC container runtimes containers-on-pcluster.
I incorporated a section about GROMACS but I am happy to add OpenFOAM as well. I am using Spack to create the container images. While I only documented single-node runs, we can certainly add multi-node runs.
Running Docker via sbatch is not going to get you very far, b/c docker is not a user-land runtime. For more info: FOSDEM21 Talk about Containers in HPC
Cheers
Christian (full disclosure: AWS Developer Advocate HPC/Batch)

Docker Container not running simultaneously

I am trying to run multiple docker Container in a single network but as soon as I reached to 7 container this problem started. If I start one container other will exit automatically. I have increased memory to 3 GB and CPU to 100%. Now this looks like a resource problem but how do we solve it

Docker performance IO

I'm running a benchmark to test IO runtime differences between Docker containers and its host, and I noticed something strange. I've performed random writes/reads.
The storage driver for the container is aufs.
If the file to be written/read is smaller or equals to 1GB, docker is faster than the host (otherwise, if the file is bigger, docker is slower).
Why do I get those results for small files?

Docker container running Mesos cluster and running other docker containers on cluster (using Marathon)

I'm just starting off with Mesos, Docker and Marathon but I can't find anywhere where this specific question is answered.
I want to set up a Mesos cluster running on Docker - there are a couple of internet resources to do this, but then I want to run Docker containers on top of Mesos itself. This would then mean Docker containers running inside other Docker containers.
Is there a problem with this? It doesn't intuitively seem right somehow but would seem like it would be really handy to do so. Ideally I want to run Mesos cluster (with Marathon, Chronos etc.) and then run Hadoop within Docker containers on top of that. Is this possible or a standard way of doing things? Any other suggestions as to what good practice is would be appreciated.
Thanks
You should be able to run it, taking care of some issues when running the mesos (with Docker) containers, like running in privileged mode. Take a look to jpetazzo/dind to see how you can install and run docker in docker. Then you can setup mesos in that container to have one container with mesos and docker installed.
There are some references over the Internet similar to what you want to do. Check this article and this project that I think you will find very interesting.
There are definitely people running Mesos in docker containers, but you'll need to use privileged mode and set up some volumes if you want mesos to access the outer docker binary (see this thread).
Current biggest caveat: don't name your mesos-slave containers "mesos-*" or MESOS-2016 will bite you. See epic
MESOS-2115 for other remaining issues related to running mesos-slave in docker containers.

How many docker containers can i run simultaneously on single host?

I am new to lxc and docker. Does docker max client count depend solely on CPU and RAM or are there some other factors associated with running multiple containers simultaneously?
As mentioned in the comments to your question, it will largely depend on the requirements of the applications inside the containers.
What follows is anecdotal data I collected for this answer (This is on a Macbook Pro with 8 cores, 16Gb and Docker running in VirtualBox with boot2docker 2Gb, using 2 MBP cores):
I was able to launch 242 (idle) redis containers before getting:
2014/06/30 08:07:58 Error: Cannot start container c4b49372111c45ae30bb4e7edb322dbffad8b47c5fa6eafad890e8df4b347ffa: pipe2: too many open files
After that, top inside the VM reports CPU use around 30%-55% user and 10%-12% system (every redis process seems to use 0.2%). Also, I get time outs while trying to connect to a redis server.

Resources