Using OpenMP and OpenMPI together under Slurm - openmp

I have written a C++ code that uses both OpenMP and OpenMPI. I want to use (let's say) 3 nodes (so size_Of_Cluster should be 3) and use OpenMP in each node to parallelize the for loop (There are 24 cores in a node). In essence I want MPI ranks be assigned to nodes. The Slurm script I have written is as follows. (I have tried many variations but could not come up with the "correct" one. I would be grateful if you could help me.)
#!/bin/bash
#SBATCH -N 3
#SBATCH -n 72
#SBATCH -p defq
#SBATCH -A akademik
#SBATCH -o %J.out
#SBATCH -e %J.err
#SBATCH --job-name=MIXED
module load slurm
module load shared
module load gcc
module load openmpi
export OMP_NUM_THREADS=24
mpirun -n 3 --bynode ./program
Using srun did not help.

The relevant lines are:
#SBATCH -N 3
#SBATCH -n 72
export OMP_NUM_THREADS=24
This means you have 72 MPI processes, and each creates 24 thread. For that to be efficient you probably need 24x72 cores. Which you don't have. You should specify:
#SBATCH -n 3
Then you will have 3 processes, with 24 threads per process.
You don't have to worry about the placement of the ranks on the nodes: that is done by the run time. You could for instance let each process print MPI_Get_processor_name to confirm.

Related

Repeat one task for 100 times in parallel on slurm

I am new to cluster computation and I want to repeat one empirical experiment 100 times on python. For each experiment, I need to generate a set of data and solve an optimization problem, then I want to obtain the averaged value. To save time, I hope to do it in parallel. For example, suppose I can use 20 cores, I only need to repeat 5 times on each core.
Here's an example of a test.slurm script that I use for running the test.py script on a single core:
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=72:00:00
#SBATCH --mail-type=begin
#SBATCH --mail-type=end
#SBATCH --mail-user=address#email
module purge
module load anaconda3/2018.12
source activate py36
python test.py
If I want to run it in multiple cores, how should I modify the slurm file accordingly?
To run the test on multiple cores, you can use srun -n option. After -n specify the number of processes, you need to launch.
srun -n 20 python test.py
srun is the launcher in slurm.
Or you can change the ntasks, cpus-per-task in slurm file.
The slurm file will look like this:
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --nodes=1
#SBATCH --ntasks=20
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=72:00:00
#SBATCH --mail-type=begin
#SBATCH --mail-type=end
#SBATCH --mail-user=address#email
module purge
module load anaconda3/2018.12
source activate py36
python test.py

SLURM How to start a script once per node

I have a Big Cluster available through SLURM.
I want to start my script e.g. ./calc on every requested node with a specified amount of cores. So for example on 2 nodes, 16 cores each.
I start with sbatch script
#SBATCH -N 2
#SBATCH --ntasks-per-node=16
srun -N 1 ./calc 2 &
srun -N 1 ./clac 2 &
wait
It doesn't work as intended though.
I tried many configurations of --ntask --nodes --cpus-per-task but nothing worked and I'm very lost.
I also don't understand the difference between task and CPUs in SLURM
In your example, you ask slurm to launch 16 tasks per node, on 2 nodes. At the end of the job, slurm will probably runs 8x(srun)x2 nodes tasks.
For your needs, you don't need to specify that you want 2 nodes specifically instead of the jobs have to run on 2 two different nodes.
For your example, run the following sbatch :
#!/bin/bash
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=16
#SBATCH --hint=nomultithread
srun <my program>
In this example, slurm will run 2 times the program with 16 cores. The nomultithread is optional and depends the cluster configuration. If the hyper-threading is activated, this will be 16 virtual cpus.
I found this to be a working solution. It turned out the most important thing was to define all parameters nodes tasks cpus
#!/bin/bash
#SBATCH -N 2
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=16
srun -N 1 -n 1 -c 16 ./calc 2 &
srun -N 1 -n 1 -c 16 ./calc 2 &
wait

What does the keyword --exclusive mean in slurm?

This is a follow up question from [How to run jobs in paralell using one slurm batch script?]. The goal was to create a single SBatch-Script, which can start multiple processes and run them in parallel. The Answer given by
damienfrancois was very detailed and looked something like this.
#!/bin/bash
#
#SBATCH --job-name=test
#SBATCH --output=/dev/null
#SBATCH --error=/dev/null
#SBATCH --partition=All
srun -n 1 -c 1 --exclusive sleep 60 &
srun -n 1 -c 1 --exclusive sleep 60 &
....
wait
However, I am not able to understand the exclusive keyword. If I use the keyword, one node of the cluster is chosen and all processes are launched there. However, I would like Slurm to distribute the ["sleeps"/steps] over the entire cluster.
So how does the keyword exclusive work ? According to the Slurm documentaion, the restriction to one node should not happen, since the keyword is used within a step-allocation.
[I am new to Slurm]

How to submit a script partly as array

I'm using Slurm on my lab's server, and I would like to submit a job that looks like this:
#SBATCH ...
mkdir my/file/architecture
echo "#HEADER" > my/file/architecture/output_summary.txt
for f in my/dir/*.csv; do
python3 myscript.py $f
done
Is there any way to run this so that it will complete the first instructions, then run the for loop in parallel? Each step is independant, so they can run at the same time.
The initial steps are not very complex, so if needed I could separate it into a separate SBATCH script. my/dir/ however contains about 7000 csv files to processes, so typing them all out manually would be a pain.
GNU Parallel might be a good fit here, or xargs, though I prefer parallel in Slurm jobs.
Here's an example of an sbatch script running an 8-way parallel:
#!/bin/sh
#SBATCH ...
#SBATCH --nodes=1
#SBATCH --ntasks=
srun="srun --exclusive -N1 -n1"
# -j is the number of tasks parallel runs so we set it to $SLURM_NTASKS
# Note that --ntasks=1 and --cpus-per-task=8 will have srun start one copy of the program at a time. We use "find" to generate a list of files to operate on.
find /my/dir/*.csv -type f | parallel -j $SLURM_NTASKS "$srun python3 myscript.py {}"
The easiest way is to run on a single node, though parallel can use SSH (I believe) to run on multiple computers.

Do I need a single bash file for each task in SLURM?

I am trying to launch several task in a SLURM-managed cluster, and would like to avoid dealing with dozens of files.
Right now, I have 50 tasks (subscripted i, and for simplicity, i is also the input parameter of my program), and for each one a single bash file slurm_run_i.sh which indicates the computations configuration, and the srun command:
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH -J pltCV
#SBATCH --mem=30G
srun python plotConvergence.py i
I am then using another bash file to submit all these tasks, slurm_run_all.sh
#!/bin/bash
for i in {1..50}:
sbatch slurm_run_$i.sh
done
This works (50 jobs are running on the cluster), but I find it troublesome to have more than 50 input files. Searching a solution, I came up with the & command, obtaining something as:
#!/bin/bash
#SBATCH --ntasks=50
#SBATCH --cpus-per-task=1
#SBATCH -J pltall
#SBATCH --mem=30G
# Running jobs
srun python plotConvergence.py 1 &
srun python plotConvergence.py 2 &
...
srun python plotConvergence.py 49 &
srun python plotConvergence.py 50 &
wait
echo "All done"
Which seems to run as well. However, I cannot manage each of these jobs independently: the output of squeue shows I have a single job (pltall) running on a single node. As there are only 12 cores on each node in the partition I am working in, I am assuming most of my jobs are waiting on the single node I've been allocated to. Setting the -N option doesn't change anything too.. Moreover, I cannot cancel some jobs individually anymore if I realize there's a mistake or something, which sounds problematic to me.
Is my interpretation right, and is there a better way (I guess) than my attempt to process several jobs in slurm without being lost among many files ?
What you are looking for is the jobs array feature of Slurm.
In your case, you would have a single submission file (slurm_run.sh) like this:
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH -J pltCV
#SBATCH --mem=30G
#SBATCH --array=1-50
srun python plotConvergence.py ${SLURM_ARRAY_TASK_ID}
and then submit the array of jobs with
sbatch slurm_run.sh
You will see that you will have 50 jobs submitted. You can cancel all of them at once or one by one. See the man page of sbatch for details.

Resources