How to run multiple unique parallel jobs on independent nodes using slurm with master / agent set up - parallel-processing

I have a physical model optimization program that uses a master / agent design to run unique parameterizations of the model across multiple nodes in parallel. I reserve the nodes and create the working directories using a batch script that ultimately uses a srun -multi-prog pest.conf command to call the optimization software (PEST++). The optimization program then calls a bash script which ultimately calls the model executable. I've been using something like srun -n 20 process.exe, but keep getting "step creation temporarily disabled" error.
So the workflow is (1) call the batch script, which sets up directories and creates the muli-prog .conf script:
#SBATCH -N 4
#SBATCH --hint=nomultithread
#SBATCH -p workq
#SBATCH --time=1:00:00
(2) The resulting multi-prog pest.conf script looks like this:
0 bash -c 'cd /caldera/projects/usgs/water/waiee/wrftest/base_pp_dir_3593956 && pestpp-glm wrftest.v2.pst /h :10497'
1-3 bash -c 'cd ${WORKER_DIR}${SLURM_PROCID} && pestpp-glm wrftest.v2.pst /h nid00413:10497'
(3) wrftext.v2.pst calls a bash script which ultimately calls the model:
printf "Running WRF-H \n"
srun -n 20 ./wrf_hydro_NoahMP.exe
wait
printf "Finished WRF-H Run.\n\n"
simple calling srun -n 20 ./wrf_hydro.exe from the command line works as expected, so I'm wondering if slurm isn't recognizing the final srun command which is resulting in the step creation temporarily disabled error?

Related

Why the SLURM doesn't provide the nodes as is requested?

My situation is the cluster consisted of 3 PCs (Raspbian with slurm 18), all connected together with shared file storage, mounted as /storage.
The task file is /storage/multiple_hello.sh:
#!/bin/bash
#SBATCH --ntasks-per-node=1
#SBATCH --nodes=3
#SBATCH --ntasks=3
cd /storage
srun echo "Hello World from $(hostname)" >> ./"$SLURM_JOB_ID"_$(hostname).txt
It is ran as sbatch /storage/multiple_hello.sh and the expected outcome is creating in /storage 3 files named 120_node1.txt, 121_node2.txt and 122_node3.txt (arbitrary job numbers) since:
3 nodes were requested
3 tasks were requested
there was set a limitation for 1 node per task
Real output: created one file only: 120_node1.txt
How to make it work as intended?
Weird enoughh, the srun --nodes=3 hostname works as expected, and returns:
node1
node2
node3
To get the expected result, modify the last line as
srun bash -c 'echo "Hello World from $(hostname)" >> ./"$SLURM_JOB_ID"_$(hostname).txt'
The way Bash parses the line is different from what you are expecting. First, $hostname and $SLURM_JOBID are expanded on the first node of the allocation (the one that runs the submission script), then srun is run, and its output is appended to the file. You need to be specific that the redirection >> is part of what you want srun to do. With the above solution, the variable and command expansions are done on each node, as well as the redirection.

How do I create a new directory for a Slurm job prior to setting the working directory?

I want to create a unique directory for each Slurm job I run. However, mkdir appears to interrupt SBATCH commands. E.g. when I try:
#!/bin/bash
#SBATCH blah blah other Slurm commands
mkdir /path/to/my_dir_$SLURM_JOB_ID
#SBATCH --chdir=/path/to/my_dir_$SLURM_JOB_ID
touch test.txt
...the Slurm execution faithfully creates the directory at /path/to/my_dir_$SLURM_JOB_ID, but skips over the --chdir command and executes the sbatch script from the working directory the batch was called from.
Is there a way to create a unique directory for the output of a job and set the working directory there within a single sbatch script?
First off, the #SBATCH options must be at the top of the file, and citing the documentation
before any executable commands
So it is expected behaviour that the --chdir is not honoured in this case. The issue rationale is that the #SBATCH options, and the --chdir in particular, is used by Slurm to setup the environment in which the job starts. That environment must be decided before the job starts, and cannot be modified afterwards by Slurm.
For similar reasons, environment variables are not processed in #SBATCH options ; they are simply ignored by Bash as they are in a commented line, and Slurm makes no effort to expand them itself.
Also note that --chdir is used to
Set the working directory of the batch script to directory before it is executed.
and that directory must exist. Slurm will not create it for you.
What you need to do is call the cd command in your script.
#!/bin/bash
#SBATCH blah blah other Slurm commands
WORKDIR=/path/to/my_dir_$SLURM_JOB_ID
mkdir -p "$WORKDIR" && cd "$WORKDIR" || exit -1
touch test.txt
Note the exit -1 so that if the directory creation fails, your job stops rather than continuing in the submission directory.
As a side note, it is always interesting to add a set -euo pipefail line in your script. It makes sure your script stops if any command in it fails.

How to adjust bash file to execute on a single node

I would like your help to know whether it is possible (and if yes how) to adjust the bash file below.
I have a principal Matlab script main.m, which in turn calls another Matlab script f.m.
f.m should be executed many times with different inputs.
I structure this as an array job.
I typically use the following bash file called td.sh to execute the array job into the HPC of my university
#$ -S /bin/bash
#$ -l h_vmem=5G
#$ -l tmem=5G
#$ -l h_rt=480:0:0
#$ -cwd
#$ -j y
#Run 237 tasks where each task has a different $SGE_TASK_ID ranging from 1 to 237
#$ -t 1-237
#$ -N mod
date
hostname
#Output the Task ID
echo "Task ID is $SGE_TASK_ID"
/share/[...]/matlab -nodisplay -nodesktop -nojvm -nosplash -r "main; ID = $SGE_TASK_ID; f; exit"
What I do in the terminal is
cd to the folder where the scripts main.m, f.m, td.sh are located
type in the terminal qsub td.sh
Question: I need to change the bash file above because the script f.m calls a solver (Gurobi) whose license is single node single user. This is what I have been told:
" This license has been installed already and works only on node A.
You will not be able to qsub your scripts as the jobs have to run on this node.
Instead you should ssh into node A and run the job on this node directly instead
of submitting to the scheduler. "
Could you guide me through understanding how I should change the bash file above? In particular, how should I force the execution into node A?
Even though I am restricted to one node only, am I still able to parallelise using array jobs? Or array jobs are by definition executed on multiple nodes?
If you cannot use your scheduler, then you cannot use its array jobs. You will have to find another way to parallelize those jobs. Array jobs are not executed on multiple nodes by definition (but they are usually executed on multiple nodes due to resource availability).
Regarding the adaptation of your script, just follow the guidelies provided by your sysadmins: forget about SGE and start your calculus through ssh directly against the node you have been told:
date
hostname
for TASK_ID in {1..237}
do
#Output the Task ID
echo "Task ID is $TASK_ID"
ssh user#A "/share/[...]/matlab -nodisplay -nodesktop -nojvm -nosplash -r \"main; ID = $TASK_ID; f; exit\""
done
If the license is single node and single user (but multiple simultaneous execution), you can try to parallelize the calculus. You will have to take into account the resources available in the node A (number of CPUs, memory...) and the resources that you need for every single execution, and then start simultaneously as many calculus as possible without overloading the node (otherwise they will take longer or even fail).

How can I convert my script for submitting SLURM jobs from Bash to Perl?

I have the following Bash script for job submission to SLURM on a cluster:
#!/bin/bash
#SBATCH -A 1234
#SBATCH -t 2-00:00
#SBATCH -n 24
module add xxx
srun resp.com
The #SBATCH lines are SLURM commands:
#SBATCH -A 1234 is the project number (1234)
#SBATCH -t 2-00:00 is the job time
#SBATCH -n 24 is the number of cores
module add xxx loads the Environment Module xxx (in this case I'm actually using module add gaussian, where gaussian is a computational quantum-chemistry program).
srun is the SLURM command to launch a job. resp.com includes commands for gaussian and atom coordinates.
I tried converting the Bash script to the following Perl script, but it didn't work. How can I do this in Perl?
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
system ("#SBATCH -A 1234");
system ("#SBATCH -t 2-00:00");
system ("#SBATCH -n 24");
system ("module add xxx");
system ("srun resp.com ");
Each of your system calls creates a child process to run the program in question and returns when the child process dies.
The whole point of module is to configure the current shell by, among other things, modifying it's environment. When this process completes (dies) say goodbye to those changes. The call to srun, in it's shinny new process with a shinny new environment, hasn't got a chance.
Steps forward:
Understand SLURM & bash and exactly why system("#SBATCH whatever"); might not be of any value. Hint: # marks the beginning of a comment in both Bash & Perl.
Understand what module add is doing with xxx and how you might replicate what it's doing inside the shell within the Perl interpreter. ThisSuitIsBlackNot recommends use Env::Modulecmd { load => 'foo/1.0' }; to replicate this functionality.
Barring any understanding of module add, system ('module add xxx; srun resp.com') would put those two commands in the same shell process, but at this point you need to ask yourself what you've gained by adding a Perl interpreter to the mix.
What you need to do is write
#!/usr/bin/perl
#SBATCH -A 1234
#SBATCH -t 2-00:00
#SBATCH -n 24
use strict;
use warnings;
use diagnostics;
system ("module add xxx && srun resp.com ");
and then submit it with
sbatch my_perl_script.pl
The #SBATCH lines are comments destined to be parsed by the sbatch command. They must be comments in the submission script.
The module command modifies the environment, but that environment is lost as soon as it is called if you invoke it with system on its own as system creates a subshell. You need to either invoke it on the same subshell as srun, as shown above, or use Perl tools to load the module in the environment of the Perl script so that it is available to srun, using use Env::Modulecmd { load => 'foo/1.0' }; as mentioned elsewhere.

Providing standard input to a fortran code running on a cluster running SLURM

I have a code that I have successfully installed on several calculating clusters that use a PBS queuing system, however I have hit a substantial stumbling block in installing it onto a cluster using the SLURM queuing system. The bulk of the code runs fine, however the code needs to be provided with its filename (which changes with each calculation), and it expects to receive it as a standard input:
character*8 name
read (5,'(a8)') name
and I provide this standard input to the cluster by:
srun_ps $1/$2.exe << EOD
$2
EOD
where $1 is the path of the executable, and $2 is the filename and srun_ps seems to be the cluster built mpi-exec script. For note this bit of code works fine on the clusters I have used with a PBS queuing system.
However what I get out here is an "end-of-file during read, unit 5, file stdin" error.
Also if I run a similar command on the command line of the login server (where the jobs are submitted through):
#helloworld.for
charachter*5 name
read(5,A5) name
write(6,A5) name
command line:
ifort -o helloworld.exe helloworld.for
./helloworld.exe << EOD
hello
EOD
provides the correct output of "hello". If I submit the same job to the cluster I again get an "end-of-file" error.
The full job submission script is:
#!/bin/bash
#SBATCH -o /home/Simulation/file.job.o
#SBATCH -D /home/Simulation/
#SBATCH -J file.job
#SBATCH --clusters=mpp1
#SBATCH --get-user-env
#SBATCH --ntasks=12
#SBATCH --time=1:00:00
source /etc/profile.d/modules.sh
/home/script/runjob /home/Simulation/ file
and relevant part of the runjob script is (the rest of the script is copying relevant input files, and file clean up after the calculation has completed):
#!/bin/sh
time srun_ps $1/$2.exe << EOD
$2
EOD
I realise this is probably an entirely too specific problem, but any advice would be appreciated.
David.
Try adding a line such as
#SBATCH -i filename
to your job submission script, replacing filename by whatever cryptic macro ($3 or whatever) will be expanded when you submit the script. Or, you might put this in your srun command, something like
srun_ps $1/$2.exe EOD
but I admit to some confusion about what gets called when in your scripts.

Resources