I am running a bash script to run jobs on Linux clusters, using SLURM. The relevant part of the script is given below (slurm.sh):
#!/bin/bash
#SBATCH -p parallel
#SBATCH --qos=short
#SBATCH --exclusive
#SBATCH -o out.log
#SBATCH -e err.log
#SBATCH --open-mode=append
#SBATCH --cpus-per-task=1
#SBATCH -J hadoopslurm
#SBATCH --time=01:30:00
#SBATCH --mem-per-cpu=1000
#SBATCH --mail-user=amukherjee708#gmail.com
#SBATCH --mail-type=ALL
#SBATCH -N 5
I am calling this script from another script (ext.sh), a part of which is given below:
#!/bin/bash
for i in {1..3}
do
source slurm.sh
done
..
I want to manipulate the value of the N variable is slurm.sh (#SBATCH -N 5) by setting it to values like 3,6,8 etc, inside the for loop of ext.sh. How do I access the variable programmatically from ext.sh? Please help.
First note that if you simply source the shell script, you will not submit a job to Slurm, you will simply run the job on the submission node. So you need to write
#!/bin/bash
for i in {1..3}
do
sbatch slurm.sh
done
Now if you want to change the -N programmatically, one option is to remove it from the file slurm.sh and add it as argument to the sbatch command:
#!/bin/bash
for i in {1..3}
do
sbatch -N $i slurm.sh
done
The above script will submit three jobs with respectively 1, 2, and 3 nodes requested.
Related
I'm trying to submit an array of jobs using slurm, but it's not working as I expected. My bash script is test.sh:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=10G
#SBATCH --account=myaccount
#SBATCH --partition=partition
#SBATCH --time=10:00:00
###Array setup here
#SBATCH --array=1-6
#SBATCH --output=test_%a.out
echo TEST MESSAGE 1
echo $SLURM_ARRAY_TASK_ID
python test.py
The test.py code:
print('TEST MESSAGE 2')
I then submitted this job by doing:
sbatch --wrap="bash test.sh"
I'm not even sure if this is how I should run it. Because there are already SBATCH commands in the bash script, should I just be running bash test.sh?
I was expecting that 5 jobs would be submitted and that $SLURM_ARRAY_TASK_ID would increase incrementally, but that's not happening. Just one job is submitting and the output is:
TEST MESSAGE 1
TEST MESSAGE 2
So the $SLURM_ARRAY_TASK_ID never get's printed and seems to be the problem. Can anyone tell me what I'm doing wrong?
You just need to submit the script with sbatch test.sh. Using --wrap in the way you've done it just runs test.sh as a simple Bash script, so none of the Slurm-specific parts are used.
I usually run two separate jobs (program1 and program2) on two different GPUs.
I would like to be able to run these two jobs from a single bash script but still on two different GPUs with a slurm .out file for each programs. Is this possible?
#!/bin/bash -l
#SBATCH --time=1:00:00
#SBATCH --gres=gpu:v100:1
#SBATCH --mem=90g
#SBATCH --cpus-per-task=6 -N 1
program1
#!/bin/bash -l
#SBATCH --time=1:00:00
#SBATCH --gres=gpu:v100:1
#SBATCH --mem=90g
#SBATCH --cpus-per-task=6 -N 1
program2
The script below seems to run both programs on the same GPU with a single .out file as output.
#!/bin/bash -l
#SBATCH --time=1:00:00
#SBATCH --gres=gpu:v100:1
#SBATCH --mem=90g
#SBATCH --cpus-per-task=6 -N 1
program1 &
program2 &
wait
Thanks for your help.
First way
You could write a submit script that gets the name of the executable as a command line argument and another script that calls the submit script. The submit script "submit.sh" could look like this:
#!/bin/bash -l
#SBATCH --time=1:00:00
#SBATCH --gres=gpu:v100:1
#SBATCH --mem=90g
#SBATCH --cpus-per-task=6 -N 1
$1
The second script "run_all.sh" could look like this:
#!/bin/bash
sbatch submit.sh program1
sbatch submit.sh program2
Now you can start your jobs with:$ ./run_all.sh
Second way
You don't have to use scripts to provide all the information for slurm. It is possible to pass the job information as arguments from the sbatch call: sbatch [OPTIONS(0)...] [ : [OPTIONS(N)...]] script(0) [args(0)...]
A script like this then could be useful:
#!/bin/bash -l
slurm_opt= --time=1:00:00 --gres=gpu:v100:1 --mem=90g --cpus-per-task=6 -N 1 --wrap
sbatch $slurm_opt program1
sbatch $slurm_opt program2
Note the --wrap option. It allows you to have any executable not just a script after it.
Hello I would need help
In fact I need to execute several bash files ex:
file1.sh
file2.sh
file3.sh
file4.sh
those file will generate data that will be used for another bash file call final.sh
So in order to gain time I want to execute the fileNb.sh files sumultany on a cluster by doing :
for file in file*.sh; do sbatch $file; done
, and then when all job have been done, I would like to execute automatically the final.sh file.
Does someone have an idea ?
Thank you very much
One clean option is to reorganise the set of jobs as a job array and then add a dependency for final job on the whole array.
Assuming fileN.sh looks like this:
#!/bin/bash
#SBATCH --<some option>
#SBATCH --<some other option>
./my_program input_fileN
you can make this a job array. In a single submission file file.sh, write this
#!/bin/bash
#SBATCH --<some option>
#SBATCH --<some other option>
#SBATCH --array=1-4
./my_program input_file${SLURM_ARRAY_TASK_ID}
Then run
JOBID=$(sbatch --parsable file.sh)
sbatch --dependency after:$JOBID final.sh
In case your jobs cannot be parametrised by an integer directly, create a Bash array like this:
#!/bin/bash
#SBATCH --<some option>
#SBATCH --<some other option>
#SBATCH --array=0-2
ARGS=(SRR63563 SRR63564 SRR63565)
fasterq-dump --threads 10 ${ARGS[$SLURM_ARRAY_TASK_ID]} -O /path1/path2/path3/
You could do:
sbatch --wait file1.sh &
sbatch --wait file2.sh &
sbatch --wait file3.sh &
sbatch --wait file4.sh &
wait
sbatch final.sh
Or, more simply with GNU Parallel:
parallel -j4 sbatch --wait ::: file*.sh
sbatch final.sh
Is this no good?
for file in file*.sh; do sbatch $file; done; ./final.sh
I have a shell file script.sh with the following commands:
#!/bin/sh
#SBATCH --partition=univ2
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=13
mpirun -n 25 benchmark.out $param
where param is an integer from the set {1,2,...,10}. Here param is a command line argument that is passed over to the executable benchmark.out. I want to create another shell file master.sh (in the same directory as script.sh) which would contain a loop over param (from 1 to 10), such that upon each iteration, script.sh is executed with a given value of param. How should this file look like? Thank you.
Master
#!/bin/bash
for param in `seq 1 1 10`; do
./script.sh $param
done
Script
#!/bin/sh
#SBATCH --partition=univ2
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=13
mpirun -n 25 benchmark.out $1
I am using slurm on a cluster to run jobs and submit a script that looks like below with sbatch:
#!/usr/bin/env bash
#SBATCH -o slurm.sh.out
#SBATCH -p defq
#SBATCH --mail-type=ALL
#SBATCH --mail-user=my.email#something.com
echo "hello"
Can I somehow comment out a #SBATCH line, e.g. the #SBATCH --mail-user=my.email#something.com in this script? Since the slurm instructions are bash comments themselves I would not know how to achieve this.
just add another # at the beginning.
##SBATCH --mail-user...
This will not be processed by Slurm