How to run programs from a bash script on different GPUs? - bash

I usually run two separate jobs (program1 and program2) on two different GPUs.
I would like to be able to run these two jobs from a single bash script but still on two different GPUs with a slurm .out file for each programs. Is this possible?
#!/bin/bash -l
#SBATCH --time=1:00:00
#SBATCH --gres=gpu:v100:1
#SBATCH --mem=90g
#SBATCH --cpus-per-task=6 -N 1
program1
#!/bin/bash -l
#SBATCH --time=1:00:00
#SBATCH --gres=gpu:v100:1
#SBATCH --mem=90g
#SBATCH --cpus-per-task=6 -N 1
program2
The script below seems to run both programs on the same GPU with a single .out file as output.
#!/bin/bash -l
#SBATCH --time=1:00:00
#SBATCH --gres=gpu:v100:1
#SBATCH --mem=90g
#SBATCH --cpus-per-task=6 -N 1
program1 &
program2 &
wait
Thanks for your help.

First way
You could write a submit script that gets the name of the executable as a command line argument and another script that calls the submit script. The submit script "submit.sh" could look like this:
#!/bin/bash -l
#SBATCH --time=1:00:00
#SBATCH --gres=gpu:v100:1
#SBATCH --mem=90g
#SBATCH --cpus-per-task=6 -N 1
$1
The second script "run_all.sh" could look like this:
#!/bin/bash
sbatch submit.sh program1
sbatch submit.sh program2
Now you can start your jobs with:$ ./run_all.sh
Second way
You don't have to use scripts to provide all the information for slurm. It is possible to pass the job information as arguments from the sbatch call: sbatch [OPTIONS(0)...] [ : [OPTIONS(N)...]] script(0) [args(0)...]
A script like this then could be useful:
#!/bin/bash -l
slurm_opt= --time=1:00:00 --gres=gpu:v100:1 --mem=90g --cpus-per-task=6 -N 1 --wrap
sbatch $slurm_opt program1
sbatch $slurm_opt program2
Note the --wrap option. It allows you to have any executable not just a script after it.

Related

Add exception to slurm array

I have the following slurm script:
#!/bin/bash
#SBATCH -A XXX-CPU
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH -p cclake
#SBATCH -D analyses/
#SBATCH -c 12
#SBATCH -t 01:00:00
#SBATCH --mem=10G
#SBATCH -J splitBAM
#SBATCH -a 1-12
#SBATCH -o analyses/splitBAM_%a.log
sed -n ${SLURM_ARRAY_TASK_ID}p analyses/slurm/commands1.csv | bash
sed -n ${SLURM_ARRAY_TASK_ID}p analyses/slurm/commands2.csv | bash
sed -n ${SLURM_ARRAY_TASK_ID}p analyses/slurm/commands3.csv | bash
Normally I would run another slurm script with a single bash command to remove some files with the option #SBATCH --dependency=afterok:job_id(first job).
What I want to do is include this in the above script, but when I add the line rm file1 file2 file3 it will obviously do this for each job in the array, but I only want to run the command once after all the jobs in the array have finished.
Is there a way to mark this command so that is not part of the array? This would allow me to do all the things with one script instead of two.
There is not specific Slurm syntax to do that, but you can add an if statement at the end of the script, looking for are any other jobs from that array.
if [[ $(squeue -h -j $SLURM_JOB_ID | wc -l) == 1 ]] ; then
rm file1 file2 file3
fi
If there is no other job running from that job array, then delete the files.

How to correctly submit an array of jobs using slurm

I'm trying to submit an array of jobs using slurm, but it's not working as I expected. My bash script is test.sh:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=10G
#SBATCH --account=myaccount
#SBATCH --partition=partition
#SBATCH --time=10:00:00
###Array setup here
#SBATCH --array=1-6
#SBATCH --output=test_%a.out
echo TEST MESSAGE 1
echo $SLURM_ARRAY_TASK_ID
python test.py
The test.py code:
print('TEST MESSAGE 2')
I then submitted this job by doing:
sbatch --wrap="bash test.sh"
I'm not even sure if this is how I should run it. Because there are already SBATCH commands in the bash script, should I just be running bash test.sh?
I was expecting that 5 jobs would be submitted and that $SLURM_ARRAY_TASK_ID would increase incrementally, but that's not happening. Just one job is submitting and the output is:
TEST MESSAGE 1
TEST MESSAGE 2
So the $SLURM_ARRAY_TASK_ID never get's printed and seems to be the problem. Can anyone tell me what I'm doing wrong?
You just need to submit the script with sbatch test.sh. Using --wrap in the way you've done it just runs test.sh as a simple Bash script, so none of the Slurm-specific parts are used.

execute bash files when previous ones have finished to run

Hello I would need help
In fact I need to execute several bash files ex:
file1.sh
file2.sh
file3.sh
file4.sh
those file will generate data that will be used for another bash file call final.sh
So in order to gain time I want to execute the fileNb.sh files sumultany on a cluster by doing :
for file in file*.sh; do sbatch $file; done
, and then when all job have been done, I would like to execute automatically the final.sh file.
Does someone have an idea ?
Thank you very much
One clean option is to reorganise the set of jobs as a job array and then add a dependency for final job on the whole array.
Assuming fileN.sh looks like this:
#!/bin/bash
#SBATCH --<some option>
#SBATCH --<some other option>
./my_program input_fileN
you can make this a job array. In a single submission file file.sh, write this
#!/bin/bash
#SBATCH --<some option>
#SBATCH --<some other option>
#SBATCH --array=1-4
./my_program input_file${SLURM_ARRAY_TASK_ID}
Then run
JOBID=$(sbatch --parsable file.sh)
sbatch --dependency after:$JOBID final.sh
In case your jobs cannot be parametrised by an integer directly, create a Bash array like this:
#!/bin/bash
#SBATCH --<some option>
#SBATCH --<some other option>
#SBATCH --array=0-2
ARGS=(SRR63563 SRR63564 SRR63565)
fasterq-dump --threads 10 ${ARGS[$SLURM_ARRAY_TASK_ID]} -O /path1/path2/path3/
You could do:
sbatch --wait file1.sh &
sbatch --wait file2.sh &
sbatch --wait file3.sh &
sbatch --wait file4.sh &
wait
sbatch final.sh
Or, more simply with GNU Parallel:
parallel -j4 sbatch --wait ::: file*.sh
sbatch final.sh
Is this no good?
for file in file*.sh; do sbatch $file; done; ./final.sh

comment in bash script processed by slurm

I am using slurm on a cluster to run jobs and submit a script that looks like below with sbatch:
#!/usr/bin/env bash
#SBATCH -o slurm.sh.out
#SBATCH -p defq
#SBATCH --mail-type=ALL
#SBATCH --mail-user=my.email#something.com
echo "hello"
Can I somehow comment out a #SBATCH line, e.g. the #SBATCH --mail-user=my.email#something.com in this script? Since the slurm instructions are bash comments themselves I would not know how to achieve this.
just add another # at the beginning.
##SBATCH --mail-user...
This will not be processed by Slurm

Change the value of external SLURM variable

I am running a bash script to run jobs on Linux clusters, using SLURM. The relevant part of the script is given below (slurm.sh):
#!/bin/bash
#SBATCH -p parallel
#SBATCH --qos=short
#SBATCH --exclusive
#SBATCH -o out.log
#SBATCH -e err.log
#SBATCH --open-mode=append
#SBATCH --cpus-per-task=1
#SBATCH -J hadoopslurm
#SBATCH --time=01:30:00
#SBATCH --mem-per-cpu=1000
#SBATCH --mail-user=amukherjee708#gmail.com
#SBATCH --mail-type=ALL
#SBATCH -N 5
I am calling this script from another script (ext.sh), a part of which is given below:
#!/bin/bash
for i in {1..3}
do
source slurm.sh
done
..
I want to manipulate the value of the N variable is slurm.sh (#SBATCH -N 5) by setting it to values like 3,6,8 etc, inside the for loop of ext.sh. How do I access the variable programmatically from ext.sh? Please help.
First note that if you simply source the shell script, you will not submit a job to Slurm, you will simply run the job on the submission node. So you need to write
#!/bin/bash
for i in {1..3}
do
sbatch slurm.sh
done
Now if you want to change the -N programmatically, one option is to remove it from the file slurm.sh and add it as argument to the sbatch command:
#!/bin/bash
for i in {1..3}
do
sbatch -N $i slurm.sh
done
The above script will submit three jobs with respectively 1, 2, and 3 nodes requested.

Resources