Using for loop with qsub for batch job submission - for-loop

Could I please be advised how I could use a for loop to qsub files for batch job submission?
At the moment, this only works if I submit a single file for job submission using the command:
qsub -v /path/to/file.txt script.sh
However if I run a for loop through files using the following commands:
files=`pwd`/*pattern* (#This gives a list of files containing a certain common title)
for i in $files;
do
qsub -v $i script.sh
done
This always gets rejected with the error that the file.txt was not provided.
I have double checked if $i from the for loop is providing the right file.txt by doing:
for i in $files;
do
echo $i
done
and this works out fine. As such I am unsure why the for loop with qsub is not working. Could I please get advice on how I could alter the code to get it to work?
Thanks.

Using -v requires you to give the variable a name: qsub -v filepath=$i script.sh where you can then access the filepath inside script.sh with $filepath.

Related

Hold remainder of shell script commands until PBS qsub array job completes

I am very new to shell scripting, and I am trying to write a shell pipeline that submits multiple qsub jobs, but has several commands to run in between these qsubs, which are contingent on the most recent job completing. I have been researching multiple ways to try and hold the shell script from proceeding after submission of a qsub job, but none have been successful.
The simplest chunk of code I can provide to illustrate the issue is as follows:
THREADS=`wc -l < list1.txt`
qsub -V -t 1-$THREADS firstjob.sh
echo "firstjob.sh completed"
There are obviously other lines of code after this that are actually contingent on firstjob.sh finishing, but I have omitted them here for clarity. I have tried the following methods of pausing/holding the script:
1) Only using wait, which is supposed to stop the script until all background programs are completed. This pushed right past the wait and printed the echo statement to the terminal while the array job was still running. My guess is this is occurring because once the qsub job is submitted, is exits and wait thinks it has completed?
qsub -V -t 1-$THREADS firstjob.sh
wait
echo "firstjob.sh completed"
2) Setting the job to a variable, echoing that variable to submit the job, and using the the entire job ID along with wait to pause. The echo command should wait until all elements of the array job have completed.The error message is shown following the code, within the code block.
job1=$(qsub -V -t 1-$THREADS firstjob.sh)
echo "$job1"
wait $job1
echo "firstjob.sh completed"
####ERROR RECEIVED####
-bash: wait: `4585057[].cluster-name.local': not a pid or valid job spec
3) Using the -sync y for qsub. This should prevent it from exiting the qsub until the job is complete, acting as an effective pause...I had hoped. Error in comment after the commands. For some reason it is not reading the -sync option correctly?
qsub -V -sync y -t 1-$THREADS firstjob.sh
echo "firstjob.sh completed"
####ERROR RECEIVED####
qsub: script file 'y' cannot be loaded - No such file or directory
4) Using a dummy shell script (the dummy just makes an empty file) so that I could use the -W depend=afterok: option of qsub to pause the script. This again pushes right past to the echo statement without any pause for submitting the dummy script. Both jobs get submitted, one right after the other, no pause.
job1=$(qsub -V -t 1-$THREADS demux.sh)
echo "$job1"
check=$(qsub -V -W depend=afterok:$job1 dummy.sh)
echo "$check"
echo "firstjob.sh completed"
Some further details regarding the script:
Each job submission is an array job.
The pipeline is being run in the terminal using a command resembling the following, so that I may provide it with 3 inputs: source Pipeline.sh -r list1.txt -d /workingDir/ -s list2.txt
I am certain that the firstjob.sh has not actually completed running because I see them in the queue when I use showq.
Perhaps there is an easy fix in most of these scenarios, but being new to all this, I am really struggling. I have to use this method in 8-10 places throughout the script, so it is really hindering progress. Would appreciate any assistance. Thanks.
POST EDIT 1
Here is the code contained in firstjob.sh...though doubtful that it will help. Everything in here functions as expected, always produces the correct results.
\#! /bin/bash
\#PBS -S /bin/bash
\#PBS -N demux
\#PBS -l walltime=72:00:00
\#PBS -j oe
\#PBS -l nodes=1:ppn=4
\#PBS -l mem=15gb
module load biotools
cd ${WORKDIR}/rawFQs/
INFILE=`head -$PBS_ARRAYID ${WORKDIR}${RAWFQ} | tail -1`
BASE=`basename "$INFILE" .fq.gz`
zcat $INFILE | fastx_barcode_splitter.pl --bcfile ${WORKDIR}/rawFQs/DemuxLists/${BASE}_sheet4splitter.txt --prefix ${WORKDIR}/fastqs/ --bol --suffix ".fq"
I just tried using -sync y, and that worked for me, so good idea there... Not sure what's different about your setup.
But a couple other things you could try involve your main script knowing the status of the qsub jobs you're running. One idea is that you could have your main script check the status of your job using qstat and wait until it finishes before proceeding.
Alternatively, you could have the first job write to a file as its last step (or, as you suggested, set up a dummy job that waits for the first job to finish). Then in your main script, you can test to see whether that file has been written before going on.

Pass command line arguments via sbatch

Suppose that I have the following simple bash script which I want to submit to a batch server through SLURM:
#!/bin/bash
#SBATCH -o "outFile"$1".txt"
#SBATCH -e "errFile"$1".txt"
hostname
exit 0
In this script, I simply want to write the output of hostname on a textfile whose full name I control via the command-line, like so:
login-2:jobs$ sbatch -D `pwd` exampleJob.sh 1
Submitted batch job 203775
Unfortunately, it seems that my last command-line argument (1) is not parsed through sbatch, since the files created do not have the suffix I'm looking for and the string "$1" is interpreted literally:
login-2:jobs$ ls
errFile$1.txt exampleJob.sh outFile$1.txt
I've looked around places in SO and elsewhere, but I haven't had any luck. Essentially what I'm looking for is the equivalent of the -v switch of the qsub utility in Torque-enabled clusters.
Edit: As mentioned in the underlying comment thread, I solved my problem the hard way: instead of having one single script that would be submitted several times to the batch server, each with different command line arguments, I created a "master script" that simply echoed and redirected the same content onto different scripts, the content of each being changed by the command line parameter passed. Then I submitted all of those to my batch server through sbatch. However, this does not answer the original question, so I hesitate to add it as an answer to my question or mark this question solved.
I thought I'd offer some insight because I was also looking for the replacement to the -v option in qsub, which for sbatch can be accomplished using the --export option. I found a nice site here that shows a list of conversions from Torque to Slurm, and it made the transition much smoother.
You can specify the environment variable ahead of time in your bash script:
$ var_name='1'
$ sbatch -D `pwd` exampleJob.sh --export=var_name
Or define it directly within the sbatch command just like qsub allowed:
$ sbatch -D `pwd` exampleJob.sh --export=var_name='1'
Whether this works in the # preprocessors of exampleJob.sh is also another question, but I assume that it should give the same functionality found in Torque.
Using a wrapper is more convenient. I found this solution from this thread.
Basically the problem is that the SBATCH directives are seen as comments by the shell and therefore you can't use the passed arguments in them. Instead you can use a here document to feed in your bash script after the arguments are set accordingly.
In case of your question you can substitute the shell script file with this:
#!/bin/bash
sbatch <<EOT
#!/bin/bash
#SBATCH -o "outFile"$1".txt"
#SBATCH -e "errFile"$1".txt"
hostname
exit 0
EOT
And you run the shell script like this:
bash [script_name].sh [suffix]
And the outputs will be saved to outFile[suffix].txt and errFile[suffix].txt
If you pass your commands via the command line, you can actually bypass the issue of not being able to pass command line arguments in the batch script. So for instance, at the command line :
var1="my_error_file.txt"
var2="my_output_file.txt"
sbatch --error=$var1 --output=$var2 batch_script.sh
The lines starting with #SBATCH are not interpreted by bash but are replaced with code by sbatch.
The sbatch options do not support $1 vars (only %j and some others, replacing $1 by %1 will not work).
When you don't have different sbatch processes running in parallel, you could try
#!/bin/bash
touch outFile${1}.txt errFile${1}.txt
rm link_out.sbatch link_err.sbatch 2>/dev/null # remove links from previous runs
ln -s outFile${1}.txt link_out.sbatch
ln -s errFile${1}.txt link_err.sbatch
#SBATCH -o link_out.sbatch
#SBATCH -e link_err.sbatch
hostname
# I do not know about the background processing of sbatch, are the jobs still running
# at this point? When they are, you can not delete the temporary symlinks yet.
exit 0
Alternative:
As you said in a comment yourself, you could make a masterscript.
This script can contain lines like
cat exampleJob.sh.template | sed -e 's/File.txt/File'$1'.txt/' > exampleJob.sh
# I do not know, is the following needed with sbatch?
chmod +x exampleJob.sh
In your template the #SBATCH lines look like
#SBATCH -o "outFile.txt"
#SBATCH -e "errFile.txt"
This is an old question but I just stumbled into the same task and I think this solution is simpler:
Let's say I have the variable $OUT_PATH in the bash script launch_analysis.bash and I want to pass this variable to task_0_generate_features.sl which is my SLURM file to send the computation to a batch server. I would have the following in launch_analysis.bash:
`sbatch --export=OUT_PATH=$OUT_PATH task_0_generate_features.sl`
Which is directly accessible in task_0_generate_features.sl
In #Jason case we would have:
sbatch -D `pwd` --export=hostname=$hostname exampleJob.sh
Reference: Using Variables in SLURM Jobs
Something like this works for me and Torque
echo "$(pwd)/slurm.qsub 1" | qsub -S /bin/bash -N Slurm-TEST
slurm.qsub:
#!/bin/bash
hostname > outFile${1}.txt 2>errFile${1}.txt
exit 0

Issue with scheduling in Linux

I scheduled a script using at scheduler in linux.
The job ran fine but the echo statements which I had redirected to a file are no where to be found.
The at scheduling command is as follows:
at -f /app/data/scripts/func_test.sh >> /app/data/log/log.txt 2>&1 -v 09:50
Can anyone point out what is the issue with the above command.
I cannot see any echo statements from the script in the log.txt file
To include shell syntax like I/O redirection, you'll need to either fold it into your script, or pass the input to at via standard input, like so:
at -v 09:50 <<EOF
sh /app/data/scripts/func_test.sh >> /app/data/log/log.txt 2>&1
EOF
If func_test.sh is already executable, you can omit the sh from the beginning of the command; it's there to ensure that you are passing a valid command line to at.
You can also simply ensure that your script itself redirects all its output to a specific log file. As an example,
#!/bin/bash
echo foo
echo bar
becomes
#!/bin/bash
{
echo foo
echo bar
} >> /app/data/log/log.txt 2>&1
Then you can simply run your script with at using
at -f /app/data/scripts/func_test.sh -v 09:50
with no output redirection, because the script itself already redirects all its output to that file.

How to know the PBS batch job submit time inside the script being excuted?

I'm using the PBS qsub to run a script on a cluster that must output a report file named with the batch job submit time.
The batch job submit time is the time it joins the PBS batch job que.
I checked all PBS default variables but I didn't find anything related to the job submit time.
I would like to know how can I get this time without creating a new input variable.
Thanks.
I figured out this by myself.
Add the following function into your PBS batch job script to get the job submit time.
getsubmitdate(){
local datestring=`qstat -f $PBS_JOBID | grep -F qtime | awk '{for(i=3;i<8;i++) printf $i" "}'`;
local result=`date -d "$datestring" +%Y%m%d` ;
local outputvar=$1 ;
if [[ "$outputvar" ]] ; then
eval $outputvar="'$result'"
else
echo "$result"
fi
}
getsubmitdate SUBMITDATE
echo $SUBMITDATE

How to properly pass on an environment variable to Sun Grid Engine?

I'm trying to submit a (series of) jobs to SGE (FWIW, it's a sequence of Gromacs molecular dynamics simulations), in which all the jobs are identical except for a suffix, such as input01, input02, etc. I wrote the commands to run in a way that the suffix is properly handled by the sequence of commands.
However, I can't find a way to get the exec environment to receive that variable. According to the qsub man page, -v var should do it.
$ export i=19
$ export | grep ' i='
declare -x i="19"
$ env | grep '^i='
i=19
Then, I submit the following script (run.sh) to see if it's received:
if [ "x" == "x$i" ]; then
echo "ERROR: \$i not set"
else
echo "SUCCESS: \$i is set"
fi
I submit the job as follows (in the same session as the export command above):
$ qsub -N "test_env" -cwd -v i run.sh
Your job 4606 ("test_env") has been submitted
The error stream is empty, and the output stream has:
$ cat test_env.o4606
ERROR: $i not set
I also tried the following commands, unsuccessfully:
$ qsub -N "test_env" -cwd -v i -V run.sh
$ qsub -N "test_env" -cwd -V run.sh
$ qsub -N "test_env" -cwd -v i=19 -V run.sh
$ qsub -N "test_env" -cwd -v i=19 run.sh
If I add a line i=19 to the beginning of run.sh, then the output is:
$ cat test_env.o4613
SUCCESS: $i is set as 19
I'm now considering generating a single file per job, which will essentially be the same but will have an i=xx line as the first. It doesn't look very much practical, but it would be a solution.
Would there be a better solution?
What I've been always doing is the following:
##send.sh
export a=10
qsub ./run.sh
and the script run.sh:
##run.sh
#$ -V
echo $a
when I call send.sh, the .o has an output of 10.
Assuming that your variable is just an incrementing counter: You can use Array Jobs to achieve this. This will set an $SGE_TASK_ID environment variable to the count which you can then copy to $i or use directly.
If the variable is anything else, then I think you'll have to generate multiple job scripts and submit each; that's the "solution" I use when I have loads of jobs with differing parameters.
I'm not certain you can pass variables by their name through qsub. I've had success with passing values (you should probably write a front-end script for this instead of doing it interactively):
$ export ii=19
$ qsub -N "test_env" -cwd -v i=$ii run.sh

Resources