I want to pick up a number of models from a folder and use them in an sge script for an array job. So I do the following in the SGE script:
MODELS=/home/sahil/Codes/bistable/models
numModels=(`ls $MODELS|wc -l`)
echo $numModels
#$ -S /bin/bash
#$ -cwd
#$ -V
#$ -t 1-$[numModels] # Running array job over all files in the models directory.
model=(`ls $MODELS`)
echo "Starting ${model[$SGE_TASK_ID-1]}..."
But I get the following error:
Unable to read script file because of error: Numerical value invalid!
The initial portion of string "$numModels" contains no decimal number
I have also tried to use
#$ -t 1-${numModels}
and
#$ -t 1-(`$numModels`)
but none of these work. Any suggestions/alternate methods are welcome, but they must use the array job functionality of qsub.
Beware that to Bash, #$ -t 1-$[numModels] is nothing more than a comment; hence it does not apply variable expansion to numModels.
One option is to pass the -t argument in the command line: remove it from your script:
#$ -S /bin/bash
#$ -cwd
#$ -V
model=(`ls $MODELS`)
echo "Starting ${model[$SGE_TASK_ID-1]}..."
and submit the script with
MODELS=/home/sahil/Codes/bistable/models qsub -t 1-$(ls $MODELS|wc -l) submit.sh
If you prefer to have a self-contained submission script, another option is to pass the content of the whole script through stdin like this:
#!/bin/bash
qsub <<EOT
MODELS=/home/sahil/Codes/bistable/models
numModels=(`ls $MODELS|wc -l`)
echo $numModels
#$ -S /bin/bash
#$ -cwd
#$ -V
#$ -t 1-$[numModels] # Running array job over all files in the models directory.
model=(`ls $MODELS`)
echo "Starting ${model[$SGE_TASK_ID-1]}..."
EOT
Then you source or execute that script directly to submit your job array (./submit.sh rather than qsub submit.sh as the qsub command is here part of the script.
Related
I am running Sun Grid Engine for submitting jobs, and I want to have a bash script that sends in any file I need to run, instead of having to run a different qsub command with a different bash file for each of the jobs. I have been capable of generating output and error files that share the name of the input file, but now I am struggling with setting a different name for each file. My approach has been the following:
#!/bin/bash
#
#$ -cwd
#$ -S /bin/bash
#$ -N $1
#
python -u $1 >/output_dir/$1.out 2>/error_dir/$1.error
This way, running qsub send_to_sge.sh foo executes the program, and creates the files foo.error and foo.out with the errors and printouts, respectively. However, the job appears with the name $1 in the SGE queue. Instead, I would like to have foo as the job name. Is there any way to achieve what I am seeking?
I have the following shell script.
#!/bin/bash --login
#BSUB -q q_ab_mpc_work
#BSUB -J psipred
#BSUB -W 01:00
#BSUB -n 64
#BSUB -o psipred.out
#BSUB -e psipred.err
module load compiler/gnu-4.8.0
module load R/3.0.1
export OMP_NUM_THREADS=4
code=${HOME}/Phd/script_dev/rfpipeline.sh
MYPATH=$HOME/Phd/script_dev/
cd ${MYPATH}
${code} myfile.txt
in which I can use bsub to submit program to cluster:
bsub < myprogram.sh
however I change the last line in my program to:
${code} $1
where I use a command line argument to specify the file, how can I pass this to bsub?
I have tried:
bsub < myprogram.sh myfile.text
however bsub will not accept myfile.text as a bash parameter.
I have also tried
bsub <<< myprogram.sh myfile.text
./myprogram.sh myfile.text | bsub
bsub "sh ./myprogram.sh myfile.text"
what do I need to do?
Can I answer my own question?
It seems that I can use sed to modify the file on the fly. My original file is now:
#!/bin/bash --login
#BSUB -q q_ab_mpc_work
#BSUB -J psipred
#BSUB -W 01:00
#BSUB -n 64
#BSUB -o psipred.out
#BSUB -e psipred.err
module load compiler/gnu-4.8.0
module load R/3.0.1
export OMP_NUM_THREADS=4
code=${HOME}/Phd/script_dev/rfpipeline.sh
MYPATH=$HOME/Phd/script_dev/
cd ${MYPATH}
${code} myfile
and I wrote a bash script, sender.sh to both modify the variable myfile with a command line argument, and send the modified file off to bsub:
#!/bin/bash
sed "s/myfile/$1/g" < myprogram.sh | bsub
being careful to use double quotes so that bash does not read $ literally. I then simply run ./sender.sh jobfile.txt which works!
Hope this helps anybody.
This answer should resolve your problem:
https://unix.stackexchange.com/questions/144518/pass-argument-to-script-then-redirect-script-as-input-to-bsub
Just pass the script with arguments at the end of the bsub command.
Ex.
example.sh
#!/bin/bash
export input=${1}
echo "arg input: ${input}"
bsub command:
bsub [bsub args] "path/to/example.sh arg1"
I want to pass the PBS_ARRAYID to the main argument vector (argv) through qsub but after reading every return in pages of google results - I cannot get this to work. A constant argument qsubs fine.
#
#$ -cwd
#$ -S /bin/bash
#$ -j y
#$ -t 1-3
#$ -pe fah 1
var1=$(echo "$PBS_ARRAYID" -l)
const1=1
./daedalus_linux_1.3_64 $const1 $var1
I lifted the Array code from the solution given here Using a loop variable in a bash script to pass different command-line args
From everything I have read this should work. And it does work with the exception of var1=$(echo "$PBS_ARRAYID" -l)
It turns out the answer is fairly simple, our University uses a Sun Grid Engine queue - SGE
The tutorials I found by search were all by chance for PBS queue
#
#$ -cwd
#$ -S /bin/bash
#$ -j y
#$ -t 1-9
#$ -pe fah 3
const1=1
./daedalus_linux_1.3_64 $const1 $SGE_TASK_ID
Is there a way to directly pass parameters to a .pbs script before submitting a job? I need to loop over a list of files indicated by different numbers and apply a script to analyze each file.
The best I've been able to come up with is the following:
#!/bin/sh
for ((i= 1; i<= 10; i++))
do
export FILENUM=$i
qsub pass_test.pbs
done
where pass_test.pbs is the following script:
#!/bin/sh
#PBS -V
#PBS -S /bin/sh
#PBS -N pass_test
#PBS -l nodes=1:ppn=1,walltime=00:02:00
#PBS -M XXXXXX#XXX.edu
cd /scratch/XXXXXX/pass_test
./run_test $FILENUM
But this feels a bit wonky. Particularly, I want to avoid having to create an environment variable to handle this.
The qsub utility can read the script from the standard input, so by using a here document you can create scripts on the fly, dynamically:
#!/bin/sh
for i in `seq 1 10`
do
cat <<EOS | qsub -
#!/bin/sh
#PBS -V
#PBS -S /bin/sh
#PBS -N pass_test
#PBS -l nodes=1:ppn=1,walltime=00:02:00
#PBS -M XXXXXX#XXX.edu
cd /scratch/XXXXXX/pass_test
./run_test $i
EOS
done
Personally, I would use a more compact version:
#!/bin/sh
for i in `seq 1 10`
do
cat <<EOS | qsub -V -S /bin/sh -N pass_test -l nodes=1:ppn=1,walltime=00:02:00 -M XXXXXX#XXX.edu -
cd /scratch/XXXXXX/pass_test
./run_test $i
EOS
done
You can use the -F option, as described here:
-F
Specifies the arguments that will be passed to the job script when the script is launched. The accepted syntax is:
qsub -F "myarg1 myarg2 myarg3=myarg3value" myscript2.sh
Note: Quotation marks are required. qsub will fail with an error
message if the argument following -F is not a quoted value. The
pbs_mom server will pass the quoted value as arguments to the job
script when it launches the script.
See also this answer
If you just need to pass numbers and run a list of jobs with the same command except the input file number, it's better to use a job array instead of a for loop as job array would have less burden on the job scheduler.
To run, you specify the file number with PBS_ARRAYID like this in the pbs file:
./run_test ${PBS_ARRAYID}
And to invoke it, on command line, type:
qsub -t 1-10 pass_test.pbs
where you can specify what array id to use after -t option
I'm trying to submit a (series of) jobs to SGE (FWIW, it's a sequence of Gromacs molecular dynamics simulations), in which all the jobs are identical except for a suffix, such as input01, input02, etc. I wrote the commands to run in a way that the suffix is properly handled by the sequence of commands.
However, I can't find a way to get the exec environment to receive that variable. According to the qsub man page, -v var should do it.
$ export i=19
$ export | grep ' i='
declare -x i="19"
$ env | grep '^i='
i=19
Then, I submit the following script (run.sh) to see if it's received:
if [ "x" == "x$i" ]; then
echo "ERROR: \$i not set"
else
echo "SUCCESS: \$i is set"
fi
I submit the job as follows (in the same session as the export command above):
$ qsub -N "test_env" -cwd -v i run.sh
Your job 4606 ("test_env") has been submitted
The error stream is empty, and the output stream has:
$ cat test_env.o4606
ERROR: $i not set
I also tried the following commands, unsuccessfully:
$ qsub -N "test_env" -cwd -v i -V run.sh
$ qsub -N "test_env" -cwd -V run.sh
$ qsub -N "test_env" -cwd -v i=19 -V run.sh
$ qsub -N "test_env" -cwd -v i=19 run.sh
If I add a line i=19 to the beginning of run.sh, then the output is:
$ cat test_env.o4613
SUCCESS: $i is set as 19
I'm now considering generating a single file per job, which will essentially be the same but will have an i=xx line as the first. It doesn't look very much practical, but it would be a solution.
Would there be a better solution?
What I've been always doing is the following:
##send.sh
export a=10
qsub ./run.sh
and the script run.sh:
##run.sh
#$ -V
echo $a
when I call send.sh, the .o has an output of 10.
Assuming that your variable is just an incrementing counter: You can use Array Jobs to achieve this. This will set an $SGE_TASK_ID environment variable to the count which you can then copy to $i or use directly.
If the variable is anything else, then I think you'll have to generate multiple job scripts and submit each; that's the "solution" I use when I have loads of jobs with differing parameters.
I'm not certain you can pass variables by their name through qsub. I've had success with passing values (you should probably write a front-end script for this instead of doing it interactively):
$ export ii=19
$ qsub -N "test_env" -cwd -v i=$ii run.sh