How to use variables on qsub? - bash

I'm new to qsub and I'm trying to figure out how to use the task queue optimally. I have this script that works well:
#!bin/bash
##PBS -V # Export all environment variables from the qsub command environment to the batch job.
#PBS -N run
#PBS -q normal.q
#PBS -e archivo.err
#PBS -o archivo.out
#PBS -pe mpirun 8
#PBS -d ~/ # Working directory (PBS_O_WORKDIR)
#PBS -l nodes=1:ppn=8
~/brinicle/step-2/onephase_3/./main.x --mesh ~/brinicle/step-2/onephase_3/results/mesh.msh -Rmin 0 -Rmax 10 -Zmin 0 -Zmax 10 -o 2 -r 2 -T_f -10 -a_l 7.8 -a_s 70.8 -dt 0.01 -t_f 1 -v_s 10 -ode 12 -reltol 0.00001 -abstol 0.00001
The problem, as you can see, is that the command line is huge and hard to edit from the command shell. I would want to separate it into variables such as
#MESH="--mesh ~/brinicle/step-2/onephase_3/results/mesh.msh"
#EXE="~/brinicle/step-2/onephase_3/./main.x"
.
.
.
$EXE $MESH $PARAMETERS
And for the other parameters too.
But when I do this the program doesn't run and says that there's an illegal variable or that the variable is undefined. Also, is very important to me to change easily the parameters -o, -r, -ode and send multiple jobs at once. For example 5 equal jobs with -o 1 then 5 with -0 2 and so on. I want to be also able to modify in this way -r and -ode. The problem is that without using the variables I really don't know how to do that.
Please, if someone can tell me how to automate the script in this way would be a huge help.

Use bash arrays.
exe=(~/brinicle/step-2/onephase_3/./main.x)
mesh=(--mesh ~/brinicle/step-2/onephase_3/results/mesh.msh)
parms=(
-Rmin 0
-Rmax 10
-Zmin 0
-Zmax 10
. etc.
)
"${exe[#]}" "${mesh[#]}" "${parms[#]}"
Research bash arrays and how to use then and quoting in shell. Prefer to use lower case variables. Research order of expansions in shell.

One alternative if you have a lot of static parameters and a lot of dynamic ones is to refactor into a function where you hard-code what doesn't change, and interpolate the parts which do change.
qrunmesh () {
qsub <<:
#!bin/bash
##PBS -V # Export all environment variables from the qsub command environment to the batch job.
#PBS -N run
#PBS -q normal.q
#PBS -e archivo.err
#PBS -o archivo.out
#PBS -pe mpirun 8
#PBS -d ~/ # Working directory (PBS_O_WORKDIR)
#PBS -l nodes=1:ppn=8
"$1" --mesh "$2" -Rmin 0 -Rmax 10 -Zmin 0 -Zmax 10 \
-o "$3" -r "$4" -T_f -10 -a_l 7.8 -a_s 70.8 \
-dt 0.01 -t_f 1 -v_s 10 -ode "$5" \
-reltol 0.00001 -abstol 0.00001
:
}
for o in 1 2 3; do
for r in 5 10 15; do
for x in onephase_3 onephase_2 twophase_3; do
for ode in 12 13 15; do
for mesh in onephase_3 otherphase_2; do
qrunmesh "$x" "$mesh" "$o" "$r" "$ode"
done
done
done
done
done
(I'm not very familiar with qsub; I assume it accepts the script on standard input if you don't pass in a script name. If not, maybe you have to store the here document in a temporary file, submit it, and remove the temporary file.)

Related

Iterations of a bash script to run in parallel

I have a bash script that looks like below.
$TOOL is another script which runs 2 times with different inputs(VAR1 and VAR2).
#Iteration 1
${TOOL} -ip1 ${VAR1} -ip2 ${FINAL_PML}/$1$2.txt -p ${IP} -output_format ${MODE} -o ${FINAL_MODE_DIR1}
rename mods mode_c_ ${FINAL_MODE_DIR1}/*.xml
#Iteration 2
${TOOL} -ip1 ${VAR2} -ip2 ${FINAL_PML}/$1$2.txt -p ${IP} -output_format ${MODE} -o ${FINAL_MODE_DIR2}
rename mods mode_c_ ${FINAL_MODE_DIR2}/*.xml
Can I make these 2 iterations in parallel inside a bash script without submitting it in a queue?
If I read this right, what you want is to run them in background.
c.f. https://linuxize.com/post/how-to-run-linux-commands-in-background/
More importantly, if you are going to be writing scripts, PLEASE read the following closely:
https://www.gnu.org/software/bash/manual/html_node/index.html#SEC_Contents
https://mywiki.wooledge.org/BashFAQ/001

Write a configuration file for each run in a separate directory, then launch mpirun

I need to do a set of calculations by changing one parameter for each time. A calculation directory contains a control file named 'test.ctrl', a job submission file named 'job-test' and a bunch of data files. Each calculation should be submitted with the same control file name (written inside the job-test), and the output is given in those data files without changing their names, which creates an overwriting problem. For this reason, I want to automize the job submission process with a bash script so that I don't need to submit each calculation by hand.
As an example, I have done the first calculation in directory b1-k1-a1 (I choose this format of dir names to indicate calc. parameters). This test.ctrl file has the parameters:
Beta=1
Kappa=1
Alpha=0 1
and I submitted this job using 'sbatch job-test' command. For the following calculations, my code should copy this whole directory with the name bX-kY-aZ, make the changes in the control file, and finally submit the job. I naively tried this writing the whole thing in the job-test file as you can see in below MWE:
#!/bin/sh
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --time=0:15:00 ##hh:mm:ss
for n in $(seq 0 5)
do
for m in $(seq 0 5)
do
for v in $(seq 0 5)
do
mkdir b$n-k$m-a$v
cd b$n-k$m-a$v
cp ~/home/b01-k1-a01/* .
sed "s/Beta=1/Beta=$n/" test.ctrl
sed "s/Kappa=1/Kappa=$m/" test.ctrl
sed "s/Alpha=0 1/Alpha=0 $v/" test.ctrl
cd ..<<EOF
EOF
mpirun soft.x test.ctrl
sleep 5
done
done
done
I will appreciate if you could suggest me how to make it work this way.
It worked after I moved cd .. to the very end of the loops and removed sed, as suggested in the comments. Hence this works now:
#!/bin/sh
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --time=0:15:00 ##hh:mm:ss
for n in $(seq 0 5)
do
for m in $(seq 0 5)
do
for v in $(seq 0 5)
do
mkdir b$n-k$m-a$v
cd b$n-k$m-a$v
cp ~/home/b01-k1-a01/* .
cat >test.ctrl <<EOF
Beta=$n
Kappa=$m
Alpha=0 $v
EOF
mpirun soft.x test.ctrl
sleep 5
cd ..
done
done
done
The immediate problem is that sed without any options does not modify the file at all; it just prints the results to standard output.
It is frightfully unclear what you were hoping the here document was accomplishing. cd does not read its standard input, so it wasn't accomplishing anything at all, anyway.
#!/bin/sh
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --time=0:15:00 ##hh:mm:ss
for n in $(seq 0 5); do
for m in $(seq 0 5); do
for v in $(seq 0 5); do
mkdir "b$n-k$m-a$v"
cd "b$n-k$m-a$v"
cp ~/home/b01-k1-a01/* .
sed -e "s/Beta=1/Beta=$n/" \
-e "s/Kappa=1/Kappa=$m/" \
-e "s/Alpha=0 1/Alpha=0 $v/" ~/home/b01-k1-a01/test.ctrl >/test.ctrl
mpirun soft.x test.ctrl
cd ..
sleep 5
done
done
done
Notice also the merging of multiple sed commands into a single script (though as noted elsewhere, maybe printf would be even better if that's everything which you have in the configuration file).

Creating separate output file per input file

I'm using kofamscan by KEGG to annotate bunch of fasta files.I'm running this with multiple fasta files so whenever new file is being analyzed the output file is being overwritten. I really want separate output files per input file(i.e. a.fasta -> a.txt; b.fasta -> b.txt, etc.) and I have tried the following but it seems to be not working:
#!/bin/bash
#$ -S /bin/bash
#$ -cwd
#$ -pe def_slot 8
#$ -N coral_kofam
#$ -o stdout
#$ -e stderr
#$ -l os7
# perform kofam operation from file 1 to file 47
#$ -t 1-47:1
#$ -tc 10
#setting
source ~/.bash_profile
readarray -t files < kofam_files #input files
TASK_ID=$((SGE_TASK_ID - 1))
~/kofamscan/bin/exec_annotation -o kofam_out_[$TASK_ID].txt --tmp-dir $(mktemp -d) ${files[$TASK_ID]}
The following section of the code is where I need to change(obviously as it is not working for me now)
-o kofam_out_[$TASK_ID].txt
Could anybody help me how to make this work?
Do you want to name output file with $TASK_ID?
Just put file name like this kofam_out_${TASK_ID}.txt

Variable Not getting recognized in shell script

I use the following shell script to run a simulation on my cluster.
#PBS -N 0.05_0.05_m_1_200k
#PBS -l nodes=1:ppn=1,pmem=1000mb
#PBS -S /bin/bash
#$ -m n
#$ -j oe
FOLDER= 0.57
WDIR=/home/vikas/ala_1_free_energy/membrane_200k/restraint_decoupling_pullinit_$FOLDER
cd /home/vikas/ala_1_free_energy/membrane_200k/restraint_decoupling_pullinit_$FOLDER
LAMBDA= 0.05
/home/durba/gmx455/bin/mdrun -np 1 -deffnm md0.05 -v
############################
Now my problem is that my script doesn't recognize variable FOLDER and throws an error
couldn't find md0.05.tpr
which exist in the folder. If I write 0.57 at the place of $folder,It works fine, which makes me feel that it's not recognizing the variable FOLDER. LAMBDA is recognized perfectly in both of the cases.If somebody can help me here, I will be grateful.
There should not be a space between the = and the value you wish to assign to the variables:
FOLDER="0.57"
WDIR="/home/vikas/ala_1_free_energy/membrane_200k/restraint_decoupling_pullinit_$FOLDER"
cd "/home/vikas/ala_1_free_energy/membrane_200k/restraint_decoupling_pullinit_$FOLDER"
LAMBDA="0.05"
/home/durba/gmx455/bin/mdrun -np 1 -deffnm md0.05 -v
############################
All of the double quotes "" I added are not strictly necessary for this example, however it is good practice to get into using them.

How to read values from command line in bash script as given?

I want to pass arguments to a script in the form
./myscript.sh -r [1,4] -p [10,20,30]
where in myscript.sh if I do:
echo $#
But I'm getting the output as
-r 1 4 -p 1 2 3
How do I get output in the form of
-r [1,4] -p [10,20,30]
I'm using Ubuntu 12.04 and bash version 4.2.37
You have files named 1 2 3 & 4 in your working directory.
Use more quotes.
./myscript.sh -r "[1,4]" -p "[10,20,30]"
[1,4] gets expanded by bash to filenames named 1 or , or 4 (whichever are actually present on your system).
Similarly, [10,20,30] gets expanded to filenames named 1 or 0 or , or 2 or 3.
On similar note, you should also change echo $# to echo "$#"
On another note, if you really want to distinguish between the arguments, use printf '%s\n' "$#" instead of just echo "$#".
You can turn off filename expansion
set -f
./myscript.sh -r [1,4] -p [10,20,30]
Don't expect other users to want to do this, if you share your script.
The best answer is anishane's: just quote the arguments
./myscript.sh -r "[1,4]" -p "[10,20,30]"
You can just the escape the brackets[]. Like this,
./verify.sh -r \[1,4\] -p \[10,20,30\]
You can print this using the echo "$#"

Resources