Running shell scripts in parallel from a list of files - bash

I have a list of scripts doing their own thing (they are actually Rscripts reading modifying and writing files) like this:
## Script 1
echo "1" > file1.out
## Script 2
echo "2" > file2.out
## Script 3
echo "3" > file3.out
These are saved in different scripts as follow:
## Writing script 1
echo "echo \"1\" > file1.out" > script1.task
## Writing script 2
echo "echo \"2\" > file2.out" > script2.task
## Writing script 3
echo "echo \"3\" > file3.out" > script3.task
Is there a way to run all these scripts in parallel using the file names?
In a loop it'd look like this:
for task_file in *.task
do
sh ${task_file}
done

If you only have 3 the answer is to use &.
But if you have 1000s:
seq 10000 | parallel 'echo {} > file{}.out'

Following advice from Andre Wildberg and user1934428 here's a solution using & just for the record:
for task_file in *.task
do
sh ${task_file} &
done
wait

Related

How to parallel process a function, with loops

So I have this function, I want this function to run everything that It contains in itself at the same time. So far it isn't working, and according to other sources, this is how you do it. The function itself works if its not in parallel.
#!/bin/bash
foo () {
cd ${HOME}/sh/path/to/script/execute
for f in *.sh; do #goes to "execute" directory and executes all
#scripts the current directory "execute" basically run-parts without cron
cd ~/sh/path/to/script
while IFS= read -r l1 #Line 1 in master.txt
IFS= read -r l2 #Line 2 in master.txt
IFS= read -r l3 #Line 3 in master.txt
do
cd /dev/shm/arb
echo ${l1} > arg.txt & echo ${l2} > arg2.txt & echo ${l3} > arg3.txt
cd ${HOME}/sh/path/to/script/execute
bash -H ${f} #executes all scripts inside "execute" folder
cd ~/sh/path/to/script/here
./here.sh &
cd ~/sh/path/to/script &
done <master.txt
done
}
export -f foo
parallel ::: foo
Results in
#No result at all....., just buffers. htop doesn't acknowledge any
#processes, and when this runs its pretty taxing on the cores.
master.txt content
In case this is relevant:
apple_fruit
apple_veggie
veggie_fruit
#apple changes
pear_fruit
pear_veggie
veggie_fruit
#pear changes
cucumber_fruit
...
I'm very new to using parallel, and don't know how it works in advanced(and basic) situations so would the loops interfere? And if it does interfere, is there a workaround?
The result is probably going to be something like:
inner() {
script="$1"
parallel -N3 "'$script' {}; here.sh {}" :::: master.txt
}
export -f inner
parallel inner ::: ${HOME}/sh/path/to/script/execute/*.sh
This will call each of the scripts in ${HOME}/sh/path/to/script/execute/ (and here.sh) with 3 arguments from master.txt like this:
${HOME}/sh/path/to/script/execute/script1.sh apple_fruit apple_veggie veggie_fruit
You need to change the scripts so that:
They get the arguments from the command line (not from arg.txt, arg2.txt, arg3.txt).
They send their output to stdout

bash for script and input parameter

Can anyone help me to modify my script. Because it does not work. Here are three scripts.
1) pb.sh, use delphicpp_release software to read the 1brs.ab.sh and will give the output as 1brs.ab.out
2) 1brs.ab.sh, use for input parameter where a.sh(another script for protein structure), chramm.siz, charmm.crg are file for atom size and charge etc. rest of the parameters for run the delphicpp_release software.
3) a.sh, use for read several protein structures, which will be in the same directory.
my script_1 = pb.sh:
./delphicpp_release 1brs.ab.sh >1brs.ab.out
echo PB-Energy-AB = $(grep -oP '(?<=Energy> Corrected:).*' 1brs.ab.out) >>PB-energy.dat
cat PB-energy.dat
script_2 = 1brs.ab.sh:
in(pdb,file="a.sh")
in(siz,file="charmm.siz")
in(crg,file="charmm.crg")
perfil=70
scale=2.0
indi=4
exdi=80.0
prbrad=1.4
salt=0.15
bndcon=2
maxc=0.0001
linit=800
energy(s)
script_3 = a.sh:
for i in $(seq 90000 20 90040); do
$i.pdb
done
As we don't know what software is, something like
for ((i=90000;i<=100000;i+=20)); do
./software << " DATA_END" > 1brs.$i.a.out
scale=2.0
in(pdb,file="../$i.ab.pdb")
in(siz,file="charmm.siz")
in(crg,file="charmm.crg")
indi=z
exdi=x
prbrad=y
DATA_END
echo Energy-A = $(grep -oP '(?<=Energy>:).*' 1brs.$i.a.out) >>PB-energy.dat
done
A more POSIX shell compliant version
i=90000
while ((i<=100000)); do
...
((i+=20));
done
EDIT: Without heredoc
{
echo 'scale=2.0'
echo 'in(pdb,file="../'"$i"'.ab.pdb")'
echo 'in(siz,file="charmm.siz")'
echo 'in(crg,file="charmm.crg")'
echo 'indi=z'
echo 'exdi=x'
echo 'prbrad=y'
} > $i.ab.sh
./software <$i.ab.sh >$i.ab.out
but as question was changed I'm not sure to understand it.

Bash Scripting : How to loop over X number of files, take input and write to a file in the same line

So I have a program written in C that takes in some parameters: calling it allcell
some sample parameters: -m 1800 -n 9
the files being analyzed: cfdT100-0.trj, cfdT100-1.trj, cfdT100-2.trj, cfdT100-3.trj, ... cfdT100-19.trj
file being fed: template.file
out file: result.file
$ allcell -m 1800 -n 9 cfdT100-[0-19].trj < template.file > result.file
But when I htop, I see that only cfdT100-0.trj, cfdT100-1.trj and cfdT100-9.trj are being read. How do I make the shell read all the files from 0-19 ?
Additionally, when I write a script file to automate this, how should I enclose the line? Will this work:
"$($ allcell -m 1800 -n 9 cfdT100-[0-19].trj < template.file > result.file)"
I believe you want to change your glob expression to cfdT100-{0..19}.trj instead.
neech#nicolaw.uk:~ $ echo cfdT100-{0..19}.trj
cfdT100-0.trj cfdT100-1.trj cfdT100-2.trj cfdT100-3.trj cfdT100-4.trj cfdT100-5.trj cfdT100-6.trj cfdT100-7.trj cfdT100-8.trj cfdT100-9.trj cfdT100-10.trj cfdT100-11.trj cfdT100-12.trj cfdT100-13.trj cfdT100-14.trj cfdT100-15.trj cfdT100-16.trj cfdT100-17.trj cfdT100-18.trj cfdT100-19.trj
Your quoting on the scripted version looks acceptable. Just change the glob.
use recursion function for infinite loop
a()
{
echo "apple"
a
}
a
This the will make a infinite loop

Write a script to put a series of files in sequence

I am beginning in scripting and I am trying to write a script in bash. I need a script to write a sequence of several file names that are numbered from 1 to 50 inside one file. These are trajectory files from MD simulations. My idea was to write something like:
for valor in {1..50}
do
echo "
#!/bin/bash
catdcd -o Traj-all.dcd -stride 10 -dcd traj-$valor.dcd" > Traj.bash
exit
However, I just got one file with the following line:
#!/bin/bash
catdcd -o Traj-all.dcd -stride 10 -dcd traj-50.dcd
exit
But what I really want is something like:
#!/bin/bash
catdcd -o Traj-all.dcd -stride 10 -dcd traj-1.dcd -dcd traj-2.dcd -dcd traj-3.dcd ... -dcd traj-50.dcd
exit
How can I solve this problem?
You need to read a bit more about bash brace expansion. You can do this:
{
echo "#!/bin/bash"
echo "catdcd -o Traj-all.dcd -stride 10" "-dec traj-"{1..50}".dcd"
# ^^^^^^^^^^^^^^^^^^^^^^^^^
} > Traj.bash
The underlined part is where the brace expansion will get expanded by the shell into
-dec traj-1.dcd -dec traj-2.dcd ... -dec traj-50.dcd
You don't need to explicitly end your script with exit -- the shell will exit by itself when it runs out of commands.
> truncates the file on open. Either only use it once before the loop to create the file and then append (>>) within the loop, or redirect the entire loop.
> foo
for ...
do ...
echo ... >> foo
done
...
{
for ...
do ...
echo ...
done
} > foo

How to get names of variables are available/set in shell scripts

I want all variables names are set in shell script. I have a file which contains key value pairs and I was read content from that file and store/set into variables. I want to do some processes if a variable is available/set otherwise I don't need to do those processes. How to achieve this.
For example I run loop in shell scripts in each iteration it gives one of the variables is set before that command.
If code like this
a=test1
b=test2
c=test3
for i in ???
do
echo $i
done
then I want output like this
a
b
c
What command is used o achieve this.
You could use set before and after setting the variables
e.g:
$ set > aux1
$ c=345
$ set > aux2
$ diff aux1 aux2
57c57
< PIPESTATUS=([0]="141" [1]="0")
---
> PIPESTATUS=([0]="0")
112a113
> c=345
If you have a pre-defined list of such variables, then you can test it like this:
for i in $(echo "a b c"); do
echo $i
done
If i could help you :
#!/bin/sh
# Define list of tests
LIST_TESTS=`cat list_test.txt`
for TEST in ${LIST_TESTS}
do
vartest=`echo ${TEST}`
if [ "${vartest}" = "" ]
# No Test
then
echo "*** WARNING*** Test not found"
else
echo "${vartest} is available"
fi
done
# Second Define list of tests
tabTest=('test1' 'test2' 'test3')
i=0
while [ "${tabTest[$i]}" != "" ]
do
echo "${tabTest[$i]} is available"
i=$(($i+1))
done

Resources