How to make a loop for getting input and output - bash

I have a command line like this:
myscript constant/tap.txt -n base.dat -c normal/sta0.grs -o normal/brs0.opm
I have 100 .grs files and I need to generate 100 .opm files.
I want to put the command above into a loop that does the following:
myscript constant/tap.txt -n base.dat -c normal/sta0.grs -o normal/brs0.opm
myscript constant/tap.txt -n base.dat -c normal/sta1.grs -o normal/brs1.opm
myscript constant/tap.txt -n base.dat -c normal/sta2.grs -o normal/brs2.opm
myscript constant/tap.txt -n base.dat -c normal/sta3.grs -o normal/brs3.opm
myscript constant/tap.txt -n base.dat -c normal/sta4.grs -o normal/brs4.opm
.
.
.
myscript constant/tap.txt -n base.dat -c normal/sta100.grs -o normal/brs100.opm
I was trying to make it like below:
#!/bin/bash
# Basic until loop
counter=100
until [ $counter -gt 100 ]
do
myscript constant/tap.txt -n base.dat -c normal/sta100.grs -o normal/brs100.opm
done
echo All done
but I could not find a way to set the parameters changes during the loop
In the above command these are constant for each run:
myscript constant/tap.txt -n base.dat -c
The only thing that changes in each loop is the following input and output:
normal/sta100.grs
normal/brs100.opm
I have 100 of sta.grs in the normal folder and I want to create 100 of brs.opm in the normal folder.

#!/bin/bash
counter=0
until ((counter>100))
do
myscript constant/tap.txt -n base.dat -c normal/sta$counter.grs -o normal/brs$counter.opm
((++counter))
done
echo 'All done'

This is an excellent use case for GNU parallel:
find normal -name '*.grs' |
parallel myscript constant/tap.txt -n base.dat -c {} -o {.}.opm
The less code you write, the less errors you make. And this generalizes nicely to cases where your files are named in more complex patterns. And you get parallelization for free (you can get rid of it with -j1).

Instead of incrementing the counter manually, you could use a for loop like this:
for i in {0..100}; do
myscript constant/tap.txt -n base.dat -c normal/sta"$i".grs -o normal/"$i".opm
done
Also, consider that this will sort in an unintuitive way:
1.opm
10.opm
100.opm
11.opm
12.opm
so maybe use padded numbers everywhere with for i in {000..100}; do. This requires Bash 4.0 or newer; if you don't have that, you could do something like
for i in {0..100}; do
printf -v ipad '%03d' "$i"
myscript constant/tap.txt -n base.dat -c normal/sta"$ipad".grs \
-o normal/"$ipad".opm
done
where the printf line puts a padded version of the counter into the ipad variable.
(And if you have Bash older than 3.1, you can't use printf -v and have to do
ipad=$(printf '%03d' "$i")
instead.)

Related

Shellscript evaluate command vs command difference: multiple $(uuidgen) get me same result

uuidgen is suppose to generate a random uuid for every call. thinking below command:
╰─○ cat a.txt | xargs -I {} -L 1 sh -c "uuidgen"
54693322-1ABF-4FCB-96E5-90EC0F4AC33E
9F1BA4CF-5612-46D7-90E9-EE653F0396FE
25F5D853-03BA-42F7-9FF8-1D3E124D09B3
046A348E-3FC0-414A-8469-21A016147245
This is Good, but below command will give me same uuids:
╰─○ cat a.txt | xargs -I {} -L 1 sh -c "echo $(uuidgen)"
7477A621-331C-4727-8471-677528BC79AC
7477A621-331C-4727-8471-677528BC79AC
7477A621-331C-4727-8471-677528BC79AC
7477A621-331C-4727-8471-677528BC79AC
The command substitution $(..) inside double quotes " is expanded before the command is run by the shell. So you are running:
xargs -I {} -L 1 sh -c "echo 7477A621-331C-4727-8471-677528BC79AC"
and it will print same value each time. Debug script with set -x. See xargs -t output. Check your scripts with shellcheck.net .
If you want to pass the string echo $(uuidgen) as-is to the shell spawned by xargs you have to quote it.
xargs sh -c "echo \$(uuidgen)"
# or
xargs sh -c 'echo $(uuidgen)'

How to count md5sum of executed command in bash

I have a wrapper script for compiling command to count md5sum of executed command and time it (also some ither stuff). Point is I have to calculate md5sum inside wrapper script.
problem is that m5sum return same output for gcc main.c and gcc "main.c" but command is different.
Here is simple code.
$ cat sc.sh
#!/usr/bin/env bash
cmd="$#"
cmdh=$(echo "$cmd" | md5sum - | cut -f1 -d" ")
echo "CMD ${cmd}"
echo "MD5 ${cmdh}"
time $#
Here is one output:
$ ./sc.sh gcc -c main.c -o out
CMD gcc -c main.c -o out
MD5 b671a0f3b1235aa91e5f86011449c698
real 0m0.019s
user 0m0.009s
sys 0m0.010s
Here is second. I would like to have diffrent md5sum.
$ ./sc.sh gcc -c "main.c" -o out
CMD gcc -c main.c -o out
MD5 b671a0f3b1235aa91e5f86011449c698
real 0m0.017s
user 0m0.007s
sys 0m0.011s
Like here:
$ echo 'gcc -c "main.c" -o out' | md5sum - | cut -f1 -d" "
94d2bafbec690940d1b908678e9c9b7d
$ echo 'gcc -c main.c -o out' | md5sum - | cut -f1 -d" "
b671a0f3b1235aa91e5f86011449c698
Is such thing possible with bash? It would be awesome to not have it bound to any specific bash version, but if there is no other choice, then its also good.
Removing quotes when parsing the line that you have typed into the terminal is part of how the shell works. The commands you are typing are the same. Research how shell works and re-research a basic introduction to shell quoting.
Like here:
Then pass it "like here". Clearly ' quotes are missing from your commands, but they are present in the "like here" snippet.
$ ./sc.sh gcc -c main.c -o out
is exactly the same as
$ ./sc.sh gcc -c "main.c" -o out
is exactly the same as
$ ./sc.sh 'g''c''c' "-""c" 'main.c' '''''-o' 'ou't
and it happens to work the same way as the following, only because of your IFS and how you are using ="$#". Research what $# does and research IFS:
$ ./sc.sh 'gcc -c main.c -o out'
But the following command is different - the double quotes inside single quotes are preserved.
$ ./sc.sh 'gcc -c "main.c" -o out'
As a follow-up, research word splitting. Remember to check your scripts with https://shellcheck.net
Inside script gcc main.c and gcc "main.c" are the same command.
$0 = gcc and $1 = main.c in both variants.
You cannot see the difference internally, and the script cannot make different signs, so you have no reason to see that.

How to loop through a script with SLURM? (sbatch and srun)

New to slurm, I have a script that was written to run the same command many times that has multiple inputs and outputs. If i have another shell script, is there a way that I can loop through that in multiple srun commands. My thought you would be something along the lines of:
shell script:
#!/bin/bash
ExCommand -f input1a -b input2a -c input3a -o outputa
ExCommand -f input1b -b input2b -c input3b -o outputb
ExCommand -f input1c -b input2c -c input3c -o outputc
ExCommand -f input1d -b input2d -c input3d -o outputd
ExCommand -f input1e -b input2e -c input3e -o outpute
sbatch script
#!/bin/bash
## Job Name
#SBATCH --job-name=collectAlignmentMetrics
## Allocation Definition
## Resources
## Nodes
#SBATCH --nodes=1
## Time limir
#SBATCH --time=4:00:00
## Memory per node
#SBATCH --mem=64G
## Specify the working directory for this job
for line in shellscript
do
srun command
done
Any ideas?
Try replace your for loop with this:
while read -r line;
do
if [[ $line == \#* ]]; continue ; fi
srun $line
done < shellscript

xargs output buffering -P parallel

I have a bash function that i call in parallel using xargs -P like so
echo ${list} | xargs -n 1 -P 24 -I# bash -l -c 'myAwesomeShellFunction #'
Everything works fine but output is messed up for obvious reasons (no buffering)
Trying to figure out a way to buffer output effectively. I was thinking I could use awk, but I'm not good enough to write such a script and I can't find anything worthwhile on google? Can someone help me write this "output buffer" in sed or awk? Nothing fancy, just accumulate output and spit it out after process terminates. I don't care the order that shell functions execute, just need their output buffered... Something like:
echo ${list} | xargs -n 1 -P 24 -I# bash -l -c 'myAwesomeShellFunction # | sed -u ""'
P.s. I tried to use stdbuf as per
https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe but did not work, i specified buffering on o and e but output still unbuffered:
echo ${list} | xargs -n 1 -P 24 -I# stdbuf -i0 -oL -eL bash -l -c 'myAwesomeShellFunction #'
Here's my first attempt, this only captures first line of output:
$ bash -c "echo stuff;sleep 3; echo more stuff" | awk '{while (( getline line) > 0 )print "got ",$line;}'
$ got stuff
This isn't quite atomic if your output is longer than a page (4kb typically), but for most cases it'll do:
xargs -P 24 bash -c 'for arg; do printf "%s\n" "$(myAwesomeShellFunction "$arg")"; done' _
The magic here is the command substitution: $(...) creates a subshell (a fork()ed-off copy of your shell), runs the code ... in it, and then reads that in to be substituted into the relevant position in the outer script.
Note that we don't need -n 1 (if you're dealing with a large number of arguments -- for a small number it may improve parallelization), since we're iterating over as many arguments as each of your 24 parallel bash instances is passed.
If you want to make it truly atomic, you can do that with a lockfile:
# generate a lockfile, arrange for it to be deleted when this shell exits
lockfile=$(mktemp -t lock.XXXXXX); export lockfile
trap 'rm -f "$lockfile"' 0
xargs -P 24 bash -c '
for arg; do
{
output=$(myAwesomeShellFunction "$arg")
flock -x 99
printf "%s\n" "$output"
} 99>"$lockfile"
done
' _

Use Bash with script text from stdin and options from command line

I want to use /bin/bash (possibly /bin/sh) with the option -f passed to, and handled by, the script.
Precisely,
while getopts f OPT
do
case $OPT in
"f" ) readonly FLG_F="TRUE"
esac
done
if [ $FLG_F ]; then
rm -rf $KIBRARY_DIR
fi
and when these lines are in a file http://hoge.com/hoge.sh,
I can do this, for instance,
wget http://hoge.com/hoge.sh
/bin/bash hoge.sh -f
but not
/bin/bash -f hoge.sh
I know the reason but I want to do like this,
wget -O - http://hoge.com/hoge.sh | /bin/bash
with -f option for hoge.sh not for /bin/bash
Are there any good ways to do this?
/bin/bash <(wget -O - http://hoge.com/hoge.sh) -f
worked. but this is only for bash users, right?
Using bash you can do
wget -O - http://hoge.com/hoge.sh | /bin/bash -s -- -f
as with -s commands are read from the standard input. This option allows the positional parameters to be set too.
It should work with other POSIX shells too.

Resources