bash loop in parallel - bash

I am trying to run this script in parallel, for i<=4 in each set. The runspr.py is itself parallel, and thats fine. What I am trying to do is running only 4 i loop in any instance.
In my present code, it will run everything.
#!bin/bash
for i in *
do
if [[ -d $i ]]; then
echo "$i id dir"
cd $i
python3 ~/bin/runspr.py SCF &
cd ..
else
echo "$i nont dir"
fi
done
I have followed https://www.biostars.org/p/63816/ and https://unix.stackexchange.com/questions/35416/four-tasks-in-parallel-how-do-i-do-that
but unable to impliment the code in parallel.

You don't need to use for loop. You can use gnu parallel like this with find:
find . -mindepth 1 -maxdepth 1 -type d ! -print0 |
parallel -0 --jobs 4 'cd {}; python3 ~/bin/runspr.py SCF'

Another possible solution is:
find . -mindepth 1 -maxdepth 1 -type d ! -print0 |
xargs -I {} -P 4 sh -c 'cd {}; python3 ~/bin/runspr.py SCF'

Related

Loop over find result in bash

I have a bash script written by some previous colleague in my company. It's shellcheck result is horrible and me, who is using zsh can't run the script. He seems to use the notorious find with for loop thingy in bash. But I can't figure out how to get it better.
At the moment i got a temporary fix.
this is his code
#!/bin/bash
releases=$(for d in $(find ${DELIVERIES} -maxdepth 1 -type d -name "*_delivery_33_SR*" | sort) ; do echo ${d##*_} ; done)
for sr in ${releases[#]}
do
echo "Release $sr"
deliveries=$(find ${deliveries_path}/*${sr}/ -type f -name "*.ear" -o -name "*.war" | sort)
if [ ! -e ${sr}.txt ]
then
for d in ${deliveries[#]}
do
echo "$(basename $d)" | tee -a ${sr}.txt
done
fi
echo
done
And this is my code that get to even loop the first part.
#!/bin/bash
for release in $(for d in $(find "${DELIVERIES}" -maxdepth 1 -type d -name "*_delivery_33_SR*" | sort) ; do echo "${d##*_}" ; done)
do
echo "Release $release"
done
As you can see I needed to put the find inside the loop and I cant save it in an variable, because when i try to loop over it will try to put \n everywhere and it is like a single element? Could any1 suggest How should I solve this problem, because this previous colleague uses this kind of find search a lot.
EDIT:
The script went to each folder with a specific name and then created a file X.X.X.txt with the version number in the X part. And appended the filenames inside the subfolder to the X.X.X.txt
Blindly refactoring gets me something like
#!/bin/bash
for d in "$DELIVERIES"/*_delivery_33_SR*/; do
sr=${d##*_}
echo "Release $sr"
if [ ! -e "${sr}.txt" ]
then
find "${deliveries_path}"/*"${sr}"/ -type f -name "*.ear" -o -name "*.war" |
sort |
xargs -n 1 basename |
tee -a "$sr.txt"
fi
echo
done

Count filenumber in directory with blank in its name

If you want a breakdown of how many files are in each dir under your current dir:
for i in $(find . -maxdepth 1 -type d) ; do
echo -n $i": " ;
(find $i -type f | wc -l) ;
done
It does not work when the directory name has a blank in the name. Can anyone here tell me how I must edite this shell script so that such directory names also accepted for counting its file contents?
Thanks
Your code suffers from a common issue described in http://mywiki.wooledge.org/BashPitfalls#for_i_in_.24.28ls_.2A.mp3.29.
In your case you could do this instead:
for i in */; do
echo -n "${i%/}: "
find "$i" -type f | wc -l
done
This will work with all types of file names:
find . -maxdepth 1 -type d -exec sh -c 'printf "%s: %i\n" "$1" "$(find "$1" -type f | wc -l)"' Counter {} \;
How it works
find . -maxdepth 1 -type d
This finds the directories just as you were doing
-exec sh -c 'printf "%s: %i\n" "$1" "$(find "$1" -type f | wc -l)"' Counter {} \;
This feeds each directory name to a shell script which counts the files, similarly to what you were doing.
There are some tricks here: Counter {} are passed as arguments to the shell script. Counter becomes $0 (which is only used if the shell script generates an error. find replaces {} with the name of a directory it found and this will be available to the shell script as $1. This is done is a way that is safe for all types of file names.
Note that, wherever $1 is used in the script, it is inside double-quotes. This protects it for word splitting or other unwanted shell expansions.
I found the solution what I have to consider:
Consider_this
#!/bin/bash
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
for i in $(find . -maxdepth 1 -type d); do
echo -n " $i: ";
(find $i -type f | wc -l) ;
done
IFS=$SAVEIFS

Execute command in all immediate subdirectories

I'm trying to add a shell function (zsh) mexec to execute the same command in all immediate subdirectories e.g. with the following structure
~
-- folder1
-- folder2
mexec pwd would show for example
/home/me/folder1
/home/me/folder2
I'm using find to pull the immediate subdirectories. The problem is getting the passed in command to execute. Here's my first function defintion:
mexec() {
find . -mindepth 1 -maxdepth 1 -type d | xargs -I'{}' \
/bin/zsh -c "cd {} && $#;";
}
only executes the command itself but doesn't pass in the arguments i.e. mexec ls -al behaves exactly like ls
Changing the second line to /bin/zsh -c "(cd {} && $#);", mexec works for just mexec ls but shows this error for mexec ls -al:
zsh:1: parse error near `ls'
Going the exec route with find
find . -mindepth 1 -maxdepth 1 -type d -exec /bin/zsh -c "(cd {} && $#)" \;
Gives me the same thing which leads me to believe there's a problem with how I'm passing the arguments to zsh. This also seems to be a problem if I use bash: the error shown is:
-a);: -c: line 1: syntax error: unexpected end of file
What would be a good way to achieve this?
Can you try using this simple loop which loops in all sub-directories at one level deep and execute commands on it,
for d in ./*/ ; do (cd "$d" && ls -al); done
(cmd1 && cmd2) opens a sub-shell to run the commands. Since it is a child shell, the parent shell (the shell from which you're running this command) retains its current folder and other environment variables.
Wrap it around in a function in a proper zsh script as
#!/bin/zsh
function runCommand() {
for d in ./*/ ; do /bin/zsh -c "(cd "$d" && "$#")"; done
}
runCommand "ls -al"
should work just fine for you.
#!/bin/zsh
# A simple script with a function...
mexec()
{
export THE_COMMAND=$#
find . -type d -maxdepth 1 -mindepth 1 -print0 | xargs -0 -I{} zsh -c 'cd "{}" && echo "{}" && echo "$('$THE_COMMAND')" && echo -e'
}
mexec ls -al
using https://github.com/sharkdp/fd but you could as well use plain old find instead of fdfind
function inDirs() { fdfind --type d --max-depth 1 --exec bash -c "x={} && echo && echo \$x && echo \${x//?/=} && cd {} && echo '-> '$* && $*" ; }

BASH script more smart with cat

I have multiple files in multiple folders
[tiagocastro#cascudo clean_reads]$ ls
11 13 14 16 17 18 3 4 5 6 8 9
and I want to make a tiny bash script to concatenate these files inside :
11]$ ls
FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L6_1.fq FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L7_1.fq
FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L6_2.fq FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L7_2.fq
But only L6 with L6 and L7 with L7
Right now I am on the basic level. I want to learn how to do it more smartly, instead of reproduce the commands I could do in terminal in the script.
Thank you everybody, for helping me.
This isn't an free programmiing service, but you can learn something from the next:
#!/bin/bash
echo2() { echo "$#" >&2; }
get_Lnums() {
find . -type f -regextype posix-extended -iregex '.*_L[0-9]+_[0-9]+\.fq' -maxdepth 1 -print | grep -oP '_\KL\d+' | sort -u
}
docat() {
echo2 doing $(pwd)
for lnum in $(get_Lnums)
do
echo cat *_${lnum}_*.fq "> new_${lnum}.all" #remove (comment out) this line when satisfied
#cat *_${lnum}_*.fq > new_${lnum}.all #and uncomment this
done
}
while read -r -d $'\0' dir
do
(cd "$dir" && docat) #subshell - don't need cd back
done < <(find . -type dir -maxdepth 1 -mindepth 1 -print0)

Perform an action in every sub-directory using Bash

I am working on a script that needs to perform an action in every sub-directory of a specific folder.
What is the most efficient way to write that?
A version that avoids creating a sub-process:
for D in *; do
if [ -d "${D}" ]; then
echo "${D}" # your processing here
fi
done
Or, if your action is a single command, this is more concise:
for D in *; do [ -d "${D}" ] && my_command; done
Or an even more concise version (thanks #enzotib). Note that in this version each value of D will have a trailing slash:
for D in */; do my_command; done
for D in `find . -type d`
do
//Do whatever you need with D
done
The simplest non recursive way is:
for d in */; do
echo "$d"
done
The / at the end tells, use directories only.
There is no need for
find
awk
...
Use find command.
In GNU find, you can use -execdir parameter:
find . -type d -execdir realpath "{}" ';'
or by using -exec parameter:
find . -type d -exec sh -c 'cd -P "$0" && pwd -P' {} \;
or with xargs command:
find . -type d -print0 | xargs -0 -L1 sh -c 'cd "$0" && pwd && echo Do stuff'
Or using for loop:
for d in */; { echo "$d"; }
For recursivity try extended globbing (**/) instead (enable by: shopt -s extglob).
For more examples, see: How to go to each directory and execute a command? at SO
Handy one-liners
for D in *; do echo "$D"; done
for D in *; do find "$D" -type d; done ### Option A
find * -type d ### Option B
Option A is correct for folders with spaces in between. Also, generally faster since it doesn't print each word in a folder name as a separate entity.
# Option A
$ time for D in ./big_dir/*; do find "$D" -type d > /dev/null; done
real 0m0.327s
user 0m0.084s
sys 0m0.236s
# Option B
$ time for D in `find ./big_dir/* -type d`; do echo "$D" > /dev/null; done
real 0m0.787s
user 0m0.484s
sys 0m0.308s
find . -type d -print0 | xargs -0 -n 1 my_command
This will create a subshell (which means that variable values will be lost when the while loop exits):
find . -type d | while read -r dir
do
something
done
This won't:
while read -r dir
do
something
done < <(find . -type d)
Either one will work if there are spaces in directory names.
You could try:
#!/bin/bash
### $1 == the first args to this script
### usage: script.sh /path/to/dir/
for f in `find . -maxdepth 1 -mindepth 1 -type d`; do
cd "$f"
<your job here>
done
or similar...
Explanation:
find . -maxdepth 1 -mindepth 1 -type d :
Only find directories with a maximum recursive depth of 1 (only the subdirectories of $1) and minimum depth of 1 (excludes current folder .)
the accepted answer will break on white spaces if the directory names have them, and the preferred syntax is $() for bash/ksh. Use GNU find -exec option with +; eg
find .... -exec mycommand +; #this is same as passing to xargs
or use a while loop
find .... | while read -r D
do
# use variable `D` or whatever variable name you defined instead here
done
if you want to perform an action INSIDE the folder and not ON folder.
Explanation: You have many pdfs and you would like to concetrate them inside a single folder.
my folders
AV 001/
AV 002/
for D in *; do cd "$D"; # VERY
DANGEROUS COMMAND - DONT USE
#-- missing "", it will list files too. It can go up too.
for d in */; do cd "$d"; echo $d; cd ..; done; # works
succesfully
for D in "$(ls -d */)"; do cd "$D"; done; #
bash: cd: $'Athens Voice 001/\nAthens Voice 002/' - there is no such
folder
for D in "$(*/)"; do cd "$D"; done; # bash: Athens
Voice 001/: is folder
for D in "$(`find . -type d`)"; do cd $D; done; # bash: ./Athens: there is no such folder or file
for D in *; do if [ -d "${D}" ] then cd ${D}; done; # many
arguments

Resources