Correct usage of find and while-read loop in different formats? - bash

After reading multiple anwers on stackoverflow I came up with the following solution to read directory paths from find's output:
find "$searchdir" -type d -execdir test -d {}/.git \; -prune -print0 | while read -r -d $'\0' dir; do
# do stuff
done
However, most sources recommend something like the following approach:
while IFS= read -r -d '' file; do
some command "$file"
done < <(find . -type f -name '*.mp3' -print0)
Why are they using process substitution? Does this change anything about the whole process or is it just an other way to do the same thing?
Is the read argument -d '' different from -d $'\0' or again the same thing? Does empty string always contain at least \0 so the bash specific $'' syntax is completely unnecessary?
I also tried doing it directly in find -exec/-execdir by passing it multiple times and failed. Maybe filtering and testing can be done in one command?
non working example:
find "$repositories_root_dir" -type d -execdir test -d {}/.git \; -prune -execdir sh -c "if git ls-remote --exit-code . \"origin/${target_branch_name}\" &> /dev/null; then echo \"Found branch '${target_branch_name}' in {}\"; git checkout \"${target_branch_name}\"; fi" \;
Sources:
https://github.com/koalaman/shellcheck/wiki/Sc2044
https://mywiki.wooledge.org/BashPitfalls#for_f_in_.24.28ls_.2A.mp3.29

In your non-working example, if you test the existence of a .git sub-directory to process only git clones and discard the other directories, then you should probably not prune because it does the exact opposite: skip only git clones.
Moreover, when using -execdir sh -c SCRIPT, you should pass positional parameters to your script instead of trying to embed the current directory name in the script with {}, which is not portable. And you could do the same for the branch name. Note that the directory name is not needed for what you try to accomplish in each git clone, because your script is executed from there.
Try this, maybe:
find "$repositories_root_dir" -type d -name '.git' -execdir sh -c '
if git ls-remote --exit-code . "origin/$1" &> /dev/null; then
printf "Found branch %s in " "$1"; pwd
echo git checkout "$1"
fi' _ "$target_branch_name" \;
(_ is assigned to positional parameter $0). Remove the echo if the result looks correct.

Related

find + cp spaces in path AND need to rename. Howto?

I need to find all files recursively with the name 'config.xml' and set them aside for analysis. The paths have spaces in them just to keep it interesting. However, I need them to be unique or they will collide in the same folder. What I would like to do is basically copy them off but using the name of the directory they were found in. The command I want is something like from this question except I need it to do something like $(dirname {}). When I do that, nothing gets moved (but I get no error)
Sample, but non-functional command:
find . -name 'config.xml' -exec sh -c 'cp "$1" "$2.xml"' -- {} "$HOME/data/$(dirname {})" \;
To do this with just one shell, not one per file found (as used by prior answers):
while IFS= read -r -d '' filename; do
outFile="$HOME/data/${filename%/*}.xml"
mkdir -p -- "${outFile%/*}"
cp -- "$filename" "$outFile"
done < <(find . -name 'config.xml' -print0)
This way your find emits a NUL-delimited stream of filenames, consumed one-by-one by the while read loop in the parent shell.
(You could use "$HOME/data/$(dirname "$filename").xml", but from a performance perspective that's really silly: $() fork()s off a subshell, and dirname is an external executable that needs to be exec'd, linked and loaded; no point to all that overhead when you can just do the string manipulation internal to the shell itself).
You may use it like this:
find . -name 'config.xml' -exec bash -c \
'd="$HOME/data/${1%/*}/"; mkdir -p "$d"; command cp -p "$1" "$d"' - {} \;
-exec sh is a little hard to handle, but not impossible. The $(dirname ...) is expanded prior sh is run, so it's equal dirname {} - the dirname of file {}. Do something like -exec sh -c ' .... ' -- {} and put the $(dirname ... ) inside sh script using $1.
find . -name 'config.xml' -exec sh -c 'cp "$1" "$2/data/$(dirname "$1").xml"' -- {} "$HOME" \;

Strip ./ from filename in find -execdir

Whole story: I am writing the script that will link all files from one directory to another. New file name will contain an original directory name. I use find at this moment with -execdir option.
This is how I want to use it:
./linkPictures.sh 2017_wien 2017/10
And it will create a symbolic link 2017_wien_picture.jpg in 2017/10 pointing to a file 2017_wien/picture.jpg.
This is my current script:
#!/bin/bash
UPLOAD="/var/www/wordpress/wp-content/uploads"
SOURCE="$UPLOAD/photo-gallery/$1/"
DEST="$UPLOAD/$2/"
find $SOURCE -type f -execdir echo ln -s {} $DEST/"$1"_{} ";"
It prints:
ln -s ./DSC03278.JPG /var/www/wordpress/wp-content/uploads/2017/10/pokus_./DSC03278.JPG
This is what I want:
ln -s ./DSC03278.JPG /var/www/wordpress/wp-content/uploads/2017/10/pokus_DSC03278.JPG
How to implement it? I do not know how to incorporate basename into to strip ./.
To run basename on {} you would need to execute a command through sh:
find "$SOURCE" -type f -execdir sh -c "echo ln -s '{}' \"$DEST/${1}_\$(basename \"{}\")\"" ";"
This won't win any speed contests (because of the sh for every file), but it will work.
All the quoting may look a bit crazy, but it's necessary to make it safe for files that may contain spaces.
You can use this find with bash -c:
find $SOURCE -type f -execdir bash -c 'echo ln -s "$2" "/$DEST/$1"_${2#./}' - "$1" '{}' \;
${2#./} will strip starting ./ from each entry of find command's output.
$1 will be passed as is to bash -c command line.
If you have large number of files to process I suggest using this while loop using a process substitution for faster execution since it doesn't spawn a new bash for every file. Moreover it will also handle filenames with white-spaces and other special characters:
while IFS= read -r file; do
echo ln -s "$file" "/$DEST/${1}_${file#./}"
done < <(find "$SOURCE" -type f -print0)

How to cd into grep output?

I have a shell script which basically searches all folders inside a location and I use grep to find the exact folder I want to target.
for dir in /root/*; do
grep "Apples" "${dir}"/*.* || continue
While grep successfully finds my target directory, I'm stuck on how I can move the folders I want to move in my target directory. An idea I had was to cd into grep output but that's where I got stuck. Tried some Google results, none helped with my case.
Example grep output: Binary file /root/ant/containers/secret/Documents/2FD412E0/file.extension matches
I want to cd into 2FD412E0and move two folders inside that directory.
dirname is the key to that:
cd $(dirname $(grep "...." ...))
will let you enter the directory.
As people mentioned, dirname is the right tool to strip off the file name from the path.
I would use find for such kind of task:
while read -r file
do
target_dir=`dirname $file`
# do something with "$target_dir"
done < <(find /root/ -type f \
-exec grep "Apples" --files-with-matches {} \;)
Consider using find's -maxdepth option. See the man page for find.
Well, there is actually simpler solution :) I just like to write bash scripts. You might simply use single find command like this:
find /root/ -type f -exec grep Apples {} ';' -exec ls -l {} ';'
Note the second -exec. It will be executed, if the previous -exec command exited with status 0 (success). From the man page:
-exec command ;
Execute command; true if 0 status is returned. All following arguments to find are taken to be arguments to the command until an argument consisting of ; is encountered. The string {} is replaced by the current file name being processed everywhere it occurs in the arguments to the command, not just in arguments where it is alone, as in some versions of find.
Replace the ls -l command with your stuff.
And if you want to execute dirname within the -exec command, you may do the following trick:
find /root/ -type f -exec grep -q Apples {} ';' \
-exec sh -c 'cd `dirname $0`; pwd' {} ';'
Replace pwd with your stuff.
When find is not available
In the comments you write that find is not available on your system. The following solution works without find:
grep -R --files-with-matches Apples "${dir}" | while read -r file
do
target_dir=`dirname $file`
# do something with "$target_dir"
echo $target_dir
done

how to get basename in -exec of find?

I cannot get the following piece of script (which is part of a larger backup script) to work correctly:
BACKUPDIR=/BACKUP/db01/physical/incremental # Backups base directory
FULLBACKUPDIR=$BACKUPDIR/full # Full backups directory
INCRBACKUPDIR=$BACKUPDIR/incr # Incremental backups directory
KEEP=5 # Number of full backups (and its incrementals) to keep
...
FIRST_DELETE=`expr $KEEP + 1` # add one to the number of backups to keep, this will be the first deleted
FILE0=`ls -ltr $FULLBACKUPDIR | awk '{print $9}' | tail -$FIRST_DELETE | head -1` # search for the first backup to be deleted
...
find $FULLBACKUPDIR -maxdepth 1 -type d ! -newer $FULLBACKUPDIR/$FILE0 -execdir echo "removing: "$FULLBACKUPDIR/$(basename {}) \; -execdir bash -c 'rm -rf $FULLBACKUPDIR/$(basename {})' \; -execdir echo "removing: "$INCRBACKUPDIR/$(basename {}) \; -execdir bash -c 'rm -rf $INCRBACKUPDIR/$(basename {})' \;
So the find works correctly which on its own will output something like this:
/BACKUPS/db01/physical/incremental/full/2013-08-12_17-51-28
/BACKUPS/db01/physical/incremental/full/2013-08-12_17-51-28
/BACKUPS/db01/physical/incremental/full/2013-08-12_17-25-07
What I want is the -exec to echo a line showing what is being removed and then remove the folder from both directories.
I've tried various ways to get just the basename but nothing seems to be working. I get this:
removing: /BACKUPS/mysql/physical/incremental/full/"/BACKUPS/mysql/physical/incremental/full/2013-08-12_17-51-28"
removing: /BACKUPS/mysql/physical/incremental/incr/"/BACKUPS/mysql/physical/incremental/full/2013-08-12_17-51-28"
removing: /BACKUPS/mysql/physical/incremental/full/"/BACKUPS/mysql/physical/incremental/full/2013-08-12_17-25-07"
And of course the folders arn't deleted because they don't exist, just fail silently because of the -f option. If I remove the -f I get the 'cannot be found' error on each rm.
How do I accomplish this? Because backups and parts of backups may be stored across different storage systems I really need the ability to just get the folder name for use in any known path.
the $(basename {}) is run first, making removing: "$INCRBACKUPDIR/$(basename {}) to removing: "$INCRBACKUPDIR/{} then the replacement is done of {}.
a way around it may be to pipe it to bash:
-exec echo "echo \"removing: \\\"$INCRBACKUPDIR/\$(basename {})\\\"\" | bash" \;
Lots of broken here.
All caps variables are by convention env vars and should not be used in scripts.
Using legacy backticks instead of $()
Parsing the output of ls (!)
Parsing the output of ls -l (!!!)
Expanding variables known to contain paths without full quotes.
All you absolutely need in order to improve this is to -exec bash properly, e.g.
-execdir bash -c 'filepath="$1" ; base=$(basename "$filepath") ; echo use $filepath and $base here' -- {} \;
But how about this instead:
#!/usr/bin/env bash
backup_base=/BACKUP/db01/physical/incremental
full_backup="$backup_base"/full
incremental_backup="$backup_base"/incr
keep=5
rm=echo
let n=0
while IFS= read -r -d $'\0' line ; do
file="${line#* }"
if [[ $n -lt $keep ]] ; then
let n=n+1
continue
fi
base=$(basename "$file")
echo "removing: $full_backup/$base"
"$rm" -rf -- "$full_backup"/"$base"
echo "removing: $incremental_backup/$base"
"$rm" -rf -- "$incremental_backup"/"$base"
done < <(find "$full_backup" -maxdepth 1 -printf '%T#.%p\0' 2>/dev/null | sort -z -r -n -t. -k1,2)
Iterate over files and directories immediately under the backup dir and skip the first 5 newest. Delete from the full and incremental dirs files matching the names of the rest.
This is an essentially safe version, except of course for timing attacks.
I have defined rm as being echo to avoid accidental deletes; swap it back to rm for actual deletion once you're sure it's correct.

How to go to each directory and execute a command?

How do I write a bash script that goes through each directory inside a parent_directory and executes a command in each directory.
The directory structure is as follows:
parent_directory (name could be anything - doesnt follow a pattern)
001 (directory names follow this pattern)
0001.txt (filenames follow this pattern)
0002.txt
0003.txt
002
0001.txt
0002.txt
0003.txt
0004.txt
003
0001.txt
the number of directories is unknown.
This answer posted by Todd helped me.
find . -maxdepth 1 -type d \( ! -name . \) -exec bash -c "cd '{}' && pwd" \;
The \( ! -name . \) avoids executing the command in current directory.
You can do the following, when your current directory is parent_directory:
for d in [0-9][0-9][0-9]
do
( cd "$d" && your-command-here )
done
The ( and ) create a subshell, so the current directory isn't changed in the main script.
You can achieve this by piping and then using xargs. The catch is you need to use the -I flag which will replace the substring in your bash command with the substring passed by each of the xargs.
ls -d */ | xargs -I {} bash -c "cd '{}' && pwd"
You may want to replace pwd with whatever command you want to execute in each directory.
If you're using GNU find, you can try -execdir parameter, e.g.:
find . -type d -execdir realpath "{}" ';'
or (as per #gniourf_gniourf comment):
find . -type d -execdir sh -c 'printf "%s/%s\n" "$PWD" "$0"' {} \;
Note: You can use ${0#./} instead of $0 to fix ./ in the front.
or more practical example:
find . -name .git -type d -execdir git pull -v ';'
If you want to include the current directory, it's even simpler by using -exec:
find . -type d -exec sh -c 'cd -P -- "{}" && pwd -P' \;
or using xargs:
find . -type d -print0 | xargs -0 -L1 sh -c 'cd "$0" && pwd && echo Do stuff'
Or similar example suggested by #gniourf_gniourf:
find . -type d -print0 | while IFS= read -r -d '' file; do
# ...
done
The above examples support directories with spaces in their name.
Or by assigning into bash array:
dirs=($(find . -type d))
for dir in "${dirs[#]}"; do
cd "$dir"
echo $PWD
done
Change . to your specific folder name. If you don't need to run recursively, you can use: dirs=(*) instead. The above example doesn't support directories with spaces in the name.
So as #gniourf_gniourf suggested, the only proper way to put the output of find in an array without using an explicit loop will be available in Bash 4.4 with:
mapfile -t -d '' dirs < <(find . -type d -print0)
Or not a recommended way (which involves parsing of ls):
ls -d */ | awk '{print $NF}' | xargs -n1 sh -c 'cd $0 && pwd && echo Do stuff'
The above example would ignore the current dir (as requested by OP), but it'll break on names with the spaces.
See also:
Bash: for each directory at SO
How to enter every directory in current path and execute script? at SE Ubuntu
If the toplevel folder is known you can just write something like this:
for dir in `ls $YOUR_TOP_LEVEL_FOLDER`;
do
for subdir in `ls $YOUR_TOP_LEVEL_FOLDER/$dir`;
do
$(PLAY AS MUCH AS YOU WANT);
done
done
On the $(PLAY AS MUCH AS YOU WANT); you can put as much code as you want.
Note that I didn't "cd" on any directory.
Cheers,
for dir in PARENT/*
do
test -d "$dir" || continue
# Do something with $dir...
done
While one liners are good for quick and dirty usage, I prefer below more verbose version for writing scripts. This is the template I use which takes care of many edge cases and allows you to write more complex code to execute on a folder. You can write your bash code in the function dir_command. Below, dir_coomand implements tagging each repository in git as an example. Rest of the script calls dir_command for each folder in directory. The example of iterating through only given set of folder is also include.
#!/bin/bash
#Use set -x if you want to echo each command while getting executed
#set -x
#Save current directory so we can restore it later
cur=$PWD
#Save command line arguments so functions can access it
args=("$#")
#Put your code in this function
#To access command line arguments use syntax ${args[1]} etc
function dir_command {
#This example command implements doing git status for folder
cd $1
echo "$(tput setaf 2)$1$(tput sgr 0)"
git tag -a ${args[0]} -m "${args[1]}"
git push --tags
cd ..
}
#This loop will go to each immediate child and execute dir_command
find . -maxdepth 1 -type d \( ! -name . \) | while read dir; do
dir_command "$dir/"
done
#This example loop only loops through give set of folders
declare -a dirs=("dir1" "dir2" "dir3")
for dir in "${dirs[#]}"; do
dir_command "$dir/"
done
#Restore the folder
cd "$cur"
I don't get the point with the formating of the file, since you only want to iterate through folders... Are you looking for something like this?
cd parent
find . -type d | while read d; do
ls $d/
done
you can use
find .
to search all files/dirs in the current directory recurive
Than you can pipe the output the xargs command like so
find . | xargs 'command here'
#!/bin.bash
for folder_to_go in $(find . -mindepth 1 -maxdepth 1 -type d \( -name "*" \) ) ;
# you can add pattern insted of * , here it goes to any folder
#-mindepth / maxdepth 1 means one folder depth
do
cd $folder_to_go
echo $folder_to_go "########################################## "
whatever you want to do is here
cd ../ # if maxdepth/mindepath = 2, cd ../../
done
#you can try adding many internal for loops with many patterns, this will sneak anywhere you want
You could run sequence of commands in each folder in 1 line like:
for d in PARENT_FOLDER/*; do (cd "$d" && tar -cvzf $d.tar.gz *.*)); done
for p in [0-9][0-9][0-9];do
(
cd $p
for f in [0-9][0-9][0-9][0-9]*.txt;do
ls $f; # Your operands
done
)
done

Resources