How should I search a group of files using linux `find` command? - bash

I have a group of files in a certain directory. Now I want to search them in two different directories. I used below code:
(jumped to the directory that contains that group of files)
ls | while read name; do find ~/dir1 ~/dir2 -name {$name};done
But I guess it is too slow since for each file, dir1 and dir2 should be searched once. So the search will be do too many times.
Is my guess right? and if so, what should I write?

find supports -o for OR operation.
You can use this:
files=(); # Initialize an empty bash array
for i in *; do files+=(-o -name "$i"); done # Add names of all the files to the array
find dir1/ dir2/ -type f '(' "${files[#]:1}" ')' # Search for those files
e.g., consider this case:
$ touch a b c
$ ls
a b c
$ files=()
$ for i in *; do files+=(-o -name "$i"); done
$ printf '%s ' "${files[#]}"; echo
-o -name a -o -name b -o -name c
$ printf '%s ' "${files[#]:1}"; echo
-name a -o -name b -o -name c
$ printf '%s ' find dir1/ dir2/ -type f '(' "${files[#]:1}" ')'; echo
find dir1/ dir2/ -type f ( -name a -o -name b -o -name c ) # This is the command that actually runs.

Related

How do I use parens '()' in a find command when building options from array?

I have a function that looks like this. I have stripped error handling, and the commands outside the function are to make sure I have something to look for in the example.
#!/bin/bash
findfiles() {
local path=$1
local mtime=$2
local prunedirs=$3
local -a fopts
fopts+=("$path")
[[ -n $prunedirs ]] && {
fopts+=('-type' 'd')
fopts+=('(' '-path')
fopts+=("${prunedirs// / -o -path }")
fopts+=(')' '-prune' '-o')
}
fopts+=('-type' 'f')
fopts+=('-writable')
fopts+=('-mtime' "+$mtime")
[[ -n $prunedirs ]] && fopts+=('-print')
echo "find ${fopts[*]}"
find "${fopts[#]}"
}
mkdir -p dir1/{dir2,dir3}
touch dir1/5daysago.txt -mt "$(date -d 'now - 5 days' +%Y%m%d%H%M)"
touch dir1/dir2/6daysago.txt -mt "$(date -d 'now - 6 days' +%Y%m%d%H%M)"
touch dir1/dir3/10daysago.txt -mt "$(date -d 'now - 10 days' +%Y%m%d%H%M)"
echo '---------------------------------------------'
findfiles dir1 4
echo '---------------------------------------------'
findfiles dir1 4 'dir1/dir2'
echo '---------------------------------------------'
findfiles dir1 4 "dir1/dir2 dir1/dir3"
This outputs the following:
---------------------------------------------
find dir1 -type f -writable -mtime +4
dir1/dir2/6daysago.txt
dir1/dir3/10daysago.txt
dir1/5daysago.txt
---------------------------------------------
find dir1 -type d ( -path dir1/dir2 ) -prune -o -type f -writable -mtime +4 -print
dir1/dir3/10daysago.txt
dir1/5daysago.txt
---------------------------------------------
find dir1 -type d ( -path dir1/dir2 -o -path dir1/dir3 ) -prune -o -type f -writable -mtime +4 -print
dir1/dir2/6daysago.txt
dir1/dir3/10daysago.txt
dir1/5daysago.txt
Notice that the third attempt does not prune the directories. If I copy and paste the find (escaping the parens) it works correctly.
$ find dir1 -type d \( -path dir1/dir2 -o -path dir1/dir3 \) -prune -o -type f -writable -mtime +4 -print
dir1/5daysago.txt
What am I doing wrong?
You need to add -o and the -path primary as separate array elements. Each directory to prune should be passed as a separate argument, not a single space-separated string.
findfiles() {
local path=$1
local mtime=$2
shift 2
n=$# # Remember for later
local -a fopts
fopts+=("$path")
if (( $# > 0 )); then
fopts+=(-type d '(')
while (( $# > 1 )); do
fopts+=(-path "$1" -o)
shift
done
fopts+=(-path $1 ')' -prune -o)
fi
fopts+=('-type' 'f')
fopts+=('-writable')
fopts+=('-mtime' "+$mtime")
# Now it's later
((n > 0)) && fopts+=('-print')
echo "find ${fopts[*]}"
find "${fopts[#]}"
}
findfiles dir1 4 "dir1/dir2" "dir1/dir3"
Change echo "find ${fopts[*]}" to declare -p fopts to unambiguously print the options. Doing so will show that the -o -path part is being added as a single word:
$ declare -p fopts
declare -a fopts=(
[0]="dir1" [1]="-type" [2]="d" [3]="(" [4]="-path"
[5]="dir1/dir2 -o -path dir1/dir3" [6]=")" [7]="-prune" [8]="-o" [9]="-type" [10]="f"
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[11]="-writable" [12]="-mtime" [13]="+4" [14]="-print"
)
To fix it you'll want to add each directory to prune to the array individually, something like:
local prunedirs=("${#:3}")
...
fopts+=(-type d '(' -false)
for dir in "${prunedirs[#]}"; do
fopts+=(-o -path "$dir")
done
fopts+=(')' -prune -o)
I've switched prunedirs to an array so it can handle directory names with whitespace.
It starts with an initial -false check so there's no need to check if prunedirs is empty. If it's empty the whole thing is still added but since it just says -type d '(' -false ')' -prune -o it's a no-op.
Also, notice you don't have to quote every single argument. It's fine to write -type d and such unquoted, the same as you would if you typed them at the command line. Only '(' and ')' need single quotes.

BASH: If condition uses result from find command to determine which file will be written

I want to list all files in a nested directory, but in that directory has some files which having space in their name. So I wanna write down the paths of which files don't have space in their name and which have in 2 different files.
So far, I just know how to find those having space in their name by this command:
find /<my directory> -type f -name * *
I want something like:
find /<my directory> -type f
if [ name has space]
then > a.txt
else > b.txt
fi
Thank you in advance.
You can put a condition in a brief -exec. This is somewhat more complex than you would hope because -exec cannot directly contain shell builtins.
find "$path" -type f -exec sh -c 'for f; do
case $f in *\ *) dest=a;; *) dest=b;; esac;
echo "$f" >>$dest.txt
done' _ {} +
In other words, pass the found files to the following sh -c ... script. (The underscore is to populate $0 with something inside the subshell.)
If the directory tree isn't too deep, perhaps it would be a lot easier to just run find twice.
find "$path" -type f -name '* *' >a.txt
find "$path" -type f \! -name '* *' >b.txt
Use two separate commands:
find "$path" -type f -name '* *' > a.txt
find "$path" -type f -not -name '* *' > b.txt

Count filenumber in directory with blank in its name

If you want a breakdown of how many files are in each dir under your current dir:
for i in $(find . -maxdepth 1 -type d) ; do
echo -n $i": " ;
(find $i -type f | wc -l) ;
done
It does not work when the directory name has a blank in the name. Can anyone here tell me how I must edite this shell script so that such directory names also accepted for counting its file contents?
Thanks
Your code suffers from a common issue described in http://mywiki.wooledge.org/BashPitfalls#for_i_in_.24.28ls_.2A.mp3.29.
In your case you could do this instead:
for i in */; do
echo -n "${i%/}: "
find "$i" -type f | wc -l
done
This will work with all types of file names:
find . -maxdepth 1 -type d -exec sh -c 'printf "%s: %i\n" "$1" "$(find "$1" -type f | wc -l)"' Counter {} \;
How it works
find . -maxdepth 1 -type d
This finds the directories just as you were doing
-exec sh -c 'printf "%s: %i\n" "$1" "$(find "$1" -type f | wc -l)"' Counter {} \;
This feeds each directory name to a shell script which counts the files, similarly to what you were doing.
There are some tricks here: Counter {} are passed as arguments to the shell script. Counter becomes $0 (which is only used if the shell script generates an error. find replaces {} with the name of a directory it found and this will be available to the shell script as $1. This is done is a way that is safe for all types of file names.
Note that, wherever $1 is used in the script, it is inside double-quotes. This protects it for word splitting or other unwanted shell expansions.
I found the solution what I have to consider:
Consider_this
#!/bin/bash
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
for i in $(find . -maxdepth 1 -type d); do
echo -n " $i: ";
(find $i -type f | wc -l) ;
done
IFS=$SAVEIFS

Specify multiple directories for recursive search

I have a bash script which searches through all the sub-directories (at all levels) given a target directory:
#! /bin/bash
DIRECTORIES="/home/me/target_dir_1"
for curr in $DIRECTORIES
do
...
Now I want the script to search multiple target directories such as target_dir_1, target_dir_2, target_dir_3. How should I modify the script to do this?
use find instead.
find /home/me/target_dir_1 -type d
You can put that in a for loop:
for d in target_dir_1 target_dir_2
do
find /home/me/"$d" -type d
done
If it is always /home/me, and you want to search all the directories under that, do the follwing:
find /home/me -type d
#!/bin/bash
GIVEN_DIR=$1 ## Or you could just set the value here instead of using $1.
while read -r DIR; do
echo "$DIR" ## do something with subdirectory.
done < <(exec find "$GIVEN_DIR" -type d -mindepth 1)
Run with:
bash script.sh dir
Note that word splitting is a bad idea so don't do this:
IFS=$'\n'
for DIR in $(find "$GIVEN_DIR" -type d -mindepth 1); do
echo "$DIR" ## do something with subdirectory.
done
Neither with other forms like when you could use -print0 for find, although it's fine if you still use while:
while read -r DIR -d $'\0'; do
echo "$DIR" ## do something with subdirectory.
done < <(exec find "$GIVEN_DIR" -type d -mindepth 1 -print0)
Lastly you could record those on an array:
readarray -t SUBDIRS < <(exec find "$GIVEN_DIR" -type d -mindepth 1)
for DIR in "${SUBDIRS[#]}"; do
echo "$DIR" ## do something with subdirectory.
done
Say:
for i in /home/me/target_dir_{1..5}; do
echo $i;
done
This would result in:
/home/me/target_dir_1
/home/me/target_dir_2
/home/me/target_dir_3
/home/me/target_dir_4
/home/me/target_dir_5
Alternatively, you can specify the variable as an array and loop over it:
DIRECTORIES=( /home/me/target_dir_1 /home/me/target_dir_2 /home/me/target_dir_3 )
for i in ${DIRECTORIES[#]}; do echo $i ; done
which would result in
/home/me/target_dir_1
/home/me/target_dir_2
/home/me/target_dir_3

Perform an action in every sub-directory using Bash

I am working on a script that needs to perform an action in every sub-directory of a specific folder.
What is the most efficient way to write that?
A version that avoids creating a sub-process:
for D in *; do
if [ -d "${D}" ]; then
echo "${D}" # your processing here
fi
done
Or, if your action is a single command, this is more concise:
for D in *; do [ -d "${D}" ] && my_command; done
Or an even more concise version (thanks #enzotib). Note that in this version each value of D will have a trailing slash:
for D in */; do my_command; done
for D in `find . -type d`
do
//Do whatever you need with D
done
The simplest non recursive way is:
for d in */; do
echo "$d"
done
The / at the end tells, use directories only.
There is no need for
find
awk
...
Use find command.
In GNU find, you can use -execdir parameter:
find . -type d -execdir realpath "{}" ';'
or by using -exec parameter:
find . -type d -exec sh -c 'cd -P "$0" && pwd -P' {} \;
or with xargs command:
find . -type d -print0 | xargs -0 -L1 sh -c 'cd "$0" && pwd && echo Do stuff'
Or using for loop:
for d in */; { echo "$d"; }
For recursivity try extended globbing (**/) instead (enable by: shopt -s extglob).
For more examples, see: How to go to each directory and execute a command? at SO
Handy one-liners
for D in *; do echo "$D"; done
for D in *; do find "$D" -type d; done ### Option A
find * -type d ### Option B
Option A is correct for folders with spaces in between. Also, generally faster since it doesn't print each word in a folder name as a separate entity.
# Option A
$ time for D in ./big_dir/*; do find "$D" -type d > /dev/null; done
real 0m0.327s
user 0m0.084s
sys 0m0.236s
# Option B
$ time for D in `find ./big_dir/* -type d`; do echo "$D" > /dev/null; done
real 0m0.787s
user 0m0.484s
sys 0m0.308s
find . -type d -print0 | xargs -0 -n 1 my_command
This will create a subshell (which means that variable values will be lost when the while loop exits):
find . -type d | while read -r dir
do
something
done
This won't:
while read -r dir
do
something
done < <(find . -type d)
Either one will work if there are spaces in directory names.
You could try:
#!/bin/bash
### $1 == the first args to this script
### usage: script.sh /path/to/dir/
for f in `find . -maxdepth 1 -mindepth 1 -type d`; do
cd "$f"
<your job here>
done
or similar...
Explanation:
find . -maxdepth 1 -mindepth 1 -type d :
Only find directories with a maximum recursive depth of 1 (only the subdirectories of $1) and minimum depth of 1 (excludes current folder .)
the accepted answer will break on white spaces if the directory names have them, and the preferred syntax is $() for bash/ksh. Use GNU find -exec option with +; eg
find .... -exec mycommand +; #this is same as passing to xargs
or use a while loop
find .... | while read -r D
do
# use variable `D` or whatever variable name you defined instead here
done
if you want to perform an action INSIDE the folder and not ON folder.
Explanation: You have many pdfs and you would like to concetrate them inside a single folder.
my folders
AV 001/
AV 002/
for D in *; do cd "$D"; # VERY
DANGEROUS COMMAND - DONT USE
#-- missing "", it will list files too. It can go up too.
for d in */; do cd "$d"; echo $d; cd ..; done; # works
succesfully
for D in "$(ls -d */)"; do cd "$D"; done; #
bash: cd: $'Athens Voice 001/\nAthens Voice 002/' - there is no such
folder
for D in "$(*/)"; do cd "$D"; done; # bash: Athens
Voice 001/: is folder
for D in "$(`find . -type d`)"; do cd $D; done; # bash: ./Athens: there is no such folder or file
for D in *; do if [ -d "${D}" ] then cd ${D}; done; # many
arguments

Resources