Loop through all files in a directory and subdirectories using Bash [duplicate] - bash

This question already has answers here:
How to loop through a directory recursively to delete files with certain extensions
(16 answers)
Closed 4 years ago.
I know how to loop through all the files in a directory, for example:
for i in *
do
<some command>
done
But I would like to go through all the files in a directory, including (particularly!) all the ones in the subdirectories. Is there a simple way of doing this?

The find command is very useful for that kind of thing, provided you don't have white space or other special characters in the file names:
For example:
for i in $(find . -type f -print)
do
stuff
done
The command generates path names relative from the start of the search (the first parameter).
As pointed out, this will fail if your filenames contain spaces or some other characters.
You can also use the -exec option which avoids the problem with spaces in file names. It executes the given command for each file found. The braces are a placeholder for the filename:
find . -type f -exec command {} \;

find and xargs are great tools for recursively processing the contents of directories and sub-directories. For example
find . -type f -print0 | xargs -0 command
will run command on batches of files from the current directory and its sub-directories. The -print0 and -0 arguments avoid the usual problems with filenames that contain spaces, quotes or other metacharacters.
If command just takes one argument, you can limit the number of files passed to it with -L1.
find . -type f -print0 | xargs -0 -L1 command
And as suggested by alexgirao, xargs can also name arguments, using -I, which gives some flexibility if command takes options. -I implies -L1.
find . -type f -print0 | xargs -0 -Iarg command arg --option

recurse() {
path=$1
If [ -d "$path" ] ; then
for i in "$path/"*
do
recurse "$i"
done
elif [ -f "$path" ] ; then
do-something
fi
}
Call recurse and pass first positional parameter as directory path from where you want to start.
Ex: recurse /path

Related

Rename all files in directory and (deeply nested) sub-directories

What is the shell command for renaming all files in a directory and sub-directory (recursively)?
I would like to add an underscore to all the files ending with *scss from filename.scss to _filename.scss in all the directories and sub-directories.
I have found answers relating to this but most if not all require you to know the filename itself, and I do not want this because the filenames differ and are a lot to know by heart or even type them manually and some of them are deeply nested in directories.
Edit: I was under the impression that the bash -c bit was somehow necessary for multiple expansion of the found element; anubhava's answer proved me wrong. I am leaving that bit in the answer for now as it worked for the OP.
find . -type f -name *scss -exec bash -c 'mv $1 _$1' -- {} \;
find . -- find in current directory (recursively)
-type f -- files
-name *scss -- matching the pattern *scss
-exec -- execute for each element found
bash -c '...' -- execute command in a subshell
-- -- end option parsing
{} -- expands to the name of the element found (which becomes the positional parameter for the bash -c command)
\; -- end the -exec command
You can use -execdir option here:
find ./src/components -iname "*.scss" -execdir mv {} _{} \;
You are close to a solution:
find ./src/components -iname "*.scss" -print0 | xargs -0 -n 1 -I{} mv {} _{}
In this approach, the "loop" is executed by xargs. I prefer this solution overt the usage of the -exec in find. The syntax is clear to me.
Also, if you want to repeat the command and avoid double-adding the underscore to the already processed files, use a regexp to get only the files not yet processed:
find ./src/components -iregex ".*/[^_][^/]*\.scss" -print0 | xargs -0 -n 1 -I{} mv {} _{}
By adding the -print0/-0 options, you also avoid problems with whitespaces.
#!/bin/sh
EXTENSION='.scss'
cd YOURDIR
find . -type f | while read -r LINE; do
FILE="$( basename "$LINE" )"
case "$LINE" in
*"$EXTENSION")
DIRNAME="$( dirname "$LINE" )"
mv -v "$DIRNAME/$FILE" "$DIRNAME/_$FILE"
;;
esac
done

Get all occurrences of a string within a directory(including subdirectories) in .gz file using bash?

I want to find all the occurrences of "getId" inside a directory which has subdirectories as follows:
*/*/*/*/*/*/myfile.gz
i tried thisfind -name *myfile.gz -print0 | xargs -0 zgrep -i "getId" but it didn't work. Can anyone tell me the best and simplest approach to get this?
find ./ -name '*gz' -exec zgrep -aiH 'getSorById' {} \;
find allows you to execute a command on the file using "-exe" and it replaces "{}" with the file name, you terminate the command with "\;"
I added "-H" to zgrep so it also prints out the file path when it has a match, as its helpful. "-a" treats binary files as text (since you might get tar-ed gzipped files)
Lastly, its best to quote your strings in case bash starts globbing them.
https://linux.die.net/man/1/grep
https://linux.die.net/man/1/find
Use the following find approach:
find . -name *myfile.gz -exec zgrep -ai 'getSORByID' {} \;
This will print all possible lines containing getSORByID substring

find piped to xargs with complex command

I am trying to process DVD files that are in many different locations on a disk. The thing they have in common is that they (each set of input files) are in a directory named VIDEO_TS. The output in each case will be a single file named for the parent of this directory.
I know I can get a fully qualified path to each directory with:
find /Volumes/VolumeName -type d -name "VIDEO_TS" -print0
and I can get the parent directory by piping to xargs:
find /Volumes/VolumeName -type d -name "VIDEO_TS" -print0 | xargs -0 -I{} dirname {}
and I also know that I can get the parent directory name on its own by appending:
| xargs -o I{} basename {}
What I can't figure out is how do I then pass these parameters to, e.g. HandBrakeCLI:
./HandBrakeCLI -i /path/to/filename/VIDEO_TS -o /path/to/convertedfiles/filename.m4v
I have read here about expansion capability of the shell and suspect that's going to help here (not using dirname or basename for a start), but the more I read the more confused I am getting!
You don't actually need xargs for this at all: You can read a NUL-delimited stream into a shell loop, and run the commands you want directly from there.
#!/bin/bash
source_dir=/Volumes/VolumeName
dest_dir=/Volumes/OtherName
while IFS= read -r -d '' dir; do
name=${dir%/VIDEO_TS} # trim /VIDEO_TS off the end of dir, assign to name
name=${name##*/} # remove everything before last remaining / from name
./HandBrakeCLI -i "$dir" -o "$dest_dir/$name.m4v"
done < <(find "$source_dir" -type d -name "VIDEO_TS" -print0)
See the article Using Find on Greg's wiki, or BashFAQ #001 for general information on processing input streams in bash, or BashFAQ #24 to understand the value of using process substitution (the <(...) construct here) rather than piping from find into the loop.
Also, find contains an -exec action which can be used as follows:
source_dir=/Volumes/VolumeName
dest_dir=/Volumes/OtherName
export dest_dir # export allows use by subprocesses!
find "$source_dir" -type d -name "VIDEO_TS" -exec bash -c '
for dir; do
name=${dir%/VIDEO_TS}
name=${name##*/}
./HandBrakeCLI -i "$dir" -o "$dest_dir/$name.m4v"
done
' _ {} +
This passes the found directory names directly on the argument list to the shell invoked with bash -c. Since the default object for for loop to iterate over is "$#", the argument list, this implicitly iterates over directories found by find.
If I understand what you are trying to do, the simplest solution would be to create a little wrapper which takes a path and invokes your CLI:
File: CLIWrapper
#!/bin/bash
for dir in "$#"; do
./HandBrakeCLI -i "${dir%/*}" -o "/path/to/convertedfiles/${dir##*/}.m4v"
done
Edit: I think I misunderstood the question. It's possible that the above script should read:
./HandBrakeCLI -i "$dir" -o "/path/to/convertedfiles/${dir##*/}.m4v"
or perhaps something slightly different. But the theory is valid. :)
Then you can invoke that script using the -exec option to find. The script loops over its arguments, making it possible for find to send multiple arguments to a single invocation using the + terminator:
find /Volumes/VolumeName -type d -name "VIDEO_TS" -exec ./CLIWrapper {} +

shell script : find a string by searching inside all the files in a folder? [duplicate]

This question already has answers here:
How do I recursively grep all directories and subdirectories?
(26 answers)
Closed 7 years ago.
How do I find a string contained in (possibly multiple) files in a folder including hidden files and subfolders?
I tried this command:
find . -maxdepth 1 -name "tes1t" -print0 | sed 's,\.\/,,g'
But this yielded no results.
grep -Hnr PATTERN . if your grep supports -r (recursive, = -d recurse). Note there would be no limit on recursion depths then.
Or try grep -d skip -Hn PATTERN {,.[!.]}*{,/{,.[!.]}*}; this should work since grep accepts multiple file arguments. Just throw away the -d skip stuff if your version of grep doesn't support it. For shells without the brace expansion, use the manually expanded form * */* */.[!.]* .[!.]* .[!.]*/* .[!.]*/.[!.]*.
First of all, your maxdepth should have been 2 instead of 1, now your find command won't descend in subdirectories. Furthermore you can simply grep for your pattern on the output of find. This can be achieved as follows:
find . -maxdepth 2 -type f -exec grep 'pattern here' '{}' \;
Explanation:
find . execute find in current directory.
-maxdepth 2 descend in subdirectories by no further.
-type f find every file that is not a directory.
-exec grep 'pattern' '{}' execute a grep statement with a certain pattern, the {} contains the filename for each file found.
Add options to grep for color highlighting, outputting line numbers and/or the file name.
For more information see man find and man grep.

Unix find: list of files from stdin

I'm working in Linux & bash (or Cygwin & bash).
I have a huge--huge--directory structure, and I have to find a few needles in the haystack.
Specifically, I'm looking for these files (20 or so):
foo.c
bar.h
...
quux.txt
I know that they are in a subdirectory somewhere under ..
I know I can find any one of them with
find . -name foo.c -print. This command takes a few minutes to execute.
How can I print the names of these files with their full directory name? I don't want to execute 20 separate finds--it will take too long.
Can I give find the list of files from stdin? From a file? Is there a different command that does what I want?
Do I have to first assemble a command line for find with -o using a loop or something?
If your directory structure is huge but not changing frequently, it is good to run
cd /to/root/of/the/files
find . -type f -print > ../LIST_OF_FILES.txt #and sometimes handy the next one too
find . -type d -print > ../LIST_OF_DIRS.txt
after it you can really FAST find anything (with grep, sed, etc..) and update the file-lists only when the tree is changed. (it is a simplified replacement if you don't have locate)
So,
grep '/foo.c$' LIST_OF_FILES.txt #list all foo.c in the tree..
When want find a list of files, you can try the following:
fgrep -f wanted_file_list.txt < LIST_OF_FILES.txt
or directly with the find command
find . type f -print | fgrep -f wanted_file_list.txt
the -f for fgrep mean - read patterns from the file, so you can easily grepping input for multiple patterns...
You shouldn't need to run find twenty times.
You can construct a single command with a multiple of filename specifiers:
find . \( -name 'file1' -o -name 'file2' -o -name 'file3' \) -exec echo {} \;
Is the locate(1) command an acceptable answer? Nightly it builds an index, and you can query the index quite quickly:
$ time locate id_rsa
/home/sarnold/.ssh/id_rsa
/home/sarnold/.ssh/id_rsa.pub
real 0m0.779s
user 0m0.760s
sys 0m0.010s
I gave up executing a similar find command in my home directory at 36 seconds. :)
If nightly doesn't work, you could run the updatedb(8) program by hand once before running locate(1) queries. /etc/updatedb.conf (updatedb.conf(5)) lets you select specific directories or filesystem types to include or exclude.
Yes, assemble your command line.
Here's a way to process a list of files from stdin and assemble your (FreeBSD) find command to use extended regular expression matching (n1|n2|n3).
For GNU find you may have to use one of the following options to enable extended regular expression matching:
-regextype posix-egrep
-regextype posix-extended
echo '
foo\\.c
bar\\.h
quux\\.txt
' | xargs bash -c '
IFS="|";
find -E "$PWD" -type f -regex "^.*/($*)$" -print
echo find -E "$PWD" -type f -regex "^.*/($*)$" -print
' arg0
# note: "$*" uses the first character of the IFS variable as array item delimiter
(
IFS='|'
set -- 1 2 3 4 5
echo "$*" # 1|2|3|4|5
)

Resources