How to find all file paths in a directory with bash

How to find all file paths in a directory with bash - bash

I have a script that looks like this:
function main() {
for source in "$#"; do
sort_imports "${source}"
done
}
main "$#"
Right now if I pass in a file ./myFile.m the script works as expected.
I want to change it to passing in ./myClassPackage and have it find all files and call sort_imports on each of them.
I tried:
for source in $(find "$#"); do
sort_imports "${source}"
done
but when I call it I get an error that I'm passing it a directory.

Using the output of a command substitution for a for loop has pitfalls due to word splitting. A truly rock-solid solution will use null-byte delimiters to properly handle even files with newlines in their names (which is not common, but valid).
Assuming you only want regular files (and not directories), try this :
while IFS= read -r -d '' source; do
sort_imports "$source"
done < <(find "$#" -type f -print0)
The -print0 option causes find to separate entries with null bytes, and the -d '' option for read allows these to be used as record separators.

You should use find with -exec:
find "$#" -type f -exec sort_imports "{}" \;
For more information see https://www.everythingcli.org/find-exec-vs-find-xargs/

If you don't want find to enumerate directories, then exclude them:
for source in $(find "$#" -not -type d); do
sort_imports "${source}"
done

Related

Iterating over directory and replacing '+' from a file name to '_' in bash

I have a directory named as assets, which in further has a set of directories.
user_images
|-content_images
|-original
|-cropped
|-resize
|-gallery_images
|-slider_images
|-logo
These folders can have folders like original, cropped, resize. And these folders further will have images. These images are named something like this – 14562345+Image.jpeg. I need to replace all the images/files that have + to _.
for f in ls -a;
do
if [[ $f == + ]]
then
cp "$f" "${f//+/_}"
fi
done
I was able to do this in the current directory. But I need to iterate this to other many other directories. How can I do that?

You can use this loop using find in a process substitution:
cd user_images
while IFS= read -r -d '' f; do
echo "$f"
mv "$f" "${f//+/_}"
done < <(find . -name '*+*' -type f -print0)

With find -exec:
find user_images -type f \
-exec bash -c '[[ $0 == *+* ]] && mv "$0" "${0//+/_}"' {} \;
Notice that this uses mv and not cp as the question states "rename", but simply replace by cp if you want to keep the original files.
The bash -c is required to be able to manipulate the file names, otherwise we could use {} directly in the -exec action.

The following will run recursively and will rename all files replacing + with a _ :
find . -name '*+*' -type f -execdir bash -c 'for f; do mv "$f" "${f//+/_}"' _ {} +
Notice the use of -execdir :
Like -exec, but the specified command is run from the subdirectory containing the matched file, which is not normally the directory in which you started find -- Quoted from man find.
Which will protect us in case of directory names matching the pattern *+* which you do not want to rename.

Execute bash function from find command

I have defined a function in bash, which checks if two files exists, compare if they are equal and delete one of them.
function remodup {
F=$1
G=${F/.mod/}
if [ -f "$F" ] && [ -f "$G" ]
then
cmp --silent "$F" "$G" && rm "$F" || echo "$G was modified"
fi
}
Then I want to call this function from a find command:
find $DIR -name "*.mod" -type f -exec remodup {} \;
I have also tried | xargs syntax. Both find and xargs tell that ``remodup` does not exist.
I can move the function into a separate bash script and call the script, but I don't want to copy that function into a path directory (yet), so I would either need to call the function script with an absolute path or allways call the calling script from the same location.
(I probably can use fdupes for this particular task, but I would like to find a way to either
call a function from find command;
call one script from a relative path of another script; or
Use a ${F/.mod/} syntax (or other bash variable manipulation) for files found with a find command.)

You need to export the function first using:
export -f remodup
then use it as:
find $DIR -name "*.mod" -type f -exec bash -c 'remodup "$1"' - {} \;

You could manually loop over find's results.
while IFS= read -rd $'\0' file; do
remodup "$file"
done < <(find "$dir" -name "*.mod" -type f -print0)
-print0 and -d $'\0' use NUL as the delimiter, allowing for newlines in the file names. IFS= ensures spaces as the beginning of file names aren't stripped. -r disables backslash escapes. The sum total of all of these options is to allow as many special characters as possible in file names without mangling.

Given that you aren't using many features of find, you can use a pure bash solution instead to iterate over the desired files.
shopt -s globstar nullglob
for fname in ./"$DIR"/**/*.mod; do
[[ -f $fname ]] || continue
f=${fname##*/}
remodup "$f"
done

To throw in a third option:
find "$dir" -name "*.mod" -type f \
-exec bash -s -c "$(declare -f remodup)"$'\n'' for arg; do remodup "$arg"; done' _ {} +
This passes the function through the argv, as opposed to through the environment, and (by virtue of using {} + rather than {} ;) uses as few shell instances as possible.
I would use John Kugelman's answer as my first choice, and this as my second.

count number of lines for each file found

i think that i don't understand very well how the find command in Unix works; i have this code for counting the number of files in each folder but i want to count the number of lines of each file found and save the total in variable.
find "$d_path" -type d -maxdepth 1 -name R -print0 | while IFS= read -r -d '' file; do
nb_fichier_R="$(find "$file" -type f -maxdepth 1 -iname '*.R' | wc -l)"
nb_ligne_fichier_R= "$(find "$file" -type f -maxdepth 1 -iname '*.R' -exec wc -l {} +)"
echo "$nb_ligne_fichier_R"
done
output:
43 .//system d exploi/r-repos/gbm/R/basehaz.gbm.R
90 .//system d exploi/r-repos/gbm/R/calibrate.plot.R
45 .//system d exploi/r-repos/gbm/R/checks.R
178 total: File name too long
can i just save to total number of lines in my variable? here in my example just save 178 and that for each files in my folder "$d_path"
Many Thanks

Maybe I'm missing something, but wouldn't this do what you want?
wc -l R/*.[Rr]

Solution:
find "$d_path" -type d -maxdepth 1 -name R | while IFS= read -r file; do
nb_fichier_R="$(find "$file" -type f -maxdepth 1 -iname '*.R' | wc -l)"
echo "$nb_fichier_R" #here is fine
find "$file" -type f -maxdepth 1 -iname '*.R' | while IFS= read -r fille; do
wc -l $fille #here is the problem nothing shown
done
done
Explanation:
adding -print0 the first find produced no newline so you had to tell read -d '' to tell it not to look for a newline. Your subsequent finds output newlines so you can use read without a delimiter. I removed -print0 and -d '' from all calls so it is consistent and idiomatic. Newlines are good in the unix world.

For the command:
find "$d_path" -type d -maxdepth 1 -name R -print0
there can be at most one directory that matches ("$d_path/R"). For that one directory, you want to print:
The number of files matching *.R
For each such file, the number of lines in it.
Allowing for spaces in $d_path and in the file names is most easily handled, I find, with an auxilliary shell script. The auxilliary script processes the directories named on its command line. You then invoke that script from the main find command.
counter.sh
shopt -s nullglob;
for dir in "$#"
do
count=0
for file in "$dir"/*.R; do ((count++)); done
echo "$count"
wc -l "$dir"/*.R </dev/null
done
The shopt -s nullglob option means that if there are no .R files (with names that don't start with a .), then the glob expands to nothing rather than expanding to a string containing *.R at the end. It is convenient in this script. The I/O redirection on wc ensures that if there are no files, it reads from /dev/null, reporting 0 lines (rather than sitting around waiting for you to type something).
On the other hand, the find command will find names that start with a . as well as those that do not, whereas the globbing notation will not. The easiest way around that is to use two globs:
for file in "$dir"/*.R "$dir"/.*.R; do ((count++)); done
or use find (rather carefully):
find . -type f -name '*.R' -exec sh -c 'echo $#' arg0 {} +
Using counter.sh
find "$d_path" -type d -maxdepth 1 -name R -exec sh ./counter.sh {} +
This script allows for the possibility of more than one sub-directory (if you remove -maxdepth 1) and invokes counter.sh with all the directories to be examined as arguments. The script itself carefully handles file names so that whether there are spaces, tabs or newlines (or any other character) in the names, it will work correctly. The sh ./counter.sh part of the find command assumes that the counter.sh script is in the current directory. If it can be found on $PATH, then you can drop the sh and the ./.
Discussion
The technique of having find execute a command with the list of file name arguments is powerful. It avoids issues with -print0 and using xargs -0, but gives you the same reliable handling of arbitrary file names, including names with spaces, tabs and newlines. If there isn't already a command that does what you need (but you could write one as a shell script), then do so and use it. If you might need to do the job more than once, you can keep the script. If you're sure you won't, you can delete it after you're done with it. It is generally much easier to handle files with awkward names like this than it is to fiddle with $IFS.

Consider this solution:
# If `"$dir"/*.R` doesn't match anything, yield nothing instead of giving the pattern.
shopt -s nullglob
# Allows matching both `*.r` and `*.R` in one expression. Using them separately would
# give double results.
shopt -s nocaseglob
while IFS= read -ru 4 -d '' dir; do
files=("$dir"/*.R)
echo "${#files[#]}"
for file in "${files[#]}"; do
wc -l "$file"
done
# Use process substitution to prevent going to a subshell. This may not be
# necessary for now but it could be useful to future modifications.
# Let's also use a custom fd to keep troubles isolated.
# It works with `-u 4`.
done 4< <(exec find "$d_path" -type d -maxdepth 1 -name R -print0)
Another form is to use readarray which allocates all found directories at once. Only caveat is that it can only read normal newline-terminated paths.
shopt -s nullglob
shopt -s nocaseglob
readarray -t dirs < <(exec find "$d_path" -type d -maxdepth 1 -name R)
for dir in "${dirs[#]}"; do
files=("$dir"/*.R)
echo "${#files[#]}"
for file in "${files[#]}"; do
wc -l "$file"
done
done

Bash. When I find a file using files=`find...`, then use a for loop "for file in $files". How do I access the path to the found file?

files=`find C:/PATH/TO/DIRECTORY -name *.txt`
for file in $files; do
#need code to rename $file, by moving it into the same directory
eg. $file was found in C:/PATH/TO/DIRECTORY/2014-05-08.
how do I rename $file to back to that directory and not to C:/PATH/TO/DIRECTORY?

You can use -execdir option in find:
find C:/PATH/TO/DIRECTORY -name '*.txt' -execdir mv '{}' '{}'-new \;
As per man find:
-execdir utility [argument ...] ;
The -execdir primary is identical to the -exec primary with the exception that utility will be executed
from the directory that holds the current file.

You would be better served by using this structure:
while read fname
do
....
done < <(find ...)
Or, if you're not using bash:
find ... | while read fname
do
....
done
The problem with storing the output of find in a variable, or even doing for fname in $(find ...), is with word splitting on whitespace. The above structures still fail if you have a file name with a newline in it, since they assume that you have one file name per line, but they're better than what you have now.
An even better solution would be something like this:
find ... -print0 | xargs -0 -n1 -Ixxx somescript.sh "xxx"
But even that might have issues if filenames have quotes or other things in them.
The bottom line is that parsing arbitrary data (which filenames can be) is hard...

find C:/PATH/TO/DIRECTORY -name \*.txt | while read file
do
dir=$(dirname "$file")
base=$(basename "$file")
mv "$file" "$dir/new_file_name"
done

Apply a script to subdirectories

I have read many times that if I want to execute something over all subdirectories I should run something like one of these:
find . -name '*' -exec command arguments {} \;
find . -type f -print0 | xargs -0 command arguments
find . -type f | xargs -I {} command arguments {} arguments
The problem is that it works well with corefunctions, but not as expected when the command is a user-defined function or a script. How to fix it?
So what I am looking for is a line of code or a script in which I can replace command for myfunction or myscript.sh and it goes to every single subdirectory from current directory and executes such function or script there, with whatever arguments I supply.
Explaining in another way, I want something to work over all subdirectories as nicely as for file in *; do command_myfunction_or_script.sh arguments $file; done works over current directory.

Instead of -exec, try -execdir.
It may be that in some cases you need to use bash:
foo () { echo $1; }
export -f foo
find . -type f -name '*.txt' -exec bash -c 'foo arg arg' \;
The last line could be:
find . -type f -name '*.txt' -exec bash -c 'foo "$#"' _ arg arg \;
Depending on what args might need expanding and when. The underscore represents $0.
You could use -execdir where I have -exec if that's needed.

The examples that you give, such as:
find . -name '*' -exec command arguments {} \;
Don't go to every single subdirectory from current directory and execute command there, but rather execute command from the current directory with the path to each file listed by the find as an argument.
If what you want is to actually change directory and execute a script, you could try something like this:
STDIR=$PWD; IFS=$'\n'; for dir in $(find . -type d); do cd $dir; /path/to/command; cd $STDIR; done; unset IFS
Here the current directory is saved to STDIR and the bash Internal Field Separator is set to a newline so names won't split on spaces. Then for each directory (-type d) that find returns, we cd to that directory, execute the command (using the full path here as changing directories will break a relative path) and then cd back to the starting directory.

There may be some way to use find with a function, but it won't be terribly elegant. If you have bash 4, what you probably want to do is use globstar:
shopt -s globstar
for file in **/*; do
myfunction "$file"
done
If you're looking for compatibility with POSIX or older versions of bash, you will be forced to source the file defining your function when you invoke bash. So something like this:
find <args> -exec bash -c '. funcfile;
for file; do
myfunction "$file"
done' _ {} +
But that's just ugly. When I get to this point, I usually just put my function in a script on my PATH and live with it.

If you want to use a bash function, this is one way.
work ()
{
local file="$1"
local dir=$(dirname $file)
pushd "$dir"
echo "in directory $(pwd) working with file $(basename $file)"
popd
}
find . -name '*' | while read line;
do
work "$line"
done

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio