Recursively search for files - bash

I am trying to find all files by passing a directory name in all sub directories meaning the process is recursive here is my code
myrecursive() {
if [ -f $1 ]; then
echo $1
elif [ -d $1 ]; then
for i in $(ls $1); do
if [ -f $1 ]; then
echo $i
else
myrecursive $i
fi
done
else
echo " sorry"
fi
}
myrecursive $1
However when I pass directory with another directory I get 2 times sorry,where is my mistake?

The goal that you are trying to achieve could be simply done by using find command:
# will search for all files recursively in current directory
find . * -exec echo {} \;
# will search for all *.txt file recursively in current directory
find . -name "*.txt" -exec echo {} \;
# will search for all *.txt file recursively in current directory
# but depth is limited to 3
find . -name "*.txt" -max-depth 3 -exec echo {} \;
See man find for manual. How to run find -exec?

The problem with your code is quite simple.
The ls command will return a list of filenames, but they aren't valid for
recursion. Use globbing instead. The loop below simply replaces $(ls) with $1/*
myrecursive() {
if [ -f $1 ]; then
echo $1
elif [ -d $1 ]; then
for i in $1/*; do
if [ -f $1 ]; then
echo $i
else
myrecursive $i
fi
done
else
echo " sorry"
fi
}
myrecursive $1
Hope that helps

#!/bin/bash
myrecursive() {
if [ -f "$1" ]; then
echo "$1"
elif [ -d "$1" ]; then
for i in "$1"/*; do
if [ -f "$i" ]; then #here now our file is $i
echo "$i"
else
myrecursive "$i"
fi
done
else
echo " sorry"
fi
}
myrecursive "$1"

Related

How to refactor a find | xargs one liner to a human readable code

I've written an OCR wrapper batch & service script for tesseract and abbyyocr11 found here: https://github.com/deajan/pmOCR
The main function is a find command that passes it's arguments to xargs with -print0 in order to deal with special filenmames.
The find command became more and more complex and ended up as a VERY long one liner that becomes difficult to maintain:
find "$DIRECTORY_TO_PROCESS" -type f -iregex ".*\.$FILES_TO_PROCES" ! -name "$find_excludes" -print0 | xargs -0 -I {} bash -c 'export file="{}"; function proceed { eval "\"'"$OCR_ENGINE_EXEC"'\" '"$OCR_ENGINE_INPUT_ARG"' \"$file\" '"$OCR_ENGINE_ARGS"' '"$OCR_ENGINE_OUTPUT_ARG"' \"${file%.*}'"$FILENAME_ADDITION""$FILENAME_SUFFIX$FILE_EXTENSION"'\" && if [ '"$_BATCH_RUN"' -eq 1 ] && [ '"$_SILENT"' -ne 1 ];then echo \"Processed $file\"; fi && echo -e \"$(date) - Processed $file\" >> '"$LOG_FILE"' && if [ '"$DELETE_ORIGINAL"' == \"yes\" ]; then rm -f \"$file\"; fi"; }; if [ "'$CHECK_PDF'" == "yes" ]; then if ! pdffonts "$file" 2>&1 | grep "yes" > /dev/null; then proceed; else echo "$(date) - Skipping file $file already containing text." >> '"$LOG_FILE"'; fi; else proceed; fi'
Is there a nicer way to pass the find results to a human readable function (without impacting too much speed) ?
Thanks.
Don't use bash -c. You are already committed to starting a new bash process for each file from the find command, so just save the code to a file and run that with
find "$DIRECTORY_TO_PROCESS" -type f -iregex ".*\.$FILES_TO_PROCES" \
! -name "$find_excludes" -print0 |
xargs -0 -I {} bash script.bash {}
You can replace find altogether. It's easier in bash 4 (which I'll show here), but doable in bash 3.
proceed () {
...
}
shopt -s globstar
extensions=(pdf tif tiff jpg jpeg bmp pcx dcx)
for ext in "${extensions[#]}"; do
for file in /some/path/**/*."$ext"; do
[[ ! -f $file || $file = *_ocr.pdf ]] && continue
# Rest of script here
done
done
Prior to bash 4, you can write your own recursive function to descend through a directory hierarchy.
descend () {
for fd in "$1"/*; do
if [[ -d $fd ]]; then
descend "$fd"
elif [[ ! -f $fd || $fd != *."$ext" || $fd = *_ocr.pdf ]]; then
continue
else
# Rest of script here
fi
done
}
for ext in "${extensions[#]}"; do
descend /some/path "$ext"
done
OK, create the script, then run find.
#!/bin/bash
trap cleanup EXIT
cleanup() { rm "$script"; }
script=$(mktemp)
cat <<'END' > "$script"
########################################################################
file="$1"
function proceed {
"$OCR_ENGINE_EXEC" "$OCR_ENGINE_INPUT_ARG" "$file" "$OCR_ENGINE_ARGS" "$OCR_ENGINE_OUTPUT_ARG" "${file%.*}$FILENAME_ADDITION$FILENAME_SUFFIX$FILE_EXTENSION"
if [ "$_BATCH_RUN" -eq 1 ] && [ "$_SILENT" -ne 1 ]; then
echo "Processed $file"
fi
echo -e "$(date) - Processed $file" >> "$LOG_FILE"
if [ "$DELETE_ORIGINAL" == "yes" ]; then
rm -f "$file"
fi
}
if [ "$CHECK_PDF" == "yes" ]; then
if ! pdffonts "$file" 2>&1 | grep "yes" > /dev/null; then
proceed
else
echo "$(date) - Skipping file $file already containing text." >> '"$LOG_FILE"';
fi
else
proceed
fi
########################################################################
END
find "$DIRECTORY_TO_PROCESS" -type f \
-iregex ".*\.$FILES_TO_PROCES" \
! -name "$find_excludes" \
-exec bash "$script" '{}' \;
The 'END' of the heredoc is quoted, so the variables are not expanded until the script is actually executed.
I finished using a while loop with a substituted find command, ie:
while IFS= read -r -d $'\0' file; do
if ! lsof -f -- "$file" > /dev/null 2>&1; then
if [ "$_BATCH_RUN" == true ]; then
Logger "Preparing to process [$file]." "NOTICE"
fi
OCR "$file" "$fileExtension" "$ocrEngineArgs" "$csvHack"
else
if [ "$_BATCH_RUN" == true ]; then
Logger "Cannot process file [$file] currently in use." "ALWAYS"
else
Logger "Deferring file [$file] currently being written to." "ALWAYS"
kill -USR1 $SCRIPT_PID
fi
fi
done < <(find "$directoryToProcess" -type f -iregex ".*\.$FILES_TO_PROCES" ! -name "$findExcludes" -and ! -wholename "$moveSuccessExclude" -and ! -wholename "$moveFailureExclude" -and ! -name "$failedFindExcludes" -print0)
The while loop reads every file from the find command in file variable.
Using -d $'\0' in while and -print0 in find command helps dealing with special filenames.

Recursively iterating through subdirectories and removing certain file

I have a music archive with lots of folders and sub-folders (Cover Art etc.) so instead of manually removing hundreds of Folder.jpg, Desktop.ini and Thumb.db files, I decided to do a simple bash script but things got really messy.
I did a simple test by creating dummy folders like this:
/home/dummy/sub1 -
sub1sub1
sub1sub1sub1
sub1sub1sub2
sub2 -
sub2sub1
sub2sub2
sub2sub2sub1
and copied some random .jpg, .mp3, .ini files across these folders. My bash script looks currently like this:
function delete_jpg_ini_db {
if [[ $f == *.jpg ]]; then
echo ".jpg file, removing $f"
gvfs-trash $f
elif [[ $f == *.ini ]]; then
echo ".ini file, removing $f"
gvfs-trash -f $f
elif [[ $f == *.db ]]; then
echo ".db file, removing $f"
gvfs-trash -f $f
else echo "not any .jpg, .ini or .db file, skipping $f"
fi
}
function iterate_dir {
for d in *; do
if [ -d $d ]; then
echo "entering sub-directory: $d" && cd $d
pwd
for f in *; do
if [ -f $f ]; then #check if .jpg, .ini or .db, if so delete
delete_jpg_ini_db
elif [ -d $f ]; then #enter sub-dir and iterate again
if [ "$(ls -A $f)" ]; then
iterate_dir
else
echo "sub-directory $f is empty!"
fi
fi
done
fi
done
}
pwd
iterate_dir
When I run it, it successfully iterates through sub1, sub1sub1 and sub1sub1sub1, but it halts there instead of going back to home and searching sub2 next.
I am new in Bash scripting, all help is appreciated..
Thanks.
And in one command you can run:
find /home/dummy/sub1 -name "*.jpg" -o -name "*.ini" -o -name "*.db" -delete
And if you want to see which files would be deleted, replace -delete with -print (just filenames) or with -ls (like ls -l output).
here is the changed code....
function delete_jpg_ini_db {
if [[ $f == *.jpg ]]; then
echo ".jpg file, removing $f"
gvfs-trash $f
elif [[ $f == *.ini ]]; then
echo ".ini file, removing $f"
gvfs-trash -f $f
elif [[ $f == *.db ]]; then
echo ".db file, removing $f"
gvfs-trash -f $f
else echo "not any .jpg, .ini or .db file, skipping $f"
fi
}
function iterate_dir {
for d in *; do
if [ -d "$d" ]; then
echo "entering sub-directory: $d" && cd $d
pwd
for f in *; do
if [ -f "$f" ]; then #check if .jpg, .ini or .db, if so delete
delete_jpg_ini_db
elif [ -d $f ]; then #enter sub-dir and iterate again
if [ "$(ls -A $f)" ]; then
iterate_dir
else
echo "sub-directory $f is empty!"
fi
fi
done
cd ..
fi
done
}
pwd
iterate_dir
Mistakes
You did have support for file name with space in them
You did not navigate back after your inner for loop..
Try it...

Recursively count directories and files with a shell script

I'm trying to write a shell script that will recursively count all the files and sub-directories in a directory and also all the hidden ones. My script can count them however it can't detect hidden files and directories that are in a sub-directory. How can i change it so that it is able to do this? Also i cannot use find, du or ls -R
#!/bin/bash
cd $1
dir=0
hiddendir=0
hiddenfiles=0
x=0
items=( $(ls -A) )
amount=( $(ls -1A | wc -l) )
counter() {
if [ -d "$i" ]; then
let dir+=1
if [[ "$i" == .* ]]; then
let hiddendir+=1
let dir-=1
fi
search "$i"
elif [ -f "$i" ]; then
let files+=1
if [[ "$i" == .* ]]; then
let files-=1
let hiddenfiles+=1
fi
fi
}
search() {
for i in $1/*; do
counter "$i"
done
}
while [ $x -lt $amount ]; do
i=${items[$x]}
counter "$i"
let x+=1
done
#!/bin/bash -e
shopt -s globstar dotglob # now ** lists all entries recursively
cd "$1"
dir=0 files=0 hiddendir=0 hiddenfiles=0
counter() {
if [ -f "$1" ]; then local typ=files
elif [ -d "$1" ]; then local typ=dir
else continue
fi
[[ "$(basename "$1")" == .* ]] && local hid=hidden || local hid=""
((++$hid$typ))
}
for i in **; do
counter "$i"
done
echo $dir $files $hiddendir $hiddenfiles
Consider using this:
find . | wc -l

Bash script: Copies image files from source dir to destination dir and adds an extra suffix on files with same file name

I have this script that copies image files from source directory to destination directory. There are some image files in the source directory that have the same name but different file size. This script also compares the two files with the same name using a stat command. Now, I want to add a string suffix e.g. IMG0897.DUP.JPG before the file extension to the files with the same file name that are going to be copied over to the destination folder. At the moment, my script adds the file size of the file to the file name.
I need help on how to add a string of text of my own rather than the size of the file.
Here's my script:
#!/bin/sh
SEARCH="IMG_*.JPG"
SOURCE= $1
DEST=$2
test $# -ne 2 && echo Usage : phar image_path archive_path
if [ ! -e $1 ]
then echo Source folder does not exist
fi
if [ ! -e $2 ]
then mkdir $2/
fi
# Execute the script.
if [ "${SEARCH%% *}" = "$SEARCH" ]; then
command="find \"$1\" -name \"$SEARCH\""
else
command="find \"$1\" -name \"${SEARCH%% *}\""$(for i in ${SEARCH#* }; do echo -n " -o -name \"$i\""; done)
fi
# Run the main loop.
eval "$command" | while read file; do
bn=$(basename "$file")
bc=$(stat -c%s "$file")
if [ -f "${2}/$bn" ] && [ "$bc" -ne $(stat -c%s "${2}/$bn") ]; then
bn="$bn.$bc"
fi
if [ -f "${2}/$bn" ]; then
echo "File ${2}/$bn already exists."
else
echo "Copying $file to $2/$bn"
cp -a "$file" "$2/$bn"
fi
done
exit 0
else
echo "Error : Can't find $1 or $2"
exit 1
fi
I modified your scripte slightly.
#!/bin/sh
SEARCH="IMG_*.JPG"
SOURCE=$1
DEST=$2
SUFFIX=DUP
test $# -ne 2 && echo Usage : phar image_path archive_path
if [ ! -e $1 ]
then echo Source folder does not exist
fi
if [ ! -e $2 ]
then mkdir $2/
fi
# Execute the script.
if [ "${SEARCH%% *}" = "$SEARCH" ]; then
command="find \"$1\" -name \"$SEARCH\""
else
command="find \"$1\" -name \"${SEARCH%% *}\""$(for i in ${SEARCH#* }; do echo -n " -o -name \"$i\""; done)
fi
# Run the main loop.
eval "$command" | while read file; do
bn=$(basename "$file")
bc=$(stat -c%s "$file")
if [ -f "${2}/$bn" ] && [ "$bc" -ne $(stat -c%s "${2}/$bn") ]; then
bc=$(echo ${bn}|cut -d. -f2)
bn=$(echo ${bn}|cut -d. -f1)
bn=$bn.$SUFFIX.$bc**
fi
if [ -f "${2}/$bn" ]; then
echo "File ${2}/$bn already exists."
else
echo "Copying $file to $2/$bn"
cp -a "$file" "$2/$bn"
fi
done
exit 0
else
echo "Error : Can't find $1 or $2"
exit 1
fi
My execution result is:
root#precise32:/vagrant# sh JPG_moves.sh /root/dir1/ /root/destination/
Copying /root/dir1/IMG_0897.JPG to /root/destination//IMG_0897.JPG
root#precise32:/vagrant# sh JPG_moves.sh /root/dir2/ /root/destination/
Copying /root/dir2/IMG_0897.JPG to /root/destination//IMG_0897.DUP.JPG

loop over directories to echo its content

The directories are variables set to the full-path
for e in "$DIR_0" "$DIR_1" "$DIR_2"
do
for i in $e/*
do
echo $i
done
The output for each line is the full path. I want only the name of each file
You are looking for basename.
This is the Bash equivalent of basename:
echo "${i##*/}"
It strips off everything before and including the last slash.
If you truly do not wish to recurse you can achieve that more succinctly with this find command:
find "$DIR_0" "$DIR_1" "$DIR_2" -type f -maxdepth 1 -exec basename{} \;
If you wish to recurse over subdirs simply leave out maxdepth:
find "$DIR_0" "$DIR_1" "$DIR_2" -type f -exec basename{} \;
to traveling a directory recursively with bash
try this you can find it here
#! /bin/bash
indent_print()
{
for((i=0; i < $1; i++)); do
echo -ne "\t"
done
echo "$2"
}
walk_tree()
{
local oldifs bn lev pr pmat
if [[ $# -lt 3 ]]; then
if [[ $# -lt 2 ]]; then
pmat=".*"
else
pmat="$2"
fi
walk_tree "$1" "$pmat" 0
return
fi
lev=$3
[ -d "$1" ] || return
oldifs=$IFS
IFS=""
for el in $1/ *; do
bn=$(basename "$el")
if [[ -d "$el" ]]; then
indent_print $lev "$bn/"
pr=$( walk_tree "$el" "$2" $(( lev + 1)) )
echo "$pr"
else
if [[ "$bn" =~ $2 ]]; then
indent_print $lev "$bn"
fi
fi
done
IFS=$oldifs
}
walk_tree "$1" "\.sh$"
See also the POSIX compliant Bash functions to replace basename & dirname here:
http://cfaj.freeshell.org/src/scripts/

Resources