Passing parameters to find in bash script - bash

I use this bash command often
find ~ -type f -name \*.smt -exec grep something {} /dev/null \;
so I am trying to turn it into a simple bash script that I would invoke like this
findgrep ~ something --mtime -12 --name \*.smt
Thanks to this answer I managed to make it work like this:
if ! options=$(getopt -o abc: -l name:,blong,mtime: -- "$#")
then
exit 1
fi
eval "set -- $options"
while [ $# -gt 0 ]
do
case $1 in
-t|--mtime) mtime=${2} ; shift;;
-n|--name|--iname) name="$2" ; shift;;
(--) shift; break;;
(-*) echo "$0: error - unrecognized option $1" 1>&2; exit 1;;
(*) break;;
esac
shift
done
if [ $# -eq 2 ]
then
dir="$1"
str="$2"
elif [ $# -eq 1 ]
then
dir="."
str="$1"
else
echo "Need a search string"
exit
fi
echo "find $dir -type f -mtime $mtime -name $name -exec grep -iln \"$str\" {} /dev/null \;"
echo "find $dir -type f -mtime $mtime -name $name -exec grep -iln \"$str\" {} /dev/null \;" | bash
but the last line - echo'ing a command into bash - seems outright barbaric, but it works.
Is there a better way to do that? somehow trying to execute the find command directly gives no output, while running the one echo'ed out in bash works ok.

ame $name -e
It's still not quoted. Check your script with shellcheck.
find "$dir" -type f -mtype "$mtime" -name "$name" -exec grep -iln "$str" {} ';'
You might want to take a few steps back and do some research about quoting and expansions in shel, find and glob. find program expects literal glob pattern, and unquoted variable expansions undergo filename expansion, changing *.smt into the list of words representing filenames, while find wants the pattern not the result of expansions.
I can throw: man find, man 7 glob, https://www.gnu.org/software/bash/manual/html_node/Quoting.html https://mywiki.wooledge.org/BashFAQ/050
https://mywiki.wooledge.org/BashGuide/Parameters#Parameter_Expansion
Before you start deciding how to pass variable number of arguments to find, I encourage to research Bash arrays. I would do:
#!/bin/bash
fatal() {
echo "$0: ERROR: $*" >&2
exit 1
}
args=$(getopt -o abc: -l name:,iname:,mtime: -- "$#") || exit 1
eval "set -- $args"
findargs=() # bash array
while (($#)); do
case $1 in
-t|--mtime) findargs+=(-mtime "$2"); shift; ;;
-n|--name) findargs+=(-name "$2"); shift; ;;
--iname) findargs+=(-iname "$2"); shift; ;;
--) shift; break; ;;
-*) fatal "unrecognized option $1"; ;;
*) break; ;;
esac
shift
done
if (($# == 2)); then
dir="$1"
str="$2"
elif (($# == 1)); then
dir="."
str="$1"
else
fatal "Need a search string"
fi
set -x
find "$dir" -type f "${findargs[#]}" -exec grep -iln "$str" /dev/null {} +

Related

Processing globs in getopt

I use this bash command often
find ~ -type f -name \*.smt -exec grep something {} /dev/null \;
so I am trying to turn it into a simple bash script that I would invoke like this
findgrep ~ something --mtime -12 --name \*.smt
However I get stuck with processing the command line options with GNU getopt like this:
if ! options=$(getopt -o abc: -l name:,blong,mtime: -- "$#")
then
# something went wrong, getopt will put out an error message for us
exit 1
fi
set -- $options
while [ $# -gt 0 ]
do
case $1 in
-t|--mtime) mtime=${2} ; shift;;
-n|--name|--iname) name="$2" ; shift;;
(--) shift; break;;
(-*) echo "$0: error - unrecognized option $1" 1>&2; exit 1;;
(*) break;;
esac
shift
done
echo "done"
echo $#
if [ $# -eq 2 ]
then
echo "2 args"
dir="$1"
str="$1"
elif [ $# -eq 1 ]
then
dir="."
str="$1"
echo "1 arg"
else
echo "need a search string"
fi
echo $dir
echo $str
echo $mtime
echo "${mtime%\'}"
echo "${mtime%\'}"
echo '--------------------'
mtime="${mtime%\'}"
mtime="${mtime#\'}"
dir="${dir%\'}"
dir="${dir#\'}"
echo $dir $mtime $name
# grep part not in yet
find $dir -type f -mtime $mtime -name $name
which does not seem to work - I suspect because the $name variable gets passed in quotes to find.
How do I fix that?
set -- $options
Is invalid (and it's not quoted). It's eval "set -- $options". Linux getopt outputs properly quoted string to be eval-ed.
mtime="${mtime%\'}"
mtime="${mtime#\'}"
dir="${dir%\'}"
dir="${dir#\'}"
Remove it. That's not how expansions work.
-name $name
It's not quoted. You have to quote it upon use.
-name "$name"
Check your scripts with shellcheck.

How to refactor a find | xargs one liner to a human readable code

I've written an OCR wrapper batch & service script for tesseract and abbyyocr11 found here: https://github.com/deajan/pmOCR
The main function is a find command that passes it's arguments to xargs with -print0 in order to deal with special filenmames.
The find command became more and more complex and ended up as a VERY long one liner that becomes difficult to maintain:
find "$DIRECTORY_TO_PROCESS" -type f -iregex ".*\.$FILES_TO_PROCES" ! -name "$find_excludes" -print0 | xargs -0 -I {} bash -c 'export file="{}"; function proceed { eval "\"'"$OCR_ENGINE_EXEC"'\" '"$OCR_ENGINE_INPUT_ARG"' \"$file\" '"$OCR_ENGINE_ARGS"' '"$OCR_ENGINE_OUTPUT_ARG"' \"${file%.*}'"$FILENAME_ADDITION""$FILENAME_SUFFIX$FILE_EXTENSION"'\" && if [ '"$_BATCH_RUN"' -eq 1 ] && [ '"$_SILENT"' -ne 1 ];then echo \"Processed $file\"; fi && echo -e \"$(date) - Processed $file\" >> '"$LOG_FILE"' && if [ '"$DELETE_ORIGINAL"' == \"yes\" ]; then rm -f \"$file\"; fi"; }; if [ "'$CHECK_PDF'" == "yes" ]; then if ! pdffonts "$file" 2>&1 | grep "yes" > /dev/null; then proceed; else echo "$(date) - Skipping file $file already containing text." >> '"$LOG_FILE"'; fi; else proceed; fi'
Is there a nicer way to pass the find results to a human readable function (without impacting too much speed) ?
Thanks.
Don't use bash -c. You are already committed to starting a new bash process for each file from the find command, so just save the code to a file and run that with
find "$DIRECTORY_TO_PROCESS" -type f -iregex ".*\.$FILES_TO_PROCES" \
! -name "$find_excludes" -print0 |
xargs -0 -I {} bash script.bash {}
You can replace find altogether. It's easier in bash 4 (which I'll show here), but doable in bash 3.
proceed () {
...
}
shopt -s globstar
extensions=(pdf tif tiff jpg jpeg bmp pcx dcx)
for ext in "${extensions[#]}"; do
for file in /some/path/**/*."$ext"; do
[[ ! -f $file || $file = *_ocr.pdf ]] && continue
# Rest of script here
done
done
Prior to bash 4, you can write your own recursive function to descend through a directory hierarchy.
descend () {
for fd in "$1"/*; do
if [[ -d $fd ]]; then
descend "$fd"
elif [[ ! -f $fd || $fd != *."$ext" || $fd = *_ocr.pdf ]]; then
continue
else
# Rest of script here
fi
done
}
for ext in "${extensions[#]}"; do
descend /some/path "$ext"
done
OK, create the script, then run find.
#!/bin/bash
trap cleanup EXIT
cleanup() { rm "$script"; }
script=$(mktemp)
cat <<'END' > "$script"
########################################################################
file="$1"
function proceed {
"$OCR_ENGINE_EXEC" "$OCR_ENGINE_INPUT_ARG" "$file" "$OCR_ENGINE_ARGS" "$OCR_ENGINE_OUTPUT_ARG" "${file%.*}$FILENAME_ADDITION$FILENAME_SUFFIX$FILE_EXTENSION"
if [ "$_BATCH_RUN" -eq 1 ] && [ "$_SILENT" -ne 1 ]; then
echo "Processed $file"
fi
echo -e "$(date) - Processed $file" >> "$LOG_FILE"
if [ "$DELETE_ORIGINAL" == "yes" ]; then
rm -f "$file"
fi
}
if [ "$CHECK_PDF" == "yes" ]; then
if ! pdffonts "$file" 2>&1 | grep "yes" > /dev/null; then
proceed
else
echo "$(date) - Skipping file $file already containing text." >> '"$LOG_FILE"';
fi
else
proceed
fi
########################################################################
END
find "$DIRECTORY_TO_PROCESS" -type f \
-iregex ".*\.$FILES_TO_PROCES" \
! -name "$find_excludes" \
-exec bash "$script" '{}' \;
The 'END' of the heredoc is quoted, so the variables are not expanded until the script is actually executed.
I finished using a while loop with a substituted find command, ie:
while IFS= read -r -d $'\0' file; do
if ! lsof -f -- "$file" > /dev/null 2>&1; then
if [ "$_BATCH_RUN" == true ]; then
Logger "Preparing to process [$file]." "NOTICE"
fi
OCR "$file" "$fileExtension" "$ocrEngineArgs" "$csvHack"
else
if [ "$_BATCH_RUN" == true ]; then
Logger "Cannot process file [$file] currently in use." "ALWAYS"
else
Logger "Deferring file [$file] currently being written to." "ALWAYS"
kill -USR1 $SCRIPT_PID
fi
fi
done < <(find "$directoryToProcess" -type f -iregex ".*\.$FILES_TO_PROCES" ! -name "$findExcludes" -and ! -wholename "$moveSuccessExclude" -and ! -wholename "$moveFailureExclude" -and ! -name "$failedFindExcludes" -print0)
The while loop reads every file from the find command in file variable.
Using -d $'\0' in while and -print0 in find command helps dealing with special filenames.

Search for file formats + options

I was fiddling around with bash last month and am trying to create a script.
I want the script to search through the folders for files with some kind of extension defined by the argument -e. The folders are defined without -option. The output is 2 columns where in the first it prints the found files, and in the second the respective folders.
Is this the most efficient and/or flexible way to go?
I also can't manage to let the -l command work. Any idea what's wrong? When I enter -name \${CHAR}*, it simply doesn't work. Also, how can I make it recognize a range being used? With an if-function looking for the "-" character or something?
I think I managed to mount a block device, but how can I add the path as a parameter so it can be used as a folder? Setting a number as a var doesn't work, it tells me it doesn't recognize the command.
For some reason the 'no recursion' tag works, but the 'no numbers' doesn't. I have no idea why this would be different.
When using the 'no recursion' (nn) and 'no numbers' (nr) tags I use a long tag --tag for the arguments. Is it possible to use only 1 -tag? This is possible with get opts, but then I can't manage to use the other tags after the get opts has been used. Someone a solution?
Finally, is it possible, when finding 2 files with the same file name, instead of printing the file twice, can it just show the file once. But for every file with the same name keep a white space, so it can still show all the folders in the second column?
#!/bin/bash
#FUNCTIONS
#Error
#Also written to stderr
err() {
echo 1>&2;
echo "Error, not enough arguments" 1>&2;
echo "Usage: $0 [-e <file extension>] [<folder>]";
echo "Please enter the argument -e and at least 1 folder.";
echo "More: Please chek Help by using -h or --help.";
echo 1>&2;
exit
}
#Help
help() {
echo
echo "--- Help ---"
echo
echo "This script will look for file extentions in 1 or more directories. The output shows the found files with the according folder where it's located."
echo
echo "Argument -e <ext> is required."
echo "Other arguments the to-look-trough folders."
echo
echo "These are also usable options:"
echo "-h or --help shows this."
echo "-l <character> looks for files starting with the character."
echo "-l <character1>-<character2> does the same, but looks trough a range of characters."
echo "-b <block-device> mounts a partition to /mnt and let it search through."
echo "--nn (no numbers) makes sure there are no numbers in the file name."
echo "--nr (no recursion) doesn't look trough subdirectories."
echo "-r of –-err <file> writes the errors (f.e. corrupted directory) to <file>."
echo "-s <word> searches the word through the files and only shows the files having that word."
echo
exit
}
#VARS
#execute getopt
OPTS=$(getopt -o e:hl:b:r:s: -l "help,nn,nr,err" -n "FileExtensionScript" -- "$#");
#Bad arguments
if [ $? -ne 0 ];
then
err;
exit
fi
#Rearrange arguments
eval set -- "$OPTS";
#echo "AFTER SET -- \$OPTS: $#";
while true; do
case "$1" in
-e)
shift;
if [ -n "$1" ]; then
EXT=$1;
shift;
fi
;;
-h|--help)
shift;
help;
;;
-l)
shift;
if [ -n "$1" ]; then
CHAR=$1;
shift;
fi
;;
-b)
shift;
if [ -n "$1" ]; then
sudo mkdir /mnt/$1;
sudo echo -e "/dev/$1 /mnt/$1 vfat defaults 0 0 " >> /etc/fstab;
sudo mount -a;
999=/mnt/$1;
shift;
fi
;;
--nn)
shift;
NONUM=" ! -name '*[0-9]*'";
;;
--nr)
shift;
NOREC="-maxdepth 1";
;;
-f|--err)
shift;
if [ -n "$1" ]; then
ERROR="| 2>filename | tee -a $1";
shift;
fi
;;
-s)
shift;
if [ -n "$1" ]; then
SEARCH="-name '*$1*'";
shift;
fi
;;
--)
shift;
break;
;;
esac
done
#No folder or arguments given
if [ $# -lt 1 ];
then
err;
exit
fi
#Debug
echo "Folder argumenten: $#" >2;
echo \# $# >2;
#Create arrays with found files and according folders
FILES=( $(find $# $NOREC $SEARCH $NONUM -name \*.${EXT} $ERROR | rev | cut -d/ -f 1 | rev) )
FOLDERS=( $(find $# $NOREC $SEARCH $NONUM -name \*.${EXT} $ERROR | rev | cut -d/ -f 1 --complement | rev) )
#Show arrays in 2 columns
for ((i = 0; i <= ${#FILES[#]}; i++));
do
printf '%s %s\n' "${FILES[i]}" "${FOLDERS[i]}"
done | column -t | sort -k1 #Make columns cleaner + sort on filename
I am not native English speaker and am hoping to get some tips to finish my script :) Thanks in advance!

Parameters or arguments in bash

I saw a question on stackflow about parsing arguments. I tried to write this, but it's not working and now it's getting on my nerves.
The usual way of running a script on the terminal is ./scriptname, but I later introduced the argument -d. So, if I put ./scriptname it will not run. If I put ./scriptname -d it will.
Now I want to put another argument for the path (where the files are moving, in this case "/home/elg19/documents") such that when I do not include the path, it won't run. But, if I put ./scriptname -d path I want to replace $To in the existing script with the command argument after -d.
#!/bin/bash
From="/home/mark/doc"
To=$2
if [ $1 = -d ]; then
cd "$From"
for i in pdf txt doc; do
find . -type f -name "*.${i}" -exec mv "{}" "$To" \;
done
fi
Your desired usage isn't completely clear, but it seems to be:
scriptname -d path
So, you can do it the extensible way, or the brute force way. Since you're changing directories willy-nilly, you also need to ensure that the paths are absolute, not relative.
Brute force
#!/bin/bash
From="/home/mark/doc"
if [ $# = 2 ] && [ "$1" = '-d' ] && [ -d $2 ]
then
case "$2" in
(/*) cd "$From" &&
for extn in pdf txt doc
do find . -type f -name "*.$extn" -exec mv {} "$To" \;
done;;
(*) echo "$0: path name must be absolute ($2 is not)" 1>&2; exit 1;;
esac
else
echo "Usage: $0 -d /absolute/dirname" 1>&2; exit 1
fi
Extensible
#!/bin/bash
From="/home/mark/doc"
To=""
usage()
{
echo "Usage: $(basename $0 .sh) -d /absolute/dirname" 1>&2
exit 1
}
while getopts d: opt
do
case "$opt" in
(d) if [ ! -d "$OPTARG" ]
then echo "$0: $OPTARG is not a directory" 1>&2; exit 1
else
case "$OPTARG" in
(/*) To="$OPTARG";;
(*) echo "$0: path name must be absolute ($2 is not)" 1>&2; exit 1;;
esac
fi;;
(*) usage;;
esac
done
shift $(($OPTIND - 1))
if [ $# != 0 ] || [ -z "$To" ]
then usage
fi
cd "$From" &&
for extn in pdf txt doc
do find . -type f -name "*.$extn" -exec mv {} "$To" \;
done
For example, it will be very easy to add a -f from option to deal with changing the source of the files.
Note that you could also use:
for extn in pdf txt doc
do find "$From" -type f -name "*.$extn" -exec mv {} "$To" \;
done
This would allow you to permit relative names for the 'from' and 'to' directories because it does not change directory.
I assume you want to do some input validation to your command line arguments. I guess the following would be somewhat useful:
#!/bin/bash
usage() {
echo "USAGE :"
echo "./move -d <to-directory>"
}
if [ $# -ne 2 ] ; then
usage
exit
fi
case $1 in
-d ) shift
To=$1
;;
* ) usage
exit
esac
From="/tmp/From/"
cd "$From"
for i in pdf txt doc; do
find . -type f -name "*.${i}" -exec mv "{}" "$To" \;
done
Moreover to debug your script, you may use the following command:
bash -x ./move.sh -d /tmp/To/
You may add more error checking (and informative echo's) for the following cases:
Source/destination directory does not exits
N files have been copied from the to
No files available at
You can take the type of files as arguments f.e. -t doc xls pdf

loop over directories to echo its content

The directories are variables set to the full-path
for e in "$DIR_0" "$DIR_1" "$DIR_2"
do
for i in $e/*
do
echo $i
done
The output for each line is the full path. I want only the name of each file
You are looking for basename.
This is the Bash equivalent of basename:
echo "${i##*/}"
It strips off everything before and including the last slash.
If you truly do not wish to recurse you can achieve that more succinctly with this find command:
find "$DIR_0" "$DIR_1" "$DIR_2" -type f -maxdepth 1 -exec basename{} \;
If you wish to recurse over subdirs simply leave out maxdepth:
find "$DIR_0" "$DIR_1" "$DIR_2" -type f -exec basename{} \;
to traveling a directory recursively with bash
try this you can find it here
#! /bin/bash
indent_print()
{
for((i=0; i < $1; i++)); do
echo -ne "\t"
done
echo "$2"
}
walk_tree()
{
local oldifs bn lev pr pmat
if [[ $# -lt 3 ]]; then
if [[ $# -lt 2 ]]; then
pmat=".*"
else
pmat="$2"
fi
walk_tree "$1" "$pmat" 0
return
fi
lev=$3
[ -d "$1" ] || return
oldifs=$IFS
IFS=""
for el in $1/ *; do
bn=$(basename "$el")
if [[ -d "$el" ]]; then
indent_print $lev "$bn/"
pr=$( walk_tree "$el" "$2" $(( lev + 1)) )
echo "$pr"
else
if [[ "$bn" =~ $2 ]]; then
indent_print $lev "$bn"
fi
fi
done
IFS=$oldifs
}
walk_tree "$1" "\.sh$"
See also the POSIX compliant Bash functions to replace basename & dirname here:
http://cfaj.freeshell.org/src/scripts/

Resources