search for files in a path bottom up bash

search for files in a path bottom up bash - bash

Using bash scripting, I am looking to try to search for a file based upon a path, however I would like to search from the bottom of the path up. something like /path/to/directory/here and then search "here" for a file ".important" , then go up to "directory" and search for ".important" and so forth up the tree. I don't want to recurse downward an any point in the path.
Thanks

Easy enough once you understand string manipulation in bash.
dest=/path/to/directory/here
curr=
# quote right-hand side to prevent interpretation as glob-style pattern
while [[ $curr != "$dest" ]]; do
if [[ -e $curr/.important ]]; then
printf 'Found ' >&2
printf '%s\n' "$curr/.important"
else
printf '%s\n' "Not found at $curr" >&2
fi
rest=${dest#$curr/} # strip $curr/ from $dest to get $rest
next=${rest%%/*} # strip anything after the first / from next
[[ $next ]] || break # break if next is empty
curr=$curr/$next # otherwise, add next to curr and recur
done
See http://wiki.bash-hackers.org/syntax/pe for more on the string expansion syntax used here.
Alternately:
( set -f; cd /; IFS=/; for dir in $dest; do
cd "$dir" || break
if [ -e .important ]; then
pwd
break
fi
done )
Key points:
set -f disables globbing; otherwise, this will behave very badly for a directory named *.
IFS=/ sets string-splitting on expansion to operate on /.
for dir in $dest is only safe after the two above operations have been done.
breaking if cd fails is essential to ensure that your script is actually in the directory that it thinks it's in.
Note that this is done in a subshell (per the parenthesis) to prevent its changes to shell settings (the set -f and IFS=) from impacting the larger script. This means you can use it in $() and read its output via stdout into a shell variable, but that you can't set a variable inside it and expect that variable to still be set in the parent script.

Related

Remove non-existing directories from `$PATH` environment variable

Removing non-existing directories from the PATH environment variable, it's a neat way to manage your PATH. You add all locations which could ever exist, then remove all those which don't. It's a lot dryer than checking for the existence of the directory upon addition.
I recently wrote a dash/bash function to do this, so I thought I'd share it since apparently this hasn't been addressed anywhere else.

path_checkdir
This code is dash-compatible.
path_checkdir() {
keep_="="
remove_="_"
help='
Usage: path_checkdir [-v] [-K =] [-R _] [-i $'\n']
-i ignore_this_path
Accept the specified path without checking the existence of the directory.
/!\ Beware, specifying it more than once will overwrite the preceding value.
I use it to keep single newlines in my $PATH.
-v
Tell which directories are kept and which are removed.
-K marker_keep_path
-R marker_remove_path
Replace the default values (= for -K and _ for -R) used by -v to tell what is
kept and what is removed.
'
while [ $# -gt 0 ]
do
case "$1" in
"-v") verbose=t;;
"-i") shift; ignore="i$1";;
"-K") shift; keep_="$1";;
"-R") shift; remove_="$1";;
"-h"|"--help") echo "$help"
esac
shift
done
# /!\ IFS characters are stripped when using `read`
local oIFS="$IFS"
IFS=''
# /!\ Beware pipes. They imply subshells
# The usuall alternative is to use process substitution, but it
# won't work with dash, so I used file descriptor redirections
# instead.
{
PATH="$(echo "$PATH" | {
P=""
while read -rd: dir
do
if [ "i$dir" = "$ignore" ] || [ -d "$dir" ]
then
# If -v is provided, be verbose about what is kept (=) and
# what is removed (_).
if [ $verbose ]
then echo "$keep_$dir" >&3
fi
P="$P:$dir"
else
if [ $verbose ]
then echo "$remove_$dir" >&3
fi
fi
done
echo "${P:1}"; })"
} 3>&1
IFS="$IFS"
}
Now, there is still a lot to improve. It accepts only one path exception while it would be great to accept any number and probably to support wildcard patterns too. More important, if some paths of $PATH contain a ~, they won't be correctly interpreted and will be removed. I'm not sure what are all the shell expansions done to $PATH, nor how to re-create them. I'll probably add support for that in the future.

Bash script: A better way to remove a list of files and directories with wildcards

I am trying to pass a list of files including wildcard files and directories that I want to delete but check for them to see if they exist first before deleted. If they are deleted, notify that the directory was deleted, not each individual file within the directory. I.e. if I remove /root/*.tst just say, "I removed *.tst".
#!/bin/bash
touch /boot/{1,2,3,4,5,6,7,8,9,10}.tst
touch /root/{1,2,3,4,5,6,7,8,9,10}.tst
mkdir /root/tmpdir
#setup files and directories we want to delete
file=/boot/*.tst /root/*.tst /root/tmpdir
for i in $file; do
if [[ -d "$file" || -f "$file" ]]; then #do I exist
rm -fr $i
echo removed $i #should tell me that I removed /boot/*.tst or /root/*.tst or /root/tmpdir
else
echo $i does not exist # should tell me if /boot/*.tst or /root/*.tst or /root/tmpdir DNE
fi
done
I can't seem to make any combination of single or double quotes or escaping * make the above do what I want it to do.

Before explaining why your code fails, here is what you should use:
for i in /boot/*.txt /root/*.txt /root/tmpdir; do
# i will be a single file if a pattern expanded, or the literal
# pattern if it does not. You can check using this line:
[[ -f $i ]] || continue
# or use shopt -s nullglob before the loop to cause non-matching
# patterns to be silently ignored
rm -fr "$i"
echo removed $i
done
It appears that would want i to be set to each of three patterns, which is a little tricky and should probably be avoided, since most of the operators you are using expect single file or directory names, not patterns that match multiple names.
The attempt you show
file=/boot/*.tst /root/*.tst /root/tmpdir
would expand /root/*.tst and try to use the first name in the expansion as a command name, executed in an environment where the variable file had the literal value /boot/*.tst. To include all the patterns in the string, you would need to escape the spaces between them, with either
file=/boot/*.tst\ /root/*.tst\ /root/tmpdir
or more naturally
file="/boot/*.tst /root/*.tst /root/tmpdir"
Either way, the patterns are not yet expanded; the literal * is stored in the value of file. You would then expand it using
for i in $file # no quotes!
and after $file expands to its literal value, the stored patterns would be expanded into the set of matching file names. However, this loop would only work for file names that didn't contain whitespace; a single file named foo bar would be seen as two separate values to assign to i, namely foo and bar. The correct way to deal with such file names in bash is to use an array:
files=( /boot/*.tst /root/*.tst /root/tmpdir )
# Quote are necessary this time to protect space-containing filenames
# Unlike regular parameter assignment, the patterns were expanded to the matching
# set of file names first, then the resulting list of files was assigned to the array,
# one file name per element.
for i in "${files[#]}"

You can replace
file=/boot/*.tst /root/*.tst /root/tmpdir
by
printf -v file "%s " /boot/*.tst /root/*.tst /root/tmpdir

The shell expands globs automatically. If you want to be able to print the literal globs in an error message then you'll need to quote them.
rmglob() {
local glob
for glob in "$#"; do
local matched=false
for path in $glob; do
[[ -e $path ]] && rm -rf "$path" && matched=true
done
$matched && echo "removed $glob" || echo "$glob does not exist" >&2
done
}
rmglob '/boot/*.tst' '/root/*.tst' '/root/tmpdir'
Notice the careful use of quoting. The arguments to deleteGlobs are quoted. The $glob variable inside the function is not quoted (for path in $glob) which triggers shell expansion at that point.

Many thanks to everyones' posts including John Kugelman.
This is the code I finally went with that provided two types of deleting. The first is a bit more forceful deleting everything. The second preserved directory structures just removing the files. As per above, note that whitespace in file names is not handled by this method.
rmfunc() {
local glob
for glob in "$#"; do
local matched=false
local checked=true
for path in $glob; do
$checked && echo -e "\nAttempting to clean $glob" && checked=false
[[ -e $path ]] && rm -fr "$path" && matched=true
done
$matched && echo -e "\n\e[1;33m[\e[0m\e[1;32mPASS\e[1;33m]\e[0m Cleaned $glob" || echo -e "\n\e[1;33m[\e[0m\e[1;31mERROR\e[1;33m]\e[0m Can't find $glob (non fatal)."
done
}
# Type 2 removal
xargfunc() {
local glob
for glob in "$#"; do
local matched=false
local checked=true
for path in $glob; do
$checked && echo -e "\nAttempting to clean $glob" && checked=false
[[ -n $(find $path -type f) ]] && find $path -type f | xargs rm -f && matched=true
done
$matched && echo -e "\n\e[1;33m[\e[0m\e[1;32mPASS\e[1;33m]\e[0m Cleaned $glob" || echo -e "\n\e[1;33m[\e[0m\e[1;31mERROR\e[1;33m]\e[0m Can't find $glob (non fatal)."
fi
}

Using Variables with grep, and an IF statement regarding this

I am looking to search for strings within a file using variables.
I have a script that will accept 3 or 4 parameters: 3 are required; the 4th isn't mandatory.
I would like to search the text file for the 3 parameters matching within the same line, and if they do match then I want to remove that line and replace it with my new one - basically it would update the 4th parameter if set, and avoid duplicate entries.
Currently this is what I have:
input=$(egrep -e '$domain\s+$type\s+$item' ~/etc/security/limits.conf)
if [ "$input" == "" ]; then
echo $domain $type $item $value >>~/etc/security/limits.conf
echo \"$domain\" \"$type\" \"$item\" \"$value\" has been successfully added to your limits.conf file.
else
cat ~/etc/security/limits.conf | egrep -v "$domain|$type|$item" >~/etc/security/limits.conf1
rm -rf ~/etc/security/limits.conf
mv ~/etc/security/limits.conf1 ~/etc/security/limits.conf
echo $domain $type $item $value >>~/etc/security/limits.conf
echo \"$domain\" \"$type\" \"$item\" \"$value\" has been successfully added to your limits.conf file.
exit 0
fi
Now I already know that the input=egrep etc.. will not work; it works if I hard code some values, but it won't accept those variables. Basically I have domain=$1, type=$2 and so on.
I would like it so that if all 3 variables are not matched within one line, than it will just append the parameters to the end of the file, but if the parameters do match, then I want them to be deleted, and appended to the file. I know I can use other things like sed and awk, but I have yet to learn them.
This is for a school assignment, and all help is very much appreciated, but I'd also like to learn why and how it works/doesn't, so if you can provide answers to that as well that would be great!

Three things:
To assign the output of a command, use var=$(cmd).
Don't put spaces around the = in assignments.
Expressions don't expand in single quotes: use double quotes.
To summarize:
input=$(egrep -e "$domain\s+$type\s+$item" ~/etc/security/limits.conf)
Also note that ~ is your home directory, so if you meant /etc/security/limits.conf and not /home/youruser/etc/security/limits.conf, leave off the ~

You have several bugs in your script. Here's your script with some comments added
input=$(egrep -e '$domain\s+$type\s+$item' ~/etc/security/limits.conf)
# use " not ' in the string above or the shell can't expand your variables.
# some versions of egrep won't understand '\s'. The safer, POSIX character class is [[:blank:]].
if [ "$input" == "" ]; then
# the shell equality test operator is =, not ==. Some shells will also take == but don't count on it.
# the normal way to check for a variable being empty in shell is with `-z`
# you can have problems with tests in some shells if $input is empty, in which case you'd use [ "X$input" = "X" ].
echo $domain $type $item $value >>~/etc/security/limits.conf
# echo is unsafe and non-portable, you should use printf instead.
# the above calls echo with 4 args, one for each variable - you probably don't want that and should have double-quoted the whole thing.
# always double-quote your shell variables to avoid word splitting ad file name expansion (google those - you don't want them happening here!)
echo \"$domain\" \"$type\" \"$item\" \"$value\" has been successfully added to your limits.conf file.
# the correct form would be:
# printf '"%s" "%s" "%s" "%s" has been successfully added to your limits.conf file.\n' "$domain" "$type" "$item" "$value"
else
cat ~/etc/security/limits.conf | egrep -v "$domain|$type|$item" >~/etc/security/limits.conf1
# Useless Use Of Cat (UUOC - google it). [e]grep can open files just as easily as cat can.
rm -rf ~/etc/security/limits.conf
# -r is for recursively removing files in a directory - inappropriate and misleading when used on a single file.
mv ~/etc/security/limits.conf1 ~/etc/security/limits.conf
# pointless to remove the file above when you're overwriting it here anyway
# If your egrep above failed to create your temp file (e.g. due to memory or permissions issues) then the "mv" above would zap your real file. the correct way to do this is:
# egrep regexp file > tmp && mv tmp file
# i.e. use && to only do the mv if creating the tmp file succeeded.
echo $domain $type $item $value >>~/etc/security/limits.conf
# see previous echo comments.
echo \"$domain\" \"$type\" \"$item\" \"$value\" has been successfully added to your limits.conf file.
# ditto
exit 0
# pointless and misleading having an explicit "exit <success>" when that's what the script will do by default anyway.
fi

This line:
input=$(egrep -e '$domain\s+$type\s+$item' ~/etc/security/limits.conf)
requires double quotes around the regex to allow the shell to interpolate the variable values.
input=$(egrep -e "$domain\s+$type\s+$item" ~/etc/security/limits.conf)
You need to be careful with backslashes; you probably don't have to double them up in this context, but you should be sure you know why.
You should be aware that your first egrep commands is much more restrictive in what it selects than the second egrep which is used to delete data from the file. The first requires the entry with the three fields in the single line; the second only requires a match with any one of the words (and that could be part of a larger word) to delete the line.
Since ~/etc/security/limits.conf is a file, there is no need to use the -r option of rm; it is advisable not to use the -r unless you intend to remove directories.

Can't get this code to work

I'm writing a bash shell script that uses a case with three options:
If the user enters "change -r txt doc *", a file extension gets changed in a subdirectory.
If a user enters "change -n -r doc ", it should rename files that end with .-r or .-n (this will rename all files in the current directory called *.-r as *.doc)
If the user enters nothing, as in "change txt doc *", it just changes a file extension in the current directory.
Here's the code i produced for it (the last two options, i'm not sure how to implement):
#!/bin/bash
case $1 in
-r)
export currectFolder=`pwd`
for i in $(find . -iname "*.$2"); do
export path=$(readlink -f $i)
export folder=`dirname $path`
export name=`basename $path .$2`
cd $folder
mv $name.$2 $name.$3
cd $currectFolder
done
;;
-n)
echo "-n"
;;
*)
echo "all"
esac
Can anyone fix this for me? Or at least tell me where i'm going wrong?

What you should brush up on are string substitutions. All kinds of them actually. Bash is very good with those. Page 105 (recipe 5.18) of the Bash Cookbook is excellent reading for that.
#!/bin/bash
# Make it more flexible for improving command line parsing later
SWITCH=$1
EXTENSIONSRC=$2
EXTENSIONTGT=$3
# Match different cases for the only allowed switch (other than file extensions)
case $SWITCH in
-r|--)
# If it's not -r we limit the find to the current directory
[[ "x$SWITCH" == "x-r" ]] || DONTRECURSE="-maxdepth 1"
# Files in current folder with particular pattern (and subfolders when -r)
find . $DONTRECURSE -iname "*.$EXTENSIONSRC"|while read fname; do
# We use a while to allow for file names with embedded blank spaces
# Get canonical name of the item into CFNAME
CFNAME=$(readlink -f "$fname")
# Strip extension through string substitution
NOEXT_CFNAME="${CFNAME%.$EXTENSIONSRC}"
# Skip renaming if target exists. This can happen due to collisions
# with case-insensitive matching ...
if [[ -f "$NOEXT_CFNAME.$EXTENSIONTGT" ]]; then
echo "WARNING: Skipping $CFNAME"
else
echo "Renaming $CFNAME"
# Do the renaming ...
mv "$CFNAME" "$NOEXT_CFNAME.$EXTENSIONTGT"
fi
done
;;
*)
# The -e for echo means that escape sequences like \n and \t get evaluated ...
echo -e "ERROR: unknown command line switch\n\tSyntax: change <-r|--> <source-ext> <target-ext>"
# Exit with non-zero (i.e. failure) status
exit 1
esac
The syntax is obviously given in the script. I took the freedom to use the convention of -- separating command line switches from file names. This way it looks cleaner and is easier to implement, actually.
NB: it is possible to condense this further. But here I was trying to get a point across, rather than win the obfuscated Bash contest ;)
PS: also handles the case-insensitive stuff now in the renaming part. However, I decided to make it skip if the target file already exists. Can perhaps be rewritten to be a command line option.

Detect if PATH has a specific directory entry in it

With /bin/bash, how would I detect if a user has a specific directory in their $PATH variable?
For example
if [ -p "$HOME/bin" ]; then
echo "Your path is missing ~/bin, you might want to add it."
else
echo "Your path is correctly set"
fi

Using grep is overkill, and can cause trouble if you're searching for anything that happens to include RE metacharacters. This problem can be solved perfectly well with bash's builtin [[ command:
if [[ ":$PATH:" == *":$HOME/bin:"* ]]; then
echo "Your path is correctly set"
else
echo "Your path is missing ~/bin, you might want to add it."
fi
Note that adding colons before both the expansion of $PATH and the path to search for solves the substring match issue; double-quoting the path avoids trouble with metacharacters.

There is absolutely no need to use external utilities like grep for this. Here is what I have been using, which should be portable back to even legacy versions of the Bourne shell.
case :$PATH: # notice colons around the value
in *:$HOME/bin:*) ;; # do nothing, it's there
*) echo "$HOME/bin not in $PATH" >&2;;
esac

Here's how to do it without grep:
if [[ $PATH == ?(*:)$HOME/bin?(:*) ]]
The key here is to make the colons and wildcards optional using the ?() construct. There shouldn't be any problem with metacharacters in this form, but if you want to include quotes this is where they go:
if [[ "$PATH" == ?(*:)"$HOME/bin"?(:*) ]]
This is another way to do it using the match operator (=~) so the syntax is more like grep's:
if [[ "$PATH" =~ (^|:)"${HOME}/bin"(:|$) ]]

Something really simple and naive:
echo "$PATH"|grep -q whatever && echo "found it"
Where whatever is what you are searching for. Instead of && you can put $? into a variable or use a proper if statement.
Limitations include:
The above will match substrings of larger paths (try matching on "bin" and it will probably find it, despite the fact that "bin" isn't in your path, /bin and /usr/bin are)
The above won't automatically expand shortcuts like ~
Or using a perl one-liner:
perl -e 'exit(!(grep(m{^/usr/bin$},split(":", $ENV{PATH}))) > 0)' && echo "found it"
That still has the limitation that it won't do any shell expansions, but it doesn't fail if a substring matches. (The above matches "/usr/bin", in case that wasn't clear).

Here's a pure-bash implementation that will not pick up false-positives due to partial matching.
if [[ $PATH =~ ^/usr/sbin:|:/usr/sbin:|:/usr/sbin$ ]] ; then
do stuff
fi
What's going on here? The =~ operator uses regex pattern support present in bash starting with version 3.0. Three patterns are being checked, separated by regex's OR operator |.
All three sub-patterns are relatively similar, but their differences are important for avoiding partial-matches.
In regex, ^ matches to the beginning of a line and $ matches to the end. As written, the first pattern will only evaluate to true if the path it's looking for is the first value within $PATH. The third pattern will only evaluate to true if if the path it's looking for is the last value within $PATH. The second pattern will evaluate to true when it finds the path it's looking for in-between others values, since it's looking for the delimiter that the $PATH variable uses, :, to either side of the path being searched for.

I wrote the following shell function to report if a directory is listed in the current PATH. This function is POSIX-compatible and will run in compatible shells such as Dash and Bash (without relying on Bash-specific features).
It includes functionality to convert a relative path to an absolute path. It uses the readlink or realpath utilities for this but these tools are not needed if the supplied directory does not have .. or other links as components of its path. Other than this, the function doesn’t require any programs external to the shell.
# Check that the specified directory exists – and is in the PATH.
is_dir_in_path()
{
if [ -z "${1:-}" ]; then
printf "The path to a directory must be provided as an argument.\n" >&2
return 1
fi
# Check that the specified path is a directory that exists.
if ! [ -d "$1" ]; then
printf "Error: ‘%s’ is not a directory.\n" "$1" >&2
return 1
fi
# Use absolute path for the directory if a relative path was specified.
if command -v readlink >/dev/null ; then
dir="$(readlink -f "$1")"
elif command -v realpath >/dev/null ; then
dir="$(realpath "$1")"
else
case "$1" in
/*)
# The path of the provided directory is already absolute.
dir="$1"
;;
*)
# Prepend the path of the current directory.
dir="$PWD/$1"
;;
esac
printf "Warning: neither ‘readlink’ nor ‘realpath’ are available.\n"
printf "Ensure that the specified directory does not contain ‘..’ in its path.\n"
fi
# Check that dir is in the user’s PATH.
case ":$PATH:" in
*:"$dir":*)
printf "‘%s’ is in the PATH.\n" "$dir"
return 0
;;
*)
printf "‘%s’ is not in the PATH.\n" "$dir"
return 1
;;
esac
}
The part using :$PATH: ensures that the pattern also matches if the desired path is the first or last entry in the PATH. This clever trick is based upon this answer by Glenn Jackman on Unix & Linux.

This is a brute force approach but it works in all cases except when a path entry contains a colon. And no programs other than the shell are used.
previous_IFS=$IFS
dir_in_path='no'
export IFS=":"
for p in $PATH
do
[ "$p" = "/path/to/check" ] && dir_in_path='yes'
done
[ "$dir_in_path" = "no" ] && export PATH="$PATH:/path/to/check"
export IFS=$previous_IFS

$PATH is a list of strings separated by : that describe a list of directories. A directory is a list of strings separated by /. Two different strings may point to the same directory (like $HOME and ~, or /usr/local/bin and /usr/local/bin/). So we must fix the rules of what we want to compare/check. I suggest to compare/check the whole strings, and not physical directories, but remove duplicate and trailing /.
First remove duplicate and trailing / from $PATH:
echo $PATH | tr -s / | sed 's/\/:/:/g;s/:/\n/g'
Now suppose $d contains the directory you want to check. Then pipe the previous command to check $d in $PATH.
echo $PATH | tr -s / | sed 's/\/:/:/g;s/:/\n/g' | grep -q "^$d$" || echo "missing $d"

A better and fast solution is this:
DIR=/usr/bin
[[ " ${PATH//:/ } " =~ " $DIR " ]] && echo Found it || echo Not found
I personally use this in my bash prompt to add icons when i go to directories that are in $PATH.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

search for files in a path bottom up bash - bash

Related

Remove non-existing directories from `$PATH` environment variable

Bash script: A better way to remove a list of files and directories with wildcards

Using Variables with grep, and an IF statement regarding this

Can't get this code to work

Detect if PATH has a specific directory entry in it

Categories

Resources