Bash Compound Conditional, With Wildcards and File Existence Check - bash

I've mastered the basics of Bash compound conditionals and have read a few different ways to check for file existence of a wildcard file, but this one is eluding me, so I figured I'd ask for help...
I need to:
1.) Check if some file matching a pattern exists
AND
2.) Check that text in a different file exists.
I know there's lots of ways to do this, but I don't really have the knowledge to prioritize them (if you have that knowledge I'd be interested in reading about that as well).
First things that came to mind is to use find for #1 and grep for #2
So something like
if [ `grep -q "OUTPUT FILE AT STEP 1000" ../log/minimize.log` ] \
&& [ `find -name "jobscript_minim\*cmd\*o\*"` ]; then
echo "Both passed! (1)"
fi
That fails, though curiously:
if `grep -q "OUTPUT FILE AT STEP 1000" ../log/minimize.log` ;then
echo "Text passed!"
fi
if `find -name "jobscript_minim\*cmd\*o\*"` ;then
echo "File passed!"
fi
both pass...
I've done a bit of reading and have seen people talking about the problem of multiple filenames matching wildcards within an if statement. What's the best solution to this? (in answer my question, I'd assumed you take a crack at that question, as well, in the process)
Any ideas/solutions/suggestions?

Let's tackle why your attempt failed first:
if [ `grep -q …` ];
This runs the grep command between backticks, and interpolates the output inside the conditional command. Since grep -q doesn't produce any output, it's as if you wrote if [ ];
The conditional is supposed to test the return code of grep, not anything about its output. Therefore it should be simply written as
if grep -q …;
The find command returns 0 (i.e. true) even if it finds nothing, so this technique won't work. What will work is testing whether its output is empty, by collecting its output any comparing it to the empty string:
if [ "$(find …)" != "" ];
(An equivalent test is if [ -n "$(find …)" ].)
Notice two things here:
I used $(…) rather than backticks. They're equivalent, except that backticks require strange quoting inside them (especially if you try to nest them), whereas $(…) is simple and reliable. Just use $(…) and forget about backticks (except that you need to write \` inside double quotes).
There are double quotes around $(…). This is really important. Without the quotes, the shell would break the output of the find command into words. If find prints, say, two lines dir/file and dir/otherfile, we want if [ "dir/file dir/otherfile" = "" ]; to be executed, not if [ dir/file dir/otherfile = "" ]; which is a syntax error. This is a general rule of shell programming: always put double quotes around a variable or command substitution. (A variable substitution is $foo or ${foo}; a command substitution is $(command).)
Now let's see your requirements.
Check if some file matching a pattern exists
If you're looking for files in the current directory or in any directory below it recursively, then find -name "PATTERN" is right. However, if the directory tree can get large, it's inefficient, because it can spend a lot of time printing all the matches when we only care about one. An easy optimization is to only retain the first line by piping into head -n 1; find will stop searching once it realizes that head is no longer interested in what it has to say.
if [ "$(find -name "jobscript_minimcmdo" | head -n 1)" != "" ];
(Note that the double quotes already protect the wildcards from expansion.)
If you're only looking for files in the current directory, assuming you have GNU find (which is the case on Linux, Cygwin and Gnuwin32), a simple solution is to tell it not to recurse deeper than the current directory.
if [ "$(find -maxdepth 1 -name "jobscript_minim*cmd*o*")" != "" ];
There are other solutions that are more portable, but they're more complicated to write.
Check that text in a different file exists.
You've already got a correct grep command. Note that if you want to search for a literal string, you should use grep -F; if you're looking for a regexp, grep -E has a saner syntax than plain grep.
Putting it all together:
if grep -q -F "OUTPUT FILE AT STEP 1000" ../log/minimize.log &&
[ "$(find -name "jobscript_minim*cmd*o*")" != "" ]; then
echo "Both passed! (1)"
fi

bash 4
shopt -s globstar
files=$(echo **/jobscript_minim*cmd*o*)
if grep -q "pattern" file && [[ ! -z $files ]];then echo "passed"; fi

for i in filename*; do FOUND=$i;break;done
if [ $FOUND == 'filename*' ]; then
echo “No files found matching wildcard.”
else
echo “Files found matching wildcard.”
fi

Related

Constructing an If statement based on the return of the environment $PATH piped through grep (with bash)

I've answered my own question in writing this, but it might be helpful for others as I couldn't find a straightforward answer anywhere else. Please delete if inappropriate.
I'm trying to construct an if statement depending whether some <STRING> is found inside the environment $PATH.
When I pipe $PATH through grep I get a successful hit:
echo $PATH | grep -i "<STRING>"
But I was really struggling to find the syntax required to construct an if statement around this. It appears that the line below works. I know that the $(...) essentially passes the internal commands to the if statement, but I'm not sure why the [[...]] double brackets are needed:
if [[ $(echo $PATH | grep -i "<STRING>") ]]; then echo "HEY"; fi
Maybe someone could explain that for me to have a better understanding.
Thanks.
You could make better use of shell syntax. Something like this:
$ STRING="bin"
$ grep -i $STRING <<< $PATH && echo "HEY"
That is: first, save the search string in a variable (which I called STRING so it's easy to remember), and use that as the search pattern. Then, use the <<< redirection to input a "here string" - namely, the PATH variable.
Or, if you don't want a variable for the string:
$ grep -i "bin" <<< $PATH && echo "HEY"
Then, the construct && <some other command> means: IF the exit status of grep is 0 (meaning at least one successful match), THEN execute the "other command" (otherwise do nothing - just exit as soon as grep completes). This is the more common, more natural form of an "if... then..." statement, exactly of the kind you were trying to write.
Question for you though. Why the -i flag? That means "case independent matching". But in Unix and Linux file names, command names, etc. are case sensitive. Why do you care if the PATH matches the string BIN? It will, because bin is somewhere on the path, but if you then search for the BIN directory you won't find one. (The more interesting question is - how to match complete words only, so that for example to match bin, a directory name bin should be present; sbin shouldn't match bin. But that's about writing regular expressions - not quite what you were asking about.)
The following version - which doesn't even use grep - is based on the same idea, but it won't do case insensitive matching:
$ [[ $PATH == *$STRING* ]] && echo "HEY"
[[ ... ]] evaluates a Boolean expression (here, an equality using the * wildcard on the right-hand side); if true, && causes the execution of the echo command.
you don't need to use [[ ]], just:
if echo $PATH | grep -qi "<STRING>"; then echo "HEY"; fi

MacOS shell script to move files based on tag

I am trying to write a shell script so that I can move school files from one destination to another based on the input. I download these files from a source like canvas and want to move them from my downloads based on the tag I assign, to the path for my course folder which is nested pretty deep thanks to how I stay organized. Unfortunately, since I store these files in my OneDrive school account, I am unable to eliminate some spacing issues but I believe I have accounted for these. Right now the script is the following:
if [ "$1" = "311" ];
then
course="'/path/to/311/folder/$2'"
elif [ "$1" = "411" ];
then
course="'/path/to/411/folder/$2'"
elif [ "$1" = "516" ];
then
course="'/path/to/516/folder/$2'"
elif [ "$1" = "530" ];
then
course="'/path/to/530/folder/$2'"
elif [ "$1" = "599" ];
then
course="'/path/to/599/folder/$2'"
fi
files=$(mdfind 'kMDItemUserTags='$1'' -onlyin /Users/user/Downloads)
#declare -a files=$(mdfind 'kMDItemUserTags='$1'' -onlyin /Users/user/Downloads)
#mv $files $course
#echo "mv $files $course"
#echo $course
for file in $files
#for file in "${files[#]}"
do
#echo $file
#echo $course
mv $file $course
done
Where $1 is the tag ID and first part of path selection, and $2 is what week number folder I want to move it to. The single quotation marks are there to take care of the spacing in the filepath. I could very easily do this in python but I'm trying to expand my capabilities some. Every time I run this script I get the following message:
usage: mv [-f | -i | -n] [-v] source target
mv [-f | -i | -n] [-v] source ... directory
I initially tried to just move them all at once (per the first mv command that's commented out) and got this error, then tried the for loop, and array but get the same error each time. However, when I uncomment the echo statements in the for loop and manually try to move each one by copying and pasting the paths to the command line, it works perfectly. My best guess is something to do with the formatting of the variable "files", since
echo "mv $files $course"
indicates the presence of a newline character or separator between each file it saves.
I'm sure it's something super simple that I'm missing since I just started trying to pick up shell scripting last week, but nothing I have been able to find online has helped me resolve this. Any help would be greatly appreciated. Thanks
You can replace the files variable assignment and for loop with one command make this the script:
if [ "$1" = "311" ];
then
course="'/path/to/311/folder/$2'"
elif [ "$1" = "411" ];
then
course="'/path/to/411/folder/$2'"
elif [ "$1" = "516" ];
then
course="'/path/to/516/folder/$2'"
elif [ "$1" = "530" ];
then
course="'/path/to/530/folder/$2'"
elif [ "$1" = "599" ];
then
course="'/path/to/599/folder/$2'"
fi
mv -t $course $(mdfind 'kMDItemUserTags='$1'' -onlyin /Users/user/Downloads | sed ':a;N;$!ba;s/\n/ /g)
The sed ':a;N;$!ba;s/\n/ /g command simply replaces the newline characters with spaces, and the -t option for mv simply makes mv take the destination as the first argument.
You're getting rather confused about how quoting works in the shell. First rule: quotes go around data, not in data. For example, you use:
course="'/path/to/311/folder/$2'"
...
mv $file $course
When you set course this way, the double-quotes are treated as shell syntax (i.e. they change how what's between them is parsed), but the single-quotes are stored as part of the variable's value, and will thereafter be treated as data. When you use this variable in the mv command, it's actually looking for a directory literally named single-quote, and under that a directory named "path", etc. Instead, just put the appropriate quotes for how you want it parsed at that point, and then double-quotes around the variable when you use it (to prevent probably-unwanted word splitting and wildcard expansion). Like this:
course="/path/to/311/folder/$2"
...
mv "$file" "$course" # This needs more work -- see below
Also, where you have:
mdfind 'kMDItemUserTags='$1'' -onlyin /Users/user/Downloads
that doesn't really make any sense. You've got a single-quoted section, 'kMDItemUserTags=' where the quotes have no effect at all (single-quotes suppress all special meanings that characters have, like $ introducing variable substitution, but there aren't any characters there with special meanings, so no reason for the quotes), followed by $ without double-quotes around it, meaning that some special characters (whitespace and wildcards) in its value will get special parsing (which you probably don't want), followed by a zero-length single-quoted string, '', which parses out to exactly nothing. You want the $1 part in double-quotes; some people also include the rest of the string in the double-quoted section, which has no effect at all. In fact, other than the $2 part (and the spaces between parameters), you can quote or not however you want. Thus, any of these would work equivalently:
mdfind kMDItemUserTags="$1" -onlyin /Users/user/Downloads
mdfind "kMDItemUserTags=$1" -onlyin /Users/user/Downloads
mdfind "kMDItemUserTags=$1" '-onlyin' '/Users/user/Downloads'
mdfind 'kMDItemUserTags'="$1" '-'"only"'in' /'Users'/'user'/'Down'loads
...etc
Ok, next problem: parsing the output from mdfind from a series of characters into separate filepaths. This is actually tricky. If you put double-quotes around the resilting string, it'll get treated as one long filepath that happens to contain some newlines in it (which is totally legal, but not what you want). If you don't double-quote it, it'll be split into separate filepaths based on whitespace (not just newlines, but also spaces and tabs -- and spaces are common within macOS filenames), and anything that looks like a wildcard will get expanded to a list of matching filenames. This tends to cause chaos.
The solution: there's one character than cannot occur in a filepath, the ASCII NULL (character code 0), and mdfind -0 will output its list delimited with null characters. You can't put the result in a shell variable (they can't hold nulls either), but you can pass it through a pipe to, say, xargs -0, which will (thanks to the -0 option) parse the nulls as delimiters, and build commands out of the results. There is one slightly tricky thing: you want xargs to put the filepaths it gets in the middle of the argument list to mv, not at the end like it usually does. The -J option lets you tell it where to add arguments. I'll also suggest two safety measures: the -p option to xargs makes it ask before actually executing the command (use this at least until you're sure it's doing the right thing), and the -n option to mv, which tells it not to overwrite existing files if there's a naming conflict. The result is something like this:
mdfind -0 kMDItemUserTags="$1" -onlyin /Users/user/Downloads | xargs -0 -p -J% mv -n % "$course"
It is a good point to consider about filenames with whitespaces.
However the problem is that you are not quoting the filename in the mv command. Please take a look of a simple example below:
filename="with space.txt"
=> assign a variable to a filname with a space
touch "$filename"
=> create a file "with space.txt"
str="'$filename'"
=> wrap with single quotes (as you do)
echo $str
=> yields 'with space.txt' and may look good, which is a pitfall
mv $str "newname.txt"
=> causes an error
The mv command above causes an error because the command is invoked with
three arguments as: mv 'with space.txt' newname.txt. Unfortunately
the pre-quoting with single quotes is meaningless.
Instead, please try something like:
if [ "$1" = "311" ]; then
course="/path/to/311/folder/$2"
elif [ "$1" = "411" ]; then
course="/path/to/411/folder/$2"
elif [ "$1" = "516" ]; then
course="/path/to/516/folder/$2"
elif [ "$1" = "530" ]; then
course="/path/to/530/folder/$2"
elif [ "$1" = "599" ]; then
course="/path/to/599/folder/$2"
else
# illegal value in $1. do some error handling
fi
# the lines above may be simplified if /path/to/*folder/ have some regularity
mdfind "kMDItemUserTags=$1" -onlyin /Users/user/Downloads | while read -r file; do
mv "$file" "$course"
done
# the syntax above works as long as the filenames do not contain newline characters

How to call a command over every .json file in a directory with different file extensions? [duplicate]

for i in $(ls);do
if [ $i = '*.java' ];then
echo "I do something with the file $i"
fi
done
I want to loop through each file in the current folder and check if it matches a specific extension. The code above doesn't work, do you know why?
No fancy tricks needed:
for i in *.java; do
[ -f "$i" ] || break
...
done
The guard ensures that if there are no matching files, the loop will exit without trying to process a non-existent file name *.java.
In bash (or shells supporting something similar), you can use the nullglob option
to simply ignore a failed match and not enter the body of the loop.
shopt -s nullglob
for i in *.java; do
...
done
Some more detail on the break-vs-continue discussion in the comments. I consider it somewhat out of scope whether you use break or continue, because what the first loop is trying to do is distinguish between two cases:
*.java had no matches, and so is treated as literal text.
*.java had at least one match, and that match might have included an entry named *.java.
In case #1, break is fine, because there are no other values of $i forthcoming, and break and continue would be equivalent (though I find break more explicit; you're exiting the loop, not just waiting for the loop to exit passively).
In case #2, you still have to do whatever filtering is necessary on any possible matches. As such, the choice of break or continue is less relevant than which test (-f, -d, -e, etc) you apply to $i, which IMO is the wrong way to determine if you entered the loop "incorrectly" in the first place.
That is, I don't want to be in the position of examining the value of $i at all in case #1, and in case #2 what you do with the value has more to do with your business logic for each file, rather than the logic of selecting files to process in the first place. I would prefer to leave that logic to the individual user, rather than express one choice or the other in the question.
As an aside, zsh provides a way to do this kind of filtering in the glob itself. You can match only regular files ending with .java (and disable the default behavior of treating unmatched patterns as an error, rather than as literal text) with
for f in *.java(.N); do
...
done
With the above, you are guaranteed that if you reach the body of the loop, then $f expands to the name of a regular file. The . makes *.java match only regular files, and the N causes a failed match to expand to nothing instead of producing an error.
There are also other such glob qualifiers for doing all sorts of filtering on filename expansions. (I like to joke that zsh's glob expansion replaces the need to use find at all.)
Recursively add subfolders,
for i in `find . -name "*.java" -type f`; do
echo "$i"
done
Loop through all files ending with: .img, .bin, .txt suffix, and print the file name:
for i in *.img *.bin *.txt;
do
echo "$i"
done
Or in a recursive manner (find also in all subdirectories):
for i in `find . -type f -name "*.img" -o -name "*.bin" -o -name "*.txt"`;
do
echo "$i"
done
the correct answer is #chepner's
EXT=java
for i in *.${EXT}; do
...
done
however, here's a small trick to check whether a filename has a given extensions:
EXT=java
for i in *; do
if [ "${i}" != "${i%.${EXT}}" ];then
echo "I do something with the file $i"
fi
done
as #chepner says in his comment you are comparing $i to a fixed string.
To expand and rectify the situation you should use [[ ]] with the regex operator =~
eg:
for i in $(ls);do
if [[ $i =~ .*\.java$ ]];then
echo "I want to do something with the file $i"
fi
done
the regex to the right of =~ is tested against the value of the left hand operator and should not be quoted, ( quoted will not error but will compare against a fixed string and so will most likely fail"
but #chepner 's answer above using glob is a much more efficient mechanism.
I agree withe the other answers regarding the correct way to loop through the files. However the OP asked:
The code above doesn't work, do you know why?
Yes!
An excellent article What is the difference between test, [ and [[ ?] explains in detail that among other differences, you cannot use expression matching or pattern matching within the test command (which is shorthand for [ )
Feature new test [[ old test [ Example
Pattern matching = (or ==) (not available) [[ $name = a* ]] || echo "name does not start with an 'a': $name"
Regular Expression =~ (not available) [[ $(date) =~ ^Fri\ ...\ 13 ]] && echo "It's Friday the 13th!"
matching
So this is the reason your script fails. If the OP is interested in an answer with the [[ syntax (which has the disadvantage of not being supported on as many platforms as the [ command), I would be happy to edit my answer to include it.
EDIT: Any protips for how to format the data in the answer as a table would be helpful!
I found this solution to be quite handy. It uses the -or option in find:
find . -name \*.tex -or -name "*.png" -or -name "*.pdf"
It will find the files with extension tex, png, and pdf.

Loop through all the files with a specific extension

for i in $(ls);do
if [ $i = '*.java' ];then
echo "I do something with the file $i"
fi
done
I want to loop through each file in the current folder and check if it matches a specific extension. The code above doesn't work, do you know why?
No fancy tricks needed:
for i in *.java; do
[ -f "$i" ] || break
...
done
The guard ensures that if there are no matching files, the loop will exit without trying to process a non-existent file name *.java.
In bash (or shells supporting something similar), you can use the nullglob option
to simply ignore a failed match and not enter the body of the loop.
shopt -s nullglob
for i in *.java; do
...
done
Some more detail on the break-vs-continue discussion in the comments. I consider it somewhat out of scope whether you use break or continue, because what the first loop is trying to do is distinguish between two cases:
*.java had no matches, and so is treated as literal text.
*.java had at least one match, and that match might have included an entry named *.java.
In case #1, break is fine, because there are no other values of $i forthcoming, and break and continue would be equivalent (though I find break more explicit; you're exiting the loop, not just waiting for the loop to exit passively).
In case #2, you still have to do whatever filtering is necessary on any possible matches. As such, the choice of break or continue is less relevant than which test (-f, -d, -e, etc) you apply to $i, which IMO is the wrong way to determine if you entered the loop "incorrectly" in the first place.
That is, I don't want to be in the position of examining the value of $i at all in case #1, and in case #2 what you do with the value has more to do with your business logic for each file, rather than the logic of selecting files to process in the first place. I would prefer to leave that logic to the individual user, rather than express one choice or the other in the question.
As an aside, zsh provides a way to do this kind of filtering in the glob itself. You can match only regular files ending with .java (and disable the default behavior of treating unmatched patterns as an error, rather than as literal text) with
for f in *.java(.N); do
...
done
With the above, you are guaranteed that if you reach the body of the loop, then $f expands to the name of a regular file. The . makes *.java match only regular files, and the N causes a failed match to expand to nothing instead of producing an error.
There are also other such glob qualifiers for doing all sorts of filtering on filename expansions. (I like to joke that zsh's glob expansion replaces the need to use find at all.)
Recursively add subfolders,
for i in `find . -name "*.java" -type f`; do
echo "$i"
done
Loop through all files ending with: .img, .bin, .txt suffix, and print the file name:
for i in *.img *.bin *.txt;
do
echo "$i"
done
Or in a recursive manner (find also in all subdirectories):
for i in `find . -type f -name "*.img" -o -name "*.bin" -o -name "*.txt"`;
do
echo "$i"
done
the correct answer is #chepner's
EXT=java
for i in *.${EXT}; do
...
done
however, here's a small trick to check whether a filename has a given extensions:
EXT=java
for i in *; do
if [ "${i}" != "${i%.${EXT}}" ];then
echo "I do something with the file $i"
fi
done
as #chepner says in his comment you are comparing $i to a fixed string.
To expand and rectify the situation you should use [[ ]] with the regex operator =~
eg:
for i in $(ls);do
if [[ $i =~ .*\.java$ ]];then
echo "I want to do something with the file $i"
fi
done
the regex to the right of =~ is tested against the value of the left hand operator and should not be quoted, ( quoted will not error but will compare against a fixed string and so will most likely fail"
but #chepner 's answer above using glob is a much more efficient mechanism.
I agree withe the other answers regarding the correct way to loop through the files. However the OP asked:
The code above doesn't work, do you know why?
Yes!
An excellent article What is the difference between test, [ and [[ ?] explains in detail that among other differences, you cannot use expression matching or pattern matching within the test command (which is shorthand for [ )
Feature new test [[ old test [ Example
Pattern matching = (or ==) (not available) [[ $name = a* ]] || echo "name does not start with an 'a': $name"
Regular Expression =~ (not available) [[ $(date) =~ ^Fri\ ...\ 13 ]] && echo "It's Friday the 13th!"
matching
So this is the reason your script fails. If the OP is interested in an answer with the [[ syntax (which has the disadvantage of not being supported on as many platforms as the [ command), I would be happy to edit my answer to include it.
EDIT: Any protips for how to format the data in the answer as a table would be helpful!
I found this solution to be quite handy. It uses the -or option in find:
find . -name \*.tex -or -name "*.png" -or -name "*.pdf"
It will find the files with extension tex, png, and pdf.

How to prevent code/option injection in a bash script

I have written a small bash script called "isinFile.sh" for checking if the first term given to the script can be found in the file "file.txt":
#!/bin/bash
FILE="file.txt"
if [ `grep -w "$1" $FILE` ]; then
echo "true"
else
echo "false"
fi
However, running the script like
> ./isinFile.sh -x
breaks the script, since -x is interpreted by grep as an option.
So I improved my script
#!/bin/bash
FILE="file.txt"
if [ `grep -w -- "$1" $FILE` ]; then
echo "true"
else
echo "false"
fi
using -- as an argument to grep. Now running
> ./isinFile.sh -x
false
works. But is using -- the correct and only way to prevent code/option injection in bash scripts? I have not seen it in the wild, only found it mentioned in ABASH: Finding Bugs in Bash Scripts.
grep -w -- ...
prevents that interpretation in what follows --
EDIT
(I did not read the last part sorry). Yes, it is the only way. The other way is to avoid it as first part of the search; e.g. ".{0}-x" works too but it is odd., so e.g.
grep -w ".{0}$1" ...
should work too.
There's actually another code injection (or whatever you want to call it) bug in this script: it simply hands the output of grep to the [ (aka test) command, and assumes that'll return true if it's not empty. But if the output is more than one "word" long, [ will treat it as an expression and try to evaluate it. For example, suppose the file contains the line 0 -eq 2 and you search for "0" -- [ will decide that 0 is not equal to 2, and the script will print false despite the fact that it found a match.
The best way to fix this is to use Ignacio Vazquez-Abrams' suggestion (as clarified by Dennis Williamson) -- this completely avoids the parsing problem, and is also faster (since -q makes grep stop searching at the first match). If that option weren't available, another method would be to protect the output with double-quotes: if [ "$(grep -w -- "$1" "$FILE")" ]; then (note that I also used $() instead of backquotes 'cause I find them much easier to read, and quotes around $FILE just in case it contains anything funny, like whitespace).
Though not applicable in this particular case, another technique can be used to prevent filenames that start with hyphens from being interpreted as options:
rm ./-x
or
rm /path/to/-x

Resources