Test file existence with exit status from gnu-find - bash

When test -e file is not flexible enough, I tend to use the following Bash idiom to check the existence of a file:
if [ -n "$(find ${FIND_ARGS} -print -quit)" ] ; then
echo "pass"
else
echo "fail"
fi
But since I am only interested in a boolean value, are there any ${FIND_ARGS} that will let me do instead:
if find ${FIND_ARGS} ; ...

I'd say no. man find...
find exits with status 0 if all files are processed successfully, greater than 0 if errors occur. This is deliberately a very broad description, but if the return value is non-zero, you should not rely on the correctness of the results of find.
Testing for output is probably fine for find. That isn't a "Bash idiom". If that's not good enough and you have Bash available then you can use extglobs and possibly globstar for file matching tests with [[. Find should only be used for complex recursive file matching, or actual searching for files, and other things that can't easily be done with Bash features.

Related

How to call a command over every .json file in a directory with different file extensions? [duplicate]

for i in $(ls);do
if [ $i = '*.java' ];then
echo "I do something with the file $i"
fi
done
I want to loop through each file in the current folder and check if it matches a specific extension. The code above doesn't work, do you know why?
No fancy tricks needed:
for i in *.java; do
[ -f "$i" ] || break
...
done
The guard ensures that if there are no matching files, the loop will exit without trying to process a non-existent file name *.java.
In bash (or shells supporting something similar), you can use the nullglob option
to simply ignore a failed match and not enter the body of the loop.
shopt -s nullglob
for i in *.java; do
...
done
Some more detail on the break-vs-continue discussion in the comments. I consider it somewhat out of scope whether you use break or continue, because what the first loop is trying to do is distinguish between two cases:
*.java had no matches, and so is treated as literal text.
*.java had at least one match, and that match might have included an entry named *.java.
In case #1, break is fine, because there are no other values of $i forthcoming, and break and continue would be equivalent (though I find break more explicit; you're exiting the loop, not just waiting for the loop to exit passively).
In case #2, you still have to do whatever filtering is necessary on any possible matches. As such, the choice of break or continue is less relevant than which test (-f, -d, -e, etc) you apply to $i, which IMO is the wrong way to determine if you entered the loop "incorrectly" in the first place.
That is, I don't want to be in the position of examining the value of $i at all in case #1, and in case #2 what you do with the value has more to do with your business logic for each file, rather than the logic of selecting files to process in the first place. I would prefer to leave that logic to the individual user, rather than express one choice or the other in the question.
As an aside, zsh provides a way to do this kind of filtering in the glob itself. You can match only regular files ending with .java (and disable the default behavior of treating unmatched patterns as an error, rather than as literal text) with
for f in *.java(.N); do
...
done
With the above, you are guaranteed that if you reach the body of the loop, then $f expands to the name of a regular file. The . makes *.java match only regular files, and the N causes a failed match to expand to nothing instead of producing an error.
There are also other such glob qualifiers for doing all sorts of filtering on filename expansions. (I like to joke that zsh's glob expansion replaces the need to use find at all.)
Recursively add subfolders,
for i in `find . -name "*.java" -type f`; do
echo "$i"
done
Loop through all files ending with: .img, .bin, .txt suffix, and print the file name:
for i in *.img *.bin *.txt;
do
echo "$i"
done
Or in a recursive manner (find also in all subdirectories):
for i in `find . -type f -name "*.img" -o -name "*.bin" -o -name "*.txt"`;
do
echo "$i"
done
the correct answer is #chepner's
EXT=java
for i in *.${EXT}; do
...
done
however, here's a small trick to check whether a filename has a given extensions:
EXT=java
for i in *; do
if [ "${i}" != "${i%.${EXT}}" ];then
echo "I do something with the file $i"
fi
done
as #chepner says in his comment you are comparing $i to a fixed string.
To expand and rectify the situation you should use [[ ]] with the regex operator =~
eg:
for i in $(ls);do
if [[ $i =~ .*\.java$ ]];then
echo "I want to do something with the file $i"
fi
done
the regex to the right of =~ is tested against the value of the left hand operator and should not be quoted, ( quoted will not error but will compare against a fixed string and so will most likely fail"
but #chepner 's answer above using glob is a much more efficient mechanism.
I agree withe the other answers regarding the correct way to loop through the files. However the OP asked:
The code above doesn't work, do you know why?
Yes!
An excellent article What is the difference between test, [ and [[ ?] explains in detail that among other differences, you cannot use expression matching or pattern matching within the test command (which is shorthand for [ )
Feature new test [[ old test [ Example
Pattern matching = (or ==) (not available) [[ $name = a* ]] || echo "name does not start with an 'a': $name"
Regular Expression =~ (not available) [[ $(date) =~ ^Fri\ ...\ 13 ]] && echo "It's Friday the 13th!"
matching
So this is the reason your script fails. If the OP is interested in an answer with the [[ syntax (which has the disadvantage of not being supported on as many platforms as the [ command), I would be happy to edit my answer to include it.
EDIT: Any protips for how to format the data in the answer as a table would be helpful!
I found this solution to be quite handy. It uses the -or option in find:
find . -name \*.tex -or -name "*.png" -or -name "*.pdf"
It will find the files with extension tex, png, and pdf.

Bash command to see if any files in dir - test if a directory is empty [duplicate]

This question already has answers here:
Checking from shell script if a directory contains files
(30 answers)
Closed 2 years ago.
I have the following bash script:
if ls /Users/david/Desktop/empty > /dev/null
then
echo 'yes -- files'
else
echo 'no -- files'
fi
How would I modify the top line such that it evaluates true if there are one or more files in the /Users/david/Desktop/empty dir?
This is covered in detail in BashFAQ #004. Notably, use of ls for this purpose is an antipattern and should be avoided.
shopt -s dotglob # if including hidden files is desired
files=( "$dir"/* )
[[ -e $files || -L $files ]] && echo "Directory is not empty"
[[ -e $files ]] doesn't actually check if the entire array's contents exist; rather, it checks the first name returned -- which handles the case when no files match, wherein the glob expression itself is returned as the sole result.
Notably:
This is far faster than invoking ls, which requires using fork() to spawn a subshell, execve() to replace that subshell with /bin/ls, the operating system's dynamic linker to load shared libraries used by the ls binary, etc, etc. [An exception to this is extremely large directories, of tens of thousands of files -- a case in which ls will also be slow; see the find-based solution below for those].
This is more correct than invoking ls: The list of files returned by globbing is guaranteed to exactly match the literal names of files, whereas ls can munge names with hidden characters. If the first entry is a valid filename, "${files[#]}" can be safely iterated over with assurance that each returned value will be a name, and there's no need to worry about filesystems with literal newlines in their names inflating the count if the local ls implementation does not escape them.
That said, an alternative approach is to use find, if you have one with the -empty extension (available both from GNU find and from modern BSDs including Mac OS):
[[ $(find -H "$dir" -maxdepth 0 -type d -empty) ]] || echo "Directory is not empty"
...if any result is given, the directory is nonempty. While slower than globbing on directories which are not unusually large, this is faster than either ls or globbing for extremely large directories not present in the direntry cache, as it can return results without a full scan.
Robust pure Bash solutions:
For background on why a pure Bash solution with globbing is superior to using ls, see Charles Duffy's helpful answer, which also contains a find-based alternative, which is much faster and less memory-intensive with large directories.[1]
Also consider anubhava's equally fast and memory-efficient stat-based answer, which, however, requires distinct syntax forms on Linux and BSD/OSX.
Updated to a simpler solution, gratefully adapted from this answer.
# EXCLUDING hidden files and folders - note the *quoted* use of glob '*'
if compgen -G '*' >/dev/null; then
echo 'not empty'
else
echo 'empty, but may have hidden files/dirs.'
fi
compgen -G is normally used for tab completion, but it is useful in this case as well:
Note that compgen -G does its own globbing, so you must pass it the glob (filename pattern) in quotes for it to output all matches. In this particular case, even passing an unquoted pattern up front would work, but the difference is worth nothing.
if nothing matches, compgen -G always produces no output (irrespective of the state of the nullglob option), and it indicates via its exit code whether at least 1 match was found, which is what the conditional takes advantage of (while suppressing any stdout output with >/dev/null).
# INCLUDING hidden files and folders - note the *unquoted* use of glob *
if (shopt -s dotglob; compgen -G * >/dev/null); then
echo 'not empty'
else
echo 'completely empty'
fi
compgen -G never matches hidden items (irrespective of the state of the dotglob option), so a workaround is needed to find hidden items too:
(...) creates a subshell for the conditional; that is, the commands executed in the subshell don't affect the current shell's environment, which allows us to set the dotglob option in a localized way.
shopt -s dotglob causes * to match hidden items too (except for . and ..).
compgen -G * with unquoted *, thanks to up-front expansion by the shell, is either passed at least one filename, whether hidden or not (additional filenames are ignored) or the empty string, if neither hidden nor non-hidden items exists. In the former case the exit code is 0 (signaling success and therefore a nonempty directory), in the later 1 (signaling a truly empty directory).
[1]
This answer originally falsely claimed to offer a Bash-only solution that is efficient with large directories, based on the following approach: (shopt -s nullglob dotglob; for f in "$dir"/*; do exit 0; done; exit 1).
This is NOT more efficient, because, internally, Bash still collects all matches in an array first before entering the loop - in other words: for * is not evaluated lazily.
Here is a solution based on stat command that can return number of hard links if run against a directory (or link to a directory). It starts incrementing number of hard links from 3 as first two are . and .. entries thus subtracting 2 from this number gives as actual number of entries in the given directory (this includes symlinks as well).
So putting it all together:
(( ($(stat -Lc '%h' "$dir") - 2) > 0)) && echo 'not empty' || echo 'empty'
As per man stat options used are:
%h number of hard links
-L --dereference, follow links
EDIT: To make it BSD/OSX compatible use:
(( ($(stat -Lf '%l' "$dir") - 2) > 0)) && echo 'not empty' || echo 'empty'

Bash conditional on command exit code

In bash, I want to say "if a file doesn't contain XYZ, then" do a bunch of things. The most natural way to transpose this into code is something like:
if [ ! grep --quiet XYZ "$MyFile" ] ; then
... do things ...
fi
But of course, that's not valid Bash syntax. I could use backticks, but then I'll be testing the output of the file. The two alternatives I can think of are:
grep --quiet XYZ "$MyFile"
if [ $? -ne 0 ]; then
... do things ...
fi
And
grep --quiet XYZ "$MyFile" ||
( ... do things ...
)
I kind of prefer the second one, it's more Lispy and the || for control flow isn't that uncommon in scripting languages. I can see arguments for the first one too, although when the person reads the first line, they don't know why you're executing grep, it looks like you're executing it for it's main effect, rather than just to control a branch in script.
Is there a third, more direct way which uses an if statement and has the grep in the condition?
Yes there is:
if grep --quiet .....
then
# If grep finds something
fi
or if the grep fails
if ! grep --quiet .....
then
# If grep doesn't find something
fi
You don't need the [ ] (test) to check the return value of a command. Just try:
if ! grep --quiet XYZ "$MyFile" ; then
This is a matter of taste since there obviously are multiple working solutions. When I deal with a problem like this, I usually apply wc -l after grep in order to count the lines that match. Then you have a single integer number that you can evaluate within a test condition. If the question only is whether there is a match at all (the number of matching lines does not matter), then applying wc probably is OTT and evaluation of grep's return code seems to be the best solution:
Normally, the exit status is 0 if selected lines are found and 1
otherwise. But the exit status is 2 if an error occurred, unless the
-q or --quiet or --silent option is used and a selected line is found. Note, however, that POSIX only mandates, for programs such as grep,
cmp, and diff, that the exit status in case of error be greater than
1; it is therefore advisable, for the sake of portability, to use
logic that tests for this general condition instead of strict equality
with 2.

Bash Compound Conditional, With Wildcards and File Existence Check

I've mastered the basics of Bash compound conditionals and have read a few different ways to check for file existence of a wildcard file, but this one is eluding me, so I figured I'd ask for help...
I need to:
1.) Check if some file matching a pattern exists
AND
2.) Check that text in a different file exists.
I know there's lots of ways to do this, but I don't really have the knowledge to prioritize them (if you have that knowledge I'd be interested in reading about that as well).
First things that came to mind is to use find for #1 and grep for #2
So something like
if [ `grep -q "OUTPUT FILE AT STEP 1000" ../log/minimize.log` ] \
&& [ `find -name "jobscript_minim\*cmd\*o\*"` ]; then
echo "Both passed! (1)"
fi
That fails, though curiously:
if `grep -q "OUTPUT FILE AT STEP 1000" ../log/minimize.log` ;then
echo "Text passed!"
fi
if `find -name "jobscript_minim\*cmd\*o\*"` ;then
echo "File passed!"
fi
both pass...
I've done a bit of reading and have seen people talking about the problem of multiple filenames matching wildcards within an if statement. What's the best solution to this? (in answer my question, I'd assumed you take a crack at that question, as well, in the process)
Any ideas/solutions/suggestions?
Let's tackle why your attempt failed first:
if [ `grep -q …` ];
This runs the grep command between backticks, and interpolates the output inside the conditional command. Since grep -q doesn't produce any output, it's as if you wrote if [ ];
The conditional is supposed to test the return code of grep, not anything about its output. Therefore it should be simply written as
if grep -q …;
The find command returns 0 (i.e. true) even if it finds nothing, so this technique won't work. What will work is testing whether its output is empty, by collecting its output any comparing it to the empty string:
if [ "$(find …)" != "" ];
(An equivalent test is if [ -n "$(find …)" ].)
Notice two things here:
I used $(…) rather than backticks. They're equivalent, except that backticks require strange quoting inside them (especially if you try to nest them), whereas $(…) is simple and reliable. Just use $(…) and forget about backticks (except that you need to write \` inside double quotes).
There are double quotes around $(…). This is really important. Without the quotes, the shell would break the output of the find command into words. If find prints, say, two lines dir/file and dir/otherfile, we want if [ "dir/file dir/otherfile" = "" ]; to be executed, not if [ dir/file dir/otherfile = "" ]; which is a syntax error. This is a general rule of shell programming: always put double quotes around a variable or command substitution. (A variable substitution is $foo or ${foo}; a command substitution is $(command).)
Now let's see your requirements.
Check if some file matching a pattern exists
If you're looking for files in the current directory or in any directory below it recursively, then find -name "PATTERN" is right. However, if the directory tree can get large, it's inefficient, because it can spend a lot of time printing all the matches when we only care about one. An easy optimization is to only retain the first line by piping into head -n 1; find will stop searching once it realizes that head is no longer interested in what it has to say.
if [ "$(find -name "jobscript_minimcmdo" | head -n 1)" != "" ];
(Note that the double quotes already protect the wildcards from expansion.)
If you're only looking for files in the current directory, assuming you have GNU find (which is the case on Linux, Cygwin and Gnuwin32), a simple solution is to tell it not to recurse deeper than the current directory.
if [ "$(find -maxdepth 1 -name "jobscript_minim*cmd*o*")" != "" ];
There are other solutions that are more portable, but they're more complicated to write.
Check that text in a different file exists.
You've already got a correct grep command. Note that if you want to search for a literal string, you should use grep -F; if you're looking for a regexp, grep -E has a saner syntax than plain grep.
Putting it all together:
if grep -q -F "OUTPUT FILE AT STEP 1000" ../log/minimize.log &&
[ "$(find -name "jobscript_minim*cmd*o*")" != "" ]; then
echo "Both passed! (1)"
fi
bash 4
shopt -s globstar
files=$(echo **/jobscript_minim*cmd*o*)
if grep -q "pattern" file && [[ ! -z $files ]];then echo "passed"; fi
for i in filename*; do FOUND=$i;break;done
if [ $FOUND == 'filename*' ]; then
echo “No files found matching wildcard.”
else
echo “Files found matching wildcard.”
fi

How to prevent code/option injection in a bash script

I have written a small bash script called "isinFile.sh" for checking if the first term given to the script can be found in the file "file.txt":
#!/bin/bash
FILE="file.txt"
if [ `grep -w "$1" $FILE` ]; then
echo "true"
else
echo "false"
fi
However, running the script like
> ./isinFile.sh -x
breaks the script, since -x is interpreted by grep as an option.
So I improved my script
#!/bin/bash
FILE="file.txt"
if [ `grep -w -- "$1" $FILE` ]; then
echo "true"
else
echo "false"
fi
using -- as an argument to grep. Now running
> ./isinFile.sh -x
false
works. But is using -- the correct and only way to prevent code/option injection in bash scripts? I have not seen it in the wild, only found it mentioned in ABASH: Finding Bugs in Bash Scripts.
grep -w -- ...
prevents that interpretation in what follows --
EDIT
(I did not read the last part sorry). Yes, it is the only way. The other way is to avoid it as first part of the search; e.g. ".{0}-x" works too but it is odd., so e.g.
grep -w ".{0}$1" ...
should work too.
There's actually another code injection (or whatever you want to call it) bug in this script: it simply hands the output of grep to the [ (aka test) command, and assumes that'll return true if it's not empty. But if the output is more than one "word" long, [ will treat it as an expression and try to evaluate it. For example, suppose the file contains the line 0 -eq 2 and you search for "0" -- [ will decide that 0 is not equal to 2, and the script will print false despite the fact that it found a match.
The best way to fix this is to use Ignacio Vazquez-Abrams' suggestion (as clarified by Dennis Williamson) -- this completely avoids the parsing problem, and is also faster (since -q makes grep stop searching at the first match). If that option weren't available, another method would be to protect the output with double-quotes: if [ "$(grep -w -- "$1" "$FILE")" ]; then (note that I also used $() instead of backquotes 'cause I find them much easier to read, and quotes around $FILE just in case it contains anything funny, like whitespace).
Though not applicable in this particular case, another technique can be used to prevent filenames that start with hyphens from being interpreted as options:
rm ./-x
or
rm /path/to/-x

Resources