How to use [[ == ]] properly to match a glob? - bash

Bash's manpage teaches that [[ == ]] matches patterns. In Bash therefore, why does the following not print matched?
Z=abc; [[ "$Z" == 'a*' ]] && echo 'matched'
The following however does indeed print matched:
Z=abc; [[ "$Z" == a* ]] && echo 'matched'
Isn't this exactly backward? Why does the a*, without the quotes, not immediately expand to list whatever filenames happen to begin with the letter a in the current directory? And besides, why doesn't the quoted 'a*' work in any case?

Glob pattern must not be quoted to make it work.
This should also work with just glob pattern out of quote whereas static text is still qupted:
[[ "$Z" == "a"* ]] && echo 'matched'
matched
[[ "$Z" == "ab"* ]] && echo 'matched'
matched
Explanation snippet from man page:
When the == and != operators are used, the string to the right of
the operator is considered a pattern and matched according to the
rules described below under Pattern Matching. If the shell option
nocasematch is enabled, the match is performed without regard to
the case of alphabetic characters. The return value is 0 if the
string matches (==) or does not match (!=) the pattern, and 1
otherwise. Any part of the pattern may be quoted to force it to be
matched as a string.
Additionally, one of the reasons to use [[ over [ is that [[ is a shell built-in and thus can have its own syntax and doesn't need to follow the normal expansion rules (which is why the arguments to [[ aren't subject to word-splitting for example).

While the existing answer is correct, I don't believe that it tells the full story.
Globs have two uses. There is a difference in behaviour between globs inside a [[ ]] construct which test the contents of a variable against a pattern and other globs, which expand to list a range of files. In either case, if you put quotes around character, it will be interpreted literally and not expanded.
It is also worth mentioning that the variable on the left hand side doesn't need to be quoted after the [[, so you could write your code like this:
Z=abc; [[ $Z == a* ]] && echo 'matched'
It is also possible to use a single = but the == looks more familiar to those coming from other coding backgrounds, so personally I prefer to use it in bash as well. As mentioned in the comments, the single = is the more widely compatible, as it is used to test string equality in all of POSIX-compliant shells, e.g. [ "$a" = "abc" ]. For this reason you may prefer to use it in bash as well.
As always, Greg's wiki contains some good information on the subject of pattern matching in bash.

Related

Bash check shows file exists for non-existent files?

Run the following in bash:
stuff=`rpm -ql <some package> | grep dasdasdfd`
(non existent file in package, exit code = 1, stdout is empty)
if [ -f $stuff ]; then echo "whaaat"; fi
Above command checks if file exists... but:
file $stuff
Just prints usage info for file... and
stat $stuff
Missing operand...
Can someone please explain why? Is this a bug? Am I doing something wrong? I just want to make sure that a file that's in the package is present on fs
You probably need to surround $stuff in quotes
if [ -f "$stuff" ]; then
As a general rule, you almost always want to add quotes around pathnames everywhere you use them.
I find it more useful to think or variables in shell scripting as "macros", which are expanded on first use to their value. This is different from variables in almost every other programming language.
So if $stuff contains hello world (notice the space), it would be the same as if you've typed:
[ -f hello world ]
which is obviously an error.
In this case, you mentioned that you're dealing with a non-existent file, so $stuff is actually empty, which would be like typing:
[ -f ]
Which is actually valid, but always succeeds. This is a bit of obscure test behaviour, from the POSIX spec we read that test always succeeds if there if only a single argument (in this case, the argument is -f):
1 argument:
Exit true (0) if $1 is not null; otherwise, exit false.
This is probably to facilitate the writing of:
[ $variable_that_may_or_may_not_be_defined ]
If you add quotes, you're passing 2 arguments, and more sane things happen:
if [ -f "" ]; then
Martin Tournoij's answer and DevSolar's answer both provide correct solutions and helpful background info: with respect to [ ... ] in one case, and [[ ... ]] in the other.
Since it may not be obvious if and when to choose [[ ... ]] over [ ... ] (and its (virtual) alias, test ...), let me attempt a summary:
If your code must be portable (POSIX-compliant), you MUST use [ ... ] (or test ...).
Tokens inside [ ... ] are parsed just like arguments passed to an executable, so you must double-quote your variable references, unless you explicitly want all shell expansions - notably word splitting (automatic splitting into multiple tokens by whitespace) and globbing - applied to them.
[ -f "$stuff" ] # double-quoting required, if $stuff has embedded whitespace
If you know that your code will be run with bash, you can use [[ ... ]] for more features and fewer surprises.
Tokens inside [[ ... ]] are parsed in a special context in which neither word splitting nor pathname expansion (globbing) are applied (though other expansions, such as parameter expansion, do occur), so there is typically no need to double-quote variable references.
[[ -f $stuff ]] # double-quoting optional
Note that ksh and zsh also support [[ ... ]] (presumably with subtle variations in behavior).
For more background info, such as the additional features that [[ ... ]] offers, read on.
[[ ... ]] improves on [ ... ] / test ... as follows:
"RHS" below means "right-hand side", i.e., the right operand of a binary operator.
(typically) requires NO quoting of variable references (except on the RHS of == and =~ to specify a literal string or substring(s))
f='some file'; [[ -f $f ]] # ok, double quotes optional
v='*'; [[ $v == '*' ]] # ok, double quotes optional
Neither word splitting nor pathname expansion is applied inside [[ ... ]], so it's safe to use unquoted references to variables whose values have embedded whitespace and/or values such as * that would normally lead to globbing.
offers string pattern matching with = / ==, with an unquoted pattern on the RHS (or at least unquoted pattern metachars.)
[[ abc == a* ]] && echo yes # matches; use of = instead of == works too
Caveat: Thus, on the RHS of = / == you must double-quote variable references (or single-quote literals) if you want their values to be treated as literals.
v='a*'; [[ abc == "$v" ]] # does NOT match
offers regex matching with =~, with an unquoted extended regular expression on the RHS (or at least unquoted regex metachars.)
[[ abc =~ ^a.+$ ]] && echo yes # matches
Caveat: Thus, on the RHS of =~ you must double-quote variable references (or single-quote literals) if you want their values to be treated as literals.
v='a.+'; [[ abc =~ ^"$v"$ ]] # does NOT match
Also note that the unquoted / quoted distinction was only introduced in bash 3.2 - you can still use shopt -s compat31 to have single- and double-quoted strings treated as regexes, too.
Caveat: The regex dialect understood by =~ is platform-specific, so a regex that works on one platform may not work on another (this is one of the few cases where bash's behavior is platform-dependent). For instance, on Linux you can use \b and \< / \> for word-boundary assertions, whereas BSD/macOS only supports [[:<]] and [[:>]], which, in turn, Linux doesn't support - see this answer of mine.
offers grouping and negation with unescaped (, ), and ! chars.
offers use of && and || (Boolean AND and OR)
[[ (3 -gt 2) && ! -f / ]] && echo yes
Note that, inside [[ ... ]], && has higher precedence than || - unlike OUTSIDE (as so-called [command-]list operators, where they combine entire commands / command lists), where they have equal precedence.
(while [ and test have -a and -o, even the POSIX spec. for test cautions against their use)
within [[ ... ]], you may spread your conditional across multiple lines for readability without the need for the line-continuation char. (\), assuming the line breaks come after && or ||, as codeforester points out.
[[ ... ]] is faster than [ ... ], though that will typically not matter.
If you are interested in relative performance, see this answer of mine.
Implementation notes re [ and test:
[[ a is shell keyword (supported in bash, ksh, and zsh), which allows for different parsing rules, as described above.
By contrast, [ and test are builtins in all major POSIX-like shells (bash, ksh, zsh, dash).
In addition, both [ and test exist as external utilities (executable files that require a separate process to invoke), as mandated by POSIX.
In fact, you need external utility versions so as to be able to use [ or test in "shell-less" invocation scenarios such as when passing a test to find -exec or xargs.
While the [ utility could conceivably be implemented as a symlink to the test utility (as long as test knows how it was invoked and enforces the closing ] when invoked as [), in practice they are often (always?) separate executables (true on Linux and macOS / BSD, for instance; on Linux, their content differs, whereas on macOS / BSD their content is identical (they are copies of the same file)).
One option would be to put $stuff in quotes, as Carpetsmoker said.
But since this is tagged bash, and because catering for whitespace in filenames is a pain, you could go for:
if [[ -f $stuff ]]
As opposed to [ which is an alias for test, the [[ construct "knows" how to handle the contents of $stuff correctly.

Trying to use wildcards in bash conditional statement/case mixed with exact alpha char and failing

Essentially, I'm testing a variable to ensure it's contents matches a specific time format: 4 digits, am/pm/AM/PM, no spaces (i.e. 1204pm). I've gotten this much to work:
tm0=1204pm; [[ $tm0 == [0-2###aApP]* ]] && echo PASS
or
tm0=1203pm; case $tm0 in [0-2###apAP]*) echo PASS; esac
But when I try to specify the last character as "m" (Originally I was trying for [Mm] but that didn't work either) it fails.
tm0=1204pm; [[ $tm0 == [0-2###aApP]m ]] && echo PASS
Any help, please and thanks.
Using globs:
[[ $tm0 == [01][0-9][0-5][0-9][aApP][mM] ]]
Note that this will validate, e.g., 1900pm. If you don't want that:
[[ $tm0 == #(0[0-9]|1[0-2])[0-5][0-9][aApP][mM] ]]
This uses extended globs. Note that you don't need shopt -s extglob to use extended globs inside [[ ... ]]: in section Condition Constructs, for the doc about [[ ... ]] you can read:
When the == and != operators are used, the string to the right of the operator is considered a pattern and matched according to the rules described below in Pattern Matching, as if the extglob shell option were enabled.
To use this feature in a case statement, you need to enable extglob.
Using regex:
[[ $tm0 =~ ^(0[0-9]|1[0-2])([0-5][0-9])([aApP][mM])$ ]]
With these groupings, you get the hour in BASH_REMATCH[0], the minutes in BASH_REMATCH[1] and the am/pm in BASH_REMATCH[2].
bash patterns are not regular expressions. They are also not Java patterns, which I think is what you're trying to use (although it's not at all clear).
You can (and should) read the bash manual chapter on pattern matching, which is a complete list of pattern features. In that, you will see that:
[...] matches a single character which is one of the characters in the enclosed character class description
* matches any number of arbitrary characters
So [0-2###apAP]* matches one of the characters 0, 1, 2, #, a,p, A, or P, followed by any number of characters (including 0).
What I think you are looking for is:
[01][0-9][0-5][0-9][aApP][mM]
although that is slightly generous since it will match, for example, 1300pm ("It was a bright cold day in April, and the clocks were striking thirteen.")

Bash glob in string comparison

The other day I was struggling with an if statement. Turns our my variable had a white space at the beginning. So I tried to conquer this with the following code but I am having no luck.
if [ "$COMMAND_WAIT" == "*REBOOT" ]; then
sudo /etc/kca/scripts/reboot.sh
echo "REBOOTING"
fi
Should I be able to wildcard this statement or is there another way around this?
The following should work. It uses [[ instead of [, and no quotes around the pattern.
if [[ "$COMMAND_WAIT" == *REBOOT ]]; then
sudo /etc/kca/scripts/reboot.sh
echo "REBOOTING"
fi
[[ expression ]] is a compound expression, with special rules regarding expansions and quoting. In contrast, [ is a builtin command, i.e. *REBOOT will be expanded as a pathname. In most cases, it's easier to use [[ instead of [.

Bash [[ tests, quoting variables

I want to decide whether to always omit quotes for variables appearing withing a Bash [[ test. I interpret the man page to say that it is permissible to do so without any loss of correctness.
I devised this simplistic "test" to verify my thinking and the "expected behaviour" but it may prove absolutely nothing, take a look at it:
x='1 == 2 &&'
if [[ $x == '1 == 2 &&' ]]; then
echo yes
else
echo no
fi
Note I am not writing this as such:
x='1 == 2 &&'
if [[ "$x" == '1 == 2 &&' ]]; then
echo yes
else
echo no
fi
which so far has always been my style, for consistency if nothing else.
Is is safe to switch my coding convention to always omit quotes for variables appearing within [[ tests?
I am trying to learn Bash and I am trying to do so picking up good habits, good style and correctness..
The key thing to remember is that quotes and escaping within pattern matching contexts always cause their contents to become literal. Quoting on the left hand side of an == within [[ is never necessary, only the right side is interpreted as a pattern. Quoting on the right hand side is necessary if you want a literal match and to avoid interpretation of pattern metacharacters within the variable.
In other words, [ "$x" = "$x" ] and [[ $x == "$x" ]] are mostly equivalent, and of course in Bash the latter should be preferred.
One quick tip: think of the operators of the [[ ]] compound comand as being the same grammar-wise as other control operators such as elif, do, ;;, and ;;& (though technically in the manual they're in their own category). They're really delimiters of sections of a compound command, which is how they achieve seemingly magical properties like the ability to short-circuit expansions. This should help to clarify a lot of the behavior of [[, and why it's distinct from e.g. the arithmetic operators, which are not like that.
More examples: http://mywiki.wooledge.org/BashFAQ/031#Theory
No. You should not get in the habit of always omitting quotes, even if they appear within [[ tests. Bash is famous for burning people for leaving off quotes :-)
In bash the [[ ]] should always evaluate as an expression, so the script will continue to function. The risk is that a logic error may pop up unnoticed. In all cases that I can think of off the top of my head it would be fine. However, quotes allow you to be specific about what you want, and are also self-documenting in addition to being safer.
Consider this expression:
if [[ "$INT" =~ ^-?[0-9]+$ ]]; then
It would still work without the quotes because it is in between [[ ]], but the quotes are clarifying and do not cause any issues.
At any rate, this is my opinion as a guy who has received a royal hosing at the hand of Bash because I failed to put " " around something that needed them :'(
My bash hacking friend once said, "use quotes liberally in Bash." That advice has served me well.

What's the different between "[]" and "[[]]" [duplicate]

This question already has answers here:
Are double square brackets [[ ]] preferable over single square brackets [ ] in Bash?
(10 answers)
Closed 4 years ago.
I looked at bash man page and the [[ says it uses Conditional Expressions. Then I looked at Conditional Expressions section and it lists the same operators as test (and [).
So I wonder, what is the difference between [ and [[ in Bash?
[[ is bash's improvement to the [ command. It has several enhancements that make it a better choice if you write scripts that target bash. My favorites are:
It is a syntactical feature of the shell, so it has some special behavior that [ doesn't have. You no longer have to quote variables like mad because [[ handles empty strings and strings with whitespace more intuitively. For example, with [ you have to write
if [ -f "$file" ]
to correctly handle empty strings or file names with spaces in them. With [[ the quotes are unnecessary:
if [[ -f $file ]]
Because it is a syntactical feature, it lets you use && and || operators for boolean tests and < and > for string comparisons. [ cannot do this because it is a regular command and &&, ||, <, and > are not passed to regular commands as command-line arguments.
It has a wonderful =~ operator for doing regular expression matches. With [ you might write
if [ "$answer" = y -o "$answer" = yes ]
With [[ you can write this as
if [[ $answer =~ ^y(es)?$ ]]
It even lets you access the captured groups which it stores in BASH_REMATCH. For instance, ${BASH_REMATCH[1]} would be "es" if you typed a full "yes" above.
You get pattern matching aka globbing for free. Maybe you're less strict about how to type yes. Maybe you're okay if the user types y-anything. Got you covered:
if [[ $ANSWER = y* ]]
Keep in mind that it is a bash extension, so if you are writing sh-compatible scripts then you need to stick with [. Make sure you have the #!/bin/bash shebang line for your script if you use double brackets.
See also
Bash FAQ - "What is the difference between test, [ and [[ ?"
Bash Practices - Bash Tests
Server Fault - What is the difference between double and single brackets in bash?
[ is the same as the test builtin, and works like the test binary (man test)
works about the same as [ in all the other sh-based shells in many UNIX-like environments
only supports a single condition. Multiple tests with the bash && and || operators must be in separate brackets.
doesn't natively support a 'not' operator. To invert a condition, use a ! outside the first bracket to use the shell's facility for inverting command return values.
== and != are literal string comparisons
[[ is a bash
is bash-specific, though others shells may have implemented similar constructs. Don't expect it in an old-school UNIX sh.
== and != apply bash pattern matching rules, see "Pattern Matching" in man bash
has a =~ regex match operator
allows use of parentheses and the !, &&, and || logical operators within the brackets to combine subexpressions
Aside from that, they're pretty similar -- most individual tests work identically between them, things only get interesting when you need to combine different tests with logical AND/OR/NOT operations.
The most important difference will be the clarity of your code. Yes, yes, what's been said above is true, but [[ ]] brings your code in line with what you would expect in high level languages, especially in regards to AND (&&), OR (||), and NOT (!) operators. Thus, when you move between systems and languages you will be able to interpret script faster which makes your life easier. Get the nitty gritty from a good UNIX/Linux reference. You may find some of the nitty gritty to be useful in certain circumstances, but you will always appreciate clear code! Which script fragment would you rather read? Even out of context, the first choice is easier to read and understand.
if [[ -d $newDir && -n $(echo $newDir | grep "^${webRootParent}") && -n $(echo $newDir | grep '/$') ]]; then ...
or
if [ -d "$newDir" -a -n "$(echo "$newDir" | grep "^${webRootParent}")" -a -n "$(echo "$newDir" | grep '/$')" ]; then ...
In bash, contrary to [, [[ prevents word splitting of variable values.

Resources