This question already has answers here:
Regex stored in a shell variable doesn't work between double brackets
(2 answers)
bash regex with quotes?
(5 answers)
Closed 1 year ago.
I would like someone to clarify this because I don't understand it.
Here is a sample code, that tests an argument if it is numeric or not (integer)
#/bin/env bash
pattern="^[+|-]?[0-9]+$"
[[ "$1" =~ "$pattern" ]] && echo "1:number" || echo "1:NOT number"
[[ "$1" =~ $pattern ]] && echo "1:number" || echo "1:NOT number"
it is advisable to quote always the variables, but here, if you make the test with this simple script with various inputs, you will see that if you enter a number, the quoted pattern variable returns an erroneous result (first test)
Why is that?
thanks in advance for anyone who will take the trouble to explain this to me.
Finally, sorry if that is already answered but I haven't found that particular one.
It's normally advised to quote all variables. But [[ ]] is a special operator, it parses its contents differently.
You don't need to quote variables inside double square brackets, because it doesn't do word splitting or filename expansion. But there's no harm in quoting most variables.
However, the pattern operand to =~ is treated very specially. Any part of it that's quoted is treated as a literal, not a regular expression pattern. So when you write "$pattern" it no longer does a regular expression match, it just searches for the actual characters in $pattern in $1.
Related
Let's suppose that you have a variable which is subject to word splitting, globing and pattern matching:
var='*
.?'
While I'm pretty sure that everyone agrees that "$var" is the best way to expand the variable as a string literal, I've identified a few cases where you don't need to use the double quotes:
Simple assignment: x=$var
Case statement: case $var in ...
Leftmost part of bash test construct: [[ $var .... ]]
UPDATE1: Bash here-string: <<< $var which works starting from bash-4.4 (thank you #GordonDavidson)
UPDATE2: Exported assignment (in bash): export x=$var
Is it correct? Is there any other shell/bash statement where the variable isn't subject to glob expansion or word splitting without using double-quotes? where expanding a variable with or without double quotes is 100% equivalent?
The reason why I ask this question is that when reading foreign code, knowing the above mentioned border-cases might help.
For example, one bug that I found in a script that I was debugging is something like:
out_exists="-f a.out"
[[ $out_exists ]] && mv a.out prog.exe
mv: cannot stat ‘a.out’: No such file or directory
This question is a duplicate of What are the contexts where Bash doesn't perform word splitting and globbing?, but that was closed before it was answered.
For a thorough answer to the question see the answer by Stéphane Chazelas to What are the contexts where Bash doesn't perform word splitting and globbing? - Unix & Linux Stack Exchange. Another good answer is in the "Where you can omit the double quotes" section in the answer by Gilles to When is double-quoting necessary? - Unix & Linux Stack Exchange.
There seem to be a small number of cases that aren't covered by the links above:
With the for (( expr1 ; expr2 ; expr3 )) ... loop, variable expansions in any of the expressions inside the (( ... )) don't need to be quoted.
Several of the expansions described in the Shell Parameter Expansion section of the Bash Reference Manual are described with a word argument that isn't subject to word splitting or pathname expansion (globbing). Examples include ${parameter:-word}, ${parameter#word}, and ${parameter%word}.
Great question! If you need to word split a variable, the quotes should be left off.
If I think of other cases, I'll add to this.
var='abc xyz'
set "$var"
echo $1
abc xyz
set $var
echo $1
abc
This question already has answers here:
Extract filename and extension in Bash
(38 answers)
Closed 4 years ago.
I have the following bash script.
while IFS= read -r filename;
do [[ $(md5 path/to/"$filename-orig") = $(md5 path/to/"$filename") ]] || echo $filename differs;
done < path/to/list-of-files-to-compare.txt
It's supposed to compare two files (by computing their MD5 hash digest) then report if they are different. It gets the files to compare from a list.
The problem is that if the file I am trying to read is at, say,
path/to/foo-orig.js
the script will look for the file at
path/to/foo.js-orig
and, obviously, this throws an error and fails.
How do I correct this bug in my script so that I handle the .js extension correctly?
Edit
TL;DR:
Given a string foo.bar how can I get foo-orig.bar?
If I understand you correctly you are really only asking: Given a string foo.bar how can I get a string foo-orig.bar? This can be done as:
$ f="path/to/foo.js"
$ echo "${f%.js}-bak.js"
path/to/foo-bak.js
It is documented under Parameter Expansion in man bash:
${parameter%word}
${parameter%%word}
Remove matching suffix pattern. The word is expanded to produce a pattern just as in pathname expansion. If the pattern matches a trailing portion of the expanded value of parameter, then the result of the expansion is the expanded value of parameter with the shortest matching pattern (the "%" case) or the longest matching pattern (the "%%" case) deleted. If parameter is # or *, the pattern removal operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with # or *, the pattern removal operation is applied to each member of the array in turn, and the expansion is the resultant list.
This example will handle all kind of extensions and even the case where there is no extension in $filename:
while IFS= read -r filename; do
if [[ $filename =~ ^(.*)([.].*)$ ]]; then
prefix=${BASH_REMATCH[1]}
suffix=${BASH_REMATCH[2]}
else
prefix=${filename}
suffix=''
fi
md5a=$(md5 -q "path/to/${prefix}-orig${suffix}")
md5b=$(md5 -q "path/to/${filename}")
[[ $md5a = $md5b ]] || echo $filename differs;
unset md5a
unset md5b
unset prefix
unset suffix
done < path/to/list-of-files-to-compare.txt
Not tested though. I just wrote it out of my head.
This question already has answers here:
Bash Regular Expression -- Can't seem to match any of \s \S \d \D \w \W etc
(6 answers)
Closed 8 months ago.
If I pass a word as an argument by:
$./file.sh hello
it gives Even as output when it should print "Argument should be a number"
#!/bin/bash
set -e
if [[ -z $1 ]]
then
echo "Argument expected"
else
if [[ $1 =~ "\D" ]] #This does not work as expected
then
echo "Argument should be a number"
else
a=$1%2
if [[ a -eq 0 ]]
then
echo "Even"
elif [[ a -eq 1 ]]
then
echo "Odd"
fi
fi
fi
#End of program
When I change "\D" to "[^0-9]" in the if statement, it works as expected and prints "Argument should be a number" to the console.
Don't they both have the same meaning? If not, in what way are the two different from each other?
Bash uses POSIX Extended Regular Expressions, not PCRE. Instead of escape sequences like \D, it uses Bracket Expressions. The bracket expression for digits is
[:digit:]
and to match non-digits, you use this inside a character class with the negation operator:
[^[:digit:]]
As you can see, this is longer than just writing [^0-9], so it's not really a shorthand. It's useful for portability to other locales, since it will include their digits as well.
Bash regex simply does not support PCRE regex syntax.
You might want to read up on different regex dialects and their history.
See e.g. Why are there so many different regular expression dialects?
The POSIX equivalent of \D is [^[:digit:]].
This question already has answers here:
How do I escape the wildcard/asterisk character in bash?
(7 answers)
Closed 5 years ago.
I am trying to do a regular expression match with a variable that contains an asterisk.
The following set of commands in Bash does filename expansion with the asterisk in the variable on the left-hand side of the operator.
test='part1 * part2'
[[ "$test" =~ ^(.+)\ .\ (.+)$ ]] && echo $BASH_REMATCH
Results in: part1 FILE1 FILE2 part2
But it should result in: part1 * part2
I have searched and searched but cannot figure out why this is happening.
I realized while asking, the regular expression matching is working fine. There is no expansion happening inside the double brackets. The expansion is occurring after the match, when the result is echoed. The $BASH_REMATCH variable contains an asterisk, and needs to be double-quoted.
The correct set of commands is:
test='part1 * part2'
regex='^(.+) . (.+)$'
[[ "$test" =~ $regex ]] && echo "$BASH_REMATCH"
UPDATE: Set regular expression outside of test.
Recently I've got confused by the following situation.
What is the difference between two if usage:
Case 1
amount=10
if [[ $amount -eq 10 ]]
then
echo "something"
fi
script output:
$ ./1.sh
something
Case 2
if [[ amount -eq 10 ]]
This also works like this (note that the variable name doesn't contain the $).
So the question is how does it work even without dollar sign in the variable name.
P.S. I'm using a POSIX shell on HP-UX.
man bash
ARITHMETIC EVALUATION
...
Shell variables are allowed as operands; parameter expansion is per‐
formed before the expression is evaluated. Within an expression,
shell variables may also be referenced by name without using the
parameter expansion syntax.
In this context shell does not expect anything but numerics, so it expands strings as variables. That makes sense to me.