Bash Regex if else - bash

In Bash I'm trying to check if a string is in the appropriate format.
#!/bin/bash
COMMIT_MSG="release/patch/JIRA-123"
[[ $COMMIT_MSG =~ 'release\/(major|minor|patch)\/[A-Z\d]+-\d+' ]] && echo "yes" || echo "no"
This is the regex I've used to match the string as patch could be either major or minor and JIRA-123 is Jira Ticket ID but when trying it in the Bash regex it always returns no.

Bash is a simplified version of regex called "Extended Regular Expression". \d doesn't exist in it, so use [0-9] instead.
Additionally, you shouldn't quote the regex in the condition.
[[ $COMMIT_MSG =~ release/(major|minor|patch)/[A-Z0-9]+-[0-9]+ ]] && echo "yes" || echo "no"

Related

Why is empty string changed into -n expression in bash

Taken this snippet:
$ [[ ""=="foo" ]] && echo yes || echo no
+ [[ -n ==foo ]]
+ echo yes
yes
How does [[ ""=="foo" ]] turn into [[ -n ==foo ]] ?
The RC was of course missing spaces around == - after adding them, it works as expected:
$ [[ "" == "foo" ]] && echo yes || echo no
+ [[ '' == \f\o\o ]]
+ echo no
no
But still i cannot understand why it behaved like this?
It's not changing the empty string into -n.
The string ""=="foo" is equivalent to the string ==foo. The trace output always shows strings in their simplest format, without unnecessary quotes.
A conditional expression that just contains a single string with no operators is true if the string is not empty. That's what the -n operator tests, so the -x expansion shows it that way.
Any operand that isn't preceded or followed by an operator is treated to have an equal operation as -n <operand>. Operators also need to be isolated with spaces to be distinguished. For a list of operators run help test. Also run help [[ to see how the keyword is different from the [ and test builtins.

bash test - match forward slashes

I have a git branch name:
current_branch='oleg/feature/1535693040'
I want to test if the branch name includes /feature/, so I use:
if [ "$current_branch" != */feature/* ] ; then
echo "Current branch does not seem to be a feature branch by name, please check, and use --force to override.";
exit 1;
fi
but that branch name doesn't match the regex, so I am exiting with 1, anyone know why?
[ ] is the single-bracket test(1) command, which does not handle patterns the same way bash does. Instead, use the double-bracket bash conditional expression [[ ]]. Example:
$ current_branch='oleg/feature/1535693040'
$ [ "$current_branch" = '*/feature/*' ] && echo yes
$ [[ $current_branch = */feature/* ]] && echo yes
yes
Edit with regexes:
$ [[ $current_branch =~ /feature/ ]] && echo yes
yes
The regex can match anywhere, so you don't need the leading and trailing * (which would be .* in a regex).
CAUTION: the slashes here are not delimiters for the regex, but literals to be matched somewhere in the string. For example, [[ foo/bar =~ / ]] returns true. This is different from regex notation in many languages.

How to check if a file name matches regex in shell script

I have a shell script that needs to check if a file name matches a certain regex, but it always shows "not match". Can anyone let me know what's wrong with my code?
fileNamePattern=abcd_????_def_*.txt
realFilePath=/data/file/abcd_12bd_def_ghijk.txt
if [[ $realFilePath =~ $fileNamePattern ]]
then
echo $realFilePath match $fileNamePattern
else
echo $realFilePath not match $fileNamePattern
fi
There is a confusion between regexes and the simpler "glob"/"wildcard"/"normal" patterns – whatever you want to call them. You're using the latter, but call it a regex.
If you want to use a pattern, you should
Quote it when assigning1:
fileNamePattern="abcd_????_def_*.txt"
You don't want anything to expand quite yet.
Make it match the complete path. This doesn't match:
$ mypath="/mydir/myfile1.txt"
$ mypattern="myfile?.txt"
$ [[ $mypath == $mypattern ]] && echo "Matches!" || echo "Doesn't match!"
Doesn't match!
But after extending the pattern to start with *:
$ mypattern="*myfile?.txt"
$ [[ $mypath == $mypattern ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
The first one doesn't match because it matches only the filename, but not the complete path. Alternatively, you could use the first pattern, but remove the rest of the path with parameter expansion:
$ mypattern="myfile?.txt"
$ mypath="/mydir/myfile1.txt"
$ echo "${mypath##*/}"
myfile1.txt
$ [[ ${mypath##*/} == $mypattern ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
Use == and not =~, as shown in the above examples. You could also use the more portable = instead, but since we're already using the non-POSIX [[ ]] instead of [ ], we can as well use ==.
If you want to use a regex, you should:
Write your pattern as one: ? and * have a different meaning in regexes; they modify what they stand after, whereas in glob patterns, they can stand on their own (see the manual). The corresponding pattern would become:
fileNameRegex='abcd_.{4}_def_.*\.txt'
and could be used like this:
$ mypath="/data/file/abcd_12bd_def_ghijk.txt"
$ [[ $mypath =~ $fileNameRegex ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
Keep your habit of writing the regex into a separate parameter and then use it unquoted in the conditional operator [[ ]], or escaping gets very messy – it's also more portable across Bash versions.
The BashGuide has a great article about the different types of patterns in Bash.
Notice that quoting your parameters is almost always a good habit. It's not required in conditional expressions in [[ ]], and actually suppresses interpretation of the right-hand side as a pattern or regex. If you were using [ ] (which doesn't support regexes and patterns anyway), quoting would be required to avoid unexpected side effects of special characters and empty strings.
1 Not exactly true in this case, actually. When assigning to a variable, the manual says that the following happens:
[...] tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, and quote removal [...]
i.e., no pathname (glob) expansion. While in this very case using
fileNamePattern=abcd_????_def_*.txt
would work just as well as the quoted version, using quotes prevents surprises in many other cases and is required as soon as you have a blank in the pattern.
Use RegExs instead of wildcards:
{ ~ } » fileNamePattern="abcd_...._def_.*\.txt" ~
{ ~ } » realFilePath=/data/file/abcd_12bd_def_ghijk.txt ~
{ ~ } » if [[ $realFilePath =~ $fileNamePattern ]] ~
\ then
\ echo $realFilePath match $fileNamePattern
\ else
\ echo $realFilePath not match $fileNamePattern
\ fi
Output:
/data/file/abcd_12bd_def_ghijk.txt match abcd_...._def_.*\.txt

check for string format in bash script

I am attempting to check for proper formatting at the start of a string in a bash script.
The expected format is like the below where the string must always begin with "ABCDEFG-" (exact letters and order) and the numbers would vary but be at least 3 digits. Everything after the 3rd digit is a do not care.
Expected start of string: "ABCDEFG-1234"
I am using the below code snippet.
[ $(echo "$str" | grep -E "ABCDEFG-[0-9][0-9][0-9]") ] && echo "yes"
str1 = "ABCDEFG-1234"
str2 = "ABCDEFG-1234 - Some more text"
When I use str1 in place of str everything works ok and yes is printed.
When I use str2 in place of str i get the below error
[: ABCDEFG-1234: unary operator expected
I am pretty new to working with bash scripts so any help would be appreciated.
If this is bash, you have no reason to use grep for this at all; the shell has built-in regular expression support.
re="ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, you might want your regex to be anchored if you want a match in the beginning rather than anywhere in the content:
re="^ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, this doesn't need to be an ERE at all -- a glob-style pattern match would also be adequate:
if [[ $str = ABCDEFG-[0-9][0-9][0-9]* ]]; then echo "yes"; fi
Try grep -E "ABCDEFG-[0-9][0-9][0-9].*"

What does the "=~" operator do in shell scripts?

It seems that it is sort of comparison operator, but what exactly it does in e.g. the following code (taken from https://github.com/lvv/git-prompt/blob/master/git-prompt.sh#L154)?
if [[ $LC_CTYPE =~ "UTF" && $TERM != "linux" ]]; then
elipses_marker="…"
else
elipses_marker="..."
fi
I'm currently trying to make git-prompt to work under MinGW, and the shell supplied with MinGW doesn't seem to support this operator:
conditional binary operator expected
syntax error near `=~'
` if [[ $LC_CTYPE =~ "UTF" && $TERM != "linux" ]]; then'
In this specific case I can just replace the entire block with elipses_marker="…" (as I know my terminal supports unicode), but what exactly this =~ does?
It's a bash-only addition to the built-in [[ command, performing regexp matching. Since it doesn't have to be an exact match of the full string, the symbol is waved, to indicate an "inexact" match.
In this case, if $LC_CTYPE CONTAINS the string "UTF".
More portable version:
if test `echo $LC_CTYPE | grep -c UTF` -ne 0 -a "$TERM" != "linux"
then
...
else
...
fi
It's a regular expression matching. I guess your bash version doesn't support that yet.
In this particular case, I'd suggest replacing it with simpler (and faster) pattern matching:
[[ $LC_CTYPE == *UTF* && $TERM != "linux" ]]
(note that * must not be quoted here)
Like Ruby, it matches where the RHS operand is a regular expression.
It matches regular expressions
Refer to following example from http://tldp.org/LDP/abs/html/bashver3.html#REGEXMATCHREF
#!/bin/bash
input=$1
if [[ "$input" =~ "[0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]" ]]
# ^ NOTE: Quoting not necessary, as of version 3.2 of Bash.
# NNN-NN-NNNN (where each N is a digit).
then
echo "Social Security number."
# Process SSN.
else
echo "Not a Social Security number!"
# Or, ask for corrected input.
fi

Resources