Search in string for multiple array values - bash

I'm looking at a simple for loop with the following logic:
variable=`some piped string`
array_value=(1.1 2.9)
for i in ${array_value[#]}; do
if [[ "$variable" == *some_text*"$array_value" ]]; then
echo -e "Info: Found a matching string"
fi
The problem is that I cannot get this to show me when it finds either the string ending in 1.1 or 2.9 as sample data.
If I do an echo $array_value in the for loop I can see that the array values are being taken so its values are being parsed, though the if loop doesn't return that echo message although the string is present.
LE:
Based on the comments received I've abstracted the code to something like this, which still doesn't work if I want to use wildcards inside the comparison quote
versions=(1.1 2.9)
string="system is running version:2.9"
for i in ${versions[#]}; do
if [[ "$string" == "system*${i}" ]]; then
echo "match found"
fi
done
Any construction similar to "system* ${i}" or "* ${i}" will not work, though if I specify the full string pattern it will work.

The problem with the test construct has to you with your if statement. To construct the if statement in a form that will evaluate, use:
if [[ "$variable" == "*some_text*${i}" ]]; then
Note: *some_text* will need to be replaced with actual text without * wildcards. If the * is needed in the text, then you will need to turn globbing off to prevent expansion by the shell. If expansion is your goal, then protect the variable i by braces.
There is nothing wrong with putting *some_text* up against the variable i, but it is cleaner, depending on the length of some_text, to assign it to a variable itself. The easiest way to accommodate this would be to define a variable to hold the some_text you are needing. E.g.:
prefix="some_text"
if [[ "$variable" == "${prefix}${i}" ]]; then
If you have additional questions, just ask.

Change "system*${i}" to system*$i.
Wrapping with quotes inside [[ ... ]] nullifies the wildcard * by treating it as a literal character.
Or if you want the match to be assigned to a variable:
match="system*"
you can then do:
[[ $string == $match$i ]]
You actually don't need quotes around $string either as word splitting is not performed inside [[ ... ]].
From man bash:
[[ expression ]]
...
Word splitting and pathname expansion are not
performed on the words between the [[ and ]]
...
Any part of the pattern may be quoted to force
the quoted portion to be matched as a string.

Related

Regular expression in bash not working in conditional construct in Bash with operator '=~'

The regular expression I have put into the conditional construct (with the =~ operator) would not return the value as I had expected, but when I assign them into two variables it worked. Wondering if I had done something wrong.
Version 1 (this one worked)
a=30
b='^[0-9]+$' #pattern looking for a number
[[ $a =~ $b ]]
echo $?
#result is 0, as expected
Version 2 (this one doesn't work but I thought it is identical)
[[ 30 =~ '^[0-9]+$' ]]
echo $?
#result is 1
Don't quote the regular expression:
[[ 30 =~ ^[0-9]+$ ]]
echo $?
From the manual:
Any part of the pattern may be quoted to force the quoted portion to be matched as a string.
So if you quote the entire pattern, it's treated as a fixed string match rather than a regular expression.

Matching a string against contents of an array with regex operator not working

i make a simply bash script to change number version based on the source branch of a merge request, i need increment different value if a feature or a hotfix/bigfix/fix branches names:
#!/bin/bash
if [ $# -eq 0 ]
then
echo -e "\nUsage: $0 MERGE_REQUEST_SOURCE\n"
exit 1
fi
if [ ! -f version ]; then
echo "0.0.0" > version
fi
VERSION=$(cat version)
MERGE_REQUEST_SOURCE=$1
declare -a FEATURE_LIST=("feature")
declare -a HOTFIX_LIST=("fix" "hotfix" "bugfix")
IFS="."
read -a num <<< ${VERSION}
MAJOR=${num[0]}
FEATURE=${num[1]}
HOTFIX=${num[2]}
if [[ ${MERGE_REQUEST_SOURCE} =~ .*${FEATURE_LIST[*]}.* ]]; then
FEATURE=$((${FEATURE}+1))
echo "${MAJOR}.${FEATURE}.${HOTFIX}" > version
elif [[ ${MERGE_REQUEST_SOURCE} =~ .*${HOTFIX_LIST[*]}.* ]]; then
HOTFIX=$((${HOTFIX}+1))
echo "${MAJOR}.${FEATURE}.${HOTFIX}" > version
else
echo -e "Nothing change, exit."
exit 0
fi
I've declared two arrays, FEATURE_LIST that contain only feature and work, if i type ./script.sh feature or ./script.sh feature/foobar it increase the value, instead if i type ./script.sh hotfix or other values combinations of array HOTFIX_LIST nothing happened. Where the error?
Using .*${HOTFIX_LIST[*]}.* is quite a tedious way of representing a string for an alternate match for the regex operator in bash. You can use the | character to represent alternations (because Extended Regular Expressions library is supported) in bash regex operator.
First generate the alternation string from the array into a string
hotfixList=$(IFS="|"; printf '^(%s)$' "${HOTFIX_LIST[*]}")
echo "$hotfixList"
^(fix|hotfix|bugfix)$
The string now represents a regex pattern comprising of three words that will match exactly as is because of the anchors ^ and $.
You can now use this variable in your regex match
[[ ${MERGE_REQUEST_SOURCE} =~ $hotfixList ]]
also for the feature check, just put the whole array expansion with [*] on the RHS which would be sufficient. Also you don't need the greedy matches, since you have the longer string on the LHS the comparison would still hold good.
[[ ${MERGE_REQUEST_SOURCE} =~ ${FEATURE_LIST[*]} ]]
As a side note, always use lower case variable names for user variables. The uppercase names are reserved only for the variables maintained by the shell which are persistent and have special meaning.

Check in shellscript if a variable is partially the same as a parameter [duplicate]

This question already has answers here:
How to check if a string contains a substring in Bash
(29 answers)
Closed 4 years ago.
So this is admittedly for university, but I can't find the answer anywhere, nor online, nor in the lecture notes.
I basically take a parameter, and have to search, if that is part of a longer string I have already stored:
if [ *$param* = $var ]
then
...
is the part in question. Now what is really weird for me, is that no matter if it says = or !=, the code nested under then never gets executed. I checked every other part of the code very thoroughly, and it all looks to be working fine.
Do you have any ideas what might cause this?
You just need to reverse the arguments. Inside [[ ... ]], =, ==, and != can perform pattern matching if the right-hand operand contains unescaped meta characters like * and ?, or a bracket expression [...].
if [[ $var == *"$param"* ]]; # check if $param is a substring of $var
Your code may or may not have performed pattern matching (depending on the contents of $var, but you were seeing if the string with the value of $param embedded in literal *s matched the value of $var.
For example, [[ foobar == *oba* ]] would succeed, as oba is a substring of foobar. [[ *oba* == foobar ]] would not, since *oba* and foobar are two distinct different strings. [[ *oba* == *oba ]] would also fail, since *oba* is not a string that ends with oba.
I would use grep
if echo $var | grep -q $param; then
echo "found it!"
fi
Try this
if [[ $var =~ $param ]]; then
echo "matches"
fi

Handling Spaces In Substring Searches In Bash Shell Scripts

I'm using the following to determine if either substring is present in a $mainString in a Bash (ver 3.2.25) shell script:
if [[ $mainString = *cat* || $mainSubstring = *blue cheese* ]]; then
echo "FOUND"
else
echo "NOT FOUND"
fi
But I keep getting the following error because of the space in "blue cheese". How do you handle spaces in the substring?
You can escape the space:
$mainSubString = *blue\ cheese*
or quote the non-wildcard portions, one example of which is
$mainSubString = *'blue cheese'*
Often, it is better to store the pattern in a variable, both to simplify the quoting and to make the [[...]] expression more concise. Note that you must not quote the parameter expansion, as glenn jackman points out in his comment.
pattern="*blue cheese*"
if [[ $mainString = *cat* || $mainSubstring = $pattern ]]; then

Multiple matches in a string using regex in bash

Been looking for some more advanced regex info on regex with bash and have not found much information on it.
Here's the concept, with a simple string:
myString="DO-BATCH BATCH-DO"
if [[ $myString =~ ([[:alpha:]]*)-([[:alpha:]]*) ]]; then
echo ${BASH_REMATCH[1]} #first perens
echo ${BASH_REMATCH[2]} #second perens
echo ${BASH_REMATCH[0]} #full match
fi
outputs:
BATCH
DO
DO-BATCH
So fine it does the first match (BATCH-DO) but how do I pull a second match (DO-BATCH)? I'm just drawing a blank here and can not find much info on bash regex.
OK so one way I did this is to put it in a for loop:
myString="DO-BATCH BATCH-DO"
for aString in ${myString[#]}; do
if [[ ${aString} =~ ([[:alpha:]]*)-([[:alpha:]]*) ]]; then
echo ${BASH_REMATCH[1]} #first perens
echo ${BASH_REMATCH[2]} #second perens
echo ${BASH_REMATCH[0]} #full match
fi
done
which outputs:
DO
BATCH
DO-BATCH
BATCH
DO
BATCH-DO
Which works but I kind of was hoping to pull it all from one regex if possible.
In your answer, myString is not an array, but you use an array reference to access it. This works in Bash because the 0th element of an array can be referred to by just the variable name and vice versa. What that means is that you could use:
for aString in $myString; do
to get the same result in this case.
In your question, you say the output includes "BATCH-DO". I get "DO-BATCH" so I presume this was a typo.
The only way to get the extra strings without using a for loop is to use a longer regex. By the way, I recommend putting Bash regexes in variable. It makes certain types much easier to use (those the contain whitespace or special characters, for example.
pattern='(([[:alpha:]]*)-([[:alpha:]]*)) +(([[:alpha:]]*)-([[:alpha:]]*))'
[[ $myString =~ $pattern ]]
declare -p BASH_REMATCH #dump the array
Outputs:
declare -ar BASH_REMATCH='([0]="DO-BATCH BATCH-DO" [1]="DO-BATCH" [2]="DO" [3]="BATCH" [4]="BATCH-DO" [5]="BATCH" [6]="DO")'
The extra set of parentheses is needed if you want to capture the individual substrings as well as the hyphenated phrases. If you don't need the individual words, you can eliminate the inner sets of parentheses.
Notice that you don't need to use if if you only need to extract substrings. You only need if to take conditional action based on a match.
Also notice that ${BASH_REMATCH[0]} will be quite different with the longer regex since it contains the whole match.
Per #Dennis Williamson's post I messed around and ended up with the following:
myString="DO-BATCH BATCH-DO"
pattern='(([[:alpha:]]*)-([[:alpha:]]*)) +(([[:alpha:]]*)-([[:alpha:]]*))'
[[ $myString =~ $pattern ]] && { read -a myREMatch <<< ${BASH_REMATCH[#]}; }
echo "\${myString} -> ${myString}"
echo "\${#myREMatch[#]} -> ${#myREMatch[#]}"
for (( i = 0; i < ${#myREMatch[#]}; i++ )); do
echo "\${myREMatch[$i]} -> ${myREMatch[$i]}"
done
This works fine except myString must have the 2 values to be there. So I post this because its is kinda interesting and I had fun messing with it. But to get this more generic and address any amount of paired groups (ie DO-BATCH) I'm going to go with a modified version of my original answer:
myString="DO-BATCH BATCH-DO"
myRE="([[:alpha:]]*)-([[:alpha:]]*)"
read -a myString <<< $myString
for aString in ${myString[#]}; do
echo "\${aString} -> ${aString}"
if [[ ${aString} =~ ${myRE} ]]; then
echo "\${BASH_REMATCH[#]} -> ${BASH_REMATCH[#]}"
echo "\${#BASH_REMATCH[#]} -> ${#BASH_REMATCH[#]}"
for (( i = 0; i < ${#BASH_REMATCH[#]}; i++ )); do
echo "\${BASH_REMATCH[$i]} -> ${BASH_REMATCH[$i]}"
done
fi
done
I would have liked a perlre like multiple match but this works fine.
Although this is a year old question (without accepted answer), could the regex pattern be simplified to:
myRE="([[:alpha:]]*-[[:alpha:]]*)"
by removing the inner parenthesis to find a smaller (more concise) set of the words DO-BATCH and BATCH-DO?
It works for me in you 18:10 time answer. ${BASH_REMATCH[0]} and ${BASH_REMATCH[1]} result in the 2 words being found.
In case you don't actually know how many matches there will be ahead of time, you can use this:
#!/bin/bash
function handle_value {
local one=$1
local two=$2
echo "i found ${one}-${two}"
}
function match_all {
local current=$1
local regex=$2
local handler=$3
while [[ ${current} =~ ${regex} ]]; do
"${handler}" "${BASH_REMATCH[#]:1}"
# trim off the portion already matched
current="${current#${BASH_REMATCH[0]}}"
done
}
match_all \
"DO-BATCH BATCH-DO" \
'([[:alpha:]]*)-([[:alpha:]]*)[[:space:]]*' \
'handle_value'

Resources