how to compare string in bash? I only want to compare words, not word order
for example i have variable
VAR1=eu-endpoint-2021.09.20 prod-store-2021.09.20 service-trace-2021.09.20
and another variable that stores the same info but with different order
VAR2=prod-store-2021.09.20 eu-endpoint-2021.09.20 service-trace-2021.09.20
and how can i compare this only by words? nor the words order
for example
if $VAR1 == $VAR2
then
do smth;
else
do smth;
fi
Since both your input string only contains parts that don't contain any spaces, we can
Convert the strings into arrays ($VAR1)
Loop over array1: Loop through an array of strings in Bash?
Check if current element exist in array2: Check if a Bash array contains a value
If not, set result to false, and break out of the loop
#!/bin/bash
VAR1='eu-endpoint-2021.09.20 prod-store-2021.09.20 service-trace-2021.09.20'
VAR2='prod-store-2021.09.20 eu-endpoint-2021.09.20 service-trace-2021.09.20'
ARR1=($VAR1)
ARR2=($VAR2)
RES=1
for i in "${ARR1[#]}"; do
[[ ! " ${ARR2[*]} " =~ " ${i} " ]] && RES=0 && break
done
[ $RES -eq 1 ] && echo 'Equal' || echo 'Not equal'
Will show Equal for the provided example strings as you can try here.
If you change any of the strings, you'll get Not equal as you can try here.
I'd just sort them then compare the result, e.g.:
$ VAR1='eu-endpoint-2021.09.20 prod-store-2021.09.20 service-trace-2021.09.20'
$ VAR2='prod-store-2021.09.20 eu-endpoint-2021.09.20 service-trace-2021.09.20'
$ if [[ $(tr ' ' '\n' <<<"$VAR1" | sort) = $(tr ' ' '\n' <<<"$VAR2" | sort) ]]; then echo same; else echo diff; fi
same
Related
Suppose I have a string,
a="This is a string"
and an array,
b=("This is my" "sstring")
I want to execute an if condition if any substring of a lies in b which is true because "This is" is a substring of the first element of b.
In case of two strings I know how to check if $x is a substring of $y using,
if [[ $y == *$x* ]]; then
#Something
fi
but since $x is an array of strings I don't know how to do it without having to explicitly loop through the array.
This might be all you need:
$ printf '%s\n' "${b[#]}" | grep -wFf <(tr ' ' $'\n' <<<"$a")
This is my
Otherwise - a shell is a tool to manipulate files/processes and sequence calls to tools. The guys who invented shell also invented awk for shell to call to manipulate text. What you're trying to do is manipulate text so there's a good chance you should be using awk instead of shell for whatever it is you're doing that this task is a part of.
$ printf '%s\n' "${b[#]}" |
awk -v a="$a" '
BEGIN { split(a,words) }
{ for (i in words) if (index($0,words[i])) { print; f=1; exit} }
END { exit !f }
'
This is my
The above assumes a doesn't contain any backslashes, if it can then use this instead:
printf '%s\n' "${b[#]}" | a="$a" awk 'BEGIN{split(ENVIRON["a"],words)} ...'
If any element in b can contain newlines then:
printf '%s\0' "${b[#]}" | a="$a" awk -v RS='\0' 'BEGIN{split(ENVIRON["a"],words)} ...'
Here is how to match the maximum number of words from string a to entries of array b:
#!/usr/bin/env bash
a="this is a string"
b=("this is my" "string" )
# tokenize a words into an array
read -ra a_words <<<"$a"
match()
{
# iterate entries of array b
for e in "${b[#]}"; do
# tokenize entry words into an array
read -ra e_words <<<"$e"
# initialize counter/length to the shortest MIN words count
i=$(( ${#a_words[#]} < ${#e_words[#]} ? ${#a_words[#]} : ${#e_words[#]} ))
# iterate matching decreasing number of words
while [ 0 -lt "$i" ]; do
# return true it matches
[ "${e_words[*]::$i}" = "${a_words[*]::$i}" ] && return
# decrease number of words to match
i=$(( i - 1 ))
done
done
# reaching here means no match found, return false
return 1
}
if match; then
printf %s\\n 'It matches!'
fi
You can split the $a into an array, then loop both arrays to find matches:
a="this is a string"
b=( "this is my" "string")
# Make an array by splitting $a on spaces
IFS=' ' read -ra aarr <<< "$a"
for i in "${aarr[#]}"
do
for j in "${b[#]}"
do
if [[ $j == *"$i"* ]]; then
echo "Match: $i : $j"
break
fi
done
done
# Match: this : this is my
# Match: is : this is my
# Match: string : string
If you need to handle substrings in $a (e.g. this is, is my etc) then you will need to loop over the array, generating all possible substrings:
for (( length=1; length <= "${#aarr[#]}"; ++length )); do
for (( start=0; start + length <= "${#aarr[#]}"; ++start )); do
substr="${aarr[#]:start:length}"
for j in "${b[#]}"; do
if [[ $j == *"${substr}"* ]]; then
echo "Match: $substr : $j"
break
fi
done
done
done
# Match: this : this is my
# Match: is : this is my
# Match: string : string
# Match: this is : this is my
I'm working in bash and I want to remove a substring from a string, I use grep to detect the string and that works as I want, my if conditions are true, I can test them in other tools and they select exactly the string element I want.
When it comes to removing the element from the string I'm having difficulty.
I want to remove something like ": Series 1", where there could be different numbers including 0 padded, a lower case s or extra spaces.
temp='Testing: This is a test: Series 1'
echo "A. "$temp
if echo "$temp" | grep -q -i ":[ ]*[S|s]eries[ ]*[0-9]*" && [ "$temp" != "" ]; then
title=$temp
echo "B. "$title
temp=${title//:[ ]*[S|s]eries[ ]*[0-9]*/ }
echo "C. "$temp
fi
# I trim temp for spaces here
series_title=${temp// /_}
echo "D. "$series_title
The problem I have is that at points C & D
Give me:
C. Testing
D. Testing_
You can perform regex matching from bash alone without using external tools.
It's not clear what your requirement is. But from your code, I guess following will help.
temp='Testing: This is a test: Series 1'
# Following will do a regex match and extract necessary parts
# i.e. extract everything before `:` if the entire pattern is matched
[[ $temp =~ (.*):\ *[Ss]eries\ *[0-9]* ]] || { echo "regex match failed"; exit; }
# now you can use the extracted groups as follows
echo "${BASH_REMATCH[1]}" # Output = Testing: This is a test
As mentioned in the comments, if you need to extract parts both before and after the removed section,
temp='Testing: This is a test: Series 1 <keep this>'
[[ $temp =~ (.*):\ *[Ss]eries\ *[0-9]*\ *(.*) ]] || { echo "invalid"; exit; }
echo "${BASH_REMATCH[1]} ${BASH_REMATCH[2]}" # Output = Testing: This is a test <keep this>
Keep in mind that [0-9]* will match zero lengths too. If you need to force that there need to be at least single digit, use [0-9]+ instead. Same goes for <space here>* (i.e. zero or more spaces) and others.
I have a variable like below:
Variable1="PanicA0 PanicA1"
Variable2="PanicA0"
I have to compare variable1 and variable2 and should echo output as PanicA1 i.e "PanicA1" is not in Varaiable2. How can i achieve this using shell script?
Step 1: split the first variables into an array per word
Step 2: iterate over the array
Step 3: Use pattern matching inside double brackets
.
Variable1="PanicA0 PanicA1"
variable2="PanicA0"
varArr=($Variable1)
for word in "${varArr[#]}"
do
[[ $variable2 == *${word}* ]] || echo "$word is not in varaiable2"
done
this is the script
varone=`echo "PanicA0 PanicA1" | cut -d' ' -f1-`
vartwo=`echo "PanicA0" | cut -d' ' -f1-`
for i in $varone; do
for j in $vartwo; do
if [[ $i = $j ]]; then
echo "Matched: $i"
break
else
echo "$i is not in vartwo"
fi
done
done
Is there a way of checking if a string exists in an array of strings - without iterating through the array?
For example, given the script below, how I can correctly implement it to test if the value stored in variable $test exists in $array?
array=('hello' 'world' 'my' 'name' 'is' 'perseus')
#pseudo code
$test='henry'
if [$array[$test]]
then
do something
else
something else
fi
Note
I am using bash 4.1.5
With bash 4, the closest thing you can do is use associative arrays.
declare -A map
for name in hello world my name is perseus; do
map["$name"]=1
done
...which does the exact same thing as:
declare -A map=( [hello]=1 [my]=1 [name]=1 [is]=1 [perseus]=1 )
...followed by:
tgt=henry
if [[ ${map["$tgt"]} ]] ; then
: found
fi
There will always technically be iteration, but it can be relegated to the shell's underlying array code. Shell expansions offer an abstraction that hide the implementation details, and avoid the necessity for an explicit loop within the shell script.
Handling word boundaries for this use case is easier with fgrep, which has a built-in facility for handling whole-word fixed strings. The regular expression match is harder to get right, but the example below works with the provided corpus.
External Grep Process
array=('hello' 'world' 'my' 'name' 'is' 'perseus')
word="world"
if echo "${array[#]}" | fgrep --word-regexp "$word"; then
: # do something
fi
Bash Regular Expression Test
array=('hello' 'world' 'my' 'name' 'is' 'perseus')
word="world"
if [[ "${array[*]}" =~ (^|[^[:alpha:]])$word([^[:alpha:]]|$) ]]; then
: # do something
fi
You can use an associative array since you're using Bash 4.
declare -A array=([hello]= [world]= [my]= [name]= [is]= [perseus]=)
test='henry'
if [[ ${array[$test]-X} == ${array[$test]} ]]
then
do something
else
something else
fi
The parameter expansion substitutes an "X" if the array element is unset (but doesn't if it's null). By doing that and checking to see if the result is different from the original value, we can tell if the key exists regardless of its value.
array=('hello' 'world' 'my' 'name' 'is' 'perseus')
regex="^($(IFS=\|; echo "${array[*]}"))$"
test='henry'
[[ $test =~ $regex ]] && echo "found" || echo "not found"
Reading your post I take it that you don't just want to know if a string exists in an array (as the title would suggest) but to know if that string actually correspond to an element of that array. If this is the case please read on.
I found a way that seems to work fine .
Useful if you're stack with bash 3.2 like I am (but also tested and working in bash 4.2):
array=('hello' 'world' 'my' 'name' 'is' 'perseus')
IFS=: # We set IFS to a character we are confident our
# elements won't contain (colon in this case)
test=:henry: # We wrap the pattern in the same character
# Then we test it:
# Note the array in the test is double quoted, * is used (# is not good here) AND
# it's wrapped in the boundary character I set IFS to earlier:
[[ ":${array[*]}:" =~ $test ]] && echo "found! :)" || echo "not found :("
not found :( # Great! this is the expected result
test=:perseus: # We do the same for an element that exists
[[ ":${array[*]}:" =~ $test ]] && echo "found! :)" || echo "not found :("
found! :) # Great! this is the expected result
array[5]="perseus smith" # For another test we change the element to an
# element with spaces, containing the original pattern.
test=:perseus:
[[ ":${array[*]}:" =~ $test ]] && echo "found!" || echo "not found :("
not found :( # Great! this is the expected result
unset IFS # Remember to unset IFS to revert it to its default value
Let me explain this:
This workaround is based on the principle that "${array[*]}" (note the double quotes and the asterisk) expands to the list of elements of array separated by the first character of IFS.
Therefore we have to set IFS to whatever we want to use as boundary (a colon in my case):
IFS=:
Then we wrap the element we are looking for in the same character:
test=:henry:
And finally we look for it in the array. Take note of the rules I followed to do the test (they are all mandatory): the array is double quoted, * is used (# is not good) AND it's wrapped in the boundary character I set IFS to earlier:
[[ ":${array[*]}:" =~ $test ]] && echo found || echo "not found :("
not found :(
If we look for an element that exists:
test=:perseus:
[[ ":${array[*]}:" =~ $test ]] && echo "found! :)" || echo "not found :("
found! :)
For another test we can change the last element 'perseus' for 'perseus smith' (element with spaces), just to check if it's a match (which shouldn't be):
array[5]="perseus smith"
test=:perseus:
[[ ":${array[*]}:" =~ $test ]] && echo "found!" || echo "not found :("
not found :(
Great!, this is the expected result since "perseus" by itself is not an element anymore.
Important!: Remember to unset IFS to revert it to its default value (unset) once you're done with the tests:
unset IFS
So so far this method seems to work, you just have to be careful and choose a character for IFS that you are sure your elements won't contain.
Hope it helps anyone!
Regards,
Fred
In most cases, the following would work. Certainly it has restrictions and limitations, but easy to read and understand.
if [ "$(echo " ${array[#]} " | grep " $test ")" == "" ]; then
echo notFound
else
echo found
fi
Instead of iterating over the array elements it is possible to use parameter expansion to delete the specified string as an array item (for further information and examples see Messing with arrays in bash and Modify every element of a Bash array without looping).
(
set -f
export IFS=""
test='henry'
test='perseus'
array1=('hello' 'world' 'my' 'name' 'is' 'perseus')
#array1=('hello' 'world' 'my' 'name' 'is' 'perseusXXX' 'XXXperseus')
# removes empty string as array item due to IFS=""
array2=( ${array1[#]/#${test}/} )
n1=${#array1[#]}
n2=${#array2[#]}
echo "number of array1 items: ${n1}"
echo "number of array2 items: ${n2}"
echo "indices of array1: ${!array1[*]}"
echo "indices of array2: ${!array2[*]}"
echo 'array2:'
for ((i=0; i < ${#array2[#]}; i++)); do
echo "${i}: '${array2[${i}]}'"
done
if [[ $n1 -ne $n2 ]]; then
echo "${test} is in array at least once! "
else
echo "${test} is NOT in array! "
fi
)
q=( 1 2 3 )
[ "${q[*]/1/}" = "${q[*]}" ] && echo not in array || echo in array
#in array
[ "${q[*]/7/}" = "${q[*]}" ] && echo not in array || echo in array
#not in array
#!/bin/bash
test="name"
array=('hello' 'world' 'my' 'yourname' 'name' 'is' 'perseus')
nelem=${#array[#]}
[[ "${array[0]} " =~ "$test " ]] ||
[[ "${array[#]:1:$((nelem-1))}" =~ " $test " ]] ||
[[ " ${array[$((nelem-1))]}" =~ " $test" ]] &&
echo "found $test" || echo "$test not found"
Just treat the expanded array as a string and check for a substring, but to isolate the first and last element to ensure they are not matched as part of a lesser-included substring, they must be tested separately.
if ! grep -q "$item" <<< "$itemlist" ; then .....
Should work fine.
for simple use cases I use something like this
array=( 'hello' 'world' 'I' 'am' 'Joe' )
word=$1
[[ " ${array[*]} " =~ " $word " ]] && echo "$word is in array!"
Note the spaces around ". This works as long as there are no spaces in the array values and the input doesn't match more values at once, like word='hello world'. If there are, you'd have to play with $IFS on top of that.
What I want is pretty simple. Given a string 'this is my test string,' I want to return the substring from the 2nd position to the 2nd to last position. Something like:
substring 'this is my test string' 1,-1. I know I can get stuff from the beginning of the string using cut, but I'm not sure how I can calculate easily from the end of the string. Help?
Turns out I can do this with awk pretty easily as follows:
echo 'this is my test string' | awk '{ print substr( $0, 2, length($0)-2 ) }'
Be cleaner in awk, python, perl, etc. but here's one way to do it:
#!/usr/bin/bash
msg="this is my test string"
start=2
len=$((${#msg} - ${start} - 2))
echo $len
echo ${msg:2:$len}
results in is is my test stri
You can do this with just pure bash
$ string="i.am.a.stupid.fool.are.you?"
$ echo ${string: 2:$((${#string}-4))}
am.a.stupid.fool.are.yo
Look ma, no global variables or forks (except for the obvious printf) and thoroughly tested:
substring()
{
# Extract substring with positive or negative indexes
# #param $1: String
# #param $2: Start (default start of string)
# #param $3: Length (default until end of string)
local -i strlen="${#1}"
local -i start="${2-0}"
local -i length="${3-${#1}}"
if [[ "$start" -lt 0 ]]
then
let start+=$strlen
fi
if [[ "$length" -lt 0 ]]
then
let length+=$strlen
let length-=$start
fi
if [[ "$length" -lt 0 ]]
then
return
fi
printf %s "${1:$start:$length}"
}