Comparison of 2 string variables in shell script - bash

Consider there is a variable line and variable word:
line = 1234 abc xyz 5678
word = 1234
The value of these variables are read from 2 different files.
I want to print the line if it contains the word. How do I do this using shell script? I tried all the suggested solutions given in previous questions. For example, the following code always passed even if the word was not in the line.
if [ "$line"==*"$word"*]; then
echo $line
fi

No need for an if statement; just use grep:
echo $line | grep "\b$word\b"

You can use if [[ "$line" == *"$word"* ]]
Also you need to use the following to assign variables
line="1234 abc xyz 5678"
word="1234"
Working example -- http://ideone.com/drLidd

Watch the white spaces!
When you set a variable to a value, don't put white spaces around the equal sign. Also use quotes when your value has spaced in it:
line="1234 abc xyz 5678" # Must have quotation marks
word=1234 # Quotation marks are optional
When you use comparisons, you must leave white space around the brackets and the comparison sign:
if [[ $line == *$word* ]]; then
echo $line
fi
Note that double square brackets. If you are doing pattern matching, you must use the double square brackets and not the single square brackets. The double square brackets mean you're doing a pattern match operation when you use == or =. If you use single square brackets:
if [ "$line" = *"$word"* ]
You're doing equality. Note that double square brackets don't need quotation marks while single brackets it is required in most situations.

echo $line | grep "$word"
would be the typical way to do this in a script, of course it does cost a new process

You can use the bash match operator =~:
[[ "$line" =~ "$word" ]] && echo "$line"
Don't forget quotes, as stated in previous answers (especially the one of #Bill).

The reason that if [ "$line"==*"$word"* ] does not work as you expect is perhaps a bit obscure. Assuming that no files exist that cause the glob to expand, then you are merely testing that the string 1234 abc xyz 5678==*1234* is non empty. Clearly, that is not an empty string, so the condition is always true. You need to put whitespace around your == operator, but that will not work because you are now testing if the string 1234 abc xyz 5678 is the same as the string to which the glob *1234* expands, so it will be true only if a file named 1234 abc xyz 5678 exists in the current working directory of the process executing the shell script. There are shell extensions that allow this sort of comparison, but grep works well, or you can use a case statement:
case "$line" in
*$word*) echo $line;;
esac

An alternative solution would be using loop:
for w in $line
do
if [ "$w" == "$word" ]; then
echo $line
break
fi
done

Code Snippet:
$a='text'
$b='text'
if [ $a -eq $b ]
then
msg='equal'
fi

Related

substring extraction in bash

iamnewbie: this code is inefficient but it should extract the substring, the problem is with last echo statement,need some insight.
function regex {
#this function gives the regular expression needed
echo -n \'
for (( i = 1 ; i <= $1 ; i++ ))
do
echo -n .
done
echo -n '\('
for (( i = 1 ; i <= $2 ; i++ ))
do
echo -n .
done
echo -n '\)'
echo -n \'
}
# regex function ends
echo "Enter the string:"
read stg
#variable stg holds the string entered
if [ -z "$stg" ] ; then
echo "Null string"
exit
else
echo "Length of the $stg is:"
z=`expr "$stg" : '.*' `
#variable z holds the length of given string
echo $z
fi
echo "Enter the number of trailing characters to be extracted from $stg:"
read n
m=`expr $z - $n `
#variable m holds an integer value which is equal to total length - length of characters to be extracted
x=$(regex $m $n)
echo ` expr "$stg" : "$x" `
#the echo statement(above) is just printing a newline!! But not the result
What I intend to do with this code is, if I enter "racecar" and give "3" , it should display "car" which are the last three characters. Instead of displaying "car" its just printing a newline. Please correct this code rather than giving a better one.
Although you didn't ask for a better solution, it's worth mentioning:
$ n=3
$ stg=racecar
$ echo "${stg: -n}"
car
Note that the space after the : in ${stg: -n} is required. Without the space, the parameter expansion is a default-value expansion rather than a substring expansion. With the space, it's a substring expansion; -n is interpreted as an arithmetic expression (which means that n is interpreted as $n) and since the result is a negative number, it specifies the number of characters from the end to start the substring. See the Bash manual for details.
Your solution is based on evaluating the equivalent of:
expr "$stg" : '......\(...\)'
with an appropriate number of dots. It's important to understand what the above bash syntax actually means. It invokes the command expr, passing it three arguments:
arg 1: the contents of the variable stg
arg 2: :
arg 3: ......\(...\)
Note that there are no quotes visible. That's because the quotes are part of bash syntax, not part of the argument values.
If the value of stg had enough characters, the result of the above expr invocation would be to print out the 7th, 8th and 9th character of the value of stg`. Otherwise, it would print a blank line, and fail.
But that's not what you are doing. You're creating the regular expression:
'......\(...\)'
which has single quotes in it. Since single-quotes are not special characters in a regex, they match themselves; in other words, that pattern will match a string which starts with a single quote, followed by nine arbitrary characters, followed by another single quote. And if the string does match, it will print the three characters prior to the second single-quote.
Of course, since the regular expression you make has a . for every character in the target string, it won't match the target even if the target started and begun with a single-quote, since there would be too many dots in the regex to match that.
If you don't put single quotes into the regex, then your program will work, but I have to say that few times have I seen such an intensely circuitous implementation of the substring function. If you're not trying to win an obfuscated bash competition (a difficult challenge since most production bash code is obfuscated by nature), I'd suggest you use normal bash features instead of trying to do everything with regexen.
One of those is the syntax to determine the length of a string:
$ stg=racecar
$ echo ${#stg}
7
(although, as shown at the beginning, you don't actually even need that.)
What about:
$ n=3
$ string="racecar"
$ [[ "$string" =~ (.{$n})$ ]]
$ echo ${BASH_REMATCH[1]}
car
This looks for the last n characters at the end of the line. In a script:
#!/bin/bash
read -p "Enter a string: " string
read -p "Enter the number of characters you want from the end: " n
[[ "$string" =~ (.{$n})$ ]]
echo "These are the last $n characters: ${BASH_REMATCH[1]}"
You may want to add some more error handling, but this'll do it.
I'm not sure you need loops for this task. I wrote some example to get two parameters from user and cut the word according to it.
#!/bin/bash
read -p "Enter some word? " -e stg
#variable stg holds the string entered
if [ -z "$stg" ] ; then
echo "Null string"
exit 1
fi
read -p "Enter some number to set word length? " -e cutNumber
# check that cutNumber is a number
if ! [ "$cutNumber" -eq "$cutNumber" ]; then
echo "Not a number!"
exit 1
fi
echo "Cut first n characters:"
echo ${stg:$cutNumber}
echo
echo "Show first n characters:"
echo ${stg:0:$cutNumber}
echo "Alternative get last n characters:"
echo -n "$stg" | tail -c $cutNumber
echo
Example:
Enter some word? TheRaceCar
Enter some number to set word length? 7
Cut first n characters:
Car
Show first n characters:
TheRace
Alternative get last n characters:
RaceCar

check for string format in bash script

I am attempting to check for proper formatting at the start of a string in a bash script.
The expected format is like the below where the string must always begin with "ABCDEFG-" (exact letters and order) and the numbers would vary but be at least 3 digits. Everything after the 3rd digit is a do not care.
Expected start of string: "ABCDEFG-1234"
I am using the below code snippet.
[ $(echo "$str" | grep -E "ABCDEFG-[0-9][0-9][0-9]") ] && echo "yes"
str1 = "ABCDEFG-1234"
str2 = "ABCDEFG-1234 - Some more text"
When I use str1 in place of str everything works ok and yes is printed.
When I use str2 in place of str i get the below error
[: ABCDEFG-1234: unary operator expected
I am pretty new to working with bash scripts so any help would be appreciated.
If this is bash, you have no reason to use grep for this at all; the shell has built-in regular expression support.
re="ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, you might want your regex to be anchored if you want a match in the beginning rather than anywhere in the content:
re="^ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, this doesn't need to be an ERE at all -- a glob-style pattern match would also be adequate:
if [[ $str = ABCDEFG-[0-9][0-9][0-9]* ]]; then echo "yes"; fi
Try grep -E "ABCDEFG-[0-9][0-9][0-9].*"

Search in string for multiple array values

I'm looking at a simple for loop with the following logic:
variable=`some piped string`
array_value=(1.1 2.9)
for i in ${array_value[#]}; do
if [[ "$variable" == *some_text*"$array_value" ]]; then
echo -e "Info: Found a matching string"
fi
The problem is that I cannot get this to show me when it finds either the string ending in 1.1 or 2.9 as sample data.
If I do an echo $array_value in the for loop I can see that the array values are being taken so its values are being parsed, though the if loop doesn't return that echo message although the string is present.
LE:
Based on the comments received I've abstracted the code to something like this, which still doesn't work if I want to use wildcards inside the comparison quote
versions=(1.1 2.9)
string="system is running version:2.9"
for i in ${versions[#]}; do
if [[ "$string" == "system*${i}" ]]; then
echo "match found"
fi
done
Any construction similar to "system* ${i}" or "* ${i}" will not work, though if I specify the full string pattern it will work.
The problem with the test construct has to you with your if statement. To construct the if statement in a form that will evaluate, use:
if [[ "$variable" == "*some_text*${i}" ]]; then
Note: *some_text* will need to be replaced with actual text without * wildcards. If the * is needed in the text, then you will need to turn globbing off to prevent expansion by the shell. If expansion is your goal, then protect the variable i by braces.
There is nothing wrong with putting *some_text* up against the variable i, but it is cleaner, depending on the length of some_text, to assign it to a variable itself. The easiest way to accommodate this would be to define a variable to hold the some_text you are needing. E.g.:
prefix="some_text"
if [[ "$variable" == "${prefix}${i}" ]]; then
If you have additional questions, just ask.
Change "system*${i}" to system*$i.
Wrapping with quotes inside [[ ... ]] nullifies the wildcard * by treating it as a literal character.
Or if you want the match to be assigned to a variable:
match="system*"
you can then do:
[[ $string == $match$i ]]
You actually don't need quotes around $string either as word splitting is not performed inside [[ ... ]].
From man bash:
[[ expression ]]
...
Word splitting and pathname expansion are not
performed on the words between the [[ and ]]
...
Any part of the pattern may be quoted to force
the quoted portion to be matched as a string.

BASH: Everything but not slash? IF STATEMENT (STRING COMPARISION)

I'm trying to match any strings that start with /John/ but does not contain / after /John/
if
[ $string == /John/[!/]+ ]; then ....
fi
This is what I got and it doesn't seem to be working.
So I tried
if
[[ $string =~ ^/John/[!/]+$ ]]; then ....
fi
It still didn't work, and so I changed it to
if
[[ $string =~ /John/[^/] ]]; then ....
fi
It worked but will match with all the strings that has / behind /John/ too.
For bash you want [[ $string =~ /John/[^/]*$ ]] -- the end-of-line anchor ensures there are no slashes after the last acceptable slash.
How about "the string starts with '/John/' and doesn't contain any slashes after '/John/'"?
[[ $string = /John/* && $string != /John/*/* ]]
Or you could compare against a parameter expansion that only expands if the conditions are met. This says "after stripping off everything including and after the last slash, the string is /John":
[[ ${string%/*} = /John ]]
In fact, this last solution is the only entirely POSIXLY_STRICT one I can come up with without multiple test expressions.
[ "${string%/*}" = /John ]
By the way, your problem is probably simply be using double-equals inside a single-bracket test expression. bash actually does accept them inside double-bracket test expressions, but a single equals is a better idea.
You can also use plain old grep:
string='/John Lennon/Yoko Ono'
if echo "$string" | grep -q "/John[^/]" ; then
echo "matched"
else
echo "no match found"
fi
This only fails if /John is at the very end of the string... if that's a possibility then you can tweak to handle that case, for instance:
string='/John Lennon/Yoko Ono'
if echo "$string" | grep -qP "(/John[^/])|(/John$)" ; then
echo "matched"
else
echo "no match found"
fi
Not sure what language you're using, but normal negative character classes are prefixed with a ^
e.g.
[^/]
You can also put in start/end qualifiers (clojure example, so Java's regex engine). Usually ^ at beginning and $ at end.
user => (re-matches #"^/[a-zA-Z]+[^/]$" "/John/")
nil

case insensitive string comparison in bash

The following line removes the leading text before the variable $PRECEDING
temp2=${content#$PRECEDING}
But now i want the $PRECEDING to be case-insensitive. This works with sed's I flag. But i can't figure out the whole cmd.
No need to call out to sed or use shopt. The easiest and quickest way to do this (as long as you have Bash 4):
if [ "${var1,,}" = "${var2,,}" ]; then
echo "matched"
fi
All you're doing there is converting both strings to lowercase and comparing the results.
Here's a way to do it with sed:
temp2=$(sed -e "s/^.*$PRECEDING//I" <<< "$content")
Explanation:
^.*$PRECEDING: ^ means start of string, . means any character, .* means any character zero or more times. So together this means "match any pattern from start of string that is followed by (and including) string stored in $PRECEDING.
The I part means case-insensitive, the g part (if you use it) means "match all occurrences" instead of just the 1st.
The <<< notation is for herestrings, so you save an echo.
The only bash way I can think of is to check if there's a match (case-insensitively) and if yes, exclude the appropriate number of characters from the beginning of $content:
content=foo_bar_baz
PRECEDING=FOO
shopt -s nocasematch
[[ $content == ${PRECEDING}* ]] && temp2=${content:${#PRECEDING}}
echo $temp2
Outputs: _bar_baz
your examples have context-switching techniques.
better is (bash v4):
VAR1="HELLoWORLD"
VAR2="hellOwOrld"
if [[ "${VAR1^^}" = "${VAR2^^}" ]]; then
echo MATCH
fi
link: Converting string from uppercase to lowercase in Bash
If you don't have Bash 4, I find the easiest way is to first convert your string to lowercase using tr
VAR1=HelloWorld
VAR2=helloworld
VAR1_LOWER=$(echo "$VAR1" | tr '[:upper:]' '[:lower:]')
VAR2_LOWER=$(echo "$VAR2" | tr '[:upper:]' '[:lower:]')
if [ "$VAR1_LOWER" = "$VAR2_LOWER" ]; then
echo "Match"
else
echo "Invalid"
fi
This also makes it really easy to assign your output to variables by changing your echo to OUTPUT="Match" & OUTPUT="Invalid"

Resources