check for string format in bash script - bash

I am attempting to check for proper formatting at the start of a string in a bash script.
The expected format is like the below where the string must always begin with "ABCDEFG-" (exact letters and order) and the numbers would vary but be at least 3 digits. Everything after the 3rd digit is a do not care.
Expected start of string: "ABCDEFG-1234"
I am using the below code snippet.
[ $(echo "$str" | grep -E "ABCDEFG-[0-9][0-9][0-9]") ] && echo "yes"
str1 = "ABCDEFG-1234"
str2 = "ABCDEFG-1234 - Some more text"
When I use str1 in place of str everything works ok and yes is printed.
When I use str2 in place of str i get the below error
[: ABCDEFG-1234: unary operator expected
I am pretty new to working with bash scripts so any help would be appreciated.

If this is bash, you have no reason to use grep for this at all; the shell has built-in regular expression support.
re="ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, you might want your regex to be anchored if you want a match in the beginning rather than anywhere in the content:
re="^ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, this doesn't need to be an ERE at all -- a glob-style pattern match would also be adequate:
if [[ $str = ABCDEFG-[0-9][0-9][0-9]* ]]; then echo "yes"; fi

Try grep -E "ABCDEFG-[0-9][0-9][0-9].*"

Related

In Bash, is it possible to match a string variable containing wildcards to another string

I am trying to compare strings against a list of other strings read from a file.
However some of the strings in the file contain wildcard characters (both ? and *) which need to be taken into account when matching.
I am probably missing something but I am unable to see how to do it
Eg.
I have strings from file in an array which could be anything alphanumeric (and include commas and full stops) with wildcards : (a?cd, xy, q?hz, j,h-??)
and I have another string I wish to compare with each item in the list in turn. Any of the strings may contain spaces.
so what I want is something like
teststring="abcdx.rubb ish,y"
matchstrings=("a?cd" "*x*y" "q?h*z" "j*,h-??")
for i in "${matchstrings[#]}" ; do
if [[ "$i" == "$teststring" ]]; then # this test here is the problem
<do something>
else
<do something else>
fi
done
This should match on the second "matchstring" but not any others
Any help appreciated
Yes; you just have the two operands to == reversed; the glob goes on the right (and must not be quoted):
if [[ $teststring == $i ]]; then
Example:
$ i=f*
$ [[ foo == $i ]] && echo pattern match
pattern match
If you quote the parameter expansion, the operation is treated as a literal string comparison, not a pattern match.
$ [[ foo == "$i" ]] || echo "foo != f*"
foo != f*
Spaces in the pattern are not a problem:
$ i="foo b*"
$ [[ "foo bar" == $i ]] && echo pattern match
pattern match
You can do this even completely within POSIX, since case alternatives undergo parameter substitution:
#!/bin/sh
teststring="abcdx.rubbish,y"
while IFS= read -r matchstring; do
case $teststring in
($matchstring) echo "$matchstring";;
esac
done << "EOF"
a?cd
*x*y
q?h*z
j*,h-??
EOF
This outputs only *x*y as desired.

how to check whether a string starts with xx and ends with yy in shellscript?

In the below example I want to find whether the sentence starts with 'ap' and ends with 'e'.
example: a="apple"
if [[ "$a" == ^"ap"+$ ]]
This is not giving proper output.
You don't mention which shell you're using, but the [[ in your attempt suggests you're using one that expands upon the base POSIX sh language. The following works with at least bash, zsh and ksh93:
$ a=apple
$ [[ $a == ap*e ]] && echo matches # Wildcard pattern
matches
$ [[ $a =~ ^ap.*e$ ]] && echo matches # Regular expression - note the =~
matches

Comparing strings case-insensitively

I want to compare Hello World to hello world. The result should be true, as if they were equal. I'm doing:
while read line; do
newLine="$newLine$line"
done < $1
newp="Hello World"
if (( ${newp,,}==${newLine,,} )); then
echo "true"
else
echo "false"
fi
when I pass a text file consisting of:
#filename: file.txt
hello world
The output seems to be:
./testScript.txt: line 20: 0à»: hello world==hello world : syntax error in expression (error token is "world==hello world ")
+ echo false
What am I doing wrong here? Also, a bit unrelated, is there any way to pass the line that is in file.txt to a string(newLine) without doing that while I have done?
You should add commas and change the double parentheses to single brackets. The if statement should be something like:
if [ "${newp,,}" = "${newLine,,}" ]; then
And in relation to that while loop... It depends on what you want to do. If, like in this case, you want to get the entire file and save it as a single string, you could simply do:
line=$(cat $1)
I would suggest you only use that loop you wrote if you are trying to parse the file line by line, i.e. adding if statements, using different variables and so on. But for a simple case like this one, cat will do just fine.
There is a shell option, nocasematch, that enables case insensitive pattern matching for use with [[ and case.
Comparing strings that differ by casing only:
$ var1=lowercase
$ var2=LOWERCASE
$ [[ $var1 == $var2 ]] && echo "Matches!" || echo "Doesn't match!"
Doesn't match!
Now enabling the shell option and trying again:
$ shopt -s nocasematch
$ [[ $var1 == $var2 ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
Just make sure to turn it off again with shopt -u nocasematch if you don't want to do all comparisons case insensitive.

BASH: Everything but not slash? IF STATEMENT (STRING COMPARISION)

I'm trying to match any strings that start with /John/ but does not contain / after /John/
if
[ $string == /John/[!/]+ ]; then ....
fi
This is what I got and it doesn't seem to be working.
So I tried
if
[[ $string =~ ^/John/[!/]+$ ]]; then ....
fi
It still didn't work, and so I changed it to
if
[[ $string =~ /John/[^/] ]]; then ....
fi
It worked but will match with all the strings that has / behind /John/ too.
For bash you want [[ $string =~ /John/[^/]*$ ]] -- the end-of-line anchor ensures there are no slashes after the last acceptable slash.
How about "the string starts with '/John/' and doesn't contain any slashes after '/John/'"?
[[ $string = /John/* && $string != /John/*/* ]]
Or you could compare against a parameter expansion that only expands if the conditions are met. This says "after stripping off everything including and after the last slash, the string is /John":
[[ ${string%/*} = /John ]]
In fact, this last solution is the only entirely POSIXLY_STRICT one I can come up with without multiple test expressions.
[ "${string%/*}" = /John ]
By the way, your problem is probably simply be using double-equals inside a single-bracket test expression. bash actually does accept them inside double-bracket test expressions, but a single equals is a better idea.
You can also use plain old grep:
string='/John Lennon/Yoko Ono'
if echo "$string" | grep -q "/John[^/]" ; then
echo "matched"
else
echo "no match found"
fi
This only fails if /John is at the very end of the string... if that's a possibility then you can tweak to handle that case, for instance:
string='/John Lennon/Yoko Ono'
if echo "$string" | grep -qP "(/John[^/])|(/John$)" ; then
echo "matched"
else
echo "no match found"
fi
Not sure what language you're using, but normal negative character classes are prefixed with a ^
e.g.
[^/]
You can also put in start/end qualifiers (clojure example, so Java's regex engine). Usually ^ at beginning and $ at end.
user => (re-matches #"^/[a-zA-Z]+[^/]$" "/John/")
nil

multiline regexp matching in bash

I would like to do some multiline matching with bash's =~
#!/bin/bash
str='foo = 1 2 3
bar = what about 42?
boo = more words
'
re='bar = (.*)'
if [[ "$str" =~ $re ]]; then
echo "${BASH_REMATCH[1]}"
else
echo no match
fi
Almost there, but if I use ^ or $, it will not match, and if I don't use them, . eats newlines too.
EDIT:
sorry, values after = could be multi-word values.
I could be wrong, but after a quick read from here, especially Note 2 at the end of the page, bash can sometimes include the newline character when matching with the dot operator. Therefore, a quick solution would be:
#!/bin/bash
str='foo = 1
bar = 2
boo = 3
'
re='bar = ([^\
]*)'
if [[ "$str" =~ $re ]]; then
echo "${BASH_REMATCH[1]}"
else
echo no match
fi
Notice that I now ask it match anything except newlines. Hope this helps =)
Edit: Also, if I understood correctly, the ^ or $ will actually match the start or the end (respectively) of the string, and not the line. It would be better if someone else could confirm this, but it is the case and you do want to match by line, you'll need to write a while loop to read each line individually.

Resources