String manipulation, optional character - bash

With grep, you can use a question mark ? to signify an optional character, that is a character that is to be matches 0 or 1 times.
$ foo=qwerasdf
$ grep -Eo fx? <<< $foo
f
The question is does Bash String Manipulation have a similar feature? Something like
$ echo ${foo%fx?}

You're probably talking about parameter expansion. It uses shell patterns, not regular expression, so the answer is no.
Upon further reading, I noticed that if you
shopt -s extglob
you can use extended pattern matching which can achieve something similar to regex, albeit with slightly different syntax.
Check this out:
word="mre"
# this returns true
if [[ $word == m?(o)re ]]; then echo true; else echo false; fi
word="more"
# this also returns true
if [[ $word == m?(o)re ]]; then echo true; else echo false; fi
word="mooooooooooore"
# again, true
if [[ $word == m+(o)re ]]; then echo true; else echo false; fi
Works with parameter expansion too,
word="noooooooooooo"
# outputs 'nay'
echo ${word/+(o)/ay}
# outputs 'nayooooooooooo'
echo ${word/o/ay}

Related

Bash script with multiline variable

Here is my code
vmname="$1"
EXCEPTLIST="desktop-01|desktop-02|desktop-03|desktop-04"
if [[ $vmname != #(${EXCEPTLIST}) ]]; then
echo "${vmname}"
else
echo "Its in the exceptlist"
fi
The above code works perfectly but my question is , the EXCEPTLIST can be a long line, say 100 server names. In that case its hard to put all that names in one line. In that situation is there any way to make the variable EXCEPTLIST to be a multiline variable ? something like as follows:
EXCEPTLIST="desktop-01|desktop-02|desktop-03| \n
desktop-04|desktop-05|desktop-06| \n
desktop-07|desktop-08"
I am not sure but was thinking of possibilities.
Apparently I would like to know the terminology of using #(${})- Is this called variable expansion or what ? Does anyone know the documentation/explain to me about how this works in bash. ?
One can declare an array if the data/string is long/large. Use IFS and printf for the format string, something like:
#!/usr/bin/env bash
exceptlist=(
desktop-01
desktop-02
desktop-03
desktop-04
desktop-05
desktop-06
)
pattern=$(IFS='|'; printf '#(%s)' "${exceptlist[*]}")
[[ "$vmname" != $pattern ]] && echo good
In that situation is there any way to make the variable EXCEPTLIST to be a multiline variable ?
With your given input/data an array is also a best option, something like:
exceptlist=(
'desktop-01|desktop-02|desktop-03'
'desktop-04|desktop-05|desktop-06'
'desktop-07|desktop-08'
)
Check what is the value of $pattern variable one way is:
declare -p pattern
Output:
declare -- pattern="#(desktop-01|desktop-02|desktop-03|desktop-04|desktop-05|desktop-06)"
Need to test/check if $vmname is an empty string too, since it will always be true.
On a side note, don't use all upper case variables for purely internal purposes.
The $(...) is called Command Substitution.
See LESS=+'/\ *Command Substitution' man bash
In addition to what was mentioned in the comments about pattern matching
See LESS=+/'(pattern-list)' man bash
See LESS=+/' *\[\[ expression' man bash
s there any way to make the variable EXCEPTLIST to be a multiline variable ?
I see no reason to use matching. Use a bash array and just compare.
exceptlist=(
desktop-01
desktop-02
desktop-03
desktop-04
desktop-05
desktop-06
)
is_in_list() {
local i
for i in "${#:2}"; do
if [[ "$1" = "$i" ]]; then
return 0
fi
done
return 1
}
if is_in_list "$vmname" "${EXCEPTLIST[#]}"; then
echo "is in exception list ${vmname}"
fi
#(${})- Is this called variable expansion or what ? Does anyone know the documentation/explain to me about how this works in bash. ?
${var} is a variable expansion.
#(...) are just characters # ( ).
From man bash in Compund commands:
[[ expression ]]
When the == and != operators are used, the string to the right of the operator is considered a pattern and matched according to the rules
described below under Pattern Matching, as if the extglob shell option were enabled. ...
From Pattern Matching in man bash:
#(pattern-list)
Matches one of the given patterns
[[ command receives the #(a|b|c) string and then matches the arguments.
There is absolutely no need to use Bash specific regex or arrays and loop for a match, if using grep for raw string on word boundary.
The exception list can be multi-line, it will work as well:
#!/usr/bin/sh
exceptlist='
desktop-01|desktop-02|desktop-03|
deskop-04|desktop-05|desktop-06|
desktop-07|deskop-08'
if printf %s "$exceptlist" | grep -qwF "$1"; then
printf '%s is in the exceptlist\n' "$1"
fi
I wouldn't bother with multiple lines of text. This is would be just fine:
EXCEPTLIST='desktop-01|desktop-02|desktop-03|'
EXCEPTLIST+='desktop-04|desktop-05|desktop-06|'
EXCEPTLIST+='desktop-07|desktop-08'
The #(...) construct is called extended globbing pattern and what it does is an extension of what you probably already know -- wildcards:
VAR='foobar'
if [[ "$VAR" == fo?b* ]]; then
echo "Yes!"
else
echo "No!"
fi
A quick walkthrough on extended globbing examples: https://www.linuxjournal.com/content/bash-extended-globbing
#!/bin/bash
set +o posix
shopt -s extglob
vmname=$1
EXCEPTLIST=(
desktop-01 desktop-02 desktop-03
...
)
if IFS='|' eval '[[ ${vmname} == #(${EXCEPTLIST[*]}) ]]'; then
...
Here's one way to load a multiline string into a variable:
fn() {
cat <<EOF
desktop-01|desktop-02|desktop-03|
desktop-04|desktop-05|desktop-06|
desktop-07|desktop-08
EOF
}
exceptlist="$(fn)"
echo $exceptlist
As to solving your specific problem, I can think of a variety of approaches.
Solution 1, since all the desktop has the same desktop-0 prefix and only differ in the last letter, we can make use of {,} or {..} expansion as follows:
vmname="$1"
found=0
for d in desktop-{01..08}
do
if [[ "$vmname" == $d ]]; then
echo "It's in the exceptlist"
found=1
break
fi
done
if (( !found )); then
echo "Not found"
fi
Solution 2, sometimes, it is good to provide a list in a maintainable clear text list. We can use a while loop and iterate through the list
vmname="$1"
found=0
while IFS= read -r d
do
if [[ "$vmname" == $d ]]; then
echo "It's in the exceptlist"
found=1
break
fi
done <<EOF
desktop-01
desktop-02
desktop-03
desktop-04
desktop-05
desktop-06
desktop-07
desktop-08
EOF
if (( !found )); then
echo "Not found"
fi
Solution 3, we can desktop the servers using regular expressions:
vmname="$1"
if [[ "$vmname" =~ ^desktop-0[1-8]$ ]]; then
echo "It's in the exceptlist"
else
echo "Not found"
fi
Solution 4, we populate an array, then iterate through an array:
vmname="$1"
exceptlist=()
exceptlist+=(desktop-01 desktop-02 desktop-03 deskop-04)
exceptlist+=(desktop-05 desktop-06 desktop-07 deskop-08)
found=0
for d in ${exceptlist[#]}
do
if [[ "$vmname" == "$d" ]]; then
echo "It's in the exceptlist"
found=1
break;
fi
done
if (( !found )); then
echo "Not found"
fi

In Bash, is it possible to match a string variable containing wildcards to another string

I am trying to compare strings against a list of other strings read from a file.
However some of the strings in the file contain wildcard characters (both ? and *) which need to be taken into account when matching.
I am probably missing something but I am unable to see how to do it
Eg.
I have strings from file in an array which could be anything alphanumeric (and include commas and full stops) with wildcards : (a?cd, xy, q?hz, j,h-??)
and I have another string I wish to compare with each item in the list in turn. Any of the strings may contain spaces.
so what I want is something like
teststring="abcdx.rubb ish,y"
matchstrings=("a?cd" "*x*y" "q?h*z" "j*,h-??")
for i in "${matchstrings[#]}" ; do
if [[ "$i" == "$teststring" ]]; then # this test here is the problem
<do something>
else
<do something else>
fi
done
This should match on the second "matchstring" but not any others
Any help appreciated
Yes; you just have the two operands to == reversed; the glob goes on the right (and must not be quoted):
if [[ $teststring == $i ]]; then
Example:
$ i=f*
$ [[ foo == $i ]] && echo pattern match
pattern match
If you quote the parameter expansion, the operation is treated as a literal string comparison, not a pattern match.
$ [[ foo == "$i" ]] || echo "foo != f*"
foo != f*
Spaces in the pattern are not a problem:
$ i="foo b*"
$ [[ "foo bar" == $i ]] && echo pattern match
pattern match
You can do this even completely within POSIX, since case alternatives undergo parameter substitution:
#!/bin/sh
teststring="abcdx.rubbish,y"
while IFS= read -r matchstring; do
case $teststring in
($matchstring) echo "$matchstring";;
esac
done << "EOF"
a?cd
*x*y
q?h*z
j*,h-??
EOF
This outputs only *x*y as desired.

check for string format in bash script

I am attempting to check for proper formatting at the start of a string in a bash script.
The expected format is like the below where the string must always begin with "ABCDEFG-" (exact letters and order) and the numbers would vary but be at least 3 digits. Everything after the 3rd digit is a do not care.
Expected start of string: "ABCDEFG-1234"
I am using the below code snippet.
[ $(echo "$str" | grep -E "ABCDEFG-[0-9][0-9][0-9]") ] && echo "yes"
str1 = "ABCDEFG-1234"
str2 = "ABCDEFG-1234 - Some more text"
When I use str1 in place of str everything works ok and yes is printed.
When I use str2 in place of str i get the below error
[: ABCDEFG-1234: unary operator expected
I am pretty new to working with bash scripts so any help would be appreciated.
If this is bash, you have no reason to use grep for this at all; the shell has built-in regular expression support.
re="ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, you might want your regex to be anchored if you want a match in the beginning rather than anywhere in the content:
re="^ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, this doesn't need to be an ERE at all -- a glob-style pattern match would also be adequate:
if [[ $str = ABCDEFG-[0-9][0-9][0-9]* ]]; then echo "yes"; fi
Try grep -E "ABCDEFG-[0-9][0-9][0-9].*"

How do I compare two strings in if condition in bash

s="STP=20"
if [[ "$x" == *"$s"* ]]
The if condition is always false; why?
Try this: http://tldp.org/LDP/abs/html/comparison-ops.html
string comparison
=
is equal to
if [ "$a" = "$b" ]
There is a difference in testing for equality between [ ... ] and [[ ... ]].
The [ ... ] is an alias to the test command:
STRING1 = STRING2 the strings are equal
However, when using [[ ... ]]
When the == and != operators are used, the string to the right of the operator is considered a pattern and matched according to the rules described below under Pattern Matching. If the shell option nocasematch is enabled, the match is performed without regard to the case of alphabetic characters. The return value is 0 if the string matches (==) or does not match (!=) the pattern, and 1 otherwise. Any part of the pattern may be quoted to force it to be matched as a string.
The same seems to be true with just the = sign:
$ foo=bar
$ if [[ $foo = *ar ]]
> then
> echo "These patterns match"
> else
> echo "These two strings aren't equal"
> fi
These patterns match
Note the difference:
$ foo=bar
> if [ $foo = *ar ]
> then
> echo "These patterns match"
> else
> echo "These two strings aren't equal"
> fi
These two strings aren't equal
However, there are a few traps with the [ $f00 = *ar ] syntax. This is the same as:
test $foo = *ar
Which means the shell will interpolate glob expressions and variables before executing the statement. If $foo is empty, the command will become equivalent to:
test = *ar # or [ = *ar ]
Since the = isn't a valid comparison operator in test, you'll get an error like:
bash: [: =: unary operator expected
Which means the [ was expecting a parameter found in the test manpage.
And, if I happen to have a file bar in my directory, the shell will replace *ar with all files that match that pattern (in this case bar), so the command will become:
[ $foo = bar ]
which IS true.
To get around the various issues with [ ... ], you should always put quotes around the parameters. This will prevent the shell from interpolating globs and will help with variables that have no values:
[ "$foo" = "*ar" ]
This will test whether the variable $foo is equal to the string *ar. It will work even if $foo is empty because the quotation marks will force an empty string comparison. The quotes around *ar will prevent the shell from interpolating the glob. This is a true equality.
Of course, it just so happens that if you use quotation marks when using [[ ... ]], you'll force a string match too:
foo=bar
if [[ $foo == "*ar" ]]
then
echo "This is a pattern match"
else
echo "These strings don't match"
fi
So, in the end, if you want to test for string equality, you can use either [ ... ] or [[ ... ]], but you must quote your parameters. If you want to do glob pattern matching, you must leave off the quotes, and use [[ ... ]].
To compare two strings in variables x and y for equality, use
if test "$x" = "$y"; then
printf '%s\n' "equal"
else
printf '%s\n' "not equal"
fi
To test whether x appears somewhere in y, use
case $y in
(*"$x"*)
printf '%s\n' "$y contains $x"
;;
(*)
printf '%s\n' "$y does not contain $x"
;;
esac
Note that these constructs are portable to any POSIX shell, not just bash. The [[ ]] construct for tests is not (yet) a standard shell feature.
I do not know where you came up with the *, but you were real close:
s="STP=20"
if [[ "STP=20" == "$s" ]]; then
echo "It worked!"
fi
You need to escape = using \ in the string s="STP=20"
s="STP\=20"
if [[ "STP\=20" == "$s" ]]; then echo Hi; else echo Bye; fi

BASH: Everything but not slash? IF STATEMENT (STRING COMPARISION)

I'm trying to match any strings that start with /John/ but does not contain / after /John/
if
[ $string == /John/[!/]+ ]; then ....
fi
This is what I got and it doesn't seem to be working.
So I tried
if
[[ $string =~ ^/John/[!/]+$ ]]; then ....
fi
It still didn't work, and so I changed it to
if
[[ $string =~ /John/[^/] ]]; then ....
fi
It worked but will match with all the strings that has / behind /John/ too.
For bash you want [[ $string =~ /John/[^/]*$ ]] -- the end-of-line anchor ensures there are no slashes after the last acceptable slash.
How about "the string starts with '/John/' and doesn't contain any slashes after '/John/'"?
[[ $string = /John/* && $string != /John/*/* ]]
Or you could compare against a parameter expansion that only expands if the conditions are met. This says "after stripping off everything including and after the last slash, the string is /John":
[[ ${string%/*} = /John ]]
In fact, this last solution is the only entirely POSIXLY_STRICT one I can come up with without multiple test expressions.
[ "${string%/*}" = /John ]
By the way, your problem is probably simply be using double-equals inside a single-bracket test expression. bash actually does accept them inside double-bracket test expressions, but a single equals is a better idea.
You can also use plain old grep:
string='/John Lennon/Yoko Ono'
if echo "$string" | grep -q "/John[^/]" ; then
echo "matched"
else
echo "no match found"
fi
This only fails if /John is at the very end of the string... if that's a possibility then you can tweak to handle that case, for instance:
string='/John Lennon/Yoko Ono'
if echo "$string" | grep -qP "(/John[^/])|(/John$)" ; then
echo "matched"
else
echo "no match found"
fi
Not sure what language you're using, but normal negative character classes are prefixed with a ^
e.g.
[^/]
You can also put in start/end qualifiers (clojure example, so Java's regex engine). Usually ^ at beginning and $ at end.
user => (re-matches #"^/[a-zA-Z]+[^/]$" "/John/")
nil

Resources