Matching one of several possible characters in a string - bash

In a bash (version 3.2.48) script I get a string that can be something like:
'XY'
' Y'
'YY'
etc
So, I have either an alphabetic character OR a space (first slot), then the relevant character (second slot). I tried some variation (without grep, sed, ...) like:
if [[ $string =~ ([[:space]]{1}|[[:alpha:]]{1})M ]]; then
and
if [[ $string =~ (\s{1}|.{1})M ]]; then
but my solutions did not always work correctly (matching correctly every combination).

This should work for you:
if [[ $string =~ [[:space:][:alpha:]]M ]]; then

if [[ ${string:1:1} == "M" ]]; then
echo Heureka
fi
or (if you want to do it with patterns)
if [[ $string =~ ([[:space:]]|[[:alpha:]])M ]]; then
echo Heureka
fi
or (even simpler)
if [[ $string == ?M ]]; then
echo Heureka
fi

Without using regular expressions, simply pattern matching is sufficient:
if [[ $string == [[::upper:]\ ]M ]]; then
echo match
fi
Given your example, you want [[:upper:]] rather than merely [[:alpha:]]

Related

how to check strings first char in bash

I want to check if a string's first char is uppercase, lowercase or anything else. I tried this code but I can't get to the last else although the first two conditions are false.
#!/bin/bash
echo "enter var: "
read var
if [[ {$var::1 =~ [A-Z] ]]
then
echo "UpperCase"
elif [[ {$var::1} =~ [a-z] ]]
then
echo "LowerCase"
else
echo "Digit or a symbol"
fi
exit
When I enter 1hello I get: "LowerCase"
What am I missing here?!
You don't necessarily need to extract the first character, you can compare the whole string to a pattern.
Here, I'm using the POSIX character classes [:upper:] and [:lower:] which I find more descriptive. They also handle non-ASCII letters.
Regex matching:
if [[ $var =~ ^[[:upper:]] ]]; then echo starts with an upper
elif [[ $var =~ ^[[:lower:]] ]]; then echo starts with a lower
else echo does not start with a letter
fi
With shell glob patterns -- within [[...]] the == operator does pattern matching not just string equality
if [[ $var == [[:upper:]]* ]]; then echo starts with an upper
elif [[ $var == [[:lower:]]* ]]; then echo starts with a lower
else echo does not start with a letter
fi
A case statement would work here as well
case "$var" in
[[:upper:]]*) echo starts with an upper ;;
[[:lower:]]*) echo starts with a lower ;;
*) echo does not start with a letter ;;
esac
Neither of your parameter expansions are correct. {$var::1 evaluates to {1hello::1, not 1, and {$var::1} likewise evaluates to {1hello::1}.
The expansion you want is ${var::1}, which does expand to 1 as intended.
You don't need a fancy parameter expansion anyway; you can match against the first character using regular expressions alone
[[ $var =~ ^[a-z] ]]
or pattern-matching
[[ $var = [a-z]* ]]
Regular expressions are not implicitly anchored, so you can use ^ to explicitly match the beginning of the string; the remainder of the string can be ignored.
Pattern matches are implicitly anchored to the start and end of the string, so you need * to match everything (if anything) that follows the initial character of the string.

Filter only three characters in shell

I am trying to parse a command in shell, and in one of the options I want to save in a variable if a string has "r", "w", "x", one of those, all of them, or a mix, but only these three. No other characters should be allowed!
I tried a case where:
$2 in *r*) ;; *w*) ;; *x*) ;; * ) echo no ;;
esac
But in this case if there is written zr it will pass, as it has an "r". I only want to make it pass as long as it has one of these three, the three of them, or two of them (any kind of combination), but no other characters.
In BASH you can use regex for this check like this:
re='^[rwx]+$'
s='rw'
[[ $s =~ $re ]] && echo "yes" || echo "no"
yes
s='zr'
[[ $s =~ $re ]] && echo "yes" || echo "no"
no
Regex ^[rwx]+$ will allow 1 or more of r or w or x letters.
With extended pattern matching in Bash (shopt -s extglob):
if [[ $var == +([rwx]) ]]; then
echo "Matches!"
else
echo "Does not match!"
fi
The +([rwx]) pattern is "one or more of r, w or x".

exact match using if statement ? does partial match as well need to do exact match

I have the following command in my script. It works fine with one exception; it also matches partial entries. I want the match to be exact.
a=mary jane uger dodo baba
b=mary
c=ma
if [[ "$a" =~ "$b" ]] && [ -n "$1" ]; then
echo it matches
else
echo it does not match
fi
So no matter if in the if statement i use value $b or $c they both match.
I want to ensure that the entry is fully match and not partially. So
this should work and give exact match.
if [[ "$a" =~ "$b" ]]
and this should not work partial match
if [[ "$a" =~ "$c" ]]
Can someone help please?
here is my exact code
if [[ "$a" =~ "$b" ]]; then
echo something
fi
Put a space or end anchor in in the end for regex comparison to make sure there is no partial word match:
a='mary jane uger dodo baba'
b='mary'
c='ma'
# will match
[[ "$a" =~ "$b"( |$) ]]
# won't match
[[ "$a" =~ "$c"( |$) ]]
z='\>'
[[ $a =~ $b$z ]] # true
[[ $a =~ $c$z ]] # false
does bash support word boundary regular expressions?

How to check for strings between certain strings in unix scripting?

String:
name#gmail.com
Checking for:
#
.com
My code
if [[ $word =~ "#" ]]
then
if [[ $word =~ ".com" || $word =~ ".ca" ]]
My problem
name#.com
The above example gets passed, which is not what I want. How do I check for characters (1 or more) between "#" and ".com"?
You can use a very very basic regex:
[[ $var =~ ^[a-z]+#[a-z]+\.[a-z]+$ ]]
It looks for a string being exactly like this:
at least one a-z char
#
at least one a-z char
.
at least one a-z char
It can get as complicated as you want, see for example Email check regular expression with bash script.
See in action
$ var="a#b.com"
$ [[ $var =~ ^[a-z]+#[a-z]+\.[a-z]+$ ]] && echo "kind of valid email"
kind of valid email
$ var="a#.com"
$ [[ $var =~ ^[a-z]+#[a-z]+\.[a-z]+$ ]] && echo "kind of valid email"
$
why not go for other tools like perl:
> echo "x#gmail.com" | perl -lne 'print $1 if(/#(.*?)\.com/)'
gmail
The glob pattern would be: [[ $word == ?*#?*.#(com|ca) ]]
? matches any single character and * matches zero or more characters
#(p1|p2|p3|...) is an extended globbing pattern that matches one of the given patterns. This requires:
shopt -s extglob
testing:
$ for word in #.com #a.ca a#.com a#b.ca a#b.org; do
echo -ne "$word\t"
[[ $word == ?*#?*.#(com|ca) ]] && echo matches || echo does not match
done
#.com does not match
#a.ca does not match
a#.com does not match
a#b.ca matches
a#b.org does not match

Checking if char is within set

I'm trying to check if some string from length 1 and has only following chars: [RGWBO].
I'm trying the following but it doesn't work, what am I missing?
if [[ !(${line[4]} =~ [RGWBO]) ]];
This is what you want:
if [[ ${line[4]} =~ ^[RGWBO]+$ ]];
This means that the string right from the start till the end must have [RGWBO] characters one or more times.
If you want to negate the expression just use ! in front of [[ ]]:
if ! [[ ${line[4]} =~ ^[RGWBO]+$ ]];
Or
if [[ ! ${line[4]} =~ ^[RGWBO]+$ ]];
This one would work with any usable version of Bash:
[[ -n ${LINE[0]} && ${LINE[0]} != *[^RGWB0]* ]]
Even though I prefer the simplicity of extended globs:
shopt -s extglob
[[ ${LINE[0]} == +([RGWBO]) ]]
Use expr (expression evaluator) to do substring matching.
#!/bin/bash
pattern='[R|G|W|B|O]'
string=line[4]
res=`expr match "$string" $pattern`
if [ "${res}" -eq "1" ]; then
echo 'match'
else
echo 'doesnt match'
fi
Approach
Test the string length with ${#myString}, if it's egal to 1 proceed to step 2 ;
Does is contains your pattern.
Code
re='[RGWBO]';
while read -r line; do
if (( ${#line} == 1 )) && [[ $line == $re ]]; then
echo "yes: $line"
else
echo "no: $line"
fi
done < test.txt
Resources
You may want to look at the following links:
Bash: Split string into character array's answer ;
Length of a string, use ${#myString} ;
Extracting parts of strings, use ${myString:0:8} ;
Data
The test.txt file contains this
RGWBO
RGWB
RGW
RG
R
G
W
B
O
V

Resources