I am reading a shell script in Unix.
I have come upon the below line:
if [[ ! $1 =~ ^# ]];
I understand the part which is on left side of equal sign but what does ~^# means.
According to http://wiki.bash-hackers.org/syntax/ccmd/conditional_expression the =~ is:
<STRING> =~ <ERE> <STRING> is checked against the extended regular expression <ERE> - TRUE on a match
So the ^# is a extended regular expression. Since # is not a special character in extended regex. The meaning of the if checks that the string in $1 does not start with #. So on the command line
$ if [[ ! '#' =~ ^# ]]; then echo matches; else echo no match; fi
no match
$ if [[ ! 'b' =~ ^# ]]; then echo matches; else echo no match; fi
matches
The ~ allows to use POSIX regular expression matching (regex).
The ^ is a special character that evaluates the beginning of a line
In your case, ^# means a line beginning with #. Then your condition only takes care of lines that do not begin with #.
In shell scripting, lines beginning with # are comments, that are not evaluated as commands by the shell.
Related
I want to check if a string's first char is uppercase, lowercase or anything else. I tried this code but I can't get to the last else although the first two conditions are false.
#!/bin/bash
echo "enter var: "
read var
if [[ {$var::1 =~ [A-Z] ]]
then
echo "UpperCase"
elif [[ {$var::1} =~ [a-z] ]]
then
echo "LowerCase"
else
echo "Digit or a symbol"
fi
exit
When I enter 1hello I get: "LowerCase"
What am I missing here?!
You don't necessarily need to extract the first character, you can compare the whole string to a pattern.
Here, I'm using the POSIX character classes [:upper:] and [:lower:] which I find more descriptive. They also handle non-ASCII letters.
Regex matching:
if [[ $var =~ ^[[:upper:]] ]]; then echo starts with an upper
elif [[ $var =~ ^[[:lower:]] ]]; then echo starts with a lower
else echo does not start with a letter
fi
With shell glob patterns -- within [[...]] the == operator does pattern matching not just string equality
if [[ $var == [[:upper:]]* ]]; then echo starts with an upper
elif [[ $var == [[:lower:]]* ]]; then echo starts with a lower
else echo does not start with a letter
fi
A case statement would work here as well
case "$var" in
[[:upper:]]*) echo starts with an upper ;;
[[:lower:]]*) echo starts with a lower ;;
*) echo does not start with a letter ;;
esac
Neither of your parameter expansions are correct. {$var::1 evaluates to {1hello::1, not 1, and {$var::1} likewise evaluates to {1hello::1}.
The expansion you want is ${var::1}, which does expand to 1 as intended.
You don't need a fancy parameter expansion anyway; you can match against the first character using regular expressions alone
[[ $var =~ ^[a-z] ]]
or pattern-matching
[[ $var = [a-z]* ]]
Regular expressions are not implicitly anchored, so you can use ^ to explicitly match the beginning of the string; the remainder of the string can be ignored.
Pattern matches are implicitly anchored to the start and end of the string, so you need * to match everything (if anything) that follows the initial character of the string.
I tried to check the following case:
#!/bin/bash
line="abc"
if [[ "${line}" != [a-z] ]]; then
echo INVALID
fi
And I get INVALID as output. But why?
It's no check if $line contains only a characters in the range [a-z] ?
Use the regular expression matching operator =~:
#!/bin/bash
line="abc"
if [[ "${line}" =~ [^a-zA-Z] ]]; then
echo INVALID
fi
Works in any Bourne shell and wastes no pipes/forks:
case $var in
("") echo "empty";;
(*[!a-z]*) echo "contains a non-alphabetic";;
(*) echo "just alphabetics";;
esac
Use [!a-zA-Z] if you want to allow upper case as well.
Could you please try following and let me know if this helps you.
line="abc"
if echo "$line" | grep -i -q '^[a-z]*$'
then
echo "MATCHED."
else
echo "NOT-MATCHED."
fi
Pattern matches are anchored to the beginning and end of the string, so your code checks if $line is not a single lowercase character. You want to match an arbitrary sequence of lowercase characters, which you can do using extended patterns:
if [[ $line != #([a-z]) ]]; then
or using the regular-expression operator:
if ! [[ $line =~ ^[a-z]+$ ]]; then # there is no negative regex operator like Perl's !~
Why? Because != means "not equal", thats why. You tell bash to compare abc with [a-z]. They are not equal.
Try echo $line | grep -i -q -x '[a-z]*'.
The flag -i makes grep case insensitive.
The flag -x means match the whole line.
The flag -q means print nothing to stdout, just return 1 or 0.
when I compare two strings with single equal “=” ?
and when I compare two strings with double equal “==” ?
for example:
[[ $STR = $STR1 ]]
OR
[[ $STR == $STR1 ]]
Or maybe they are both do exactly the same thing?
[[ $a == z* ]] # True if $a starts with an "z" (pattern matching).
[[ $a == "z*" ]] # True if $a is equal to z* (literal matching).
[ $a == z* ] # File globbing and word splitting take place.
[ "$a" == "z*" ] # True if $a is equal to z* (literal matching).
Everything about string comparison you can find here.
You right, both way does exactly the same thing, there is no difference at the execution.
In your conditions, take care to add quotes to your variable's name, bash could throws an error if one of them is null. Add double quotes will pass throught this error by setting variable to empty string "".
[[ "$STR1" = "$STR2" ]]
EDIT : (Thanks to comments below)
Prefer use this syntaxe [ "$STR1" == "$STR2" ] for test and shell convenience. Doubles quotes are better to use and make your condition usable with regular expression as "*.txt" but not even required.
String:
name#gmail.com
Checking for:
#
.com
My code
if [[ $word =~ "#" ]]
then
if [[ $word =~ ".com" || $word =~ ".ca" ]]
My problem
name#.com
The above example gets passed, which is not what I want. How do I check for characters (1 or more) between "#" and ".com"?
You can use a very very basic regex:
[[ $var =~ ^[a-z]+#[a-z]+\.[a-z]+$ ]]
It looks for a string being exactly like this:
at least one a-z char
#
at least one a-z char
.
at least one a-z char
It can get as complicated as you want, see for example Email check regular expression with bash script.
See in action
$ var="a#b.com"
$ [[ $var =~ ^[a-z]+#[a-z]+\.[a-z]+$ ]] && echo "kind of valid email"
kind of valid email
$ var="a#.com"
$ [[ $var =~ ^[a-z]+#[a-z]+\.[a-z]+$ ]] && echo "kind of valid email"
$
why not go for other tools like perl:
> echo "x#gmail.com" | perl -lne 'print $1 if(/#(.*?)\.com/)'
gmail
The glob pattern would be: [[ $word == ?*#?*.#(com|ca) ]]
? matches any single character and * matches zero or more characters
#(p1|p2|p3|...) is an extended globbing pattern that matches one of the given patterns. This requires:
shopt -s extglob
testing:
$ for word in #.com #a.ca a#.com a#b.ca a#b.org; do
echo -ne "$word\t"
[[ $word == ?*#?*.#(com|ca) ]] && echo matches || echo does not match
done
#.com does not match
#a.ca does not match
a#.com does not match
a#b.ca matches
a#b.org does not match
In a bash (version 3.2.48) script I get a string that can be something like:
'XY'
' Y'
'YY'
etc
So, I have either an alphabetic character OR a space (first slot), then the relevant character (second slot). I tried some variation (without grep, sed, ...) like:
if [[ $string =~ ([[:space]]{1}|[[:alpha:]]{1})M ]]; then
and
if [[ $string =~ (\s{1}|.{1})M ]]; then
but my solutions did not always work correctly (matching correctly every combination).
This should work for you:
if [[ $string =~ [[:space:][:alpha:]]M ]]; then
if [[ ${string:1:1} == "M" ]]; then
echo Heureka
fi
or (if you want to do it with patterns)
if [[ $string =~ ([[:space:]]|[[:alpha:]])M ]]; then
echo Heureka
fi
or (even simpler)
if [[ $string == ?M ]]; then
echo Heureka
fi
Without using regular expressions, simply pattern matching is sufficient:
if [[ $string == [[::upper:]\ ]M ]]; then
echo match
fi
Given your example, you want [[:upper:]] rather than merely [[:alpha:]]