Comparing strings case-insensitively - bash

I want to compare Hello World to hello world. The result should be true, as if they were equal. I'm doing:
while read line; do
newLine="$newLine$line"
done < $1
newp="Hello World"
if (( ${newp,,}==${newLine,,} )); then
echo "true"
else
echo "false"
fi
when I pass a text file consisting of:
#filename: file.txt
hello world
The output seems to be:
./testScript.txt: line 20: 0à»: hello world==hello world : syntax error in expression (error token is "world==hello world ")
+ echo false
What am I doing wrong here? Also, a bit unrelated, is there any way to pass the line that is in file.txt to a string(newLine) without doing that while I have done?

You should add commas and change the double parentheses to single brackets. The if statement should be something like:
if [ "${newp,,}" = "${newLine,,}" ]; then
And in relation to that while loop... It depends on what you want to do. If, like in this case, you want to get the entire file and save it as a single string, you could simply do:
line=$(cat $1)
I would suggest you only use that loop you wrote if you are trying to parse the file line by line, i.e. adding if statements, using different variables and so on. But for a simple case like this one, cat will do just fine.

There is a shell option, nocasematch, that enables case insensitive pattern matching for use with [[ and case.
Comparing strings that differ by casing only:
$ var1=lowercase
$ var2=LOWERCASE
$ [[ $var1 == $var2 ]] && echo "Matches!" || echo "Doesn't match!"
Doesn't match!
Now enabling the shell option and trying again:
$ shopt -s nocasematch
$ [[ $var1 == $var2 ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
Just make sure to turn it off again with shopt -u nocasematch if you don't want to do all comparisons case insensitive.

Related

Bash script with multiline variable

Here is my code
vmname="$1"
EXCEPTLIST="desktop-01|desktop-02|desktop-03|desktop-04"
if [[ $vmname != #(${EXCEPTLIST}) ]]; then
echo "${vmname}"
else
echo "Its in the exceptlist"
fi
The above code works perfectly but my question is , the EXCEPTLIST can be a long line, say 100 server names. In that case its hard to put all that names in one line. In that situation is there any way to make the variable EXCEPTLIST to be a multiline variable ? something like as follows:
EXCEPTLIST="desktop-01|desktop-02|desktop-03| \n
desktop-04|desktop-05|desktop-06| \n
desktop-07|desktop-08"
I am not sure but was thinking of possibilities.
Apparently I would like to know the terminology of using #(${})- Is this called variable expansion or what ? Does anyone know the documentation/explain to me about how this works in bash. ?
One can declare an array if the data/string is long/large. Use IFS and printf for the format string, something like:
#!/usr/bin/env bash
exceptlist=(
desktop-01
desktop-02
desktop-03
desktop-04
desktop-05
desktop-06
)
pattern=$(IFS='|'; printf '#(%s)' "${exceptlist[*]}")
[[ "$vmname" != $pattern ]] && echo good
In that situation is there any way to make the variable EXCEPTLIST to be a multiline variable ?
With your given input/data an array is also a best option, something like:
exceptlist=(
'desktop-01|desktop-02|desktop-03'
'desktop-04|desktop-05|desktop-06'
'desktop-07|desktop-08'
)
Check what is the value of $pattern variable one way is:
declare -p pattern
Output:
declare -- pattern="#(desktop-01|desktop-02|desktop-03|desktop-04|desktop-05|desktop-06)"
Need to test/check if $vmname is an empty string too, since it will always be true.
On a side note, don't use all upper case variables for purely internal purposes.
The $(...) is called Command Substitution.
See LESS=+'/\ *Command Substitution' man bash
In addition to what was mentioned in the comments about pattern matching
See LESS=+/'(pattern-list)' man bash
See LESS=+/' *\[\[ expression' man bash
s there any way to make the variable EXCEPTLIST to be a multiline variable ?
I see no reason to use matching. Use a bash array and just compare.
exceptlist=(
desktop-01
desktop-02
desktop-03
desktop-04
desktop-05
desktop-06
)
is_in_list() {
local i
for i in "${#:2}"; do
if [[ "$1" = "$i" ]]; then
return 0
fi
done
return 1
}
if is_in_list "$vmname" "${EXCEPTLIST[#]}"; then
echo "is in exception list ${vmname}"
fi
#(${})- Is this called variable expansion or what ? Does anyone know the documentation/explain to me about how this works in bash. ?
${var} is a variable expansion.
#(...) are just characters # ( ).
From man bash in Compund commands:
[[ expression ]]
When the == and != operators are used, the string to the right of the operator is considered a pattern and matched according to the rules
described below under Pattern Matching, as if the extglob shell option were enabled. ...
From Pattern Matching in man bash:
#(pattern-list)
Matches one of the given patterns
[[ command receives the #(a|b|c) string and then matches the arguments.
There is absolutely no need to use Bash specific regex or arrays and loop for a match, if using grep for raw string on word boundary.
The exception list can be multi-line, it will work as well:
#!/usr/bin/sh
exceptlist='
desktop-01|desktop-02|desktop-03|
deskop-04|desktop-05|desktop-06|
desktop-07|deskop-08'
if printf %s "$exceptlist" | grep -qwF "$1"; then
printf '%s is in the exceptlist\n' "$1"
fi
I wouldn't bother with multiple lines of text. This is would be just fine:
EXCEPTLIST='desktop-01|desktop-02|desktop-03|'
EXCEPTLIST+='desktop-04|desktop-05|desktop-06|'
EXCEPTLIST+='desktop-07|desktop-08'
The #(...) construct is called extended globbing pattern and what it does is an extension of what you probably already know -- wildcards:
VAR='foobar'
if [[ "$VAR" == fo?b* ]]; then
echo "Yes!"
else
echo "No!"
fi
A quick walkthrough on extended globbing examples: https://www.linuxjournal.com/content/bash-extended-globbing
#!/bin/bash
set +o posix
shopt -s extglob
vmname=$1
EXCEPTLIST=(
desktop-01 desktop-02 desktop-03
...
)
if IFS='|' eval '[[ ${vmname} == #(${EXCEPTLIST[*]}) ]]'; then
...
Here's one way to load a multiline string into a variable:
fn() {
cat <<EOF
desktop-01|desktop-02|desktop-03|
desktop-04|desktop-05|desktop-06|
desktop-07|desktop-08
EOF
}
exceptlist="$(fn)"
echo $exceptlist
As to solving your specific problem, I can think of a variety of approaches.
Solution 1, since all the desktop has the same desktop-0 prefix and only differ in the last letter, we can make use of {,} or {..} expansion as follows:
vmname="$1"
found=0
for d in desktop-{01..08}
do
if [[ "$vmname" == $d ]]; then
echo "It's in the exceptlist"
found=1
break
fi
done
if (( !found )); then
echo "Not found"
fi
Solution 2, sometimes, it is good to provide a list in a maintainable clear text list. We can use a while loop and iterate through the list
vmname="$1"
found=0
while IFS= read -r d
do
if [[ "$vmname" == $d ]]; then
echo "It's in the exceptlist"
found=1
break
fi
done <<EOF
desktop-01
desktop-02
desktop-03
desktop-04
desktop-05
desktop-06
desktop-07
desktop-08
EOF
if (( !found )); then
echo "Not found"
fi
Solution 3, we can desktop the servers using regular expressions:
vmname="$1"
if [[ "$vmname" =~ ^desktop-0[1-8]$ ]]; then
echo "It's in the exceptlist"
else
echo "Not found"
fi
Solution 4, we populate an array, then iterate through an array:
vmname="$1"
exceptlist=()
exceptlist+=(desktop-01 desktop-02 desktop-03 deskop-04)
exceptlist+=(desktop-05 desktop-06 desktop-07 deskop-08)
found=0
for d in ${exceptlist[#]}
do
if [[ "$vmname" == "$d" ]]; then
echo "It's in the exceptlist"
found=1
break;
fi
done
if (( !found )); then
echo "Not found"
fi

In Bash, is it possible to match a string variable containing wildcards to another string

I am trying to compare strings against a list of other strings read from a file.
However some of the strings in the file contain wildcard characters (both ? and *) which need to be taken into account when matching.
I am probably missing something but I am unable to see how to do it
Eg.
I have strings from file in an array which could be anything alphanumeric (and include commas and full stops) with wildcards : (a?cd, xy, q?hz, j,h-??)
and I have another string I wish to compare with each item in the list in turn. Any of the strings may contain spaces.
so what I want is something like
teststring="abcdx.rubb ish,y"
matchstrings=("a?cd" "*x*y" "q?h*z" "j*,h-??")
for i in "${matchstrings[#]}" ; do
if [[ "$i" == "$teststring" ]]; then # this test here is the problem
<do something>
else
<do something else>
fi
done
This should match on the second "matchstring" but not any others
Any help appreciated
Yes; you just have the two operands to == reversed; the glob goes on the right (and must not be quoted):
if [[ $teststring == $i ]]; then
Example:
$ i=f*
$ [[ foo == $i ]] && echo pattern match
pattern match
If you quote the parameter expansion, the operation is treated as a literal string comparison, not a pattern match.
$ [[ foo == "$i" ]] || echo "foo != f*"
foo != f*
Spaces in the pattern are not a problem:
$ i="foo b*"
$ [[ "foo bar" == $i ]] && echo pattern match
pattern match
You can do this even completely within POSIX, since case alternatives undergo parameter substitution:
#!/bin/sh
teststring="abcdx.rubbish,y"
while IFS= read -r matchstring; do
case $teststring in
($matchstring) echo "$matchstring";;
esac
done << "EOF"
a?cd
*x*y
q?h*z
j*,h-??
EOF
This outputs only *x*y as desired.

check for string format in bash script

I am attempting to check for proper formatting at the start of a string in a bash script.
The expected format is like the below where the string must always begin with "ABCDEFG-" (exact letters and order) and the numbers would vary but be at least 3 digits. Everything after the 3rd digit is a do not care.
Expected start of string: "ABCDEFG-1234"
I am using the below code snippet.
[ $(echo "$str" | grep -E "ABCDEFG-[0-9][0-9][0-9]") ] && echo "yes"
str1 = "ABCDEFG-1234"
str2 = "ABCDEFG-1234 - Some more text"
When I use str1 in place of str everything works ok and yes is printed.
When I use str2 in place of str i get the below error
[: ABCDEFG-1234: unary operator expected
I am pretty new to working with bash scripts so any help would be appreciated.
If this is bash, you have no reason to use grep for this at all; the shell has built-in regular expression support.
re="ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, you might want your regex to be anchored if you want a match in the beginning rather than anywhere in the content:
re="^ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, this doesn't need to be an ERE at all -- a glob-style pattern match would also be adequate:
if [[ $str = ABCDEFG-[0-9][0-9][0-9]* ]]; then echo "yes"; fi
Try grep -E "ABCDEFG-[0-9][0-9][0-9].*"

How to get first character of variable

I'm trying to get the first character of a variable, but I'm getting a Bad substitution error. Can anyone help me fix it?
code is:
while IFS=$'\n' read line
do
if [ ! ${line:0:1} == "#"] # Error on this line
then
eval echo "$line"
eval createSymlink $line
fi
done < /some/file.txt
Am I doing something wrong or is there a better way of doing this?
-- EDIT --
As requested - here's some sample input which is stored in /some/file.txt
$MOZ_HOME/mobile/android/chrome/content/browser.js
$MOZ_HOME/mobile/android/locales/en-US/chrome/browser.properties
$MOZ_HOME/mobile/android/components/ContentPermissionPrompt.js
To get the first character of a variable you need to say:
v="hello"
$ echo "${v:0:1}"
h
However, your code has a syntax error:
[ ! ${line:0:1} == "#"]
# ^-- missing space
So this can do the trick:
$ a="123456"
$ [ ! "${a:0:1}" == "#" ] && echo "doesnt start with #"
doesnt start with #
$ a="#123456"
$ [ ! "${a:0:1}" == "#" ] && echo "doesnt start with #"
$
Also it can be done like this:
$ a="#123456"
$ [ "$(expr substr $a 1 1)" != "#" ] && echo "does not start with #"
$
$ a="123456"
$ [ "$(expr substr $a 1 1)" != "#" ] && echo "does not start with #"
does not start with #
Update
Based on your update, this works to me:
while IFS=$'\n' read line
do
echo $line
if [ ! "${line:0:1}" == "#" ] # Error on this line
then
eval echo "$line"
eval createSymlink $line
fi
done < file
Adding the missing space (as suggested in fedorqui's answer ;) ) works for me.
An alternative method/syntax
Here's what I would do in Bash if I want to check the first character of a string
if [[ $line != "#"* ]]
On the right hand side of ==, the quoted part is treated literally whereas * is a wildcard for any sequence of character.
For more information, see the last part of Conditional Constructs of Bash reference manual:
When the ‘==’ and ‘!=’ operators are used, the string to the right of the operator is considered a pattern and matched according to the rules described below in Pattern Matching
Checking that you're using the right shell
If you are getting errors such as "Bad substitution error" and "[[: not found" (see comment) even though your syntax is fine (and works fine for others), it might indicate that you are using the wrong shell (i.e. not Bash).
So to make sure you are using Bash to run the script, either
make the script executable and use an appropriate shebang e.g. #!/bin/bash
or execute it via bash my_script
Also note that sh is not necessarily bash, sometimes it can be dash (e.g. in Ubuntu) or just plain ol' Bourne shell.
Try this:
while IFS=$'\n' read line
do
if ! [ "${line:0:1}" = "#" ]; then
eval echo "$line"
eval createSymlink $line
fi
done < /some/file.txt
or you can use the following for your if syntax:
if [[ ! ${line:0:1} == "#" ]]; then
TIMTOWTDI ^^
while IFS='' read -r line
do
case "${line}" in
"#"*) echo "${line}"
;;
*) createSymlink ${line}
;;
esac
done < /some/file.txt
Note: I dropped the eval, which could be needed in some (rare!) cases (and are dangerous usually).
Note2: I added a "safer" IFS & read (-r, raw) but you can revert to your own if it is better suited. Note that it still reads line by line.
Note3: I took the habit of using always ${var} instead of $var ... works for me (easy to find out vars in complex text, and easy to see where they begin and end at all times) but not necessary here.
Note4: you can also change the test to : *"#"*) if some of the (comments?) lines can have spaces or tabs before the '#' (and none of the symlink lines does contain a '#')

number of tokens in bash variable

how can I know the number of tokens in a bash variable (whitespace-separated tokens) - or at least, wether it is one or there are more.
The $# expansion will tell you the number of elements in a variable / array. If you're working with a bash version greater than 2.05 or so you can:
VAR='some string with words'
VAR=( $VAR )
echo ${#VAR[#]}
This effectively splits the string into an array along whitespace (which is the default delimiter), and then counts the members of the array.
EDIT:
Of course, this recasts the variable as an array. If you don't want that, use a different variable name or recast the variable back into a string:
VAR="${VAR[*]}"
I can't understand why people are using those overcomplicated bashisms all the time. There's almost always a straight-forward, no-bashism solution.
howmany() { echo $#; }
myvar="I am your var"
howmany $myvar
This uses the tokenizer built-in to the shell, so there's no discrepancy.
Here's one related gotcha:
myvar='*'
echo $myvar
echo "$myvar"
set -f
echo $myvar
echo "$myvar"
Note that the solution from #guns using bash array has the same gotcha.
The following is a (supposedly) super-robust version to work around the gotcha:
howmany() ( set -f; set -- $1; echo $# )
If we want to avoid the subshell, things start to get ugly
howmany() {
case $- in *f*) set -- $1;; *) set -f; set -- $1; set +f;; esac
echo $#
}
These two must be used WITH quotes, e.g. howmany "one two three" returns 3
set VAR='hello world'
echo $VAR | wc -w
here is how you can check.
if [ `echo $VAR | wc -w` -gt 1 ]
then
echo "Hello"
fi
Simple method:
$ VAR="a b c d"
$ set $VAR
$ echo $#
4
To count:
sentence="This is a sentence, please count the words in me."
words="${sentence//[^\ ]} "
echo ${#words}
To check:
sentence1="Two words"
sentence2="One"
[[ "$sentence1" =~ [\ ] ]] && echo "sentence1 has more than one word"
[[ "$sentence2" =~ [\ ] ]] && echo "sentence2 has more than one word"
For a robust, portable sh solution, see #JoSo's functions using set -f.
(Simple bash-only solution for answering (only) the "Is there at least 1 whitespace?" question; note: will also match leading and trailing whitespace, unlike the awk solution below:
[[ $v =~ [[:space:]] ]] && echo "\$v has at least 1 whitespace char."
)
Here's a robust awk-based bash solution (less efficient due to invocation of an external utility, but probably won't matter in many real-world scenarios):
# Functions - pass in a quoted variable reference as the only argument.
# Takes advantage of `awk` splitting each input line into individual tokens by
# whitespace; `NF` represents the number of tokens.
# `-v RS=$'\3'` ensures that even multiline input is treated as a single input
# string.
countTokens() { awk -v RS=$'\3' '{print NF}' <<<"$1"; }
hasMultipleTokens() { awk -v RS=$'\3' '{if(NF>1) ec=0; else ec=1; exit ec}' <<<"$1"; }
# Example: Note the use of glob `*` to demonstrate that it is not
# accidentally expanded.
v='I am *'
echo "\$v has $(countTokens "$v") token(s)."
if hasMultipleTokens "$v"; then
echo "\$v has multiple tokens."
else
echo "\$v has just 1 token."
fi
Not sure if this is exactly what you meant but:
$# = Number of arguments passed to the bash script
Otherwise you might be looking for something like man wc

Resources