Counting filenames matching a regex in bash - bash

I have the following script
setup=`ls ./test | egrep 'm-ha-.........js'`
regex="m-ha-(........)\.js"
if [[ "$setup" =~ $regex ]]
then
checksum=${BASH_REMATCH[1]}
fi
I noticed that if [[ "$setup" =~ $regex ]] returns the first file that matches the regex in BATCH_REMATCH.
Is there a way to test how many files matches the regex? I want to return an error, if there are multiple files that matches the regex.

You don't need a regex, or ls, for this.
matches=(./test/m-ha-????????.js)
[[ ${#matches[*]} -gt 1 ]] && echo "More than one."
We expand the wildcard into an array and examine the number of elements in the array.
If you want to strip the prefix, ${match[0]#mh-a-} returns the first element with the prefix removed. The % interpolation operator similarly strips a suffix, e.g. ${match[0]%.js}. You can't strip from both ends at the same time, but you can loop over the matches:
for match in "${matches[#]%.js}"; do
echo "${match#./test/m-ha-}"
done
Notice that the array won't be empty if there are no matches unless you explicitly set the nullglob option.

Related

In Bash, is it possible to match a string variable containing wildcards to another string

I am trying to compare strings against a list of other strings read from a file.
However some of the strings in the file contain wildcard characters (both ? and *) which need to be taken into account when matching.
I am probably missing something but I am unable to see how to do it
Eg.
I have strings from file in an array which could be anything alphanumeric (and include commas and full stops) with wildcards : (a?cd, xy, q?hz, j,h-??)
and I have another string I wish to compare with each item in the list in turn. Any of the strings may contain spaces.
so what I want is something like
teststring="abcdx.rubb ish,y"
matchstrings=("a?cd" "*x*y" "q?h*z" "j*,h-??")
for i in "${matchstrings[#]}" ; do
if [[ "$i" == "$teststring" ]]; then # this test here is the problem
<do something>
else
<do something else>
fi
done
This should match on the second "matchstring" but not any others
Any help appreciated
Yes; you just have the two operands to == reversed; the glob goes on the right (and must not be quoted):
if [[ $teststring == $i ]]; then
Example:
$ i=f*
$ [[ foo == $i ]] && echo pattern match
pattern match
If you quote the parameter expansion, the operation is treated as a literal string comparison, not a pattern match.
$ [[ foo == "$i" ]] || echo "foo != f*"
foo != f*
Spaces in the pattern are not a problem:
$ i="foo b*"
$ [[ "foo bar" == $i ]] && echo pattern match
pattern match
You can do this even completely within POSIX, since case alternatives undergo parameter substitution:
#!/bin/sh
teststring="abcdx.rubbish,y"
while IFS= read -r matchstring; do
case $teststring in
($matchstring) echo "$matchstring";;
esac
done << "EOF"
a?cd
*x*y
q?h*z
j*,h-??
EOF
This outputs only *x*y as desired.

Regular expression in bash not working in conditional construct in Bash with operator '=~'

The regular expression I have put into the conditional construct (with the =~ operator) would not return the value as I had expected, but when I assign them into two variables it worked. Wondering if I had done something wrong.
Version 1 (this one worked)
a=30
b='^[0-9]+$' #pattern looking for a number
[[ $a =~ $b ]]
echo $?
#result is 0, as expected
Version 2 (this one doesn't work but I thought it is identical)
[[ 30 =~ '^[0-9]+$' ]]
echo $?
#result is 1
Don't quote the regular expression:
[[ 30 =~ ^[0-9]+$ ]]
echo $?
From the manual:
Any part of the pattern may be quoted to force the quoted portion to be matched as a string.
So if you quote the entire pattern, it's treated as a fixed string match rather than a regular expression.

need to remove the last zeros in line

need to read a file. check for the zeros in the last of each line . if the last digit is zero I want to delete it .please help me for this
input="temp.txt"
while IFS= read -r line
do
echo "output :$line"
if [[ $line == 0$ ]]; then
echo " blash "
else
echo "anotherblash"
fi
done < "$input"
You can do this type of substitution with sed:
sed 's/0*$//' temp.txt
This removes all the trailing zeros from each line. 0* matches "zero or more" 0s, and $ matches the end of the line.
If you only ever want to remove one 0, then remove the *.
If you prefer to do the same thing in the shell (I assume you use bash, since your attempt includes [[), you could do this:
#!/bin/bash
# match any line ending in one or more zeros
# capture everything up to the trailing 0s
re='(.*[^0])0+$'
while read -r line; do
# use =~ for regex match
if [[ $line =~ $re ]]; then
# assign first capture group, discarding trailing 0s
line=${BASH_REMATCH[1]}
fi
echo "$line"
done < temp.txt
But this approach has the disadvantages of being more complicated and less portable, so I would go with the sed option.
In the expression command [[ $line == 0$ ]] you use the regular expression 0$, but, as man sh tells:
When the == and != operators are used, the string to the right
of the operator is considered a pattern and matched according to
the rules described below under Pattern Matching. …
An additional binary operator, =~, is available, with the same
precedence as == and !=. When it is used, the string to the
right of the operator is considered an extended regular expres-
sion and matched accordingly (as in regex(3)).
So, since you use the == operator, you have to specify a pattern as with filename matching, i. e. [[ $line == *0 ]].
While the solution given by John1024 in the comment is the right way to go, if you prefer to follow your original approach, it does not make sense to compare [[ $line == 0$ ]], because this would just check whether the line consists of the digit zero, forllowed by a dollar sign. Instead, you would have to do a regular expression match, i.e.
if [[ $line =~ 0$ ]]
This would yield true, if the line ends in a zero.
Another possibility is to stick with globbing and write the condition as
if [[ $line == *0 ]]
Note that within [[ ... ]], a =~ does regexp matching and a == does wildcard matching (i.e. via globbing).

Case insensitive comparision in If condition

I have this csv file and i need to count the number of rows which satisfies the condition that the row entry is betwen a certain year range and the artist_name matches the name argument. But the string matching should be case insensitive. How do i achieve that in the if loop..
I am a beginner, so please bear with me
#!/bin/bash
file="$1"
artist="$2"
from_year="$(($3-1))"
to_year="$(($4+1))"
count=0
while IFS="," read arr1 arr2 arr3 arr4 arr5 arr6 arr7 arr8 arr9 arr10 arr11 ; do
if [[ $arr11 -gt $from_year ]] && [[ $arr11 -lt $to_year ]] && [[ $arr7 =~ $artist ]]; then
count=$((count+1))
fi
done < "$file"
echo $count
The $arr7 =~ $artist part is where i need to make the modification
Bash has a builtin method for converting strings to lower case. Once they are both lower case, you can compare them for equality. For example:
$ arr7="Rolling Stones"
$ artist="rolling stoneS"
$ [ "${arr7,,}" = "${artist,,}" ] && echo "Matches!"
Matches!
$ [[ ${arr7,,} =~ ${artist,,} ]] && echo "Matches!"
Matches!
Details
${parameter,,} converts all characters in a string to lower case. If you wanted to convert to upper case, use ${parameter^^}. If you want to convert just some of the characters, use ${parameter,,pattern} where only those characters matching pattern are changed. Still more details on this are documented by manbash`:
${parameter^pattern}
${parameter^^pattern}
${parameter,pattern}
${parameter,,pattern}
Case modification. This expansion modifies the case of alphabetic characters in parameter. The pattern is expanded to
produce a pattern just
as in pathname expansion. The ^ operator converts lowercase letters matching pattern to uppercase; the , operator
converts matching uppercase
letters to lowercase. The ^^ and ,, expansions convert each matched character in the expanded value; the ^ and , expansions
match and convert
only the first character in the expanded value. If pattern is omitted, it is treated like a ?, which matches every
character. If parameter
is # or *, the case modification operation is applied to each positional parameter in turn, and the expansion is the
resultant list. If
parameter is an array variable subscripted with # or *, the case modification operation is applied to each member of the array
in turn, and
the expansion is the resultant list.
Compatibility
These case modification methods require bash version 4 (released on 2009-Feb-20) or better.
The bash case-transformations (${var,,} and ${var^^}) were introduced (some time ago) in bash version 4. However, if you are using Mac OS X, you most likely have bash v3.2 which doesn't implement case-transformation natively.
In that case, you can do lower-cased comparison less efficiently and with a lot more typing using tr:
if [[ $(tr "[:upper:]" "[:lower:]" <<<"$arr7") = $(tr "[:upper:]" "[:lower:]" <<<"$artist") ]]; then
# ...
fi
By the way, =~ does a regular expression comparison, not a string comparison. You almost certainly wanted =. Also, instead of [[ $x -lt $y ]] you can use an arithmetic compound command: (( x < y )). (In arithmetic expansions, it is not necessary to use $ to indicate variables.)
Use shopt -s nocasematch
demo
#!/bin/bash
words=(Cat dog mouse cattle scatter)
#Print words from list that match pat
print_matches()
{
pat=$1
echo "Pattern to match is '$pat'"
for w in "${words[#]}"
do
[[ $w =~ $pat ]] && echo "$w"
done
echo
}
echo -e "Wordlist: (${words[#]})\n"
echo "Normal matching"
print_matches 'cat'
print_matches 'Cat'
echo -e "-------------------\n"
echo "Case-insensitive matching"
shopt -s nocasematch
print_matches 'cat'
print_matches 'CAT'
echo -e "-------------------\n"
echo "Back to normal matching"
shopt -u nocasematch
print_matches 'cat'
output
Wordlist: (Cat dog mouse cattle scatter)
Normal matching
Pattern to match is 'cat'
cattle
scatter
Pattern to match is 'Cat'
Cat
-------------------
Case-insensitive matching
Pattern to match is 'cat'
Cat
cattle
scatter
Pattern to match is 'CAT'
Cat
cattle
scatter
-------------------
Back to normal matching
Pattern to match is 'cat'
cattle
scatter

Search in string for multiple array values

I'm looking at a simple for loop with the following logic:
variable=`some piped string`
array_value=(1.1 2.9)
for i in ${array_value[#]}; do
if [[ "$variable" == *some_text*"$array_value" ]]; then
echo -e "Info: Found a matching string"
fi
The problem is that I cannot get this to show me when it finds either the string ending in 1.1 or 2.9 as sample data.
If I do an echo $array_value in the for loop I can see that the array values are being taken so its values are being parsed, though the if loop doesn't return that echo message although the string is present.
LE:
Based on the comments received I've abstracted the code to something like this, which still doesn't work if I want to use wildcards inside the comparison quote
versions=(1.1 2.9)
string="system is running version:2.9"
for i in ${versions[#]}; do
if [[ "$string" == "system*${i}" ]]; then
echo "match found"
fi
done
Any construction similar to "system* ${i}" or "* ${i}" will not work, though if I specify the full string pattern it will work.
The problem with the test construct has to you with your if statement. To construct the if statement in a form that will evaluate, use:
if [[ "$variable" == "*some_text*${i}" ]]; then
Note: *some_text* will need to be replaced with actual text without * wildcards. If the * is needed in the text, then you will need to turn globbing off to prevent expansion by the shell. If expansion is your goal, then protect the variable i by braces.
There is nothing wrong with putting *some_text* up against the variable i, but it is cleaner, depending on the length of some_text, to assign it to a variable itself. The easiest way to accommodate this would be to define a variable to hold the some_text you are needing. E.g.:
prefix="some_text"
if [[ "$variable" == "${prefix}${i}" ]]; then
If you have additional questions, just ask.
Change "system*${i}" to system*$i.
Wrapping with quotes inside [[ ... ]] nullifies the wildcard * by treating it as a literal character.
Or if you want the match to be assigned to a variable:
match="system*"
you can then do:
[[ $string == $match$i ]]
You actually don't need quotes around $string either as word splitting is not performed inside [[ ... ]].
From man bash:
[[ expression ]]
...
Word splitting and pathname expansion are not
performed on the words between the [[ and ]]
...
Any part of the pattern may be quoted to force
the quoted portion to be matched as a string.

Resources