Match string to multiple globs in Bash - bash

I'm trying to simplify my Bash 4 script. I'm reading lines from a file and I want to exclude lines matching certain substrings:
while read p; do
if [[ $p != *-ext ]]; then
if [[ $p != *-backend ]]; then
if [[ $p != *-vip ]]; then
echo $p
fi
fi
fi
fi
done < $hostsfile
So if the line does NOT end with -ext or -backend or -vip then print it. What's an easier way (one-liner that doesn't chain &&'s) to accomplish this?

Use an extended pattern. This requires the extglob option be enabled, but that is (temporarily) enabled by default for the RHS of the != operator inside [[ in bash 4.2 or later.
shopt -s extglob # if necessary
if [[ $p != *-#(backend|vip|ext) ]]; then
An equivalent using regular expressions would be
if ! [[ $p =~ (backend|vip|ext)$ ]]; then
No additional options need to be set, but a regular expression isn't implicitly anchored, so you don't need to match the beginning of the string like *- does in the pattern, but you do need to explicitly match the end of the string with $. Also, there is no doesn't-match regular-expression operator (like !~ in Perl), so you need to use =~ and negate the exit status of the command.

Adding a POSIX-compliant option (portable to /bin/sh) to the otherwise very good answer already present:
case $p in *-backend|*-vip|*-ext) : ;; *) echo "$p" ;; esac
Split out onto multiple lines, that would look like:
case $p in
*-backend|*-vip|*-ext)
:
;;
*)
echo "$p"
;;
esac
...that is to say, we're running the : command (shorthand for true) if any of the patterns is matched, and our echo otherwise.

Related

Bash - Comparing a string to an array that contains wildcards?

I have an array of possible file extensions, which contains some wild cards e.g.:
FILETYPES=("DBG" "MSG" "OUT" "output*.txt")
I also have a list of files, which I am grabbing the file extension from. I then need to compare the extension with the array of file extensions.
I have tried:
if [[ ${EXTENSION} =~ "${FILETYPES[*]}" ]]; then
echo "file found"
fi
if [[ ${EXTENSION} == "${FILETYPES[*]}" ]]; then
echo "file found"
fi
and
if [[ ${EXTENSION} =~ "${FILETYPES[*]}" ]]; then
echo "file found"
fi
But to no avail
I tried:
if [[ "${FILETYPES[*]}" =~ ${EXTENSION} ]]; then
echo "file found"
fi
However, it ended up comparing "txt" to "output*.txt" and concluding it was a match.
FILETYPES=("DBG" "MSG" "OUT" "output*.txt") First of all, avoid ALL_CAPS variable names except if these are meant as global environment variables.
"output*.txt": is ok as a globing pattern, for bash test [[ $variable == output*.txt ]] for example. But for Regex matching it needs a different syntax like [[ $variable =~ output.*\.txt ]]
"${FILETYPES[*]}": Expanding this array into a single_string was mostly a good approach, but it needs clever use of the IFS environment variable to help it expands into a Regex. Something like IFS='|' regex_fragment="(${array[*]})", so that each array entry will be expanded, separated by a pipe | and enclosed in parenthesis as (entry1|entry2|...).
Here is an implementation you could use:
textscript.sh
#!/usr/bin/env bash
extensions_regexes=("DBG" "MSG" "OUT" "output.*\.txt")
# Expands the extensions regexes into a proper regex string
IFS='|' regex=".*\.(${extensions_regexes[*]})"
# Prints the regex for debug purposes
printf %s\\n "$regex"
# Iterate all filenames passed as argument to the script
for filename; do
# Compare the filename with the regex
if [[ $filename =~ $regex ]]; then
printf 'file found: %s \n' "$filename"
fi
done
Sample usage:
$ touch foobar.MSG foobar.output.txt
$ bash testscript.sh *
.*\.(DBG|MSG|OUT|output.*\.txt)
file found: foobar.MSG
file found: foobar.output.txt
You cannot directly compare a string with an array. Would you please try something like:
filetypes=("DBG" "MSG" "OUT" "output*.txt")
extension="MSG" # example
match=0
for type in "${filetypes[#]}"; do
if [[ $extension = $type ]]; then
match=1
break
fi
done
echo "$match"
You can save looping with regex:
pat="^(DBG|MSG|OUT|output.*\.txt)$"
extension="output_foo.txt" # example
match=0
if [[ $extension =~ $pat ]]; then
match=1
fi
echo "$match"
Please note the expressions of regex differ from wildcards for globbing.
As a side note, we conventionally do not use uppercases for user variables to avoid conflicts with system variables.

How can I check if a variable is contains only letters

I tried to check the following case:
#!/bin/bash
line="abc"
if [[ "${line}" != [a-z] ]]; then
echo INVALID
fi
And I get INVALID as output. But why?
It's no check if $line contains only a characters in the range [a-z] ?
Use the regular expression matching operator =~:
#!/bin/bash
line="abc"
if [[ "${line}" =~ [^a-zA-Z] ]]; then
echo INVALID
fi
Works in any Bourne shell and wastes no pipes/forks:
case $var in
("") echo "empty";;
(*[!a-z]*) echo "contains a non-alphabetic";;
(*) echo "just alphabetics";;
esac
Use [!a-zA-Z] if you want to allow upper case as well.
Could you please try following and let me know if this helps you.
line="abc"
if echo "$line" | grep -i -q '^[a-z]*$'
then
echo "MATCHED."
else
echo "NOT-MATCHED."
fi
Pattern matches are anchored to the beginning and end of the string, so your code checks if $line is not a single lowercase character. You want to match an arbitrary sequence of lowercase characters, which you can do using extended patterns:
if [[ $line != #([a-z]) ]]; then
or using the regular-expression operator:
if ! [[ $line =~ ^[a-z]+$ ]]; then # there is no negative regex operator like Perl's !~
Why? Because != means "not equal", thats why. You tell bash to compare abc with [a-z]. They are not equal.
Try echo $line | grep -i -q -x '[a-z]*'.
The flag -i makes grep case insensitive.
The flag -x means match the whole line.
The flag -q means print nothing to stdout, just return 1 or 0.

Pass script arguments as a pattern

I have a bash script that requires a glob expression as a parameter. However I am having trouble using inputs as globs i.e say my input is
Shell_script '*.c'
and my code is iterating through an array of files and filtering them through pattern matching. In this case files which do not have the .c extension. (In this example, the first input could be any pattern otherwise)
count=${#array[#]}
for (( q = 0; q < count; q++ ));
do
if [[ ${array[q]} == $1 ]]; then
:
else unset array[q]
fi
done
.....
Any ideas?
Matching array contents against a glob is entirely possible:
#!/bin/bash
# this array has noncontiguous indexes to demonstrate a potential bug in the original code
array=( [0]="hello.c" [3]="cruel.txt" [5]="world.c" )
glob=$1
for idx in "${!array[#]}"; do
val=${array[$idx]}
if [[ $val = $glob ]]; then
echo "File $val matches glob expression $glob" >&2
else
echo "File $val does not match glob expression $glob; removing" >&2
unset array[$idx]
fi
done
Similarly, you can expand a glob against filesystem contents, though you'll want to clear IFS first to avoid string-splitting:
# here, the expectation is that your script would be invoked as: ./yourscript '*.c'
IFS=
for f in $1; do
[[ -e $f || -L $f ]] || { echo "No file matching $f found" >&2; }
echo "Iterating over file $f"
done
That said, in general, this is extremely unidiomatic, as opposed to letting the calling shell expand the glob before your script is started, and reading the list of matched files off your argument vector. Thus:
# written this way, your script can just be called ./yourscript *.c
for f; do
[[ -e $f || -L $f ]] || { echo "No file matching $f found" >&2; }
echo "Iterating over file $f"
done
You can loop over your list of files like this. If you run your script as
./test.sh "*.c". Then inside your script you can do:
for file in $1
do
#use your file
done

case insensitive check in if loop using regex in shell

I want to check the following condition, but it should be case insensitive.
if [ "SPP" == $1 ]
Is there anyway I can do it using regex.
You can also do the following:
#!/bin/bash
myParam=`echo "$1" | tr 'a-z' 'A-Z'`
if [ "SPP" == "$myParam" ]; then
echo "Is the same"
else
echo "It is not the same"
fi
This script will automatically converts user input to uppercase before making any string comparison. By doing so, you will not have to use regex for case insensitive string comparison.
Hope it helps.
Better late than never...
If that's ksh93, use the ~(i:...) case-insensitive globbing sub-pattern:
if [[ $1 == *~(i:spp)* ]]; then
: matched.
fi
For ksh88 (also the ksh clones), use an intermediary variable typeset -u'd to force upper-case:
typeset -u tocheck=$1
if [[ $tocheck == *SPP* ]]; then
: matched
fi
You can use:
shopt -s nocasematch
For case insensitive matching in BASH.
Alternatively this should also work:
[[ "$1" == [sS][pP][pP] ]]

Matching one of several possible characters in a string

In a bash (version 3.2.48) script I get a string that can be something like:
'XY'
' Y'
'YY'
etc
So, I have either an alphabetic character OR a space (first slot), then the relevant character (second slot). I tried some variation (without grep, sed, ...) like:
if [[ $string =~ ([[:space]]{1}|[[:alpha:]]{1})M ]]; then
and
if [[ $string =~ (\s{1}|.{1})M ]]; then
but my solutions did not always work correctly (matching correctly every combination).
This should work for you:
if [[ $string =~ [[:space:][:alpha:]]M ]]; then
if [[ ${string:1:1} == "M" ]]; then
echo Heureka
fi
or (if you want to do it with patterns)
if [[ $string =~ ([[:space:]]|[[:alpha:]])M ]]; then
echo Heureka
fi
or (even simpler)
if [[ $string == ?M ]]; then
echo Heureka
fi
Without using regular expressions, simply pattern matching is sufficient:
if [[ $string == [[::upper:]\ ]M ]]; then
echo match
fi
Given your example, you want [[:upper:]] rather than merely [[:alpha:]]

Resources