Handling Spaces In Substring Searches In Bash Shell Scripts - bash

I'm using the following to determine if either substring is present in a $mainString in a Bash (ver 3.2.25) shell script:
if [[ $mainString = *cat* || $mainSubstring = *blue cheese* ]]; then
echo "FOUND"
else
echo "NOT FOUND"
fi
But I keep getting the following error because of the space in "blue cheese". How do you handle spaces in the substring?

You can escape the space:
$mainSubString = *blue\ cheese*
or quote the non-wildcard portions, one example of which is
$mainSubString = *'blue cheese'*
Often, it is better to store the pattern in a variable, both to simplify the quoting and to make the [[...]] expression more concise. Note that you must not quote the parameter expansion, as glenn jackman points out in his comment.
pattern="*blue cheese*"
if [[ $mainString = *cat* || $mainSubstring = $pattern ]]; then

Related

Shell script Bash, Check if string starts and ends with single quotes

I need to check if a string starts and ends with a single quote, for example
'My name is Mozart'
What I have is this, which doesn't work
if [[ $TEXT == '*' ]] ;
This does not work either
if [[ $TEXT == /'*/' ]] ;
But if I change it to
if [[ $TEXT == a*a ]] ;
it works for a sentence like 'an amazing apa'. So I Believe it has to do with the single quote sign.
Any ideas on how I can solve it?
With a regex:
if [[ $TEXT =~ ^\'.*\'$ ]]
With globbing:
if [[ $TEXT == \'*\' ]]
I am writing the complete bash script so you won't have any confusion:
#! /bin/bash
text1="'helo there"
if [[ $text1 =~ ^\'.*\'$ ]]; then
echo "text1 match"
else
echo "text1 not match"
fi
text2="'hello babe'"
if [[ $text2 =~ ^\'.*\'$ ]]; then
echo "text2 match"
else
echo "text2 not match"
fi
Save the above script as
matchCode.sh
Now run it as:
./matchCode
output:
text1 not match
text2 match
Ask if you have any confusion.
Cyrus' helpful answer solves your problem as posted.
However, I suspect you may be confused over quotes that are part of the shell syntax vs. quotes that are actually part of the string:
In a POSIX-like shell such as Bash, 'My name is Mozart' is a single-quoted string whose content is the literal My name is Mozart - without the enclosing '. That is, the enclosing ' characters are a syntactic elements that tell the shell that everything between them is the literal contents of the string.
By contrast, to create a string whose content is actually enclosed in ' - i.e., has embedded ' instances, you'd have to use something like: "'My name is Mozart'". Now it is the enclosing " instances that are the syntactic elements that bookend the string content.
Note, however, that using a "..." string (double quotes) makes the contents subject to string interpolation (expansion of embedded variable references, arithmetic and command substitutions; none in the case at hand, however), so it's important to know when to use '...' (literal strings) vs. "..." (interpolated strings).
Embedding ' instances in '...' strings is actually not supported at all in POSIX-like shells, but in Bash, Ksh, and Zsh there's another string type that allows you to do that: ANSI C-quoted strings, $'...', in which you can embed ' escaped as \': $'\'My name is Mozart\''
Another option is to use string concatenation: In POSIX-like shells, you can place substrings employing different quoting styles (including unquoted tokens) directly next to one another in order to form a single string: "'"'My Name is Mozart'"'" would also give you a string with contents 'My Name is Mozart'.
POSIX-like shells also allow you to escape individual, unquoted characters (meaning: neither part of a single- nor a double-quoted string) with \; therefore, \''My name is Mozart'\' yields the same result.
The behavior of Bash's == operator inside [[ ... ]] (conditionals) may have added to the confusion:
If the RHS (right-hand side - the operand to the right of operator ==) is quoted, Bash treats it like a literal; only unquoted strings (or variable references) are treated as (glob-like) patterns:
'*' matches literal *, whereas * (unquoted!) matches any sequence of characters, including none.
Thus:
[[ $TEXT == '*' ]] would only ever match the single, literal character *.
[[ $TEXT == /'*/' ]], because it mistakes / for the escape character - which in reality is \ - would only match literal /*/ (/'*/' is effectively a concatenation of unquoted / and single-quoted literal */).
[[ $TEXT == a*a ]], due to using an unquoted RHS, is the only variant that actually performs pattern matching: any string that starts with a and ends with a is matched, including aa (because unquoted * represents any sequence of characters).
To verify that Cyrus' commands do work with strings whose content is enclosed in (embedded) single quotes, try these commands, which - on Bash, Ksh, and Zsh - should both output yes.
[[ "'ab'" == \'*\' ]] && echo yes # pattern matching, indiv. escaped ' chars.
[[ "'ab'" =~ ^\'.*\'$ ]] && echo yes # regex operator =~

How to check if a file name matches regex in shell script

I have a shell script that needs to check if a file name matches a certain regex, but it always shows "not match". Can anyone let me know what's wrong with my code?
fileNamePattern=abcd_????_def_*.txt
realFilePath=/data/file/abcd_12bd_def_ghijk.txt
if [[ $realFilePath =~ $fileNamePattern ]]
then
echo $realFilePath match $fileNamePattern
else
echo $realFilePath not match $fileNamePattern
fi
There is a confusion between regexes and the simpler "glob"/"wildcard"/"normal" patterns – whatever you want to call them. You're using the latter, but call it a regex.
If you want to use a pattern, you should
Quote it when assigning1:
fileNamePattern="abcd_????_def_*.txt"
You don't want anything to expand quite yet.
Make it match the complete path. This doesn't match:
$ mypath="/mydir/myfile1.txt"
$ mypattern="myfile?.txt"
$ [[ $mypath == $mypattern ]] && echo "Matches!" || echo "Doesn't match!"
Doesn't match!
But after extending the pattern to start with *:
$ mypattern="*myfile?.txt"
$ [[ $mypath == $mypattern ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
The first one doesn't match because it matches only the filename, but not the complete path. Alternatively, you could use the first pattern, but remove the rest of the path with parameter expansion:
$ mypattern="myfile?.txt"
$ mypath="/mydir/myfile1.txt"
$ echo "${mypath##*/}"
myfile1.txt
$ [[ ${mypath##*/} == $mypattern ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
Use == and not =~, as shown in the above examples. You could also use the more portable = instead, but since we're already using the non-POSIX [[ ]] instead of [ ], we can as well use ==.
If you want to use a regex, you should:
Write your pattern as one: ? and * have a different meaning in regexes; they modify what they stand after, whereas in glob patterns, they can stand on their own (see the manual). The corresponding pattern would become:
fileNameRegex='abcd_.{4}_def_.*\.txt'
and could be used like this:
$ mypath="/data/file/abcd_12bd_def_ghijk.txt"
$ [[ $mypath =~ $fileNameRegex ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
Keep your habit of writing the regex into a separate parameter and then use it unquoted in the conditional operator [[ ]], or escaping gets very messy – it's also more portable across Bash versions.
The BashGuide has a great article about the different types of patterns in Bash.
Notice that quoting your parameters is almost always a good habit. It's not required in conditional expressions in [[ ]], and actually suppresses interpretation of the right-hand side as a pattern or regex. If you were using [ ] (which doesn't support regexes and patterns anyway), quoting would be required to avoid unexpected side effects of special characters and empty strings.
1 Not exactly true in this case, actually. When assigning to a variable, the manual says that the following happens:
[...] tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, and quote removal [...]
i.e., no pathname (glob) expansion. While in this very case using
fileNamePattern=abcd_????_def_*.txt
would work just as well as the quoted version, using quotes prevents surprises in many other cases and is required as soon as you have a blank in the pattern.
Use RegExs instead of wildcards:
{ ~ } » fileNamePattern="abcd_...._def_.*\.txt" ~
{ ~ } » realFilePath=/data/file/abcd_12bd_def_ghijk.txt ~
{ ~ } » if [[ $realFilePath =~ $fileNamePattern ]] ~
\ then
\ echo $realFilePath match $fileNamePattern
\ else
\ echo $realFilePath not match $fileNamePattern
\ fi
Output:
/data/file/abcd_12bd_def_ghijk.txt match abcd_...._def_.*\.txt

Search in string for multiple array values

I'm looking at a simple for loop with the following logic:
variable=`some piped string`
array_value=(1.1 2.9)
for i in ${array_value[#]}; do
if [[ "$variable" == *some_text*"$array_value" ]]; then
echo -e "Info: Found a matching string"
fi
The problem is that I cannot get this to show me when it finds either the string ending in 1.1 or 2.9 as sample data.
If I do an echo $array_value in the for loop I can see that the array values are being taken so its values are being parsed, though the if loop doesn't return that echo message although the string is present.
LE:
Based on the comments received I've abstracted the code to something like this, which still doesn't work if I want to use wildcards inside the comparison quote
versions=(1.1 2.9)
string="system is running version:2.9"
for i in ${versions[#]}; do
if [[ "$string" == "system*${i}" ]]; then
echo "match found"
fi
done
Any construction similar to "system* ${i}" or "* ${i}" will not work, though if I specify the full string pattern it will work.
The problem with the test construct has to you with your if statement. To construct the if statement in a form that will evaluate, use:
if [[ "$variable" == "*some_text*${i}" ]]; then
Note: *some_text* will need to be replaced with actual text without * wildcards. If the * is needed in the text, then you will need to turn globbing off to prevent expansion by the shell. If expansion is your goal, then protect the variable i by braces.
There is nothing wrong with putting *some_text* up against the variable i, but it is cleaner, depending on the length of some_text, to assign it to a variable itself. The easiest way to accommodate this would be to define a variable to hold the some_text you are needing. E.g.:
prefix="some_text"
if [[ "$variable" == "${prefix}${i}" ]]; then
If you have additional questions, just ask.
Change "system*${i}" to system*$i.
Wrapping with quotes inside [[ ... ]] nullifies the wildcard * by treating it as a literal character.
Or if you want the match to be assigned to a variable:
match="system*"
you can then do:
[[ $string == $match$i ]]
You actually don't need quotes around $string either as word splitting is not performed inside [[ ... ]].
From man bash:
[[ expression ]]
...
Word splitting and pathname expansion are not
performed on the words between the [[ and ]]
...
Any part of the pattern may be quoted to force
the quoted portion to be matched as a string.

Comparing variables in shell scripts

I have got a project that involves shell scripts and comparing values/variables within them. I have looked here and elsewhere on comparing variables and I have tried all the various examples given but I am running into something that is not as advertised. OS is Solaris10
I have created the following script as a learning experience-
#!/bin/ksh
stest()
{
if $X = $Y
then echo they're the same
else echo they're notthe same
fi
}
X=a
Y=a
stest
echo completed
I keep getting some variation of the following-
using shell sh or ksh-
#./test.sh
./test.sh[2]: a: not found
completed
using shell bash-
#./test.sh
./test.sh: line 5: a: command not found
completed
I have tried enclosing the if $X = $Y line in brackets and double brackets and I get back
[a: not found
or
[[a: not found
If I change the variables X and Y to the numeral "1" I get the same thing-
./test.sh[2]: 1: not found
I have tried enclosing things in single quotes, double quotes & backwards quotes.
Any help is appreciated.
After if, you need a shell command, like anywhere else. $X = $Y is parsed as a shell command, meaning $X is interpreted as a command name (provided that the value of the variable is a single word).
You can use the [ command (also available as test) or the [[ … ]] special syntax to compare two variables. Note that you need spaces on the inside of the brackets: the brackets are a separate token in the shell syntax.
if [ "$X" = "$Y" ]; then …
or
if [[ "$X" = "$Y" ]]; then …
[ … ] works in any shell, [[ … ]] only in ksh, bash and zsh.
Note that you need double quotes around the variables¹. If you leave off the quotes, then the variable is split into multiple words and each word is interpreted as a wildcard pattern. This doesn't happen inside [[ … ]], but the right-hand side of = is interpreted as a wildcard pattern there too. Always put double quotes around variable substitutions (unless you want the value of the variable to be used as a list of filename matching patterns, rather than as a string).
¹ Except on $X the [[ … ]] syntax.
This KornShell (ksh) script should work:
soExample.ksh
#!/bin/ksh
#Initialize Variables
X="a"
Y="a"
#Function to create File with Input
#Params: 1}
stest(){
if [ "${X}" == "${Y}" ]; then
echo "they're the same"
else
echo "they're not the same"
fi
}
#-----------
#---Main----
#-----------
echo "Starting: ${PWD}/${0} with Input Parameters: {1: ${1} {2: ${2} {3: ${3}"
stest #function call#
echo "completed"
echo "Exiting: ${PWD}/${0}"
Output :
user#foo:/tmp $ ksh soExample.ksh
Starting: /tmp/soExample.ksh with Input Parameters: {1: {2: {3:
they're not the same
completed
Exiting: /tmp/soExample.ksh
ksh version:
user#foo:/tmp $ echo $KSH_VERSION
#(#)MIRBSD KSH R48 2013/08/16

Comparison of 2 string variables in shell script

Consider there is a variable line and variable word:
line = 1234 abc xyz 5678
word = 1234
The value of these variables are read from 2 different files.
I want to print the line if it contains the word. How do I do this using shell script? I tried all the suggested solutions given in previous questions. For example, the following code always passed even if the word was not in the line.
if [ "$line"==*"$word"*]; then
echo $line
fi
No need for an if statement; just use grep:
echo $line | grep "\b$word\b"
You can use if [[ "$line" == *"$word"* ]]
Also you need to use the following to assign variables
line="1234 abc xyz 5678"
word="1234"
Working example -- http://ideone.com/drLidd
Watch the white spaces!
When you set a variable to a value, don't put white spaces around the equal sign. Also use quotes when your value has spaced in it:
line="1234 abc xyz 5678" # Must have quotation marks
word=1234 # Quotation marks are optional
When you use comparisons, you must leave white space around the brackets and the comparison sign:
if [[ $line == *$word* ]]; then
echo $line
fi
Note that double square brackets. If you are doing pattern matching, you must use the double square brackets and not the single square brackets. The double square brackets mean you're doing a pattern match operation when you use == or =. If you use single square brackets:
if [ "$line" = *"$word"* ]
You're doing equality. Note that double square brackets don't need quotation marks while single brackets it is required in most situations.
echo $line | grep "$word"
would be the typical way to do this in a script, of course it does cost a new process
You can use the bash match operator =~:
[[ "$line" =~ "$word" ]] && echo "$line"
Don't forget quotes, as stated in previous answers (especially the one of #Bill).
The reason that if [ "$line"==*"$word"* ] does not work as you expect is perhaps a bit obscure. Assuming that no files exist that cause the glob to expand, then you are merely testing that the string 1234 abc xyz 5678==*1234* is non empty. Clearly, that is not an empty string, so the condition is always true. You need to put whitespace around your == operator, but that will not work because you are now testing if the string 1234 abc xyz 5678 is the same as the string to which the glob *1234* expands, so it will be true only if a file named 1234 abc xyz 5678 exists in the current working directory of the process executing the shell script. There are shell extensions that allow this sort of comparison, but grep works well, or you can use a case statement:
case "$line" in
*$word*) echo $line;;
esac
An alternative solution would be using loop:
for w in $line
do
if [ "$w" == "$word" ]; then
echo $line
break
fi
done
Code Snippet:
$a='text'
$b='text'
if [ $a -eq $b ]
then
msg='equal'
fi

Resources