BASH if then else with quoted text

BASH if then else with quoted text - bash

I'm trying to write a short script that checks for verizon fios availability by zip code from a list of 5 digit us zip codes.
The basis of this I have working but comparing the recived output from curl to the expected output in the if statements isn't working.
I know there is a better & cleaner way to do this however I'm really interested in what is wrong with this method. I think it's something to do with the quotes getting jumbled up.
Let me know what you guys think. I originally thought this would be a quick little script. ha. Thanks for the help
Here is what I have so far:
#!/bin/bash
Avail='<link rel="canonical" href="http://fios.verizon.com/fios-plans.html#availability-msg" />'
NotAvail='<link rel="canonical" href="http://fios.verizon.com/order-now.html#availability-msg" />'
while read zip; do
chk=`curl -s -d "ref=GIa6uiuwP81j047HjKMHOwEyW4QJTYjG&PageID=page9765&AvailabilityZipCode=$zip" http://fios.verizon.com/availability_post4.php --Location | grep "availability-msg"`
#echo $chk
if [ "$chk" = "$Avail" ]
then
fios=1
elif [ "$chk" = "$NotAvail" ]
then
fios=0
else
fios=err
fi
echo "$zip | $fios"
done < zipcodes.txt

Most likely, the line read from curl ends in CR/LF. grep will take the LF as a line-end, and leave the CR in the line, where it will not match either of your patterns. (Other whitespace issues could also cause a similarly invisible mismatch, but stray CR's are very common since HTTP insists on them.)
The easiest solution is to use a less specific match, like a glob or regex; these are both available with bash's [[ (rather than [) command.
Eg.:
if [[ $chk =~ /fios-plans\.html ]]; then
will do a substring comparison

Related

How could I change this egrep script to a zgrep script and still have it work?

I'm trying to look for phone numbers in any of the following formats: +1.570.555.1212, 570.555.1212, (570)555-1212, and 570-555-1212. We also need to look in compressed folders using zgrep, however I would have my code come back "No matches found". The code is working as it is below to find phone numbers from txt files. It is very bad, but here it is below
Code:
#!/bin/bash
egrep '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}.[0-9]{3}.[0-9]{4}|([0-9]{3})[0-9]{3}-[0-9]{4}|+(1).[0-9]{3}.[0-9]{3}.[0-9]{4}' *
if [ $? -eq 0 ] ; then echo $1 ; else echo "No matches found" ; fi 2>/dev/null

zgrep without any options is equivalent in its regex capabilities to grep; you need to say zgrep -E if you want to use grep -E (aka egrep) regex syntax when searching compressed files.
#!/bin/bash
if zgrep -E -q '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}.[0-9]{3}.[0-9]{4}|([0-9]{3})[0-9]{3}-[0-9]{4}|+(1).[0-9]{3}.[0-9]{3}.[0-9]{4}' *
then
echo "$1"
else
echo "No matches found" >&2
fi
Notice also Why is testing “$?” to see if a command succeeded or not, an anti-pattern? and When to wrap quotes around a shell variable as well as the preference for -q over redirecting to /dev/null, and the displaying of error messages on standard error (>&2 redirection).
Your regex could also use some refactoring; maybe try
(\+\(1\).)?[0-9]{3}.[0-9]{3}.[0-9]{4}
Notice how round brackets and the plus sign need to be backslash-escaped to match literally, and how after refactoring out the +(1) prefix as optional the rest of the regex subsumes all the other variants you had enumerated, because . matches - and ( and . and many other characters. (The optional prefix could also be dropped completely and this would still match the same strings, but I had to guess some things so I am leaving it in with this remark.)

Bash version 3.2 doesn't read in lines/compare lines to a regular expression

I wrote a script in bash at home to solve a problem and "brought it with me" to work (More or less I just found it useful enough to put on my work laptop). It is a bash script and essentially, the only difference between my home version and the work version is where the files are located.
The bash script has this in it:
if [ ! -e $finalfile ]; then
echo >> $finalfile
else
echo > $finalfile
fi
while IFS='' read -r line || [[ -n "$line" ]]; do
if [[ "$line" =~ [A-Z]{1,}\( ]]; then
varName=$(echo "$line" |cut -d "(" -f1 | tr -d "[:punct:]" | tr -d "[:blank:]")
findComments
fi
done < "$getVarNames"
When I run this script at home, it reads in every line and prints correctly to final file (which is done in a different function). All the program does is read in each line from a file with the variable names, strip all the extraneous punctuation and blanks, and then checks a different file for comments with that variable name in it in the function findComments. The comments follow a specific form and depending on the form, the program will place some text into the finalfile. So I use read and regular expressions a LOT throughout this script.
The work version of the script does not run. When I use the -x option when I run it, this is what I get:
+ IFS=
+ read -r line
=~ [A-Z]{1,}( ]])=c(581,583)
+ IFS=
+ read -r line
=~ [A-Z]{1,}( ]])=c(584,585)
+ IFS=
+ read -r line
=~ [A-Z]{1,}( ]]2)=c(586,587)
+ IFS=
+ read -r line
=~ [A-Z]{1,}( ]]1)=c(588)
I'm not sure why. But it is reading in the line, because the garbage printed after my regular expression in the output are pieces of the lines it's supposed to be reading in. I checked the version of bash and work is running 3.2.25 while I'm running 4.2.25 at home (I think? I know I updated in 2011 or so and I think that's 4.2 or 4.3? But I know I'm at Bash 4.0, also I'm not home atm so I can't check... All I know is for sure I don't have bash v3). I'm also not 100% sure that we're going to update bash anytime soon at work. If things go like they do for most software, then we might not update until a program we run has an incompatibility with bash 3.2. So... like never.
I know bash 3.0 added some regular expression matching and in bash 3.2 there's a change to the regular expression operator =~ so that you don't need to put " " around the regular expression, I read that somewhere but here's link that I'm looking at right now. What I'm not seeing is any information on read changing in bash v3. There were some changes in v4 but I haven't found any changes that affect how I'm using read.
The long and short of this is that I don't know why this code doesn't work in version 3.2 of bash. I've been looking for an answer wherever I can find information but I legitimately can't figure out what is the problem with my code. I'm pretty sure it's read because I think my regular expression is OK, but is it possibly the regular expression? How can I fix my code so it works on a system with 3.2 on it?
Also on another note, I don't know if there's anywhere that has like... release notes or a changelog or something that I can look at for bash. It would be helpful if I could look at exactly what 3.2 was fixing or like, exactly what was added in my current (home) version of bash.

filename comparison with wildcard

I am working on a script and I need to compare a filename to another one and look for specific changes (in this case a "(x)" added to a filename when OS X needs to add a file to a directory, when a filename already exists) so this is an excerpt of the script, modified to be tested without the rest of it.
#!/bin/bash
p2_s2="/Path/to file (name)/containing - many.special chars.docx.gdoc"
next_line="/Path/to file (name)/containing - many.special chars.docx (1).gdoc"
file_ext=$(echo "${p2_s2}" | rev | cut -d '.' -f 1 | rev)
file_name=$(basename "${p2_s2}" ".${file_ext}")
file_dir=$(dirname "${p2_s2}")
esc_file_name=$(printf '%q' "${file_name}")
esc_file_dir=$(printf '%q' "${file_dir}")
esc_next_line=$(printf '%q' "${next_line}")
if [[ ${esc_next_line} =~ ${esc_file_dir}/${esc_file_name}\ \(?\).${file_ext} ]]
then
echo "It's a duplicate!"
fi
What I'm trying to do here is detect if the file next_line is a duplicate of p2_s2. As I am expecting multiple duplicates, next_line can have a (1) appended at the end of a filename or any other number in brackets (Although I am sure no double digits). As I can't do a simple string compare with a wildcard in the middle, I tried using the "=~" operator and escaping all the special chars. Any idea what I'm doing wrong?

You can trim ps2_s2's extension, trim next_line's extension including the number inside the parenthesis and see if you get the same file name. If you do - it's a duplicate. In order to do so, [[ allows us to perform a comparison between a string and a Glob.
I used extglob's +( ... ) pattern, so I can use +([0-9]) to match the number inside the parenthesis. Notice that extglob is enabled by shopt -s extglob.
#!/bin/bash
p2_s2="/Path/to/ps2.docx.gdoc"
next_line="/Path/to/ps2(1).docx.gdoc"
shopt -s extglob
if [[ "${p2_s2%%.*}" = "${next_line%%\(+([0-9])\).*}" ]]; then
printf '%s is a duplicate of %s\n' "$next_line" "$p2_s2"
fi
EDIT:
I now see that you've edited your question, so in case this solution is not enough, I'm positive that it'll be a good template to work with.

The (1) in next_line doesn't come before the final . it comes before the second to final . in the original filename but you only strip off a single . as the extension.
So when you generate the comparison filename you end up with /Path/to\ file\ \(name\)/containing\ -\ many.special\ chars.docx\ \(?\).gdoc which doesn't match what you expect.
If you had added set -x to the top of your script you'd have seen what the shell was actually doing and seen this.
What does OS X actually do in this situation? Does it add (#) before .gdoc? Does it add it before.docx`? Does it depend on whether OS X knows what the filename is (it is some type it can open natively)?

String contains in Bash that is a directory path

I am writing an SVN script that will export only changed files. In doing so I only want to export the files if they don't contain a specific file.
So, to start out I am modifying the script found here.
I found a way to check if a string contains using the functionality found here.
Now, when I try to run the following:
filename=`echo "$line" |sed "s|$repository||g"`
if [ ! -d $target_directory$filename ] && [[!"$filename" =~ *myfile* ]] ; then
fi
However I keep getting errors stating:
/home/home/myfile: "no such file or directory"
It appears that BASH is treating $filename as a literal. How do I get it so that it reads it as a string and not a path?
Thanks for your help!

You have some syntax issues (a shell script linter can weed those out):
You need a space after "[[", otherwise it'll be interpretted as a command (giving an error similar to what you posted).
You need a space after the "!", otherwise it'll be considered part of the operand.
You also need something in the then clause, but since you managed to run it, I'll assume you just left it out.
You combined two difference answers from the substring thing you posted, [[ $foo == *bar* ]] and [[ $foo =~ .*bar.* ]]. The first uses a glob, the second uses a regex. Just use [[ ! $filename == *myfile* ]]

How to prevent code/option injection in a bash script

I have written a small bash script called "isinFile.sh" for checking if the first term given to the script can be found in the file "file.txt":
#!/bin/bash
FILE="file.txt"
if [ `grep -w "$1" $FILE` ]; then
echo "true"
else
echo "false"
fi
However, running the script like
> ./isinFile.sh -x
breaks the script, since -x is interpreted by grep as an option.
So I improved my script
#!/bin/bash
FILE="file.txt"
if [ `grep -w -- "$1" $FILE` ]; then
echo "true"
else
echo "false"
fi
using -- as an argument to grep. Now running
> ./isinFile.sh -x
false
works. But is using -- the correct and only way to prevent code/option injection in bash scripts? I have not seen it in the wild, only found it mentioned in ABASH: Finding Bugs in Bash Scripts.

grep -w -- ...
prevents that interpretation in what follows --
EDIT
(I did not read the last part sorry). Yes, it is the only way. The other way is to avoid it as first part of the search; e.g. ".{0}-x" works too but it is odd., so e.g.
grep -w ".{0}$1" ...
should work too.

There's actually another code injection (or whatever you want to call it) bug in this script: it simply hands the output of grep to the [ (aka test) command, and assumes that'll return true if it's not empty. But if the output is more than one "word" long, [ will treat it as an expression and try to evaluate it. For example, suppose the file contains the line 0 -eq 2 and you search for "0" -- [ will decide that 0 is not equal to 2, and the script will print false despite the fact that it found a match.
The best way to fix this is to use Ignacio Vazquez-Abrams' suggestion (as clarified by Dennis Williamson) -- this completely avoids the parsing problem, and is also faster (since -q makes grep stop searching at the first match). If that option weren't available, another method would be to protect the output with double-quotes: if [ "$(grep -w -- "$1" "$FILE")" ]; then (note that I also used $() instead of backquotes 'cause I find them much easier to read, and quotes around $FILE just in case it contains anything funny, like whitespace).

Though not applicable in this particular case, another technique can be used to prevent filenames that start with hyphens from being interpreted as options:
rm ./-x
or
rm /path/to/-x

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

BASH if then else with quoted text - bash

Related

How could I change this egrep script to a zgrep script and still have it work?

Bash version 3.2 doesn't read in lines/compare lines to a regular expression

filename comparison with wildcard

String contains in Bash that is a directory path

How to prevent code/option injection in a bash script

Categories

Resources