getopts & preventing accidentally interpreting short options as arguments - bash

Here's a toy example that shows what I mean:
while getopts "sf:e:" opt; foundOpts="${foundOpts}${opt}" ; done
echo $foundOpts
problem is that getopts isn't particularly smart when it comes to arguments. for example, ./myscript.sh -ssssf will output option requires an argument -- f. That's good.
But if someone does ./myscript.sh -ssfss by mistake, it parses that like -ss -f ss, which is NOT desirable behavior!
How can I detect this? Ideally, I'd like to force that f parameter to be defined separately, and allow -f foo or -f=foo, but not -sf=foo or -sf foo.
Is this possible without doing something hairy? I imagine I could do something with a RegEx, along the lines of match $# against something like -[^ef[:space:]]*[ef]{1}[ =](.*), but I'm worried about false positives (that regex might not be exactly right)
Thanks!

Here's what I've come up with. It seems to work, but I imagine the RegEx could be simplified. And I imagine there are edge cases I've overlooked.
validOpts="sf:e:"
optsWithArgs=$(echo "$validOpts" | grep -o .: | tr -d ':\n')
# ensure that options that require (or have optional) arguments are not combined with other short options
invalidOptArgFormats=( "[^$optsWithArgs[:space:]]+[$optsWithArgs]" "[$optsWithArgs]([^ =]|[ =]$|$)" )
IFS="|" && invalidOptArgRegEx="-(${invalidOptArgFormats[*]})"
[[ "$#" =~ $invalidOptArgRegEx ]] && echo -e "An option you provided takes an argument; it is required to have its option\n on a separate - to prevent accidentally passing an opt as an argument.\n (You provided ${BASH_REMATCH[0]})" && exit 1

Related

Constructing an If statement based on the return of the environment $PATH piped through grep (with bash)

I've answered my own question in writing this, but it might be helpful for others as I couldn't find a straightforward answer anywhere else. Please delete if inappropriate.
I'm trying to construct an if statement depending whether some <STRING> is found inside the environment $PATH.
When I pipe $PATH through grep I get a successful hit:
echo $PATH | grep -i "<STRING>"
But I was really struggling to find the syntax required to construct an if statement around this. It appears that the line below works. I know that the $(...) essentially passes the internal commands to the if statement, but I'm not sure why the [[...]] double brackets are needed:
if [[ $(echo $PATH | grep -i "<STRING>") ]]; then echo "HEY"; fi
Maybe someone could explain that for me to have a better understanding.
Thanks.
You could make better use of shell syntax. Something like this:
$ STRING="bin"
$ grep -i $STRING <<< $PATH && echo "HEY"
That is: first, save the search string in a variable (which I called STRING so it's easy to remember), and use that as the search pattern. Then, use the <<< redirection to input a "here string" - namely, the PATH variable.
Or, if you don't want a variable for the string:
$ grep -i "bin" <<< $PATH && echo "HEY"
Then, the construct && <some other command> means: IF the exit status of grep is 0 (meaning at least one successful match), THEN execute the "other command" (otherwise do nothing - just exit as soon as grep completes). This is the more common, more natural form of an "if... then..." statement, exactly of the kind you were trying to write.
Question for you though. Why the -i flag? That means "case independent matching". But in Unix and Linux file names, command names, etc. are case sensitive. Why do you care if the PATH matches the string BIN? It will, because bin is somewhere on the path, but if you then search for the BIN directory you won't find one. (The more interesting question is - how to match complete words only, so that for example to match bin, a directory name bin should be present; sbin shouldn't match bin. But that's about writing regular expressions - not quite what you were asking about.)
The following version - which doesn't even use grep - is based on the same idea, but it won't do case insensitive matching:
$ [[ $PATH == *$STRING* ]] && echo "HEY"
[[ ... ]] evaluates a Boolean expression (here, an equality using the * wildcard on the right-hand side); if true, && causes the execution of the echo command.
you don't need to use [[ ]], just:
if echo $PATH | grep -qi "<STRING>"; then echo "HEY"; fi

Why can't I double-quote a variable with several parameters in it?

I'm writing a bash script that uses rsync to synchronize directories. According to the Google shell style guide:
Always quote strings containing variables, command substitutions, spaces or shell meta characters, unless careful unquoted expansion is required.
Use "$#" unless you have a specific reason to use $*.
I wrote the following test case scenario:
#!/bin/bash
__test1(){
echo stdbuf -i0 -o0 -e0 $#
stdbuf -i0 -o0 -e0 $#
}
__test2(){
echo stdbuf -i0 -o0 -e0 "$#"
stdbuf -i0 -o0 -e0 "$#"
}
PARAM+=" --dry-run "
PARAM+=" mirror.leaseweb.net::archlinux/"
PARAM+=" /tmp/test"
echo "test A: ok"
__test1 nice -n 19 rsync $PARAM
echo "test B: ok"
__test2 nice -n 19 rsync $PARAM
echo "test C: ok"
__test1 nice -n 19 rsync "$PARAM"
echo "test D: fails"
__test2 nice -n 19 rsync "$PARAM"
(I need stdbuf to immediately observe output in my longer script that i'm running)
So, my question is: why does test D fail with the below message?
rsync: getaddrinfo: --dry-run mirror.leaseweb.net 873: Name or service not known
The echo in every test looks the same. If I'm suppose to quote all variables, why does it fail in this specific scenario?
It fails because "$PARAM" expands as a single string, and no word splitting is performed, although it contains what should be interpreted by the command as several arguments.
One very useful technique is to use an array instead of a string. Build the array like this :
declare -a PARAM
PARAM+=(--dry-run)
PARAM+=(mirror.leaseweb.net::archlinux/)
PARAM+=(/tmp/test)
Then, use an array expansion to perform your call :
__test2 nice -n 19 rsync "${PARAM[#]}"
The "${PARAM[#]}" expansion has the same property as the "$#" expansion : it expands to a list of items (one word per item in the array/argument list), no word splitting occurs, just as if each item was quoted.
I agree with #Fred — using arrays is best. Here's a bit of explanation, and some debugging tips.
Before running the tests, I added
echo "$PARAM"
set|grep '^PARAM='
to actually show what PARAM is.** In your original test, it is:
PARAM=' --dry-run mirror.leaseweb.net::archlinux/ /tmp/test'
That is, it is a single string that contains multiple space-separated pieces.
As a rule of thumb (with exceptions!*), bash will split words unless you tell it not to. In tests A and C, the unquoted $# in __test1 gives bash an opportunity to split $PARAM. In test B, the unquoted $PARAM in the call to __test2has the same effect. Therefore,rsync` sees each space-separated item as a separate parameter in tests A-C.
In test D, the "$PARAM" passed to __test2 is not split when __test2 is called, because of the quotes. Therefore, __test2 sees only one parameter in $#. Then, inside __test2, the quoted "$#" keeps that parameter together, so it is not split at the spaces. As a result, rsync thinks the entirety of PARAM is the hostname, so fails.
If you use Fred's solution, the output from sed|grep '^PARAM=' is
PARAM=([0]="--dry-run" [1]="mirror.leaseweb.net::archlinux/" [2]="/tmp/test")
That is bash's internal notation for an array: PARAM[0] is "--dry-run", etc. You can see each word individually. echo $PARAM is not very helpful for an array, since it only outputs the first word (here, --dry-run).
Edits
* As Fred points out, one exception is that, in the assignment A=$B, B will not be expanded. That is, A=$B and A="$B" are the same.
** As ghoti points out, instead of set|grep '^PARAM=', you can use declare -p PARAM. The declare builtin with the -p switch will print out a line that you could paste back into the shell to recreate the variable. In this case, that output is:
declare -a PARAM='([0]="--dry-run" [1]="mirror.leaseweb.net::archlinux/" [2]="/tmp/test")'
This is a good option. I personally prefer the set|grep approach because declare -p gives you an extra level of quoting, but both work fine. Edit As #rici points out, use declare -p if an element of your array might include a newline.
As an example of the extra quoting, consider unset PARAM ; declare -a PARAM ; PARAM+=("Jim's") (a new array with one element). Then you get:
set|grep: PARAM=([0]="Jim's")
# just an apostrophe ^
declare -p: declare -a PARAM='([0]="Jim'\''s")'
# a bit uglier, in my opinion ^^^^

Bash conditional on command exit code

In bash, I want to say "if a file doesn't contain XYZ, then" do a bunch of things. The most natural way to transpose this into code is something like:
if [ ! grep --quiet XYZ "$MyFile" ] ; then
... do things ...
fi
But of course, that's not valid Bash syntax. I could use backticks, but then I'll be testing the output of the file. The two alternatives I can think of are:
grep --quiet XYZ "$MyFile"
if [ $? -ne 0 ]; then
... do things ...
fi
And
grep --quiet XYZ "$MyFile" ||
( ... do things ...
)
I kind of prefer the second one, it's more Lispy and the || for control flow isn't that uncommon in scripting languages. I can see arguments for the first one too, although when the person reads the first line, they don't know why you're executing grep, it looks like you're executing it for it's main effect, rather than just to control a branch in script.
Is there a third, more direct way which uses an if statement and has the grep in the condition?
Yes there is:
if grep --quiet .....
then
# If grep finds something
fi
or if the grep fails
if ! grep --quiet .....
then
# If grep doesn't find something
fi
You don't need the [ ] (test) to check the return value of a command. Just try:
if ! grep --quiet XYZ "$MyFile" ; then
This is a matter of taste since there obviously are multiple working solutions. When I deal with a problem like this, I usually apply wc -l after grep in order to count the lines that match. Then you have a single integer number that you can evaluate within a test condition. If the question only is whether there is a match at all (the number of matching lines does not matter), then applying wc probably is OTT and evaluation of grep's return code seems to be the best solution:
Normally, the exit status is 0 if selected lines are found and 1
otherwise. But the exit status is 2 if an error occurred, unless the
-q or --quiet or --silent option is used and a selected line is found. Note, however, that POSIX only mandates, for programs such as grep,
cmp, and diff, that the exit status in case of error be greater than
1; it is therefore advisable, for the sake of portability, to use
logic that tests for this general condition instead of strict equality
with 2.

Bash parameter expansion

I have a script which uses the following logic:
if [ ! -z "$1" ]; then # if any parameter is supplied
ACTION= # clear $ACTION
else
ACTION=echo # otherwise, set it to 'echo'
fi
This works fine, as-is. However, in reading the Shell Parameter Expansion section of the bash manual, it seems this should be able to be done in a single step. However, I can't quite wrap my head around how to do it.
I've tried:
ACTION=${1:-echo} # ends up with $1 in $ACTION
ACTION=${1:+}
ACTION=${ACTION:-echo} # ends up always 'echo'
and a few ways of nesting them, but nesting seems to be disallowed as far as I can tell.
I realize I've already got a working solution, but now I'm genuinely curious if this is possible. It's something that would be straightforward with a ternary operator, but I don't think bash has one.
If this is possible, I'd like to see the logic to do this seeming two-step process, with no if/else constructs, but using only any combination of the Shell Parameter Expansion features.
Thank you.
EDIT for elderarthis:
The remainder of the script is just:
find . -name "*\?[NMSD]=[AD]" -exec ${ACTION} rm -f "{}" +
I just want ACTION=echo as a sanity check against myself, hence, passing any argument will actually do the deletion (by nullifying ${ACTION}, whereas passing no args leaves echo in there.
And I know TIMTOWTDI; I'm looking to see if it can be done with just the stuff in the Shell Parameter Expansion section :-)
EDIT for Mikel:
$ cat honk.sh
#!/bin/bash
ACTION=${1-echo}
echo $ACTION
$ ./honk.sh
echo
$ ./honk.sh foo
foo
The last needs to have ACTION='', and thus return a blank line/null value.
If I insisted on doing it in fewer than 4 lines and no sub-shell, then I think I'd use:
ACTION=${1:+' '}
: ${ACTION:=echo}
This cheats slightly - it creates a blank action rather than an empty action if there is an argument to the script. If there is no argument, then ACTION is empty before the second line. On the second line, if action is empty, set it to 'echo'. In the expansion, since you (correctly) do not quote $ACTION, no argument will be passed for the blank.
Tester (xx.sh):
ACTION=${1:+' '}
: ${ACTION:=echo}
echo $ACTION rm -f a b c
Tests:
$ sh xx.sh 1
rm -f a b c
$ sh xx.sh
echo rm -f a b c
$ sh xx.sh ''
echo rm -f a b c
$
If the last line is incorrect, then remove the colon from before the plus.
If a sub-shell is acceptable, then one of these two single lines works:
ACTION=$([ -z "$1" ] && echo echo)
ACTION=$([ -z "${1+X}" ] && echo echo)
The first corresponds to the first version shown above (empty first arguments are treated as absent); the second deals with empty arguments as present. You could write:
ACTION=$([ -z "${1:+X}" ] && echo echo)
to make the relation with the second clearer - except you're only going to use one or the other, not both.
Since the markdown notation in my comment confused the system (or I got it wrong but didn't get to fix it quickly enough), my last comment (slightly amended) should read:
The notation ${var:+' '} means 'if $var is set and is not empty, then use what follows the +' (which, in this case, is a single blank). The notation ${var+' '} means 'if $var is set - regardless of whether it is empty or not - then use what follows the +'. These other expansions are similar:
${var:=X} - set $var to X unless it already has a non-empty value.
${var:-X} - expands to $var if it has a non-empty value and expands to X if $var is unset or is empty
Dropping the colon removes the 'empty' part of the test.
ACTION=${1:-echo}
is correct.
Make sure it's near the top of your script before anything modifies $1 (e.g. before any set command). Also, it wouldn't work inside a function, because $1 would be the first parameter to the function.
Also check if $1 is set but null, in which case fix how you're calling it, or use ACTION=${1-echo} (note there is no :).
Update
Ah, I assumed you must have meant the opposite, because it didn't really make sense otherwise.
It still seems odd, but I guess as a mental exercise, maybe you want something like this:
#!/bin/bash
shopt -s extglob
ACTION=$1
ACTION=${ACTION:-echo}
ACTION=${ACTION/!(echo)/} # or maybe ACTION=${ACTION#!(echo)}
echo ACTION=$ACTION
It's not quite right: it gives ACTION=o, but I think something along those lines should work.
Further, if you pass echo as $1, it will stay as echo, but I don't think that's a bad thing.
It's also terribly ugly, but you knew that when asking the question. :-)

How to prevent code/option injection in a bash script

I have written a small bash script called "isinFile.sh" for checking if the first term given to the script can be found in the file "file.txt":
#!/bin/bash
FILE="file.txt"
if [ `grep -w "$1" $FILE` ]; then
echo "true"
else
echo "false"
fi
However, running the script like
> ./isinFile.sh -x
breaks the script, since -x is interpreted by grep as an option.
So I improved my script
#!/bin/bash
FILE="file.txt"
if [ `grep -w -- "$1" $FILE` ]; then
echo "true"
else
echo "false"
fi
using -- as an argument to grep. Now running
> ./isinFile.sh -x
false
works. But is using -- the correct and only way to prevent code/option injection in bash scripts? I have not seen it in the wild, only found it mentioned in ABASH: Finding Bugs in Bash Scripts.
grep -w -- ...
prevents that interpretation in what follows --
EDIT
(I did not read the last part sorry). Yes, it is the only way. The other way is to avoid it as first part of the search; e.g. ".{0}-x" works too but it is odd., so e.g.
grep -w ".{0}$1" ...
should work too.
There's actually another code injection (or whatever you want to call it) bug in this script: it simply hands the output of grep to the [ (aka test) command, and assumes that'll return true if it's not empty. But if the output is more than one "word" long, [ will treat it as an expression and try to evaluate it. For example, suppose the file contains the line 0 -eq 2 and you search for "0" -- [ will decide that 0 is not equal to 2, and the script will print false despite the fact that it found a match.
The best way to fix this is to use Ignacio Vazquez-Abrams' suggestion (as clarified by Dennis Williamson) -- this completely avoids the parsing problem, and is also faster (since -q makes grep stop searching at the first match). If that option weren't available, another method would be to protect the output with double-quotes: if [ "$(grep -w -- "$1" "$FILE")" ]; then (note that I also used $() instead of backquotes 'cause I find them much easier to read, and quotes around $FILE just in case it contains anything funny, like whitespace).
Though not applicable in this particular case, another technique can be used to prevent filenames that start with hyphens from being interpreted as options:
rm ./-x
or
rm /path/to/-x

Resources