Looping over shell script arguments and passing quoted arguments to function - bash

I have a script below that sources a directory of bash scripts and then parses the flags of the command to run a specific function from the sourced files.
Given this function within the scripts dir:
function reggiEcho () {
echo $1
}
Here are some examples of current output
$ reggi --echo hello
hello
$ reggi --echo hello world
hello
$ reggi --echo "hello world"
hello
$ reggi --echo "hello" --echo "world"
hello
world
As you can see quoted parameters are not honored as they should be `"hello world" should echo properly.
This is the script, the issue is within the while loop.
How do I parse these flags, and maintain passing in quoted parameters into the function?
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
STR="$(find $DIR/scripts -type f -name '*.sh' -print)"
ARR=( $STR )
TUSAGE="\n"
for f in "${ARR[#]}"; do
if [ -f $f ]
then
. $f --source-only
if [ -z "$USAGE" ]
then
:
else
TUSAGE="$TUSAGE \t$USAGE\n"
fi
USAGE=""
else
echo "$f not found"
fi
done
TUSAGE="$TUSAGE \t--help (shows this help output)\n"
function usage() {
echo "Usage: --function <args> [--function <args>]"
echo $TUSAGE
exit 1
}
HELP=false
cmd=()
while [ $# -gt 0 ]; do # loop until no args left
if [[ $1 = '--help' ]] || [[ $1 = '-h' ]] || [[ $1 = '--h' ]] || [[ $1 = '-help' ]]; then
HELP=true
fi
if [[ $1 = --* ]] || [[ $1 = -* ]]; then # arg starts with --
if [[ ${#cmd[#]} -gt 0 ]]; then
"${cmd[#]}"
fi
top=`echo $1 | tr -d -` # remove all flags
top=`echo ${top:0:1} | tr '[a-z]' '[A-Z]'`${top:1} # make sure first letter is uppercase
top=reggi$top # prepend reggi
cmd=( "$top" ) # start new array
else
echo $1
cmd+=( "$1" )
fi
shift
done
if [[ "$HELP" = true ]]; then
usage
elif [[ ${#cmd[#]} -gt 0 ]]; then
${cmd[#]}
else
usage
fi

There are many places in this script where you have variable references without double-quotes around them. This means the variables' values will be subject to word spitting and wildcard expansion, which can have various weird effects.
The specific problem you're seeing is due to an unquoted variable reference on the fourth-from-last line, ${cmd[#]}. With cmd=( echo "hello world" ), word splitting makes this equivalent to echo hello world rather than echo "hello world".
Fixing that one line will fix your current problem, but there are a number of other unquoted variable references that may cause other problems later. I recommend fixing all of them. Cyrus' recommendation of shellcheck.net is good at pointing them out, and will also note some other issues I won't cover here. One thing it won't mention is that you should avoid all-caps variable names (DIR, TUSAGE, etc) -- there are a bunch of all-caps variables with special meanings, and it's easy to accidentally reuse one of them and wind up with weird effects. Lowercase and mixed-case variables are safer.
I also recommend against using \t and \n in strings, and counting on echo to translate them into tabs and newlines, respectively. Some versions of echo do this automatically, some require the -e option to tell them to do it, some will print "-e" as part of their output... it's a mess. In bash, you can use $'...' to translate those escape sequences directly, e.g:
tusage="$tusage"$' \t--help (shows this help output)\n' # Note mixed quoting modes
echo "$tusage" # Note that double-quoting is *required* for this to work right
You should also fix the file listing so it doesn't depend on being unquoted (see chepner's comment). If you don't need to scan subdirectories of $DIR/scripts, you can do this with a simple wildcard (note lowercase vars and that the var is double-quoted, but the wildcard isn't):
arr=( "$dir/scripts"/*.sh )
If you need to look in subdirectories, it's more complicated. If you have bash v4 you can use a globstar wildcard, like this:
shopt -s globstar
arr=( "$dir/scripts"/**/*.sh )
If your script might have to run under bash v3, see BashFAQ #20: "How can I find and safely handle file names containing newlines, spaces or both?", or just use this:
while IFS= read -r -d '' f <&3; do
if [ -f $f ]
# ... etc
done 3< <(find "$dir/scripts" -type f -name '*.sh' -print0)
(That's my favorite it-just-works idiom for iterating over find's matches. Although it does require bash, not some generic POSIX shell.)

Related

Bash variable expansion that includes single or double quotes [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Equivalent of shlex.split in bash without python [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Script is not glob-expanding, but works fine when running the culprit as a minimalistic example

I've been trying for hours on this problem, and cannot set it straight.
This minimal script works as it should:
#!/bin/bash
wipe_thumbs=1
if (( wipe_thumbs )); then
src_dir=$1
thumbs="$src_dir"/*/t1*.jpg
echo $thumbs
fi
Invoke with ./script workdir and a lot of filenames starting with t1* in all the sub-dirs of workdir are shown.
When putting the above if-case in the bigger script, the globbing is not executed:
SRC: -- workdir/ --
THUMBS: -- workdir//*/t1*.jpg --
ls: cannot access workdir//*/t1*.jpg: No such file or directory
The only difference with the big script and the minimal script is that the big script has a path-validator and getopts-extractor. This code is immediately above the if-case:
#!/bin/bash
OPTIONS=":ts:d:"
src_dir=""
dest_dir=""
wipe_thumbs=0
while getopts $OPTIONS opt ; do
case "$opt" in
t) wipe_thumbs=1
;;
esac
done
shift $((OPTIND - 1))
src_dir="$1"
dest_dir="${2:-${src_dir%/*}.WORK}"
# Validate source
echo -n "Validating source..."
if [[ -z "$src_dir" ]]; then
echo "Can't do anything without a source-dir."
exit
else
if [[ ! -d "$src_dir" ]]; then
echo "\"$src_dir\" is really not a directory."
exit
fi
fi
echo "done"
# Validate dest
echo -n "Validating destination..."
if [[ ! -d "$dest_dir" ]]; then
mkdir "$dest_dir"
(( $? > 0 )) && exit
else
if [[ ! -w "$dest_dir" ]]; then
echo "Can't write into the specified destination-dir."
exit
fi
fi
echo "done"
# Move out the files into extension-named directories
echo -n "Moving files..."
if (( wipe_thumbs )); then
thumbs="$src_dir"/*/t1*.jpg # not expanded
echo DEBUG THUMBS: -- "$thumbs" --
n_thumbs=$(ls "$thumbs" | wc -l)
rm "$thumbs"
fi
...rest of script, never reached due to error...
Can anyone shed some lights on this? Why is the glob not expanded in the big script, but working fine in the minimalistic test script?
EDIT: Added the complete if-case.
The problem is that wildcards aren't expanded in assignment statements (e.g. thumbs="$src_dir"/*/t1*.jpg), but are expanded when variables are used without double-quotes. Here's an interactive example:
$ src_dir=workdir
$ thumbs="$src_dir"/*/t1*.jpg
$ echo $thumbs # No double-quotes, wildcards will be expanded
workdir/sub1/t1-1.jpg workdir/sub1/t1-2.jpg workdir/sub2/t1-1.jpg workdir/sub2/t1-2.jpg
$ echo "$thumbs" # Double-quotes, wildcards printed literally
workdir/*/t1*.jpg
$ ls $thumbs # No double-quotes, wildcards will be expanded
workdir/sub1/t1-1.jpg workdir/sub2/t1-1.jpg
workdir/sub1/t1-2.jpg workdir/sub2/t1-2.jpg
$ ls "$thumbs" # Double-quotes, wildcards treated as literal parts of filename
ls: workdir/*/t1*.jpg: No such file or directory
...so the quick-n-easy fix is to remove the double-quotes from the ls and rm commands. But this isn't safe, as it'll also cause parsing problems if $src_dir contains any whitespace or wildcard characters (this may not be an issue for you, but I'm used to OS X where spaces in filenames are everywhere, and I've learned to be careful about these things). The best way to do this is to store the list of thumb files as an array:
$ src="work dir"
$ thumbs=("$src_dir"/*/t1*.jpg) # No double-quotes protect $src_dir, but not the wildcard portions
$ echo "${thumbs[#]}" # The "${array[#]}" idiom expands each array element as a separate word
work dir/sub1/t1-1.jpg work dir/sub1/t1-2.jpg work dir/sub2/t1-1.jpg work dir/sub2/t1-2.jpg
$ ls "${thumbs[#]}"
work dir/sub1/t1-1.jpg work dir/sub2/t1-1.jpg
work dir/sub1/t1-2.jpg work dir/sub2/t1-2.jpg
You might also want to set nullglob in case there aren't any matches (so it'll expand to a zero-length array).
In your script, this'd come out something like this:
if (( wipe_thumbs )); then
shopt -s nullglob
thumbs=("$src_dir"/*/t1*.jpg) # expanded as array elements
shopt -u nullglob # back to "normal" to avoid unexpected behavior later
printf 'DEBUG THUMBS: --'
printf ' "%s"' "${thumbs[#]}"
printf ' --\n'
# n_thumbs=$(ls "${thumbs[#]}" | wc -l) # wrong way to do this...
n_thumbs=${#thumbs[#]} # better...
if (( n_thumbs == 0 )); then
echo "No thumb files found" >&2
exit
fi
rm "${thumbs[#]}"
fi

Assembling a command in shell script [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Reading quoted/escaped arguments correctly from a string

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Resources