Iterate over a list of quoted strings - bash

I'm trying to run a for loop over a list of strings where some of them are quoted and others are not like so:
STRING='foo "bar_no_space" "baz with space"'
for item in $STRING; do
echo "$item"
done
Expected result:
foo
bar_no_space
baz with space
Actual result:
foo
"bar_no_space"
"baz
with
space"
I can achieve the expected result by running the following command:
bash -c 'for item in '"$STRING"'; do echo "$item"; done;'
I would like to do this without spawning a new bash process or using eval because I do not want to take the risk of having random commands executed.
Please note that I do not control the definition of the STRING variable, I receive it through an environment variable. So I can't write something like:
array=(foo "bar_no_space" "baz with space")
for item in "${array[#]}"; do
echo "$item"
done
If it helps, what I am actually trying to do is split the string as a list of arguments that I can pass to another command.
I have:
STRING='foo "bar_no_space" "baz with space"'
And I want to run:
my-command --arg foo --arg "bar_no_space" --arg "baz with space"

Use an array instead of a normal variable.
arr=(foo "bar_no_space" "baz with space")
To print the values:
print '%s\n' "${arr[#]}"
And to call your command:
my-command --arg "${arr[0]}" --arg "${arr[1]}" --arg "{$arr[2]}"

Can you try something like this:
sh-4.4$ echo $string
foo "bar_no_space" "baz with space"
sh-4.4$ echo $string|awk 'BEGIN{FS="\""}{for(i=1;i<NF;i++)print $i}'|sed '/^ $/d'
foo
bar_no_space
baz with space

Solved: xargs + subshell
A few years late to the party, but...
Malicious Input:
SSH_ORIGINAL_COMMAND='echo "hello world" foo '"'"'bar'"'"'; sudo ls -lah /; say -v Ting-Ting "evil cackle"'
Note: I originally had an rm -rf in there, but then I realized that would be a recipe for disaster when testing variations of the script.
Converted perfectly into safe args:
# DO NOT put IFS= on its own line
IFS=$'\r\n' GLOBIGNORE='*' args=($(echo "$SSH_ORIGINAL_COMMAND" \
| xargs bash -c 'for arg in "$#"; do echo "$arg"; done'))
echo "${args[#]}"
See that you can indeed pass these arguments just like $#:
for arg in "${args[#]}"
do
echo "$arg"
done
Output:
hello world
foo
bar;
sudo
rm
-rf
/;
say
-v
Ting-Ting
evil cackle
I'm too embarrassed to say how much time I spent researching this to figure it out, but once you get the itch... y'know?
Defeating xargs
It is possible to fool xargs by providing escaped quotes:
SSH_ORIGINAL_COMMAND='\"hello world\"'
This can make a literal quote part of the output:
"hello
world"
Or it can cause an error:
SSH_ORIGINAL_COMMAND='\"hello world"'
xargs: unmatched double quote; by default quotes are special to xargs unless you use the -0 option
In either case, it doesn't enable arbitrary execution of code - the parameters are still escaped.

Pure bash parser
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quoted.
Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Output:
$#: foo bar baz qux *
foo
bar baz
qux
*
Parse Argument Function
Literally going character-by-character and adding to the current string, or adding to the array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.

Here is a way without an array of strings or other difficulties (but with bash calling and eval):
STRING='foo "bar_no_space" "baz with space"'
eval "bash -c 'while [ -n \"\$1\" ]; do echo \$1; shift; done' -- $STRING"
Output:
foo
bar_no_space
baz with space
If You want to do with the strings something more difficult then just echo You can split Your script:
split_qstrings.sh
#!/bin/bash
while [ -n "$1" ]
do
echo "$1"
shift
done
Another part with more difficult processing (capitalizing of a characters for example):
STRING='foo "bar_no_space" "baz with space"'
eval "split_qstrings.sh $STRING" | while read line
do
echo "$line" | sed 's/a/A/g'
done
Output:
foo
bAr_no_spAce
bAz with spAce

Related

Bash variable expansion that includes single or double quotes [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Equivalent of shlex.split in bash without python [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Shell argument expansion when passing with backticks [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Assembling a command in shell script [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Reading quoted/escaped arguments correctly from a string

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Resources