Reading quoted/escaped arguments correctly from a string - bash

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?

A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")

Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.

Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.

This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Related

Bash variable expansion that includes single or double quotes [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Equivalent of shlex.split in bash without python [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Converting a string or array to separate inputs in a bash command line tool [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Shell argument expansion when passing with backticks [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Assembling a command in shell script [duplicate]

I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[#]}"
...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$#" "$entry"
done
"$#"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:
args=( "hi there" test)
./swap "${args[#]}"
In POSIX shell, you are stuck using eval (which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.
Ugly Idea Alert: Pure Bash Function
Here's a quoted-string parser written in pure bash (what terrible fun)!
Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.
Example Usage
MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
# Show each of the arguments array
for arg in "${args[#]}"; do
echo "$arg"
done
Example Output
foo
bar baz
qux
*
Parse Argument Function
This literally goes character-by-character and either adds to the current string or the current array.
set -u
set -e
# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
notquote="-"
str=$1
declare -a args=()
s=""
# Strip leading space, then trailing space, then end with space.
str="${str## }"
str="${str%% }"
str+=" "
last_quote="${notquote}"
is_space=""
n=$(( ${#str} - 1 ))
for ((i=0;i<=$n;i+=1)); do
c="${str:$i:1}"
# If we're ending a quote, break out and skip this character
if [ "$c" == "$last_quote" ]; then
last_quote=$notquote
continue
fi
# If we're in a quote, count this character
if [ "$last_quote" != "$notquote" ]; then
s+=$c
continue
fi
# If we encounter a quote, enter it and skip this character
if [ "$c" == "'" ] || [ "$c" == '"' ]; then
is_space=""
last_quote=$c
continue
fi
# If it's a space, store the string
re="[[:space:]]+" # must be used as a var, not a literal
if [[ $c =~ $re ]]; then
if [ "0" == "$i" ] || [ -n "$is_space" ]; then
echo continue $i $is_space
continue
fi
is_space="true"
args+=("$s")
s=""
continue
fi
is_space=""
s+="$c"
done
if [ "$last_quote" != "$notquote" ]; then
>&2 echo "error: quote not terminated"
return 1
fi
for arg in "${args[#]}"; do
echo "$arg"
done
return 0
}
I may or may not keep this updated at:
https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy
Seems like a rather stupid thing to do... but I had the itch... oh well.
This might not be the most robust approach, but it is simple, and seems to work for your case:
## demonstration matching the question
$ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
there" "hi
## simple solution, using 'xargs'
$ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
test hi there

Resources