POSIX sh equivalent for Bash’s printf %q - bash

Suppose I have a #!/bin/sh script which can take a variety of positional parameters, some of which may include spaces, either/both kinds of quotes, etc. I want to iterate "$#" and for each argument either process it immediately somehow, or save it for later. At the end of the script I want to launch (perhaps exec) another process, passing in some of these parameters with all special characters intact.
If I were doing no processing on the parameters, othercmd "$#" would work fine, but I need to pull out some parameters and process them a bit.
If I could assume Bash, then I could use printf %q to compute quoted versions of args that I could eval later, but this would not work on e.g. Ubuntu's Dash (/bin/sh).
Is there any equivalent to printf %q that can be written in a plain Bourne shell script, using only built-ins and POSIX-defined utilities, say as a function I could copy into a script?
For example, a script trying to ls its arguments in reverse order:
#!/bin/sh
args=
for arg in "$#"
do
args="'$arg' $args"
done
eval "ls $args"
works for many cases:
$ ./handle goodbye "cruel world"
ls: cannot access cruel world: No such file or directory
ls: cannot access goodbye: No such file or directory
but not when ' is used:
$ ./handle goodbye "cruel'st world"
./handle: 1: eval: Syntax error: Unterminated quoted string
and the following works fine but relies on Bash:
#!/bin/bash
args=
for arg in "$#"
do
printf -v argq '%q' "$arg"
args="$argq $args"
done
eval "ls $args"

This is absolutely doable.
The answer you see by Jesse Glick is approximately there, but it has a couple of bugs, and I have a few more alternatives for your consideration, since this is a problem I ran into more than once.
First, and you might already know this, echo is a bad idea, one should use printf instead, if the goal is portability: "echo" has undefined behavior in POSIX if the argument it receives is "-n", and in practice some implementations of echo treat -n as a special option, while others just treat it as a normal argument to print. So that becomes this:
esceval()
{
printf %s "$1" | sed "s/'/'\"'\"'/g"
}
Alternatively, instead of escaping embedded single quotes by making them into:
'"'"'
..instead you could turn them into:
'\''
..stylistic differences I guess (I imagine performance difference is negligible either way, though I've never tested). The resulting sed string looks like this:
esceval()
{
printf %s "$1" | sed "s/'/'\\\\''/g"
}
(It's four backslashes because double quotes swallow two of them, and leaving two, and then sed swallows one, leaving just the one. Personally, I find this way more readable so that's what I'll use in the rest of the examples that involve it, but both should be equivalent.)
BUT, we still have a bug: command substitution will delete at least one (but in many shells ALL) of the trailing newlines from the command output (not all whitespace, just newlines specifically). So the above solution works unless you have newline(s) at the very end of an argument. Then you'll lose that/those newline(s). The fix is obviously simple: Add another character after the actual command value before outputting from your quote/esceval function. Incidentally, we already needed to do that anyway, because we needed to start and stop the escaped argument with single quotes. You have two alternatives:
esceval()
{
printf '%s\n' "$1" | sed "s/'/'\\\\''/g; 1 s/^/'/; $ s/$/'/"
}
This will ensure the argument comes out already fully escaped, no need for adding more single quotes when building the final string. This is probably the closest thing you will get to a single, inline-able version. If you're okay with having a sed dependency, you can stop here.
If you're not okay with the sed dependency, but you're fine with assuming that your shell is actually POSIX-compliant (there are still some out there, notably the /bin/sh on Solaris 10 and below, which won't be able to do this next variant - but almost all shells you need to care about will do this just fine):
esceval()
{
printf \'
unescaped=$1
while :
do
case $unescaped in
*\'*)
printf %s "${unescaped%%\'*}""'\''"
unescaped=${unescaped#*\'}
;;
*)
printf %s "$unescaped"
break
esac
done
printf \'
}
You might notice seemingly redundant quoting here:
printf %s "${unescaped%%\'*}""'\''"
..this could be replaced with:
printf %s "${unescaped%%\'*}'\''"
The only reason I do the former, is because one upon a time there were Bourne shells which had bugs when substituting variables into quoted strings where the quote around the variable didn't exactly start and end where the variable substitution did. Hence it's a paranoid portability habit of mine. In practice, you can do the latter, and it won't be a problem.
If you don't want to clobber the variable unescaped in the rest of your shell environment, then you can wrap the entire contents of that function in a subshell, like so:
esceval()
{
(
printf \'
unescaped=$1
while :
do
case $unescaped in
*\'*)
printf %s "${unescaped%%\'*}""'\''"
unescaped=${unescaped#*\'}
;;
*)
printf %s "$unescaped"
break
esac
done
printf \'
)
}
"But wait", you say: "What I want to do this on MULTIPLE arguments in one command? And I want the output to still look kinda nice and legible for me as a user if I run it from the command line for whatever reason."
Never fear, I have you covered:
esceval()
{
case $# in 0) return 0; esac
while :
do
printf "'"
printf %s "$1" | sed "s/'/'\\\\''/g"
shift
case $# in 0) break; esac
printf "' "
done
printf "'\n"
}
..or the same thing, but with the shell-only version:
esceval()
{
case $# in 0) return 0; esac
(
while :
do
printf "'"
unescaped=$1
while :
do
case $unescaped in
*\'*)
printf %s "${unescaped%%\'*}""'\''"
unescaped=${unescaped#*\'}
;;
*)
printf %s "$unescaped"
break
esac
done
shift
case $# in 0) break; esac
printf "' "
done
printf "'\n"
)
}
In those last four, you could collapse some of the outer printf statements and roll their single quotes up into another printf - I kept them separate because I feel it makes the logic more clear when you can see the starting and ending single-quotes on separate print statements.
P.S. There's also this monstrosity I made, which is a polyfill which will select between the previous two versions depending on if your shell seems to be capable of supporting the necessary variable substitution syntax (it looks awful though, because the shell-only version has to be inside an eval-ed string to keep the incompatible shells from barfing when they see it): https://github.com/mentalisttraceur/esceval/blob/master/sh/esceval.sh

I think this is POSIX. It works by clearing $# after expanding it for the for loop, but only once so that we can iteratively build it back up (in reverse) using set.
flag=0
for i in "$#"; do
[ "$flag" -eq 0 ] && shift $#
set -- "$i" "$#"
flag=1
done
echo "$#" # To see that "$#" has indeed been reversed
ls "$#"
I realize reversing the arguments was just an example, but you may be able to use this trick of set -- "$arg" "$#" or set -- "$#" "$arg" in other situations.
And yes, I realize I may have just reimplemented (poorly) ormaaj's Push.

Push. See the readme for examples.

The following seems to work with everything I have thrown at it so far, including spaces, both kinds of quotes and a variety of other metacharacters, and embedded newlines:
#!/bin/sh
quote() {
echo "$1" | sed "s/'/'\"'\"'/g"
}
args=
for arg in "$#"
do
argq="'"`quote "$arg"`"'"
args="$argq $args"
done
eval "ls $args"

If you're okay with calling out to an external executable (as in the sed solutions given in other answers), then you may as well call out to /usr/bin/printf. While it's true that the POSIX shell built-in printf doesn't support %q, the printf binary from Coreutils sure does (since release 8.25).
esceval() {
/usr/bin/printf '%q ' "$#"
}

We can use /usr/bin/printf when version of GNU Coreutil is not less than 8.25
#!/bin/sh
minversion="8.25"
gnuversion=$(ls '--version' | sed '1q' | awk 'NF{print $NF}')
printcmd="printf"
if ! [ $gnuversion \< $minversion ]; then
printcmd="/usr/bin/printf"
fi;
params=$($printcmd "%q" "$#")

Related

How to write the content of a unknown shell variable to stdout in a safe way

I only know what I have read so far, and I am confused about how to actually echo a variable as is.
echo "$var" might fail if var='-n'
printf '%s\n' "$var" might fail because of shell not implementig printf
echo -- "$var" might fail because it is a gnu extension
So if i would have to guess:
echo x"$var"|sed 's#^x##1' would be the only way, but I have never encountered that pattern. Why?
As a concrete question:
for source; do
target="$(echo "$source"|sed 's#[^a-z0-9]\+#.#')"
# do stuff with $source and $target
done
Does this work, or could someone "hack" / "break" my script by putting a file named '-n' somewhere, assuming my script is executed by some my_script * cron?
How do I write echo "$var" so it does not break?
Does this work, or could someone "hack" / "break" my script by putting
a file named '-n' somewhere?
There is nothing wrong with:
target="$(echo "$source"|sed 's#[^a-z0-9]\+#.#')"
What is happening:
"$(...)" is a command substitution which will substitute the results of the command within as the value -- in which case the result is assigned to target.
echo "$source"|sed 's#[^a-z0-9]\+#.#' simply pipes the output of echo (e.g. what is in source) to sed for the simple substitution of every character not lowercase or a digit followed by + with a period 1. Note: the quotes ".." around $source ARE proper within the command substitution.
There is no inherent reason assigning -n to a variable will cause any mischief. What you do with the variable is another question, but suffice it to say it is hard to see any problem.
"POSIX-shell's out there not implementing printf" -- Huh? Any shell not implementing printf would be more an exception rather than the rule. See printf - The Open Group Library that is POSIX.
If you are attempting to printf output that begins with '-' simply precede the output with "--" to indicate End-of-Options before the string your want to print and things will go fine. With your example of "-n", printf is about the only way you will output a variable beginning with the single '-', for example:
$ t="-n"
$ printf -- "%s\n" "$t"
-n
(note: you don't have to include "--" in printf "%s\n" "$var", the only time you must include it is with printf -- "-foo\n" or you will receive an "invalid option error".
For echo you can enable interpretation of backslash escapes with -e and include a backspace, e.g.
$ echo -e " \b$t"
-n
I think that has covered all issues. If not, let me know. Also, if you have any additional questions, drop a comment below or edit and add to your question.
footnotes:
note: + isn't part of basic regular expressions and it need not be escaped, but if there is any question, it is safer to include in a character class of its own, e.g. [^a-z0-9][+].

Looping through arguments in Shell Scripting skipping the first argument

#!/bin/sh
BACKUPDIR=$1
for argnum in {2..$#};do
echo ${"$argnum"}
done
I have tried this but it gives me this error:
./backup.sh: 10: ./backup.sh: Bad substitution
Use the shift command to remove $1 from the argument list after you're done reading it (thus renumbering your old $2 to $1, your old $3 to $2, etc):
#!/bin/sh
backupdir=$1; shift
for arg; do
echo "$arg"
done
To provide a literal (but not-particularly-good-practice) equivalent to the code in the question, indirect expansion (absent such security-impacting practices as eval) looks like the following:
#!/bin/bash
# ^^^^-- This IS NOT GUARANTEED TO WORK in /bin/sh
# not idiomatic, not portable to baseline POSIX shells; this is best avoided
backupdir=$1
for ((argnum=2; argnum<=$#; ++argnum)); do
echo "${!argnum}"
done

Capturing verbatim command line (including quotes!) to call inside script

I'm trying to write a "phone home" script, which will log the exact command line (including any single or double quotes used) into a MySQL database. As a backend, I have a cgi script which wraps the database. The scripts themselves call curl on the cgi script and include as parameters various arguments, including the verbatim command line.
Obviously I have quite a variety of quote escaping to do here and I'm already stuck at the bash stage. At the moment, I can't even get bash to print verbatim the arguments provided:
Desired output:
$ ./caller.sh -f -hello -q "blah"
-f hello -q "blah"
Using echo:
caller.sh:
echo "$#"
gives:
$ ./caller.sh -f -hello -q "blah"
-f hello -q blah
(I also tried echo $# and echo $*)
Using printf %q:
caller.sh:
printf %q $#
printf "\n"
gives:
$ ./caller.sh -f hello -q "blah"
-fhello-qblah
(I also tried print %q "$#")
I would welcome not only help to fix my bash problem, but any more general advice on implementing this "phone home" in a tidier way!
There is no possible way you can write caller.sh to distinguish between these two commands invoked on the shell:
./caller.sh -f -hello -q "blah"
./caller.sh -f -hello -q blah
There are exactly equivalent.
If you want to make sure the command receives special characters, surround the argument with single quotes:
./caller.sh -f -hello -q '"blah"'
Or if you want to pass just one argument to caller.sh:
./caller.sh '-f -hello -q "blah"'
You can get this info from the shell history:
function myhack {
line=$(history 1)
line=${line#* }
echo "You wrote: $line"
}
alias myhack='myhack #'
Which works as you describe:
$ myhack --args="stuff" * {1..10} $PATH
You wrote: myhack --args="stuff" * {1..10} $PATH
However, quoting is just the user's way of telling the shell how to construct the program's argument array. Asking to log how the user quotes their arguments is like asking to log how hard the user punched the keys and what they were wearing at the time.
To log a shell command line which unambiguously captures all of the arguments provided, you don't need any interactive shell hacks:
#!/bin/bash
line=$(printf "%q " "$#")
echo "What you wrote would have been indistinguishable from: $line"
I understand you want to capture the arguments given by the caller.
Firstly, quotes used by the caller are used to protect during the interpretation of the call. But they do not exist as argument.
An example: If someone call your script with one argument "Hello World!" with two spaces between Hello and World. Then you have to protect ALWAYS $1 in your script to not loose this information.
If you want to log all arguments correctly escaped (in the case where they contains, for example, consecutive spaces...) you HAVE to use "$#" with double quotes. "$#" is equivalent to "$1" "$2" "$3" "$4" etc.
So, to log arguments, I suggest the following at the start of the caller:
i=0
for arg in "$#"; do
echo "arg$i=$arg"
let ++i
done
## Example of calls to the previous script
#caller.sh '1' "2" 3 "4 4" "5 5"
#arg1=1
#arg2=2
#arg3=3
#arg4=4 4
#arg5=5 5
#Flimm is correct, there is no way to distinguish between arguments "foo" and foo, simply because the quotes are removed by the shell before the program receives them. What you need is "$#" (with the quotes).

countdown in shell without seq or jot

I'm trying to make a countdown timer script that takes a number of seconds as $1, then counts down to zero, showing the current remaining seconds as it goes.
The catch is, I'm doing this on an embedded box that doesn't have seq or jot, which are the two tools I know can generate my list of numbers.
Here's the script as I have it working on a normal (non-embedded) system:
#!/bin/sh
for i in $(/usr/bin/jot ${1:-10} ${1:-10} 1); do
printf "\r%s " "$i"
sleep 1
done
echo ""
This works in FreeBSD. If I'm on a Linux box, I can replace the for line with:
for i in $(/usr/bin/seq ${1:-10} -1 1); do
for the same effect.
But what do I do if I have no jot OR seq?
The problem with "vanilla" Bourne shell is that there's no such thing; the behavior depends on the particular implementation. Most modern /bin/shes have POSIX features, but differ in the details. I have a habit of falling back to really ancient Bourne features when I go into "sh-compatibility" mode, which was helpful 30 years ago but is usually overboard on modern systems, even embedded ones. :)
Anyway, here's a generic countdown loop that works even in very old shells, yet still works fine in modern bash/dash/ksh/zsh/etc. It does require the expr command, which is a pretty safe requirement.
i=10
while [ $i -gt 0 ]; do
...
i=`expr $i - 1`
done
So if your embedded system has printf, here's your script, with the same "default to 10 if no argument specified" behavior:
#!/bin/sh
n=$1
i=${n-10}
while [ $i -gt 0 ]; do
printf '\r%s ' "$i"
i=`expr $i - 1`
sleep 1
done
(The first two lines can probably be replaced with just i=${1-10}, but some - again, ancient - shells didn't like applying special parameter expansions to numeric parameters.)
If the system doesn't have printf, then it becomes problematic; with only the shell's built-in echo (or no builtin and some randomly selected implementation of /bin/echo), there's no guaranteed way to do either of those things (echo a special character or prevent the newline). You might be able to use \r or \015 to get the carriage return. You might need -e to get the shell to recognize those (but that might just wind up echoing a literal -e). Putting a literal carriage return inside the script will probably work but makes maintaining the script a pain.
Similarly, -n might squash the newline, but might just echo a literal -n. The earliest way to squash newlines with echo used the special sequence \c where the newline would naturally go; this still works with some versions of /bin/echo (including the one on Mac OS X) or in conjunction with bash's builtin echo's -e option.
So what seems to be the simplest part of the script might be the part that makes you reach for awk.
If your shell was bash, you could count down from a fixed number with something like this:
#!/bin/bash
for n in {10..1}; do
printf "\r%s " $n
sleep 1
done
But that won't work for you, because bash won't handle things like {${1:-10}..1}, and you've specified you want a command line option.
Of course, you've also said you're not using bash, so we'll assume a simpler shell.
If you have awk, you can use that to count.
#!/bin/sh
for n in $(awk -v m="${1:-10}" 'BEGIN{for(;m;m--){print m}}'); do
printf "\r%s " $n
sleep 1
done
printf "\r \r" # clean up
If you don't have awk, you should be able to do it in pure shell:
#!/bin/sh
n=${1:-10}
while [ $n -gt 0 ]; do
printf "\r%s " $n
sleep 1
n=$((n-1))
done
printf "\r \r" # clean up
I think the pure-shell version is probably simple enough that it should be preferred over the awk solution.
Of course, as Mark Reed pointed out in comments, if your system doesn't include a printf, then you'll need to perform some ugly echo magic that will depend on your OS or shell... and if your shell doesn't support $((..)), you can replace that line with n=$(expr $n - 1). If you want to add error handling in case a non-numeric $1 is provided, that wouldn't hurt.
If you're using Bash, this is what you want:
for i in {1..10}
Or no bash? How about the stuff from this post? Bourne Shell For i in (seq)
try
for i in 1 10 15 20
do
echo "do something with $i"
done
else if you have recent Solaris, there is bash 3 at least. for example
this give range from 1 to 10 and 15 to 20
for i in {1..10} {15..20}
do
echo "$i"
done
OR use tool like nawk
for i in `nawk 'BEGIN{ for(i=1;i<=10;i++) print i}'`
do
echo $i
done
OR even the while loop
while [ "$s" -lt 10 ]; do s=`echo $s+1|bc`; echo $s; done

Bash convention for if ; then

From this web page :
http://tldp.org/LDP/abs/html/abs-guide.html
It's mentioned the usage of the if bracket then convention which need a space after the semicolon :
;
Command separator [semicolon]. Permits putting two or more commands on the same line.
echo hello; echo there
if [ -x "$filename" ]; then # Note the space after the semicolon.
#+ ^^
echo "File $filename exists."; cp $filename $filename.bak
else # ^^
echo "File $filename not found."; touch $filename
fi; echo "File test complete."
Note that the ";" sometimes needs to be escaped.
Does anyone know where is this coming from and if this is needed at all by certain shells?
This has become the style in the last few years:
if [ -x "$filename" ]; then
echo "hi"
fi
However, back when dinosaurs like Burroughs and Sperry Rand ruled the earth, I learned to write if statements like this:
if [ -x "$filename" ]
then
echo "hi"
fi
Then, you don't even need a semicolon.
The new style with then on the same line as the if started in order to emulate the way C and other programming languages did their if statements:
if (! strcmp("foo", "bar")) {
printf "Strings equal\n";
}
These programming languages put the opening curly brace on the same line as the if.
Semicolon ; is an operator (not a keyword, like braces { }or a bang !) in Shell, so it doesn't need to be delimited with white space to be recognized in any POSIX-compliant shell.
However, doing so improves readability (for my taste).
Semicolon needs to be escaped if you mean a symbol "semicolon", not an operator.
The space after the semicolon is not required by the syntax for any shell I know of, but it's good style and makes the code easier to read.
I suppose the "sometimes needs to be escaped" wording refers to cases like echo foo\;bar, where you don't want the semicolon to be interpreted as a separator by the shell.
I do not believe that the space should be necessary there. There's nothing about requiring spaces in the POSIX sh spec.
Empirically, the following works fine in both bash 4.1.5(1) and dash:
$ if true;then echo hi;else echo bye;fi
hi
$
I've never came across a shell that required a space in that context.
Just to make sure, I've asked on c.u.s., you can read the replies here.

Resources