Why does printf behave differently when called from a Makefile? - shell

The printf program can be used to print binary data, e.g.:
$ printf '%b' '\xff\xff'
��
If I put this in a Makefile on its own, it works the same:
all:
printf '%b' '\xff\xff'
$ make
printf '%b' '\xff\xff'
��
However, if I want to do anything else on the same shell invocation in the Makefile, for example to redirect it to a file, or just printing something else afterwards, then although the command printed by Make doesn't change (suggesting it's not an escaping issue), but the output changes to a backslash followed by an "x" followed by a double "f", twice:
all:
printf '%b' '\xff\xff'; printf 'wtf?\n'
make
printf '%b' '\xff\xff'; printf 'wtf?\n'
\xff\xffwtf?
What is going on here? Why do the two printfs in one line behave differently than a single printf?

#chepner is on the right track in their comment but the details are not quite right:
This is wild speculation, but I suspect there is some sort of
optimization being applied by make that causes the first example, as a
simple command, to be executing a third option, the actual binary
printf (found in /usr/bin, perhaps), rather than a shell. In your
second example, the ; forces make to use a shell to execute the shell
command line.
Make always uses /bin/sh as its shell, regardless of what the user is using as their shell. On some systems, /bin/sh is bash (which has a builtin printf) and on some systems /bin/sh is something different (typically dash which is a lightweight, POSIX-conforming shell) which probably doesn't have a shell built-in.
On your system, /bin/sh is bash. But, when you have a "simple command" that doesn't require a shell (that is, make itself has enough trivial quoting smarts to understand your command) then to be more efficient make will invoke that command directly rather than running the shell.
That's what's happening here: when you run the simple command (no ;) make will invoke the command directly and run /usr/bin/printf. When you run the more complex command (including a ;) make will give up running the command directly and invoke your shell... which is bash, which uses bash's built-in printf.
Basically, your script is not POSIX-conforming (there is no %b in the POSIX standard) and so what it does is not well-defined. If you want the SAME behavior always you should use /usr/bin/printf to force that always to be used. Forcing make to always run a shell and never use its fast path is much trickier; you'll need to include a special character like a trailing ; in each command.

Related

Converting a BASH script to run on SH (via BusyBox)

I have an Asus router running a recent version of FreshTomato - that comes with BusyBox.
I need to run a script that was made with BASH in mind - it is an adaptation of this script - but it fails to run with this error: line 41: syntax error: bad substitution
Checking the script with shellcheck.net yields these errors:
Line 41:
for optionvarname in ${!foreign_option_*} ; do
^-- SC3053: In POSIX sh, indirect expansion is undefined.
^-- SC3056: In POSIX sh, name matching prefixes are undefined.
Line 42:
option="${!optionvarname}"
^-- SC3053: In POSIX sh, indirect expansion is undefined.
These are the lines that are causing problems:
for optionvarname in ${!foreign_option_*} ; do # line 41
option="${!optionvarname}" # line 42
# do some stuff with $option...
done
If my understanding is correct, the original script simply does something with all variables that have a name starting with foreign_option_
However, as far as I could determine, both ${!foreign_option_*} and ${!optionvarname} constructs are BASH-specific and not POSIX compliant, so there is no direct "bash to sh" code conversion possible.
I have tried to create a /bin/bash symlink that points to busybox, but I got the Read-only file system error.
So, how can I get this script to run on my router? I see only two options, but I cant figure out how to implement either:
Make BusyBox interpret the script as BASH instead of SH - can I use a specific shebang for this?
Seems like the fastest option, but only if BusyBox has a "complete" implementation of BASH
Alter the script code to not use BASH specifics.
This is safer, but since there is no "collect al variables starting with X" for SH, how can I do it?
how can I get this script to run on my router?
That easy, either:
install bash on your router or
port the script to busybox/posix compatible shell.
Make BusyBox interpret the script as BASH instead of SH - can I use a specific shebang for this?
That doesn't make sense. Busybox comes with ash shell interpreter and bash is bash. Bash can interpret bash extensions, ash can't interpret them. You can't "make busybox interpret bash" - cars don't fly, planes are for that. If you want to make a car fly, you add wings to it and make it faster. The answer to Make BusyBox interpret the script as BASH instead of SH would be: patch busybox and implement all bash extensions in it.
Shebang is used to run a file under different interpreter. Using #!/bin/bash would invoke bash, which would be unrelated to anything busybox related and busybox wouldn't be involved in it.
how can I do it?
Decide on a unrealistic maximum, iterate over variables named foreign_option_{1...some_max}, for each variable see if it is set, if it is set, cotinue the script.
for i in $(seq 100); do
optionvarname="foreign_option_${i}"
# https://stackoverflow.com/questions/3601515/how-to-check-if-a-variable-is-set-in-bash
if eval "[ -z \"\${${optionvarname}+x}\" ]"; then continue; fi;
With enough luck maybe you can use the set output. The following will fail if any variable contains a value as newline + the string that matches the regex:
for optionvarname in $(set | grep -o '^foreign_option_[0-9]\+=' | sed 's/=//'); then
Indirect expansion can be easily replaced by eval:
eval "option=\"\$${optionvarname}\""
If you really cannot install Bash on that router, here is one possible workaround, which seems to work for me in BusyBox on a Qnap NAS :
foreign_option_one=1
foreign_option_two=2
for x in one two; do
opt_var=foreign_option_${x}
eval "opt_value=\$$opt_var"
echo "$opt_var = $opt_value"
done
(But you will probably encounter more problems with moving a Bash script to busybox, so you might want to first consider alternatives like replacing the router)

prevent script injection when spawning command line with input arguments from external source

I've got a python script that wraps a bash command line tool, that gets it's variables from external source (environment variables). is there any way to perform some soft of escaping to prevent malicious user from executing bad code in one of those parameters.
for example if the script looks like this
/bin/sh
/usr/bin/tool ${VAR1} ${VAR2}
and someone set VAR2 as follows
export VAR2=123 && \rm -rf /
so it may not treat VAR2 as pure input, and perform the rm command.
Is there any way to make the variable non-executable and take the string as-is to the command line tool as input ?
The correct and safe way to pass the values of variables VAR1 and VAR2 as arguments to /usr/bin/tool is:
/usr/bin/tool -- "$VAR1" "$VAR2"
The quotes prevent any special treatment of separator or pattern matching characters in the strings.
The -- should prevent the variable values being treated as options if they begin with - characters. You might have to do something else if tool is badly written and doesn't accept -- to terminate command line options.
See Quotes - Greg's Wiki for excellent information about quoting in shell programming.
Shellcheck can detect many cases where quotes are missing. It's available as either an online tool or an installable program. Always use it if you want to eliminate many common bugs from your shell code.
The curly braces in the line of code in the question are completely redundant, as they usually are. Some people mistakenly think that they act as quotes. To understand their use, see When do we need curly braces around shell variables?.
I'm guessing that the /bin/sh in the question was intended to be a #! /bin/sh shebang. Since the question was tagged bash, note that #! /bin/sh should not be used with code that includes Bashisms. /bin/sh may not be Bash, and even if it is Bash it behaves differently when invoked as /bin/sh rather than /bin/bash.
Note that even if you forget the quotes the line of code in the question will not cause commands (like rm -rf /) embedded in the variable values to be run at that point. The danger is that badly-written code that uses the variables will create and run commands that include the variable values in unsafe ways. See should I avoid bash -c, sh -c, and other shells' equivalents in my shell scripts? for an explanation of (only) some of the dangers.
To avoid injections at best, consider switching to [T]csh.
Unlike Bourne Shells, the C Shell is "limited", thus instructing one to take different, safer paths to write scripts. The "limitations" imposed by the C Shell make it one of the most reliable Shells to work with.
(E.g: Nesting is minimal to impossible, thus preventing injections at all costs; there are better ways to achieve what one want.)

Space between # and ! in shebang (# !/usr/bin/ksh)

I am writing a Korn shell script that involves process substitution using < <(), like this:
array=()
while IFS= read -r -d '' x;do
array+=( "$x" )
done < <(some command)
This is trying to insert into array all string returned by some command. The curious thing is that this works when my shebang looks like this:
# !/usr/bin/ksh
which is of course unusual (notice the space between # and !). On the other hand, when my shebang looks like #!/usr/bin/ksh (the right way, apparently), this script fails with the error syntax error: '< ' unexpected. Why is this? What difference does having a space in the shebang mean? Google gave me several answers saying that a space between !# and !/usr... is okay, but nothing regarding a space between ! and #.
# ! is an invalid shebang, and entirely ignored. Behavior of a script with no shebang depends on how you invoke it.
If invoked from a shell: Some shells use /bin/sh to run such scripts; others use themselves for the purpose. Presumably the shell you're interactively using when testing this (and finding the script to work only with an invalid shebang) is in the latter set, so your script is actually being run with bash, or otherwise your active interactive shell at the time.
If invoked without a shell: Most operating systems will refuse to execute such a binary.
Real David Korn ksh93 supports process substitution correctly, but some 3rd-party clones and ancient ksh implementations don't.
If you're going to use ksh, using genuine David Korn ksh93 (not mksh, pdksh, or another 3rd-party clone) is strongly preferred, and (to your immediate point) will ensure process substitution support.

how does this escaping work?

Here is what it finally took to get my code in my makefile to work
Line 5 is the question area
BASE=50
INCREMENT=1
FORMATTED_NUMBER=${BASE}+${INCREMENT}
all:
echo $$((${FORMATTED_NUMBER}))
why do i have to add two $ and two (( )) ?
Formatted_Number if i echo it looks like "50+1" . What is the logic that make is following to know that seeing $$(("50+1")) is actually 51?
sorry if this is a basic question i'm new to make and dont fully understand it.
First, whenever asking questions please provide a complete example. You're missing the target and prerequisite here so this is not a valid makefile, and depending on where they are it could mean very different things. I'm assuming that your makefile is something like this:
BASE=50
INCREMENT=1
FORMATTED_NUMBER=${BASE}+${INCREMENT}
all:
echo $$((${FORMATTED_NUMBER}))
Makefiles are interesting in that they're a combination of two different formats. The main format is makefile format (the first five lines above), but inside a make recipe line (that's the last line above, which is indented with a TAB character) is shell script format.
Make doesn't know anything about math. It doesn't interpret the + in the FORMATTED_NUMBER value. Make variables are all strings. If you want to do math, you have to do it in the shell, in a shell script, using the shell's math facilities.
In bash and other modern shells, the syntax $(( ...expression... )) will perform math. So in the shell if you type echo $((50+1)) (go ahead and try it yourself) it will print 51.
That's why you need the double parentheses ((...)): because that's what the shell wants and you're writing a shell script.
So why the double $? Because before make starts the shell to run your recipe, it first replaces all make variable references with their values. That's why the shell sees 50+1 here: before make started the shell it expanded ${FORMATTED_NUMBER} into its value, which is ${BASE}+${INCREMENT}, then it expanded those variables so it ends up with 50+1.
But what if you actually want to use a $ in your shell script (as you do here)? Then you have to tell make to not treat the $ as introducing a make variable. You do this by doubling it, so if make sees $$ then it does not think that's a make variable, and sends a single $ to the shell.
So for the recipe line echo $$((${FORMATTED_NUMBER})) make actually invokes a shell script echo $((50+1)).
You can use this in BASH:
FORMATTED_NUMBER=$((BASE+INCREMENT))
Is using non BASH use:
FORMATTED_NUMBER=`echo "$BASE + $INCREMENT" | bc`

Why the sh script cannot work

I write a sh script (test.sh) like this:
#!/bin/sh
echo $#
and then run it like this:
#./test.sh '["hello"]'
but the output is:
"
In fact I need
["hello"]
The bash version is:
#bash --version
GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu)
And if I run
# echo '["hello"]'
["hello"]
I don't know why the script cannot work...
You probably mean "$#", though I don't think it should make much of a difference in this case. It's also worth making sure that the script is executable (chmod +x test.sh).
EDIT: Since you asked,
Bash has various levels of string expansion/manipulation/whatever. Variable expansion, such as $#, is one of them. Read man bash for more, but from what I can tell, Bash treats each command as one long string until the final stage of "word splitting" and "quote removal" where it's broken up into arguments.
This means, with your original definition, that ./test.sh "a b c d" is equivalent to echo "a" "b" "c" "d", not echo "a b c d". I'm not sure what other stages of expansion happen.
The whole word-splitting thing is common in UNIXland (and pretty much any command-line-backed build system), where you might define something like CFLAGS="-Os -std=c99 -Wall"; it's useful that $CFLAGS expands to "-Os" "-std=c99" "-Wall".
In other "$scripting" languages (Perl, and possibly PHP), $foo really means $foo.
The manual test you show works because echo gets the argument ["hello"]. The outermost quotes are stripped by the shell. When you put this in a shell script, each shell strips one layer of quotes: the one you type at and the one interpreting the script. Adding an extra layer of quotes makes that work out right.
Just a guess, but maybe try changing the first line to
#!/bin/bash
so that it actually runs with bash?

Resources