Running IDL program from bash with variables - bash

I've written a program in IDL to generate scatter plots based on command line arguments. I can successfully call the program directly in the terminal like this:
idl -e "scatterplot_1_2d_file.pro" $infile $outfile $title $xtitle $ytitle $xmin $xmax $ymin $ymax $timescale
Where $* refer to some string literals typed in. The problem is, I thought I'd be able to just type that very line, putting in variable names in place of the literals, into a bash script, and generate a million scatter plots while I'm on break. Unfortunately, if I do it that way, I get the error:
idl: -e option cannot be specified with batch files
So my next attempt was to try writing those commands to an IDL batch file that I'd then run.
That attempt looks like this:
#!/bin/bash
indir=/path/to/indir/
outdir=/path/to/outdir/
files=`ls $indir`
batchfile=/path/to/tempbatchfile.pro
echo .r "/path/to/scatterplot_1_2d_file.pro" >> $batchfile
for file in $files
do
name=${file%\.*}
echo scatterplot_1_2d_file $indir$name.txt $outdir$name.jpg $name "Gauge Precipitation (mm)" "NMQ Precipitation (mm)" "*" "*" "*" "*" 2 >> $batchfile
done #done file
echo exit >> $batchfile
idl <<EOF
#/path/to/scatterplot_1_2d_file
EOF
rm $batchfile
I don't know if the bulk of the errors that script generates are relevant, so I'll just post the start and I'll post the rest later if you need it:
[foo]$ bash script_thing.sh
IDL Version 6.3 (linux x86 m32). (c) 2006, Research Systems, Inc.
Installation number: 91418.
Licensed for personal use by XXXXXXXXX only.
All other use is strictly prohibited.
PRO scatterplot_1_2d_file
^
% Programs can't be compiled from single statement mode.
At: /path/to/scatterplot_1_2d_file.pro, Line 1
% Attempt to subscript ARGS with <INT ( 1)> is out of range.
% Execution halted at: $MAIN$
% Attempt to subscript ARGS with <INT ( 2)> is out of range.
% Execution halted at: $MAIN$
% Attempt to subscript ARGS with <INT ( 3)> is out of range.
% Execution halted at: $MAIN$
% Attempt to subscript ARGS with <INT ( 4)> is out of range.
% Execution halted at: $MAIN$
% Attempt to subscript ARGS with <INT ( 5)> is out of range.
% Execution halted at: $MAIN$
% Attempt to subscript ARGS with <INT ( 6)> is out of range.
% Execution halted at: $MAIN$
% Attempt to subscript ARGS with <INT ( 7)> is out of range.
% Execution halted at: $MAIN$
% Attempt to subscript ARGS with <INT ( 8)> is out of range.
% Execution halted at: $MAIN$
% Attempt to subscript ARGS with <INT ( 9)> is out of range.
% Execution halted at: $MAIN$
I don't know if I'm just trying to do something that can't be done, but it doesn't SEEM like it. Any advice?

There are two ways to go about this: using COMMAND_LINE_ARGS or constructing a valid IDL routine call. This routine uses both:
pro test, other_args
compile_opt strictarr
args = command_line_args(count=nargs)
help, nargs
if (nargs gt 0L) then print, args
help, other_args
if (n_elements(other_args) gt 0L) then print, other_args
end
Call it from the command line in either of the following two ways:
Desktop$ idl -e "test" -args $MODE
IDL Version 8.2, Mac OS X (darwin x86_64 m64). (c) 2012, Exelis Visual Information Solutions, Inc.
Installation number: 216855.
Licensed for use by: Tech-X Corporation
% Compiled module: TEST.
NARGS LONG = 1
test
OTHER_ARGS UNDEFINED = <Undefined>
Desktop$ idl -e "test, '$MODE'"
IDL Version 8.2, Mac OS X (darwin x86_64 m64). (c) 2012, Exelis Visual Information Solutions, Inc.
Installation number: 216855.
Licensed for use by: Tech-X Corporation
% Compiled module: TEST.
NARGS LONG = 0
OTHER_ARGS STRING = 'test'
test

I don't know IDL, but here are fixes for your Bash script that are likely to help:
#!/bin/bash
indir=/path/to/indir/
outdir=/path/to/outdir/
# (commented out) files=`ls $indir` # no, just no
batchfile=/path/to/tempbatchfile.pro
echo ".r /path/to/scatterplot_1_2d_file.pro" > "$batchfile" # overwrite the file on the first write, put everything inside the quotes
for file in "$indir/"*
do
name=${file%\.*}
echo "scatterplot_1_2d_file $indir$name.txt $outdir$name.jpg $name Gauge Precipitation (mm) NMQ Precipitation (mm) * * * * 2" # quote the whole thing once, it's simpler and echo doesn't do anything differently
done >> "$batchfile" # do all the output from the loop
echo exit >> "$batchfile"
# *** where does idl learn the location of "$batchfile"? ***
idl <<EOF
#/path/to/scatterplot_1_2d_file
EOF
rm "$batchfile"
To fix your command line version, use quoting:
idl -e "scatterplot_1_2d_file.pro" "$infile" "$outfile" "$title" "$xtitle" "$ytitle" "$xmin" "$xmax" "$ymin" "$ymax" "$timescale"
Always quote variables when they're expanded.

Related

Bash builtins also available as separate executables

I wanted to understand bash builtins. Hence the following questions:
When I think of the term builtin I am thinking that the bash executable has a function defined in its symbol table that other parts of the executable can access without actually having to fork. Is this what builtin means?
I also see that some builtins have a separate executable. For instance type [ returns [ is a shell builtin. But then I also see an executable named /usr/bin/[. Is it correct to say that the same code is available through two executables: one through bash program and another through /usr/bin/[?
Loosely speaking, the program version of the built-ins is used when the shell interpreter is not available or not needed. Let's explain it in more details...
When you run a shell script, the interpreter recognizes the built-ins and will not fork/exec but merely call the function corresponding to the built-in. Even if you call them from an C/C++ executable through system(), the latter launches a shell first and then makes the spawn shell run the built-in.
Here is an example program, which runs echo message thanks to system() library service:
#include <stdlib.h>
int main(void)
{
system("echo message");
return 0;
}
Compile it and run it:
$ gcc msg.c -o msg
$ ./msg
message
Running the latter under strace with the -f option shows the involved processes. The main program is executed:
$ strace -f ./msg
execve("./msg", ["./msg"], 0x7ffef5c99838 /* 58 vars */) = 0
Then, system() triggers a fork() which is actually a clone() system call. The child process#5185 is launched:
clone(child_stack=0x7f7e6d6cbff0, flags=CLONE_VM|CLONE_VFORK|SIGCHLD
strace: Process 5185 attached
<unfinished ...>
The child process executes /bin/sh -c "echo message". The latter shell calls the echo built-in to display the message on the screen (write() system call):
[pid 5185] execve("/bin/sh", ["sh", "-c", "echo message"], 0x7ffdd0fafe28 /* 58 vars */ <unfinished ...>
[...]
[pid 5185] write(1, "message\n", 8message
) = 8
[...]
+++ exited with 0 +++
The program version of the built-ins is useful when you need them from a C/C++ executable without an intermediate shell for the sake of the performances. For instance, when you call them through execv() function.
Here is an example program which does the same thing as the preceding example but with execv() instead of system():
#include <unistd.h>
int main(void)
{
char *av[3];
av[0] = "/bin/echo";
av[1] = "message";
av[2] = NULL;
execv(av[0], av);
return 0;
}
Compile and run it to see that we get the same result:
$ gcc msg1.c -o msg1
$ ./msg1
message
Let's run it under strace to get the details. The output is shorter because no sub-process is involved to execute an intermediate shell. The actual /bin/echo program is executed instead:
$ strace -f ./msg1
execve("./msg1", ["./msg1"], 0x7fffd5b22ec8 /* 58 vars */) = 0
[...]
execve("/bin/echo", ["/bin/echo", "message"], 0x7fff6562fbf8 /* 58 vars */) = 0
[...]
write(1, "message \1\n", 10message
) = 10
[...]
exit_group(0) = ?
+++ exited with 0 +++
Of course, if the program is supposed to do additional things, a simple call to execv() is not sufficient as it overwrites itself by the /bin/echo program. A more elaborated program would fork and execute the latter program but without the need to run an intermediate shell:
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
int main(void)
{
if (fork() == 0) {
char *av[3];
av[0] = "/bin/echo";
av[1] = "message";
av[2] = NULL;
execv(av[0], av);
}
wait(NULL);
// Make additional things before ending
return 0;
}
Compile and run it under strace to see that the intermediate child process executes the /bin/echo program without the need of an intermediate shell:
$ gcc msg2.c -o msg2
$ ./msg2
message
$ strace -f ./msg2
execve("./msg2", ["./msg2"], 0x7ffc11a5e228 /* 58 vars */) = 0
[...]
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 5703 attached
, child_tidptr=0x7f8e0b6e0810) = 5703
[pid 5703] execve("/bin/echo", ["/bin/echo", "message"], 0x7ffe656a9d08 /* 58 vars */ <unfinished ...>
[...]
[pid 5703] write(1, "message\n", 8message
) = 8
[...]
[pid 5703] +++ exited with 0 +++
<... wait4 resumed>NULL, 0, NULL) = 5703
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5703, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
exit_group(0) = ?
+++ exited with 0 +++
the bash executable has a function defined in its symbol table
There are builtins that are included inside Bash executable. You can load builtins dynamically from a separate shared library on runtime.
can access without actually having to fork
Yes.
Is it correct to say that the same code is available through two executables: one through bash executable and another through /usr/bin/[?
No, it's a different source code. One is a Bash builtin and the other is a program. It will be a different source code. There is also different behavior in grey areas.
$ printf "%q\n" '*'
\*
$ /bin/printf "%q\n" '*'
'*'
$ time echo 1
1
real 0m0.000s
user 0m0.000s
sys 0m0.000s
$ /bin/time echo 1
1
0.00user 0.00system 0:00.00elapsed 50%CPU (0avgtext+0avgdata 2392maxresident)k
64inputs+0outputs (1major+134minor)pagefaults 0swaps
$ [ -v a ]
$ /bin/[ -v a ]
/bin/[: ‘-v’: unary operator expected

Consistent syntax for obtaining output of a command efficiently in bash?

Bash has the command substitution syntax $(f), which allows to capture
the STDOUT of a command f. If the command is an executable, this is fine
– the creation of a new process is necessary anyway. But if the command is
a shell-function, using this syntax creates an overhead of about 25ms for
each subshell on my system. This is enough to add up to noticable delays
when used in inner loops, especially in interactive contexts such as
command completions or $PS1.
A common optimization is to use global variables instead
[1] for returning values,
but it comes at a cost to readability: The intent becomes less clear, and
output capturing suddenly is inconsistent between shell functions and
executables. I am adding a comparison of options and their weaknesses below.
In order to get a consistent, reliable syntax, I was wondering if bash has
any feature that allows to capture shell-function and executable output
alike, while avoiding subshells for shell-functions.
Ideally, a solution would also contain a more efficient alternative to executing multiple commands in a subshell, which allows more cleanly isolating concerns, e.g.
person=$(
db_handler=$(database_connect) # avoids leaking the variable
query $db_handler lastname # outside it's required
echo ", " # scope.
query $db_handler firstname
database_close $db_handler
)
Such a construct allows the reader of the code to ignore everything inside $(), if the details of how $person is formatted aren't interesting to them.
Comparison of Options
1. With command substitution
person="$(get lastname), $(get firstname)"
Slow, but readable and consistent: It doesn't matter to the reader at first
glance whether get is a shell function or an executable.
2. With same global variable for all functions
get lastname
person="$R, "
get firstname
person+="$R"
Obscures what $person is supposed to contain. Alternatively,
get lastname
local lastname="$R"
get firstname
local firstname="$R"
person="$lastname, $firstname"
but that's very verbose.
3. With different global variable for each function
get_lastname
get_firstname
person="$lastname $firstname"
More readable assignment, but
If some function is invoked twice, we're back to (2).
The side-effect of setting the variable is not obvious.
It is easy to use the wrong variable by accident.
4. With global variable, whose name is passed as argument
get LN lastname
get FN firstname
person="$LN, $FN"
More readable, allows multiple return values easily.
Still inconsistent with capturing output from executables.
Note: Assignment to dynamic variable names should be done with declare
rather than eval:
$VARNAME="$LOCALVALUE" # doesn't work.
declare -g "$VARNAME=$LOCALVALUE" # will work.
eval "$VARNAME='$LOCALVALUE'" # doesn't work for *arbitrary* values.
eval "$VARNAME=$(printf %q "$LOCALVALUE")"
# doesn't avoid a subshell afterall.
[1] http://rus.har.mn/blog/2010-07-05/subshells/
If you want it to be efficient the shell functions can't return their result via stdout. If they did, there'd be no way to get it but by running the function in a subshell and capturing the output via an internal pipe, and these operations are kind of expensive (a few ms on a modern system).
When I was focusing on shell scripts and I needed to max their performance I used a convention where function foo would return its result via a variable foo. This you can do even in a POSIX shell and it has the nice property that it won't overwrite your locals because if foo is a function, you've already kind of reserved the name.
Then I had this bx_r getter function that runs a shell function and saves its output into either a variable whose name is given by the first argument or it outputs the output to stdout if the first argument is a word that's an illegal variable name (without a newline if the word is exactly an empty word, i.e., '').
I've modified it so it can be used uniformly with either commands or functions.
You can't use the type builtin to differentiate between the two here because
type returns its result via stdout => you'd need to capture that result and that would impose the forking penalty again.
So what I do when I'm about to run function foo is I check if there's a corresponding variable foo (this can catch a local variable but you'll avoid the chances of this if you limit yourself to properly namespaced shell function names). If there is, I assume that's where function foo returns its result, otherwise I run it in a $(), capturing its stdout.
Here's the code with some testing code:
bx_varlike_eh()
{
case $1 in
([!A-Za-z_0-9]*) false;;
(*) true;;
esac
}
bx_r() #{{{ Varname=$1; shift; Invoke $# and save it to $Varname if a legal varname or print it
{
# `bx_r '' some_command` prints without a newline
# `bx_r - some_command` (or any non-variable-character-containing word instead of -)
# prints with a newline
local bx_r__varname="$1"; shift 1
local bx_r
if ! bx_varlike_eh "$1" || eval "[ \"\${$1+set}\" != set ]"; then
#https://unix.stackexchange.com/a/465715/23692
bx_r=$( "$#" ) || return #$1 not varlike or unset => must be a regular command, so capture
else
#if $1 is a variable name, assume $1 is a function that saves its output there
"$#" || return
eval "bx_r=\$$1" #put it in bx_r
fi
case "$bx_r__varname" in
('') printf '%s' "$bx_r";;
([!A-Za-z_0-9]*) printf '%s\n' "$bx_r";;
(*) eval "$bx_r__varname=\$bx_r";;
esac
} #}}}
#TEST
for sh in sh bash; do
time $sh -c '
. ./bx_r.sh
bx_getnext=; bx_getnext() { bx_getnext=$((bx_getnext+1)); }
bx_r - bx_getnext
bx_r - bx_getnext
i=0; while [ $i -lt 10000 ]; do
bx_r ans bx_getnext
i=$((i+1)); done; echo ans=$ans
'
echo ====
$sh -c '
. ./bx_r.sh
bx_r - date
bx_r - /bin/date
bx_r ans /bin/date
echo ans=$ans
'
echo ====
time $sh -c '
. ./bx_r.sh
bx_echoget() { echo 42; }
i=0; while [ $i -lt 10000 ]; do
ans=$(bx_echoget)
i=$((i+1)); done; echo ans=$ans
'
done
exit
#MY TEST OUTPUT
1
2
ans=10002
0.14user 0.00system 0:00.14elapsed 99%CPU (0avgtext+0avgdata 1644maxresident)k
0inputs+0outputs (0major+76minor)pagefaults 0swaps
====
Thu Sep 5 17:12:01 CEST 2019
Thu Sep 5 17:12:01 CEST 2019
ans=Thu Sep 5 17:12:01 CEST 2019
====
ans=42
1.95user 1.14system 0:02.81elapsed 110%CPU (0avgtext+0avgdata 1656maxresident)k
0inputs+1256outputs (0major+350075minor)pagefaults 0swaps
1
2
ans=10002
0.92user 0.03system 0:00.96elapsed 99%CPU (0avgtext+0avgdata 3284maxresident)k
0inputs+0outputs (0major+159minor)pagefaults 0swaps
====
Thu Sep 5 17:12:05 CEST 2019
Thu Sep 5 17:12:05 CEST 2019
ans=Thu Sep 5 17:12:05 CEST 2019
====
ans=42
5.20user 2.40system 0:06.96elapsed 109%CPU (0avgtext+0avgdata 3220maxresident)k
0inputs+1248outputs (0major+949297minor)pagefaults 0swaps
As you can see, you can get uniform call syntax with this, while speeding up
the execution of small shell functions by up to about 14 times due to eliminating the need for captures ($()).
Use a bash nameref.
With bash v4 you can use variable namerefs:
get() {
declare -n _get__res
_get_res="$1"
case "$2" in
firstname) _get_res="Kamil"; ;;
lastname) _get_res="Cuk"; ;;
esac
}
get LN lastname
get FN firstname
person="$LN, $FN"
Namerefs can still clash with variables from outer scope. Use long names for the namerefs, like here I used underscore, function name, two underscores and then variable name.

kprobe_events fetch-args works for x86 but not arm64

I wanted to get do_sys_open filename argument as string. For this i added kprobe following kprobetrace.txt. A simple probe which gives filename as hex works for both x86/arm64.
x86: echo 'p:myprobe do_sys_open filename_string=%si' > kprobe_events
arm64: echo 'p:myprobe do_sys_open filename_string=%x1' > kprobe_events
However changing probe to get string for file name works on x86 but not arm64(ie cannot get string representation filename_string=(fault))
x86:
echo 'p:myprobe do_sys_open filename_string=+0(%si):string' > kprobe_events
output:
adb-30551 [001] d... 4570187.407426: myprobe: (do_sys_open+0x0/0x270) filename_string="/dev/bus/usb/001/001"
arm64:
echo 'p:myprobe do_sys_open filename_string=+0(%x1):string' > kprobe_events
output:
netd-4621 [001] d... 8491.094187: myprobe: (do_sys_open+0x0/0x24c) filename_string=(fault)
To check if i was using arm ABI correctly i tried setting probe using perf.
The probe created by perf as seen from /sys/kernel/debug/tracing/kprobe_events was similar
./perf4.14 probe 'do_sys_open filename:string'
/d/tracing # cat kprobe_events
p:kprobes/myprobe do_sys_open filename_string=+0(%x1):string
But perf probe was also failing (ie filename_string="") in this case.
./perf4.14 record -e probe:do_sys_open -aR sleep 3
/data/local/tmp # ./perf4.14 script
perf4.14 4587 [007] 7490.809036: probe:do_sys_open: (ffffff8337060148) filename_string=""
sleep 4588 [003] 7490.817937: probe:do_sys_open: (ffffff8337060148) filename_string=""
What would be the correct way to set kprobe_events for arm to fetch args as string?
Am i using the ABI incorrectly?
On kernel version >= 4.20, you can use $argN to fetch the Nth function argument. From kernel 4.20 kprobetrace.rst:
FETCHARGS : Arguments. Each probe can have up to 128 args.
.....
.....
$argN : Fetch the Nth function argument. (N >= 1) (\*1)
Since the filename is second argument of do_sys_open(), you should give $arg2 in the kprobe event, like this:
echo 'p:myprobe do_sys_open filename_string=+0($arg2):string' > kprobe_events
This should work on both x86 and arm64.

How to create a command line pipe? (xcode mac os x)

How to create a command line pipe? (xcode mac os x) Hello I want to
create a command line with xcode (mac os x) that had the quality of
being used in pipe after "|" .
i know that by using xargs we can pass arguments stored in stdin into
arguments.
I would like to understand how to create a pipable command line. Thank
you for your answers
For example, if we define the hand function that should receive the
arguments to execute. In a rudimentary way we will write (in C):
int main (int argc, char ** argv)
{
char buf [512] = "";
char buf1[512] = "";
int f;
and achieve some things
the first argument of the argument array, contains in any case the no
of the command line that you are creating ex: argv [0] "echo" and the
argument following the one you wish to use, 'here for echo argv [1]
"the sun shines" of course if echo is able to receive an argument
after "|" (pipe) for example: echo "the sun is shining" | echo, but
echo do not use the pipe.
For our main function, is elementarily we will check if argv [1]
contains an argument by
if (argv [1] == NULL)
> {
there we arbitrarily take the guess that argv [1] is empty. as a reminder our command line is located after a pipe "|" if our argv [1]
is empty, it will need to find an argument to continue using our
command line, for this we accept the postulate that if the command
line is placed after "|" pipe is that the output of the previous
command is not empty from where what will follow
> f = open ("/ dev / stdin", O_RDONLY);
> read (f, buf, sizeof (buf));
memcpy (buf1, buf, sizeof (buf));
now we have opened the stream "stdin" which necessarily contains the
output of the previous command, we use 2 buffers buf [512] and buf1
[512], because we can not fill argv [1], now that we have our
arguments in a buffer, one can realize the continuation of the
execution of the command, as if it were about argv [1].
In objective-c or in C simply, to give an line command the virtue to
be used ien pipeline as a result of the use of "|" after another
command line, it is necessary to redirect "stdin" towards the entry of
the command line as it was an argument ("argv"). From there by
mastering a little programming in "C", one must arrive at its ends to
make a command line create with xcode, usable after "|" .

Is the behavior behind the Shellshock vulnerability in Bash documented or at all intentional?

A recent vulnerability, CVE-2014-6271, in how Bash interprets environment variables was disclosed. The exploit relies on Bash parsing some environment variable declarations as function definitions, but then continuing to execute code following the definition:
$ x='() { echo i do nothing; }; echo vulnerable' bash -c ':'
vulnerable
But I don't get it. There's nothing I've been able to find in the Bash manual about interpreting environment variables as functions at all (except for inheriting functions, which is different). Indeed, a proper named function definition is just treated as a value:
$ x='y() { :; }' bash -c 'echo $x'
y() { :; }
But a corrupt one prints nothing:
$ x='() { :; }' bash -c 'echo $x'
$ # Nothing but newline
The corrupt function is unnamed, and so I can't just call it. Is this vulnerability a pure implementation bug, or is there an intended feature here, that I just can't see?
Update
Per Barmar's comment, I hypothesized the name of the function was the parameter name:
$ n='() { echo wat; }' bash -c 'n'
wat
Which I could swear I tried before, but I guess I didn't try hard enough. It's repeatable now. Here's a little more testing:
$ env n='() { echo wat; }; echo vuln' bash -c 'n'
vuln
wat
$ env n='() { echo wat; }; echo $1' bash -c 'n 2' 3 -- 4
wat
…so apparently the args are not set at the time the exploit executes.
Anyway, the basic answer to my question is, yes, this is how Bash implements inherited functions.
This seems like an implementation bug.
Apparently, the way exported functions work in bash is that they use specially-formatted environment variables. If you export a function:
f() { ... }
it defines an environment variable like:
f='() { ... }'
What's probably happening is that when the new shell sees an environment variable whose value begins with (), it prepends the variable name and executes the resulting string. The bug is that this includes executing anything after the function definition as well.
The fix described is apparently to parse the result to see if it's a valid function definition. If not, it prints the warning about the invalid function definition attempt.
This article confirms my explanation of the cause of the bug. It also goes into a little more detail about how the fix resolves it: not only do they parse the values more carefully, but variables that are used to pass exported functions follow a special naming convention. This naming convention is different from that used for the environment variables created for CGI scripts, so an HTTP client should never be able to get its foot into this door.
The following:
x='() { echo I do nothing; }; echo vulnerable' bash -c 'typeset -f'
prints
vulnerable
x ()
{
echo I do nothing
}
declare -fx x
seems, than Bash, after having parsed the x=..., discovered it as a function, exported it, saw the declare -fx x and allowed the execution of the command after the declaration.
echo vulnerable
x='() { x; }; echo vulnerable' bash -c 'typeset -f'
prints:
vulnerable
x ()
{
echo I do nothing
}
and running the x
x='() { x; }; echo Vulnerable' bash -c 'x'
prints
Vulnerable
Segmentation fault: 11
segfaults - infinite recursive calls
It doesn't overrides already defined function
$ x() { echo Something; }
$ declare -fx x
$ x='() { x; }; echo Vulnerable' bash -c 'typeset -f'
prints:
x ()
{
echo Something
}
declare -fx x
e.g. the x remains the previously (correctly) defined function.
For the Bash 4.3.25(1)-release the vulnerability is closed, so
x='() { echo I do nothing; }; echo Vulnerable' bash -c ':'
prints
bash: warning: x: ignoring function definition attempt
bash: error importing function definition for `x'
but - what is strange (at least for me)
x='() { x; };' bash -c 'typeset -f'
STILL PRINTS
x ()
{
x
}
declare -fx x
and the
x='() { x; };' bash -c 'x'
segmentation faults too, so it STILL accept the strange function definition...
I think it's worth looking at the Bash code itself. The patch gives a bit of insight as to the problem. In particular,
*** ../bash-4.3-patched/variables.c 2014-05-15 08:26:50.000000000 -0400
--- variables.c 2014-09-14 14:23:35.000000000 -0400
***************
*** 359,369 ****
strcpy (temp_string + char_index + 1, string);
! if (posixly_correct == 0 || legal_identifier (name))
! parse_and_execute (temp_string, name, SEVAL_NONINT|SEVAL_NOHIST);
!
! /* Ancient backwards compatibility. Old versions of bash exported
! functions like name()=() {...} */
! if (name[char_index - 1] == ')' && name[char_index - 2] == '(')
! name[char_index - 2] = '\0';
if (temp_var = find_function (name))
--- 364,372 ----
strcpy (temp_string + char_index + 1, string);
! /* Don't import function names that are invalid identifiers from the
! environment, though we still allow them to be defined as shell
! variables. */
! if (legal_identifier (name))
! parse_and_execute (temp_string, name, SEVAL_NONINT|SEVAL_NOHIST|SEVAL_FUNCDEF|SEVAL_ONECMD);
if (temp_var = find_function (name))
When Bash exports a function, it shows up as an environment variable, for example:
$ foo() { echo 'hello world'; }
$ export -f foo
$ cat /proc/self/environ | tr '\0' '\n' | grep -A1 foo
foo=() { echo 'hello world'
}
When a new Bash process finds a function defined this way in its environment, it evalutes the code in the variable using parse_and_execute(). For normal, non-malicious code, executing it simply defines the function in Bash and moves on. However, because it's passed to a generic execution function, Bash will correctly parse and execute additional code defined in that variable after the function definition.
You can see that in the new code, a flag called SEVAL_ONECMD has been added that tells Bash to only evaluate the first command (that is, the function definition) and SEVAL_FUNCDEF to only allow functio0n definitions.
In regard to your question about documentation, notice here in the commandline documentation for the env command, that a study of the syntax shows that env is working as documented.
There are, optionally, 4 possible options
An optional hyphen as a synonym for -i (for backward compatibility I assume)
Zero or more NAME=VALUE pairs. These are the variable assignment(s) which could include function definitions.
Note that no semicolon (;) is required between or following the assignments.
The last argument(s) can be a single command followed by its argument(s). It will run with whatever permissions have been granted to the login being used. Security is controlled by restricting permissions on the login user and setting permissions on user-accessible executables such that users other than the executable's owner can only read and execute the program, not alter it.
[ spot#LX03:~ ] env --help
Usage: env [OPTION]... [-] [NAME=VALUE]... [COMMAND [ARG]...]
Set each NAME to VALUE in the environment and run COMMAND.
-i, --ignore-environment start with an empty environment
-u, --unset=NAME remove variable from the environment
--help display this help and exit
--version output version information and exit
A mere - implies -i. If no COMMAND, print the resulting environment.
Report env bugs to bug-coreutils#gnu.org
GNU coreutils home page: <http://www.gnu.org/software/coreutils/>
General help using GNU software: <http://www.gnu.org/gethelp/>
Report env translation bugs to <http://translationproject.org/team/>

Resources