Using set -e / set +e in bash with functions - bash

I've been using a simple bash preamble like this in my scripts:
#!/bin/bash
set -e
In conjunction with modularity / using functions this has bitten me today.
So, say I have a function somewhere like
foo() {
#edit: some error happens that make me want to exit the function and signal that to the caller
return 2
}
Ideally I'd like to be able to use multiple small files, include their functions in other files and then call these functions like
set +e
foo
rc=$?
set -e
. This works for exactly two layers of routines. But if foo is also calling subroutines like that, the last setting before the return will be set -e, which will make the script exit on the return - I cannot override this in the calling function. So, what I had to do is
foo() {
#calling bar() in a shielded way like above
#..
set +e
return 2
}
Which I find very counterintuitive (and also not what I want - what if in some contexts I'd like to use the function without shielding against failures, while in other contexts I want to handle the cleanup?) What's the best way to handle this? Btw. I'm doing this on OSX, I haven't tested whether this behaviour is different on Linux.

Shell functions don't really have "return values", just exit codes.
You could add && : to the caller, this makes the command "tested", and won't exit it:
foo() {
echo 'x'
return 42
}
out=$(foo && :)
echo $out
The : is the "null command" (ie. it doesn't do anything). In this case it doesn't even get executed, since it only gets run if foo return 0 (which it doesn't).
This outputs:
x
It's arguably a bit ugly, but then again, all of shell scripting is arguably a bit ugly ;-)
Quoting sh(1) from FreeBSD, which explains this better than bash's man page:
-e errexit
Exit immediately if any untested command fails in non-interactive
mode. The exit status of a command is considered to be explicitly
tested if the command is part of the list used to control an if,
elif, while, or until; if the command is the left hand operand of
an “&&” or “||” operator; or if the command is a pipeline preceded
by the ! operator. If a shell function is executed and its exit
status is explicitly tested, all commands of the function are con‐
sidered to be tested as well.

Related

How does the behavior of a function change if it is within a subshell?

When making functions within a script it bash, it appears that people often have the function run within a subshell, ie
function(){(
)}
instead of
function(){
}
What are the benefit/downsides of using {()} rather than just {} if any?
Parentheses cause the function to run in a subshell, which is a child process isolated from the parent shell. They're useful when you want to make process-wide environmental changes without affecting the behavior of code outside of the function.
Examples include:
Changing the current directory with cd does not affect the parent shell. Running cd in a subshell is a cleaner alternative to pushd and popd.
Variable assignments are isolated to the subshell. You can temporarily change global settings like $PATH and $IFS without having to carefully save and restore their values before and after.
Shell options changed with set or shopt will be automatically restored when the subshell exits. I commonly write (set -x; some-commands) to temporarily enable command logging, for example.
Signal handlers installed with trap are only in effect in the subshell. You can install a custom INT (Ctrl-C) handler for the duration of a function, or a custom EXIT handler to run cleanup code when the function returns.
func() {(
echo 'entering func' >&2
trap 'echo exiting func >&2' EXIT
...
)}
If exit is called it won't cause the entire script to exit. This is useful if you want to call exit from several functions down the call stack as a sort of poor man's "exception".
Or if you want to source a script that might exit, wrapping it in a subshell will keep it from killing your script.
(
. ./script-that-might-exit
echo "script set \$foo to $foo"
echo "script changed dir to $PWD"
)
Fun fact: Functions don't have to be delimited by curly braces. It's legal to omit the braces and use parentheses as the delimiters:
func() (
# runs in a subshell
)
If exit is called in a (..) subshell, it will only terminate that expression. Moreover, the code is free to change the values of variables as well as global options (via set); those changes are not seen in the surrounding code and are gone when the expression exits, which can simplify reasoning about the correctness of the code.
When you use (...) inside a function, watch out of this pitfall: a return command inside the (...) won't return from the function; it will just terminate the (...), just like exit. If you have commands after the (...), they will then execute.

How can I change the environment of my bash function without affecting the environment it was called from?

I am working on a bash script that needs to operate on several directories. In each directory it needs to source a setup script unique to that directory and then run some commands. I need the environment set up when that script is sourced to only persist inside the function, as if it had been called as an external script with no persistent effects on the calling script.
As a simplified example if I have this script, sourced.sh:
export foo=old
And I have this as my driver.sh:
export foo=young
echo $foo
source_and_run(){
source ./sourced.sh
echo $foo
}
source_and_run
echo $foo
I want to know how to change the invocation of source_and_run in driver so it will print:
young
old
young
I need to be able to collect the return value from the function as well. I'm pretty sure there's a simple way to accomplish this but I haven't been able to track it down so far.
Creating another script like external.sh:
source ./sourced.sh; echo $foo
and defining source_and_run like
source_and_run(){ ./external.sh; return $? }
would work but managing that extra layer of scripts seems like it shouldn't be necessary.
You said
Creating another script like external.sh:
source ./sourced.sh; echo $foo
and defining source_and_run like
source_and_run(){ ./external.sh; return $? }
would work but managing that extra layer of scripts seems like it shouldn't be necessary.
You can get the same behavior by using a subshell. Note the () instead of {}.
But note that return $? is not necessary in your case. By default, a function returns the exit status of its last command. In your case, that command is echo. Therefore, the return value will always be 0.
source_and_run() (
source ./sourced.sh
echo "$foo"
)
By the way: A better solution would be to rewrite sourced.sh such that it prints $foo. That way you could call the script directly instead of having to source it and then using echo $foo.
The very purpose of a bash function is that it runs in the same process as the invoker.
Since the environment is accessible (for instance, using the command printenv),you could, at the entry of the function, save the environment and at the end restore it. However, the easier and more natural approach is to not use a function at all, but make it a separate shell script which is executed ín its own process and hence has its own environment, which does not affect the environment of the caller anymore.

Purpose of #!/bin/false in bash script

While working on a project written in bash by my former colleague, I noticed that all .sh files contain nothing but function definitions start with #!/bin/false, which is, as I understand, a safety mechanism of preventing execution of include-only files.
Example:
my_foo.sh
#!/bin/false
function foo(){
echo foontastic
}
my_script.sh
#!/bin/bash
./my_foo.sh # does nothing
foo # error, no command named "foo"
. ./my_foo.sh
foo # prints "foontastic"
However when I don't use #!/bin/false, effects of both proper and improper use are exactly the same:
Example:
my_bar.sh
function bar(){
echo barvelous
}
my_script.sh
#!/bin/bash
./my_bar.sh # spawn a subshell, defines bar and exit, effectively doing nothing
bar # error, no command named "bar"
. ./my_bar.sh
bar # prints "barvelous"
Since properly using those scripts by including them with source in both cases works as expected, and executing them in both cases does nothing from the perspective of a parent shell and generate no error message concerning invalid use, what is exactly the purpose of #!/bash/false in those script?
In general, let’s consider a file testcode with bash code in it
#!/bin/bash
if [ "$0" = "${BASH_SOURCE[0]}" ]; then
echo "You are executing ${BASH_SOURCE[0]}"
else
echo "You are sourcing ${BASH_SOURCE[0]}"
fi
you can do three different things with it:
$ ./testcode
You are executing ./testcode
This works if testcode has the right permissions and the right shebang. With a shebang of #!/bin/false, this outputs nothing and returns a code of 1 (false).
$ bash ./testcode
You are executing ./testcode
This completely disregards the shebang (which can even be missing) and it only requires read permission, not executable permission. This is the way to call bash scripts from a CMD command line in Windows (if you have bash.exe in your PATH...), since there the shebang machanism doesn’t work.
$ . ./testcode
You are sourcing ./testcode
This also completely disregards the shebang, as above, but it is a complete different matter, because sourcing a script means having the current shell execute it, while executing a script means invoking a new shell to execute it. For instance, if you put an exit command in a sourced script, you exit from the current shell, which is rarely what you want. Therefore, sourcing is often used to load function definitions or constants, in a way somewhat resembling the import statement of other programming languages, and various programmers develop different habits to differentiate between scripts meant to be executed and include files to be sourced. I usually don’t use any extension for the former (others use .sh), but I use an extension of .shinc for the latter. Your former colleague used a shebang of #!/bin/false and one can only ask them why they preferred this to a zillion other possibilities. One reason that comes to my mind is that you can use file to tell these files apart:
$ file testcode testcode2
testcode: Bourne-Again shell script, ASCII text executable
testcode2: a /bin/false script, ASCII text executable
Of course, if these include files contain only function definitions, it’s harmless to execute them, so I don’t think your colleague did it to prevent execution.
Another habit of mine, inspired by the Python world, is to place some regression tests at the end of my .shinc files (at least while developing)
... function definitions here ...
[ "$0" != "${BASH_SOURCE[0]}" ] && return
... regression tests here ...
Since return generates an error in executed scripts but is OK in sourced scripts, a more cryptic way to get the same result is
... function definitions here ...
return 2>/dev/null || :
... regression tests here ...
The difference in using #!/bin/false or not from the point of view of the parent shell is in the return code.
/bin/false always return a failing return code (in my case 1, but not sure if it is standard).
Try that :
./my_foo.sh //does nothing
echo $? // shows "1", a.k.a failing
./my_bar.sh //does nothing
echo $? // shows "0", a.k.a. everything went right
So, using #!/bin/false not only documents the fact that the script is not intended to be executed, but also produces an error return code when doing so.

Does a Shell function run in a sub-shell?

I'm trying to get around a problem that seems to me you cannot pass open db2 connection to a sub-shell.
My code organization is as follows:
Driver script (in my_driver.sh)
# foo.sh defines baz() bar(), which use a db2 connection
# Also the "$param_file" is set in foo.sh!
source foo.sh
db2 "connect to $dbName USER $dbUser using $dbPass"
function doit
{
cat $param_file | while read params
do
baz $params
bar $params
done
}
doit
I've simplified my code, but the above is enough the give the idea. I start the above:
my_driver.sh
Now, my real issue is that the db2 connection is not available in sub-shell:
I tried:
. my_driver.sh
Does not help
If I do it manually from the command line:
source foo.sh
And I set $params manually:
baz $params
bar $params
Then it does work! So it seems that doit or something else acts as if bar and baz are executed from a sub-shell.
I would be elated if I can somehow figure out how to pass db2 open connection to sub-shell would be best.
Otherwise, these shell functions seem to me that they run in a sub-shell. Is there a way around that?
The shell does not create a subshell to run a function.
Of course, it does create subshells for many other purposes, not all of which might be obvious. For example, it creates subshells in the implementation of |.
db2 requires that the all db2 commands have the same parent as the db2 command which established the connection. You could log the PID using something like:
echo "Execute db2 from PID $$" >> /dev/stderr
db2 ...
(as long as the db2 command isn't execute inside a pipe or shell parentheses.)
One possible problem in the code shown (which would have quite a different symptom) is the use of the non-standard syntax
function f
To define a function. A standard shell expects
f()
Bash understands both, but if you don't have a shebang line or you execute the scriptfile using the sh command, you will end up using the system's default shell, which might not be bash.
Found solution, but can't yet fully explain the problem ....
if you change doit as follows it works!
function doit
{
while read params
do
baz $params
bar $params
done < $param_file
}
Only, I'm not sure why? and how I can prove it ...
If I stick in debug code:
echo debug check with PID=$$ PPID=$PPID and SHLVL=$SHLVL
I get back same results with the | or not. I do understand that cat $param_file | while read params creates a subshell, however, my debug statements always show the same PID and PPID...
So my problem is solved, but I'm missing some explanations.
I also wonder if this question would not be more well suited in the unix.stackexchange community?
A shell function in such shells as sh (i.e. Dash) or Bash may be considered as a labelled commands group or named "code block" which may be called multiple times by its name. A command group surrounded by {} does not create a subshell or "fork" a process, but executes in the same process and environment.
Some might find it relatively similar to goto where function names represent labels as in other programming languages, including C, Basic, or Assembler. However, the statements vary quite greatly (e.g. functions return, but goto - doesn't) and Go To Statement may be Considered Harmful.
Shell Functions
Shell functions are a way to group commands for later
execution using a single name for the group. They are executed just
like a "regular" command. When the name of a shell function is used as
a simple command name, the list of commands associated with that
function name is executed. Shell functions are executed in the current
shell context; no new process is created to interpret them.
Functions are declared using this syntax:
fname () compound-command [ redirections ]
or
function fname [()] compound-command [ redirections ]
This defines a shell function named fname. The reserved word function is optional. If the function reserved word is supplied, the parentheses are optional.
Source: https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html or man bash.
Grouping Commands Together
Commands may be grouped by writing either
(list)
or
{ list; }
The first of these executes the commands in a subshell. Builtin commands grouped into a (list) will not affect the current shell. The
second form does not fork another shell so is slightly more efficient.
Grouping commands together this way allows you to redirect their
output as though they were one program:
{ printf " hello " ; printf " world\n" ; } > greeting
Note that "}" must follow a control operator (here, ";") so that it is recognized as a reserved word and not as another command
argument.
Functions
The syntax of a function definition is
name () command
A function definition is an executable statement; when executed it installs a function named name and returns an exit status of zero. The command is normally a list enclosed between "{" and "}".
Source: https://linux.die.net/man/1/dash or man sh.
Transfers control unconditionally.
Used when it is otherwise impossible to transfer control to the desired location using other statements... The goto statement transfers control to the location specified by label. The goto statement must be in the same function as the label it is referring, it may appear before or after the label.
Source: https://en.cppreference.com/w/cpp/language/goto
Goto
... It performs a one-way transfer of control to another line of code; in
contrast a function call normally returns control. The jumped-to
locations are usually identified using labels, though some languages
use line numbers. At the machine code level, a goto is a form of
branch or jump statement, in some cases combined with a stack
adjustment. Many languages support the goto statement, and many do not...
Source: https://en.wikipedia.org/wiki/Goto
Related:
https://mywiki.wooledge.org/BashProgramming#Functions
https://uomresearchit.github.io/shell-programming-course/04-subshells_and_functions/index.html (Subshells and Functions...)
Is there a "goto" statement in bash?
What's the difference between "call" and "invoke"?
https://en.wikipedia.org/wiki/Call_stack
https://mywiki.wooledge.org/BashPitfalls

"Exception handling" in shell scripts

I know that you can use the shortcutting boolean operators in shell scripts to do some sort of exception handling like so:
my_first_command && my_second_command && my_third_command
But this quickly becomes unreadable and unmaintainable as the number of commands you want to chain grows. If I'm writing a script (or a shell function), is there a good way to have execution of the script or function halt on the first nonzero return code, without writing on one big line?
(I use zsh, so if there are answers that only work in zsh that's fine by me.)
The -e option does this:
ERR_EXIT (-e, ksh: -e)
If a command has a non-zero exit status, execute the ZERR trap,
if set, and exit. This is disabled while running initialization
scripts.
You should be able to put this on the shebang line, like:
#!/usr/bin/zsh -e
Most shells have this option, and it's usually called -e.

Resources