How to detect "Cannot fork" errors in bash? - bash

For currently unknown reasons, one of our bash-scripts produces "Cannot fork" errors when running a simple line like:
myvar=`mycmd || echo "error"; exit 2`
Obviously the problem is that no new process can be created (forked) so that command fails.
However bash just ignores the error and continues in the script which caused unexpected problems.
As you can see, we already check for errors in the command itself, but the "Cannot fork" error appears before the command is even run.
Is there a way to catch that error and stop bash from execution?

There are actually several problems with this error check, that'll prevent it from properly handing any error, not just "Cannot fork" errors.
The first problem is that || has higher precedence than ;, so mycmd || echo "error"; exit 2 will run echo "error" only if mycmd fails, but it'll run exit 2 unconditionally, whether or not mycmd succeeds or fails.
In order to fix this, you should group the error handling commands with { }. Something like: mycmd || { echo "error"; exit 2; }. Note that a ; or line break before } is required, or the } will be treated as an argument to exit.
(BTW, I sometimes see this shorthanded as mycmd || echo "error" && exit 2. Don't do this. If the echo fails for some weird reason, it won't run the exit.)
Also all of this, the echo and the exit, is run in the subshell created by the backticks (or would be, if that subshell had forked successfully). That means the error message will get saved in myvar rather than printed (error messages should generally be sent to standard error, e.g. echo "error" >&2); and more importantly it'll be the subshell that exits, not the shell that's running the script. The main script will note that the subshell exited with an error... and blithely keep running. (Well, unless you have -e set, but that's a whole other ball of potential bugs.)
The solution to that is to put the || stuff outside the backticks (or `$( ), since that's generally preferred over backticks). That way it happens in the main shell, that's what prints the error, that's what exits if there's an error, etc. This should also solve the "Cannot fork" problem, although I haven't tested it.
So, with all these corrections, it should look something like this:
myvar=$(mycmd) || {
echo "error" >&2
exit 2
}
Oh, and as Charles Duffy pointed out in a comment, if you use something like local myvar=$(mycmd) or export myvar=$(mycmd), the local/export/whatever command will override the exit status from mycmd. If you need to do that, set the variable's properties separately from its value:
local myvar
myvar=$(mycmd) || {
...

Related

How to make exception for a bash script with set -ex

I have a bash script that has set -ex, which means the running script will exit once any command in it hits an error.
My use case is that there's a subcommand in this script for which I want to catch its error, instead of making the running script shutdown.
E.g., here's myscript.sh
#!/bin/bash
set -ex
# sudo code here
error = $( some command )
if [[ -n $error ]] ; then
#do something
fi
Any idea how I can achieve this?
You can override the output of a single command
this_will_fail || true
Or for an entire block of code
set +e
this_will_fail
set -e
Beware, however, that if you decide you don't want to use set -e in the script anymore this won't work.
If you want to handle a particular command's error status yourself, you can use as the condition in an if statement:
if ! some command; then
echo 'An error occurred!' >&2
# handle error here
fi
Since the command is part of a condition, it won't exit on error. Note that other than the ! (which negates it, so the then clause will run if the command fails rather than it succeeds), you just include the command directly in the if statement (no brackets, parentheses, etc).
BTW, in your pseudocode example, you seem to be treating it as an error if the command produces any output; usually that's not what you want, and I'm assuming you actually want to test the exit status to detect errors.

Unsure how to send error code if command within a shell script fails

I recently learned about an if statement that looks at previous command and if failed will send exit code 1 with a message. I can't remember the entire statement but it starts with the below:
if [ $? != "0" ]
How does this statement end? Does it follow every command within a script?
Don't do that. Explicitly referencing $? is almost never necessary. If you want to exit with status 1 when a command fails, you can simply do:
cmd || exit 1
If you want to exit with the same non-zero value returned by the command, you can simply do:
cmd || exit
There are a lot of examples of bad code out there that instead do things like:
cmd
if [ "$?" -ne 0 ]; then echo cmd failed >&2; exit 1; fi
and this is bad practice for many reasons. There's no point in having the shell print a generic message about failure; the command itself ought to have written a detailed error message already, and the vague "cmd failed" is just line noise. Also, you will often see set -e, which basically slaps a || exit on the end of every simple command but has a lot of unintended side effects and edge cases, and its implementation has changed throughout history and different versions of the same shell will handle the edge cases differently so it's really not a good idea to use it.
As to the question "how does this statement end?"; it ends with fi. The general form of if is if CMD; then ...; else ...; fi where CMD is some set of pipelines (eg, you can do if echo foo | grep bar; cmd2 | foo; then ....). If the CMD returns a 0 status the first set of commands (between "then" and "else") is executed. If CMD returns non-zero, the commands between "else" and "fi" are executed. The "else" clause is optional. Don't be fooled by [; it is simply a command. In my opinion, it would be clearer if you used its alternate spelling test, which does not require a final argument of ]. IOW, you could write if test "$?" -ne 0; then ...; fi.

How can I use an if statement in bash to check if no errors have occurred?

I have a bash script I want to self-destruct on execution. So far it works great but I'd like some final check that if no errors have occurred (any output to stderr), then go ahead and self destruct. Otherwise, I'd like to leave the script in tact. I have the code for everything except the error check. Not sure if I can just output err to a file and check if file is empty. I'm sure it's a simple solution.
How could I do this?
Thanks for any help.
You can try this out. $? contains the return code for the process last executed by command. Moreover standard nix derivatives demarcate 0 as (no error) and 1 - 255 as some kind of errors that happened. Note that this will report errors that do not necessarily have any stderr output.
command
if [ "$?" -ne 0 ]; then
echo "command failed";
# your termination logic here
exit 1;
fi
Assuming that your script returns the value 0 on success, a value from 1 to 255 if an error occur you can use the following command
if /path/to/myscript; then
echo success
else
echo failed
fi
you can also use the following (shorter) command
[[ /path/to/myscript ]] && echo success || echo failed

Bash script does not quit on first "exit" call when calling the problematic function using $(func)

Sorry I cannot give a clear title for what's happening but here is the simplified problem code.
#!/bin/bash
# get the absolute path of .conf directory
get_conf_dir() {
local path=$(some_command) || { echo "please install some_command first."; exit 100; }
echo "$path"
}
# process the configuration
read_conf() {
local conf_path="$(get_conf_dir)/foo.conf"
[ -r "$conf_path" ] || { echo "conf file not found"; exit 200; }
# more code ...
}
read_conf
So basically here what I am trying to do is, reading a simple configuration file in bash script, and I have some trouble in error handling.
The some_command is a command which comes from a 3rd party library (i.e. greadlink from coreutils), required for obtain the path.
When running the code above, I expect it outputs "command not found" because that's where the FIRST error occurs, but actually it always prints "conf file not found".
I am very confused about such behavior, and I think BASH probably intent to handle thing like this but I don't know why. And most importantly, how to fix it?
Any idea would be greatly appreciated.
Do you see your please install some_command first message anywhere? Is it in $conf_path from the local conf_path="$(get_conf_dir)/foo.conf" line? Do you have a $conf_path value of please install some_command first/foo.conf? Which then fails the -r test?
No, you don't. (But feel free to echo the value of $conf_path in that exit 200 block to confirm this fact.) (Also Error messages should, in general, get sent to standard error and not standard output anyway. So they should be echo "..." 2>&1. That way they don't be caught by the normal command substitution at all.)
The reason you don't is because that exit 100 block is never happening.
You can see this with set -x at the top of your script also. Go try it.
See what I mean?
The reason it isn't happening is that the failure return of some_command is being swallowed by the local path=$(some_command) assignment statement.
Try running this command:
f() { local a=$(false); echo "Returned: $?"; }; f
Do you expect to see Returned: 1? You might but you won't see that.
What you will see is Returned: 0.
Now try either of these versions:
f() { a=$(false); echo "Returned: $?"; }; f
f() { local a; a=$(false); echo "Returned: $?"; }; f
Get the output you expected in the first place?
Right. local and export and declare and typeset are statements on their own. They have their own return values. They ignore (and replace) the return value of the commands that execute in their contexts.
The solution to your problem is to split the local path and path=$(some_command) statements.
http://www.shellcheck.net/ catches this (and many other common errors). You should make it your friend.
In addition to the above (if you've managed to follow along this far) even with the changes mentioned so far your exit 100 won't exit the main script since it will only exit the sub-shell spawned by the command substitution in the assignment.
If you want that exit 100 to exit your script then you either need to notice and re-exit with it (check for get_conf_dir failure after the conf_path assignment and exit with the previous exit code) or drop the get_conf_dir function itself and just do that inline in read_conf.

Trap syntax issue in bash

I intend to use trap to execute some clean up code in case of a failure. I have the following code, but it seems to be have some syntactical issues.
#!/bin/bash
set -e
function handle_error {
umount /mnt/chroot
losetup -d $LOOP_DEV1 $LOOP_DEV2
}
trap "{ echo \"$BASH_COMMAND failed with status code $?\"; handle_error; }" ERR
Does any one see an issue with the way the trap has been written. In case of an error the trap does get executed fine but it also throws another unwanted error message below.
/root/myscript.sh: line 60: } ERR with status code 0: command not found
##line 60 is that line of code that exited with a non zero status
How do I write it correctly to avoid the error message? Also what if I had to send arguments $LOOP_DEV1 and $LOOP_DEV2 from the main script to the trap and then to the handle_error function? Right now they are exported as environment variables in the main script. I did some search for trap examples but I couldn't get something similar.
EDIT
I changed the shebang from /bin/sh to /bin/bash. As /bin/sh was already symlinked to bash I did not expect unicorns nor did I see any.
That trap call is creating an interesting recursion, because $BASH_COMMAND (and $?) are being expanded when the trap command executes. However, $BASH_COMMAND at that point is the trap command itself, textually including $BASH_COMMAND (and some quotes and semicolons). Actually figuring out what the command to be executed when the trap fires is an interesting study, but it's not necessary to fix the problem, which you can do like this:
trap '{ echo "$BASH_COMMAND failed with status code $?"; handle_error; }' ERR
Note that replacing " with ' not only avoids immediate parameter expansion, it also avoids have to escape the inner "s.

Resources