Why is `trap -` not working when put inside a function? - bash

Short version
In a Bash script, I activate a trap, and later deactivate it by calling trap - EXIT ERR SIGHUP SIGINT SIGTERM. When I do the deactivation directly in the script, it works. However, when I put the exact same line of code in a Bash function, it is ignored, i.e. the trap is still activated if, later, a command returns an exit code different from zero. Why?
Long version
I have a bunch of functions to work with traps:
trap_stop()
{
echo "trap_stop"
trap - EXIT ERR SIGHUP SIGINT SIGTERM
}
trap_terminate()
{
local exitCode="$?"
echo "trap_terminate"
trap_stop
local file="${BASH_SOURCE[1]}"
local stack=$(caller)
local line="${stack% *}"
if [ $exitCode == 0 ]; then
echo "Finished."
else
echo "The initialization failed with code $exitCode in $file:${line}."
fi
exit $exitCode
}
trap_start()
{
echo "trap_start"
trap "trap_terminate $LINENO" EXIT ERR SIGHUP SIGINT SIGTERM
}
When used like this:
trap_start # <- Trap started.
echo "Stopping traps."
trap_stop # <- Trap stopped before calling a command which exits with exit code 2.
echo "Performing a command which will fail."
ls /tmp/missing
exit_code="$?"
echo "The result of the check is $exit_code."
I get the following output:
trap_start
Stopping traps.
trap_stop
Performing a command which will fail.
ls: cannot access '/tmp/missing': No such file or directory
trap_terminate
trap_stop
The initialization failed with code 2 in ./init:41.
Despite the fact that function deactivating the trap was called, the trap was still triggered when calling ls on a directory which doesn't exist.
On the other hand, when I replace the call to trap_stop by the actual trap - statement, like this:
trap_start
echo "Stopping traps."
trap - EXIT ERR SIGHUP SIGINT SIGTERM # <- This statement replaced the call to `trap_stop`.
echo "Performing a command which will fail."
ls /tmp/missing
exit_code="$?"
echo "The result of the check is $exit_code."
then the output is correct, i.e. the trap is not activated and I reach the end of the script.
trap_start
Stopping traps.
Performing a command which will fail.
ls: cannot access '/tmp/missing': No such file or directory
The result of the check is 2.
Why is moving trap - to a function makes it stop working?

EDIT (courtesy of #KamilCuk): If your bash is older than 4.4, upgrade your bash, it could solve the problem.
I added some debugging to your code:
echo "Stopping traps."
trap -p
trap_stop # <- Trap stopped before calling a command which exits with exit code 2.
trap -p
And got:
Stopping traps.
trap -- 'trap_terminate 29' EXIT
trap -- 'trap_terminate 29' SIGHUP
trap -- 'trap_terminate 29' SIGINT
trap -- '' SIGFPE
trap -- 'trap_terminate 29' SIGTERM
trap -- '' SIGXFSZ
trap -- '' SIGPWR
trap -- 'trap_terminate 29' ERR
trap_stop
trap -- '' SIGFPE
trap -- '' SIGXFSZ
trap -- '' SIGPWR
trap -- 'trap_terminate 29' ERR
As you can see, the trap - part does work, except for the ERR condition.
After some man page time:
echo "Stopping traps."
set -E
trap_stop # <- Trap stopped before calling a command which exits with exit code 2.
yields:
trap_start
Stopping traps.
trap_stop
Performing a command which will fail.
ls: cannot access '/tmp/missing': No such file or directory
The result of the check is 2.
The relevant part of bash(1):
-E
If set, any trap on ERR is inherited by shell functions, command substitutions, and commands executed in a subshell environment. The ERR trap is normally not inherited in such cases.
That said, this seems to be a bug in bash:
#!/bin/bash
t1()
{
trap 'echo t1' ERR
}
t2()
{
trap 'echo t2' ERR
}
t1
false
t2
false
yields:
t1
t1
whereas I'd expect at the very least:
t1
t2

Related

trap INT in bash script fails when a function outputs its stdout in a process substitution call

I have the below script example to handle trap on EXIT and INT signals during some job, and trigger a clean up function that cannot be interrupted
#!/bin/bash
# Our main function to handle some job:
some_job() {
echo "Working hard on some stuff..."
for i in $(seq 1 5); do
#printf "."
printf '%s' "$i."
sleep 1
done
echo ""
echo "Job done, but we found some errors !"
return 2 # to simulate script exit code 2
}
# Our clean temp files function
# - should not be interrupted
# - should not be called twice if interrupted
clean_tempfiles() {
echo ""
echo "Cleaning temp files, do not interrupt..."
for i in $(seq 1 5); do
printf "> "
sleep 1
done
echo ""
}
# Called on signal EXIT, or indirectly on INT QUIT TERM
clean_exit() {
# save the return code of the script
err=$?
# reset trap for all signals to not interrupt clean_tempfiles() on any next signal
trap '' EXIT INT QUIT TERM
clean_tempfiles
exit $err # exit the script with saved $?
}
# Called on signals INT QUIT TERM
sig_cleanup() {
# save error code (130 for SIGINT, 143 for SIGTERM, 131 for SIGQUIT)
err=$?
# some shells will call EXIT after the INT signal
# causing EXIT trap to be executed, so we trap EXIT after INT
trap '' EXIT
(exit $err) # execute in a subshell just to pass $? to clean_exit()
clean_exit
}
trap clean_exit EXIT
trap sig_cleanup INT QUIT TERM
some_job
The trap works properly, clean_tempfiles() cannot be interrupted and the exit code from some_job() is preserved
Now, if I want to redirect the output from some_job() using process substitution, in the last script line:
# - redirect stdout and stderr to stdout_logfile
# - redirect stderr to stderr_logfile
# - redirect both stdout and stderr to terminal
some_job > >(tee -a stdout_logfile) 2> >(tee -a stderr_logfile | tee -a stdout_logfile >&2)
the normal EXIT trap is properly triggered. However, the SIGINT trap is never triggered. Ctr^C cannot be trapped as soon as the last logging line is added
I could workaround it using POSIX sh compliant redirects with temp log fifo pipes, however, I would really have it working with the simpler bash syntax using process substitution
Is this even possible ?

How to handle wrong exit code upon child failure with SIGCHLD

I have a shell script which makes a bunch of child process to run in the background. If any of those fail I want to terminate all child process and the parent process with kill -- -$$.
I attempted to make a simpler signal handler function check_exit_code() in the parent process to see if it would work fine. It is called every time a child process terminates, using trap and SIGCHLD signal.
#!/bin/sh
set -o monitor
function check_exit_code {
if [ $1 -eq "0" ]
then
echo "Success: $1"
else
echo "Fail: $1"
fi
}
trap "check_exit_code $?" SIGCHLD
mycommand1 &
mycommand2 &
mycommand3 &
...
wait
Unfortunately, this only returns Success: 0 even when mycommand# failed and its exit code was 2, so I changed the function to the following.
#!/bin/sh
set -o monitor
function check_exit_code() {
local EXIT_STATUS=$?
if [ "$EXIT_STATUS" -eq "0" ]
then
echo "Success: $EXIT_STATUS"
else
echo "Fail: $EXIT_STATUS"
fi
}
trap "check_exit_code" SIGCHLD
mycommand1 &
mycommand2 &
mycommand3 &
...
wait
This only returns Fail: 145, which mycommand# cannot return. I suspect that when I write $? I receive the exit status of another command. What is the problem? How would you fix it?
The problem with your first attempt is the double quotes around check_exit_status $?. The shell expands $? to zero before setting the trap and thus, a SIGCHLD triggers check_exit_status 0 no matter what, hence the constant Success: 0 output.
Regarding your second attempt, upon entry to the trap action, the special parameter ? holds the most recent pipeline's exit status as usual, which is in this case wait. Since a trap was set for SIGCHLD, the shell interrupts wait and assigns 128 + 17 (numeric equivalent of SIGCHLD) to ? upon receiving that signal.
With bash-5.1.4 or a newer version, you can achieve the desired result like so:
while true; do
wait -n -p pid
case $?,$pid in
( 0* ) # a job has exited with zero
continue ;;
( *, ) # pid is empty, no jobs left
break ;;
( * ) # a job has exited with non-zero
trap "exit $?" TERM
kill 0
esac
done

Bash: how to trap set -e, but not exit

My scripts have as first instruction:
set -e
So that whenever an error occurs, the script aborts. I would like to trap this situation to show an information message, but I do not want to show that message whenever the script exits; ONLY when set -e triggers the abortion. Is it possible to trap this situation?
This:
set -e
function mytrap {
echo "Abnormal termination!"
}
trap mytrap EXIT
error
echo "Normal termination"
Is called in any exit (whether an error happens or not), which is not what I want.
Instead of using trap on EXIT, use it on ERR event:
trap mytrap ERR
Full Code:
set -e
function mytrap {
echo "Abnormal termination!"
}
trap mytrap ERR
(($#)) && error
echo "Normal termination"
Now run it for error generation:
bash sete.sh 123
sete.sh: line 9: error: command not found
Abnormal termination!
And here is normal exit:
bash sete.sh
Normal termination

bash trap '' vs trap function passing signals

I'm confused about forwarding signals to child processes with traps. Say I have two scripts:
a.sh
#!/bin/bash
# print the process id
echo $$
cleanup() {
rv=$?
echo "cleaning up $rv"
exit
}
sleep 5
trap '' SIGTERM # trap cleanup SIGTERM
echo 'cant stop wont stop'
./b.sh
echo 'can stop will stop'
trap - SIGTERM
sleep 4
echo 'done'
b.sh
#!/bin/bash
sleep 4;
echo 'b done'
If I execute a.sh and then from another window kill the process group with kill -- -PGID, the SIGTERM is ignored and not passed on to b.sh. But if I do trap cleanup SIGTERM, the SIGTERM passes through and terminates b.sh. Why is my trap passing the signal in one case and not the other?
This is interesting. Quoting man 7 signal:
A child created via fork(2) inherits a copy of its parent's signal dispositions. During an execve(2), the dispositions of handled signals are reset to the default; the dispositions of ignored signals are left unchanged.
In your case, the child always receives TERM by virtue of being in the same process group. The problem is, what does it do with it.
When the parent ignores TERM, by the rule above, so does the child, so the child survives. When the parent catches TERM, the child's handler will be reset, and it will die as the default action.
From the trap man page:
trap [action condition ...]
If action is null ( "" ), the shell shall ignore each specified condition if it arises.
So when you execute TRAP '' SIGTERM, the SIGTERM condition is ignored. Try using a space value instead and see if it works:
sleep 5
trap ' ' SIGTERM # Note the space (' ')!!
echo 'cant stop wont stop'
./b.sh
echo 'can stop will stop'
trap - SIGTERM
sleep 4
echo 'done'

trapping shell exit code

I am working on a shell script, and want to handle various exit codes that I might come across. To try things out, I am using this script:
#!/bin/sh
echo "Starting"
trap "echo \"first one\"; echo \"second one\"; " 1
exit 1;
I suppose I am missing something, but it seems I can't trap my own "exit 1". If I try to trap 0 everything works out:
#!/bin/sh
echo "Starting"
trap "echo \"first one\"; echo \"second one\"; " 0
exit
Is there anything I should know about trapping HUP (1) exit code?
trap dispatches on signals the process receives (e.g., from a kill), not on exit codes, with trap ... 0 being reserved for process ending. trap /blah/blah 0 will dispatch on either exit 0 or exit 1
That's just an exit code, it doesn't mean HUP. So your trap ... 1 is looking for HUP, but the exit is just an exit.
In addition to the system signals which you can list by doing trap -l, you can use some special Bash sigspecs: ERR, EXIT, RETURN and DEBUG. In all cases, you should use the name of the signal rather than the number for readability.
You can also use || operator, with a || b, b gets executed when a failed
#!/bin/sh
failed
{
echo "Failed $*"
exit 1
}
dosomething arg1 || failed "some comments"

Resources