I'm confused about forwarding signals to child processes with traps. Say I have two scripts:
a.sh
#!/bin/bash
# print the process id
echo $$
cleanup() {
rv=$?
echo "cleaning up $rv"
exit
}
sleep 5
trap '' SIGTERM # trap cleanup SIGTERM
echo 'cant stop wont stop'
./b.sh
echo 'can stop will stop'
trap - SIGTERM
sleep 4
echo 'done'
b.sh
#!/bin/bash
sleep 4;
echo 'b done'
If I execute a.sh and then from another window kill the process group with kill -- -PGID, the SIGTERM is ignored and not passed on to b.sh. But if I do trap cleanup SIGTERM, the SIGTERM passes through and terminates b.sh. Why is my trap passing the signal in one case and not the other?
This is interesting. Quoting man 7 signal:
A child created via fork(2) inherits a copy of its parent's signal dispositions. During an execve(2), the dispositions of handled signals are reset to the default; the dispositions of ignored signals are left unchanged.
In your case, the child always receives TERM by virtue of being in the same process group. The problem is, what does it do with it.
When the parent ignores TERM, by the rule above, so does the child, so the child survives. When the parent catches TERM, the child's handler will be reset, and it will die as the default action.
From the trap man page:
trap [action condition ...]
If action is null ( "" ), the shell shall ignore each specified condition if it arises.
So when you execute TRAP '' SIGTERM, the SIGTERM condition is ignored. Try using a space value instead and see if it works:
sleep 5
trap ' ' SIGTERM # Note the space (' ')!!
echo 'cant stop wont stop'
./b.sh
echo 'can stop will stop'
trap - SIGTERM
sleep 4
echo 'done'
Related
I have a shell script which makes a bunch of child process to run in the background. If any of those fail I want to terminate all child process and the parent process with kill -- -$$.
I attempted to make a simpler signal handler function check_exit_code() in the parent process to see if it would work fine. It is called every time a child process terminates, using trap and SIGCHLD signal.
#!/bin/sh
set -o monitor
function check_exit_code {
if [ $1 -eq "0" ]
then
echo "Success: $1"
else
echo "Fail: $1"
fi
}
trap "check_exit_code $?" SIGCHLD
mycommand1 &
mycommand2 &
mycommand3 &
...
wait
Unfortunately, this only returns Success: 0 even when mycommand# failed and its exit code was 2, so I changed the function to the following.
#!/bin/sh
set -o monitor
function check_exit_code() {
local EXIT_STATUS=$?
if [ "$EXIT_STATUS" -eq "0" ]
then
echo "Success: $EXIT_STATUS"
else
echo "Fail: $EXIT_STATUS"
fi
}
trap "check_exit_code" SIGCHLD
mycommand1 &
mycommand2 &
mycommand3 &
...
wait
This only returns Fail: 145, which mycommand# cannot return. I suspect that when I write $? I receive the exit status of another command. What is the problem? How would you fix it?
The problem with your first attempt is the double quotes around check_exit_status $?. The shell expands $? to zero before setting the trap and thus, a SIGCHLD triggers check_exit_status 0 no matter what, hence the constant Success: 0 output.
Regarding your second attempt, upon entry to the trap action, the special parameter ? holds the most recent pipeline's exit status as usual, which is in this case wait. Since a trap was set for SIGCHLD, the shell interrupts wait and assigns 128 + 17 (numeric equivalent of SIGCHLD) to ? upon receiving that signal.
With bash-5.1.4 or a newer version, you can achieve the desired result like so:
while true; do
wait -n -p pid
case $?,$pid in
( 0* ) # a job has exited with zero
continue ;;
( *, ) # pid is empty, no jobs left
break ;;
( * ) # a job has exited with non-zero
trap "exit $?" TERM
kill 0
esac
done
Short version
In a Bash script, I activate a trap, and later deactivate it by calling trap - EXIT ERR SIGHUP SIGINT SIGTERM. When I do the deactivation directly in the script, it works. However, when I put the exact same line of code in a Bash function, it is ignored, i.e. the trap is still activated if, later, a command returns an exit code different from zero. Why?
Long version
I have a bunch of functions to work with traps:
trap_stop()
{
echo "trap_stop"
trap - EXIT ERR SIGHUP SIGINT SIGTERM
}
trap_terminate()
{
local exitCode="$?"
echo "trap_terminate"
trap_stop
local file="${BASH_SOURCE[1]}"
local stack=$(caller)
local line="${stack% *}"
if [ $exitCode == 0 ]; then
echo "Finished."
else
echo "The initialization failed with code $exitCode in $file:${line}."
fi
exit $exitCode
}
trap_start()
{
echo "trap_start"
trap "trap_terminate $LINENO" EXIT ERR SIGHUP SIGINT SIGTERM
}
When used like this:
trap_start # <- Trap started.
echo "Stopping traps."
trap_stop # <- Trap stopped before calling a command which exits with exit code 2.
echo "Performing a command which will fail."
ls /tmp/missing
exit_code="$?"
echo "The result of the check is $exit_code."
I get the following output:
trap_start
Stopping traps.
trap_stop
Performing a command which will fail.
ls: cannot access '/tmp/missing': No such file or directory
trap_terminate
trap_stop
The initialization failed with code 2 in ./init:41.
Despite the fact that function deactivating the trap was called, the trap was still triggered when calling ls on a directory which doesn't exist.
On the other hand, when I replace the call to trap_stop by the actual trap - statement, like this:
trap_start
echo "Stopping traps."
trap - EXIT ERR SIGHUP SIGINT SIGTERM # <- This statement replaced the call to `trap_stop`.
echo "Performing a command which will fail."
ls /tmp/missing
exit_code="$?"
echo "The result of the check is $exit_code."
then the output is correct, i.e. the trap is not activated and I reach the end of the script.
trap_start
Stopping traps.
Performing a command which will fail.
ls: cannot access '/tmp/missing': No such file or directory
The result of the check is 2.
Why is moving trap - to a function makes it stop working?
EDIT (courtesy of #KamilCuk): If your bash is older than 4.4, upgrade your bash, it could solve the problem.
I added some debugging to your code:
echo "Stopping traps."
trap -p
trap_stop # <- Trap stopped before calling a command which exits with exit code 2.
trap -p
And got:
Stopping traps.
trap -- 'trap_terminate 29' EXIT
trap -- 'trap_terminate 29' SIGHUP
trap -- 'trap_terminate 29' SIGINT
trap -- '' SIGFPE
trap -- 'trap_terminate 29' SIGTERM
trap -- '' SIGXFSZ
trap -- '' SIGPWR
trap -- 'trap_terminate 29' ERR
trap_stop
trap -- '' SIGFPE
trap -- '' SIGXFSZ
trap -- '' SIGPWR
trap -- 'trap_terminate 29' ERR
As you can see, the trap - part does work, except for the ERR condition.
After some man page time:
echo "Stopping traps."
set -E
trap_stop # <- Trap stopped before calling a command which exits with exit code 2.
yields:
trap_start
Stopping traps.
trap_stop
Performing a command which will fail.
ls: cannot access '/tmp/missing': No such file or directory
The result of the check is 2.
The relevant part of bash(1):
-E
If set, any trap on ERR is inherited by shell functions, command substitutions, and commands executed in a subshell environment. The ERR trap is normally not inherited in such cases.
That said, this seems to be a bug in bash:
#!/bin/bash
t1()
{
trap 'echo t1' ERR
}
t2()
{
trap 'echo t2' ERR
}
t1
false
t2
false
yields:
t1
t1
whereas I'd expect at the very least:
t1
t2
I have a function that waits for the given pids:
waitpid() {
wait "$#";
}
I can wait for jobs this way:
sleep 5 &
spid=$!
waitpid $spid
echo $?
0
It works. However, I want to capture the output of the function in a variable:
sleep 5 &
spid=$!
spid_status=$(waitpid $spid)
echo $?
255
[1]+ Done sleep 5
This does not work, as $() starts a new subshell, and wait is not able to wait for shells that aren't children of its own.
Is there a workaround to that? I would like to have a function that waits for subshells to finish and returns all the exit status like this:
waitpids() {
local pids_status
for pid in "$#"; do
wait "$pid";
pids_status+=($?)
done
echo "$pids_status[#]}"
}
To avoid creating a subshell you'd have to use a redirect instead. The easiest is probably to create a temporary file to hold the data:
trap 'rm -f "$stdout"' EXIT # Optional
stdout="$(mktemp)"
waitpid > "$stdout"
spid_status=$?
I have the following 2 scripts
Parent.sh
#!/usr/bin/ksh
echo "In Parent : Before"
Child.sh
echo "In Parent : After"
read
Child.sh
#!/usr/bin/ksh
function quit_handler
{
echo "Quit on Child"
stty $origtermconfig
exit
}
origtermconfig="$(stty -g)"
trap quit_handler INT
while true
do
echo "Child Says Hi"
echo "Child PID is" $PID
echo "Parent PID is " $PPID
sleep 2
done
Below is a session transcript
u0012734#l273pp039_pub[/home/u0012734] > Parent.sh
In Parent : Before
Child Says Hi
Child PID is 16618
Parent PID is 18640
Child Says Hi
Child PID is 16618
Parent PID is 18640
Child Says Hi
Child PID is 16618
Parent PID is 18640 <----- I pressed CTRL-C Here
Quit on Child
u0012734#l273pp039_pub[/home/u0012734] >
I was expecting the parent script to continue execution of the third and fourth line of Parent.sh but that did not happen. What could be the issue? Please guide.
The below answer helped. I am also posting a link that has some good details related to SIGINTs and handling it well
When you hit Control+C (or whatever character is configured to be the INTR character), SIGINT is sent to all processes in the foreground process group. This includes the parent process in your example. Your parent isn't configured to trap on SIGINT so it terminates.
Source: POSIX.1-2008 XBD section 11.1.9
I am working on a shell script, and want to handle various exit codes that I might come across. To try things out, I am using this script:
#!/bin/sh
echo "Starting"
trap "echo \"first one\"; echo \"second one\"; " 1
exit 1;
I suppose I am missing something, but it seems I can't trap my own "exit 1". If I try to trap 0 everything works out:
#!/bin/sh
echo "Starting"
trap "echo \"first one\"; echo \"second one\"; " 0
exit
Is there anything I should know about trapping HUP (1) exit code?
trap dispatches on signals the process receives (e.g., from a kill), not on exit codes, with trap ... 0 being reserved for process ending. trap /blah/blah 0 will dispatch on either exit 0 or exit 1
That's just an exit code, it doesn't mean HUP. So your trap ... 1 is looking for HUP, but the exit is just an exit.
In addition to the system signals which you can list by doing trap -l, you can use some special Bash sigspecs: ERR, EXIT, RETURN and DEBUG. In all cases, you should use the name of the signal rather than the number for readability.
You can also use || operator, with a || b, b gets executed when a failed
#!/bin/sh
failed
{
echo "Failed $*"
exit 1
}
dosomething arg1 || failed "some comments"