bash: Why can't I set a trap for SIGINT in a background shell? - bash

Here's a simple program that registers two trap handlers and then displays them with trap -p. Then it does the same thing, but in a child background process.
Why does the background process ignore the SIGINT trap?
#!/bin/bash
echo "Traps on startup:"
trap -p
echo ""
trap 'echo "Received INT"' INT
trap 'echo "Received TERM"' TERM
echo "Traps set on parent:"
trap -p
echo ""
(
echo "Child traps on startup:"
trap -p
echo ""
trap 'echo "Child received INT"' INT
trap 'echo "Child received TERM"' TERM
echo "Traps set on child:"
trap -p
echo ""
) &
child_pid=$!
wait $child_pid
Output:
$ ./show-traps.sh
Traps on startup:
Traps set on parent:
trap -- 'echo "Received INT"' SIGINT
trap -- 'echo "Received TERM"' SIGTERM
Child traps on startup:
Traps set on child:
trap -- 'echo "Child received TERM"' SIGTERM

SIGINT and SIGQUIT are ignored in backgrounded processes (unless they're backgrounded with set -m on). It's a (weird) POSIX requirement (see
2. Shell Command Language or my SO question Why do shells ignore SIGINT and SIGQUIT in backgrounded processes? for more details).
Additionally, POSIX requires that:
When a subshell is entered, traps that are not being ignored shall be
set to the default actions, except in the case of a command
substitution containing only a single trap command ..
However, even if you set the INT handler in the subshell again after it was reset, the subshell won't be able to receive it because it's ignored (you can try it or you can inspect the signal ignore mask using ps, for example).

Background jobs are not supposed to be tied to the shell that started them. If you exit a shell, they will continue running. As such they shouldn't be interrupted by SIGINT, not by default. When job control is enabled, that is fulfilled automatically, since background jobs are running in separate process groups. When job control is disabled (generally in non-interactive shells), bash makes the asynchronous commands ignore SIGINT.
The relevant parts of the documentation:
Non-builtin commands started by Bash have signal handlers set to the values inherited by the shell from its parent. When job control is not in effect, asynchronous commands ignore SIGINT and SIGQUIT in addition to these inherited handlers. Commands run as a result of command substitution ignore the keyboard-generated job control signals SIGTTIN, SIGTTOU, and SIGTSTP.
https://www.gnu.org/software/bash/manual/html_node/Signals.html
To facilitate the implementation of the user interface to job control, the operating system maintains the notion of a current terminal process group ID. Members of this process group (processes whose process group ID is equal to the current terminal process group ID) receive keyboard-generated signals such as SIGINT. These processes are said to be in the foreground. Background processes are those whose process group ID differs from the terminal’s; such processes are immune to keyboard-generated signals. Only foreground processes are allowed to read from or, if the user so specifies with stty tostop, write to the terminal. Background processes which attempt to read from (write to when stty tostop is in effect) the terminal are sent a SIGTTIN (SIGTTOU) signal by the kernel’s terminal driver, which, unless caught, suspends the process.
https://www.gnu.org/software/bash/manual/html_node/Job-Control-Basics.html
More on it here.

Related

bash trap propagated to command with custom signal handler

In my script I'm trapping signals in the usual way.
function on_stop {
echo 'On Stop'
sleep 10
echo 'Signalling others to exit'
trap - TERM EXIT INT
kill -s INT "$$"
}
./executable_with_custom_signal_handling &
pid=$!
trap 'on_stop' TERM EXIT INT
wait
If sleep is used instead of ./executable_with_custom_signal_handling everything works as expected. Otherwise, ./executable_with_custom_signal_handling receives signal immediately in parallel with on_stop.
I am wondering does it have something to do with a custom signal handling in the executable?
signal(SIGINT, handler)
Any workarounds known?
By default, the shell runs backgrounded commands with SIGINT (and SIGQUIT) ignored.
Your backgrounded sleep is not interrupted by the Ctrl-C SIGINT to the process group, then, because it never sees the signal. When your custom executable installs a new signal action, replacing SIG_IGN, that executable will receive the SIGINT.

How can I catch SIGINT without closing xterm?

Here is my script;
#!/bin/bash
trap '' SIGINT
xterm &
wait
I run it and an xterm pops up. Then I focus my keyboard on the originating terminal window and hit ^C. I would like nothing to happen, but instead the child xterm goes away.
(Ideally, I want to install my own trap handler, but this is a baby step)
Using disown after forking xterm detaches the xterm from the parent and then ^C doesn't do anything to the xterm, but then wait doesn't work.
I just want to block SIGINT from getting to xterm.
When you send SIGINT to bash script the signal is propagate to current process in the script, then it executes command in trap. So "wait" is interrupted. You must do that "wait" run again.
Also you must do that all jobs are launched in their own process groups (set -m). From the set man page:
set -m
Monitor mode. Job control is enabled. This option is on by default for interactive shells on systems that support it (see JOB
CONTROL above). Background processes run in a
separate process group and a line containing their exit status is printed upon their completion.
#!/bin/bash
set -m
trap 'R=true' SIGINT
xterm &
while : ; do
R=false
wait
[[ $R == true ]] || break
done
You can see commands that it run with '-x' option in shebang.
Pressing CTRL+C will send SIGINT signal to each process under the same group of the foreground process's. So xterm goes away too. You can use setsid to change the group id of xterm's.
#!/bin/bash
trap 'echo "Caught SIGINT"' SIGINT
setsid xterm &
wait
wait will be interrupted by SIGINT too. So if you want to wait after pressing CTRL+C, you need to wait again according to suggestion of #fbohorquez.
#!/bin/bash
trap 'R=true;echo "Caught SIGINT"' SIGINT
setsid xterm &
while : ; do
R=false
wait
[ $R == false ] && break
done

Forwarding signals in bash script which is submitted on the cluster

I have a launch.sh script which I submit on the cluster with
bsub $settings < launch.sh
This launch.sh bash script looks simplified as the following:
function trap_with_arg() {
func="$1" ; shift
for sig ; do
echo "$ES Installing trap for signal $sig"
trap "$func $sig" "$sig"
done
}
function signalHandler() {
# do stuff depending in what stage the script is
}
# Setup the Trap
trap_with_arg signalHandler SIGINT SIGTERM SIGUSR1 SIGUSR2
./start.sh
mpirun process.sh
./end.sh
Where process.sh calls two binaries (as an example) as
./binaryA
./binaryB
My question is the following:
The cluster already sends SIGUSR1 (approx. 10min before SIGTERM) to the process (I think this is the bash shell running my launch.sh script).
At the moment I catch this signal in the launch.sh script and call some signal handler. The problem is, this signal handler only gets executed (at least what I know) after a running command is finished (e.g. that might be mpirun process.sh or ./start.sh )
How can I forward these signals to make the commands/binaries exit gracefully. Forwarding for example to process.sh (mpirun, as I experienced, already forwards somehow these received signals (how does it do that?)
What is the proper way of forwarding signals, (e.g. also to the binaries binaryA, binaryB ?
I have no really good clue how to do this? Making the commands execute in background, creating a child process?
Thanks for some enlightenment :-)
From bash manual at http://www.gnu.org/software/bash/manual/html_node/Signals.html:
If Bash is waiting for a command to complete and receives a signal for which a trap has been set, the trap will not be executed until the command completes. When Bash is waiting for an asynchronous command via the wait builtin, the reception of a signal for which a trap has been set will cause the wait builtin to return immediately with an exit status greater than 128, immediately after which the trap is executed.
Thus, the solution seems to place commands in background and use "wait":
something &
wait

How does trap / kill work in bash on Linux?

My sample file
traptest.sh:
#!/bin/bash
trap 'echo trapped' TERM
while :
do
sleep 1000
done
$ traptest.sh &
[1] 4280
$ kill %1 <-- kill by job number works
Terminated
trapped
$ traptest.sh &
[1] 4280
$ kill 4280 <-- kill by process id doesn't work?
(sound of crickets, process isn't killed)
If I remove the trap statement completely, kill process-id works again?
Running some RHEL 2.6.18-194.11.4.el5 at work. I am really confused by this behaviour, is it right?
kill [pid]
send the TERM signal exclusively to the specified PID.
kill %1
send the TERM signal to the job #1's entire process group, in this case to the script pid + his children (sleep).
I've verified that with strace on sleep process and on script process
Anyway, someone got a similar problem here (but with SIGINT instead of SIGTERM): http://www.vidarholen.net/contents/blog/?p=34.
Quoting the most important sentence:
kill -INT %1 sends the signal to the job’s process group, not the backgrounded pid!
This is expected behavior. Default signal sent by kill is SIGTERM, which you are catching by your trap. Consider this:
#!/bin/bash
# traptest.sh
trap "echo Booh!" SIGINT SIGTERM
echo "pid is $$"
while : # This is the same as "while true".
do
a=1
done
(sleep really creates a new process and the behavior is clearer with my example I guess).
So if you run traptest.sh in one terminal and kill TRAPTEST_PROCESS_ID from another terminal, output in the terminal running traptest will be Booh! as expected (and the process will NOT be killed). If you try sending kill -s HUP TRAPTEST_PROCESS_ID, it will kill the traptest process.
This should clear up the %1 confusion.
Note: the code example is taken from tldp
Davide Berra explained the difference between kill %<jobspec> and kill <PID>, but not how that difference results in what you observed. After all, Unix signal handlers should be called pretty much instantaneously, so why does sending a SIGTERM to the script alone not trigger its trap handler?
The bash man page explains why, in the last paragraph of the SIGNALS section:
If bash is waiting for a command to complete and receives a signal for
which a trap has been set, the trap will not be executed until the
command completes.
So, the signal was delivered immediately, but the handler execution was deferred until sleep exited.
Hence, with kill %<jobspec>:
Both the script and sleep received SIGTERM
bash registered the signal, noticed that a trap was set for it, and queued the handler for future execution
sleep exited immediately
bash noted sleep's exit, and ran the trap handler
whereas with kill <script_PID>:
Only the script received SIGTERM
bash registered the signal, noticed that a trap was set for it, and queued the handler for future execution
sleep exited after 1000 seconds
bash noted sleep's exit, and ran the trap handler
Obviously, you didn't want long enough to see that last bit. :)
If you're interested in the gory details, download the bash source code and look in trap.c, specifically the trap_handler() and run_pending_traps() functions.

Unable to trap SIGINT signal in a background shell

I am unable to trap a signal when running in a child / background process.
Here is my simple bash script:
#!/bin/bash
echo "in child"
trap "got_signal" SIGINT
function got_signal {
echo "trapped"
exit 0
}
while [ true ]; do
sleep 2
done
When running this and later do
kill -SIGINT (pid)
everything works as expected, it prints trapped and exits.
Now, if I start the same script from a parent script like this:
#!/bin/bash
echo "starting the child"
./child.sh &
Then the child does not trap the signal anymore.... ?
After changing to use SIGTERM instead of SIGINT, it seems to be working correctly... ?
The bash manpage on OSX (but it should be the same in other versions) has this to say about signal handling:
Non-builtin commands run by bash have signal handlers set to the values
inherited by the shell from its parent. When job control is not in
effect, asynchronous commands ignore SIGINT and SIGQUIT in addition to
these inherited handlers.
and further on, under the trap command:
Signals ignored upon entry to the shell cannot
be trapped or reset.
Since scripts don't use job control by default, this means the case you're talking about.
Per your note:
Signals ignored upon entry to the shell cannot be trapped or reset.
I have noticed that ZSH does not ignore the signals sent back and forth between parent and child process, but bash does. Here's the question I posted myself:
Trapping CHLD signal - ZSH works but ksh/bash/sh don't?

Resources