Unable to trap SIGINT signal in a background shell - bash

I am unable to trap a signal when running in a child / background process.
Here is my simple bash script:
#!/bin/bash
echo "in child"
trap "got_signal" SIGINT
function got_signal {
echo "trapped"
exit 0
}
while [ true ]; do
sleep 2
done
When running this and later do
kill -SIGINT (pid)
everything works as expected, it prints trapped and exits.
Now, if I start the same script from a parent script like this:
#!/bin/bash
echo "starting the child"
./child.sh &
Then the child does not trap the signal anymore.... ?
After changing to use SIGTERM instead of SIGINT, it seems to be working correctly... ?

The bash manpage on OSX (but it should be the same in other versions) has this to say about signal handling:
Non-builtin commands run by bash have signal handlers set to the values
inherited by the shell from its parent. When job control is not in
effect, asynchronous commands ignore SIGINT and SIGQUIT in addition to
these inherited handlers.
and further on, under the trap command:
Signals ignored upon entry to the shell cannot
be trapped or reset.
Since scripts don't use job control by default, this means the case you're talking about.

Per your note:
Signals ignored upon entry to the shell cannot be trapped or reset.
I have noticed that ZSH does not ignore the signals sent back and forth between parent and child process, but bash does. Here's the question I posted myself:
Trapping CHLD signal - ZSH works but ksh/bash/sh don't?

Related

bash: Why can't I set a trap for SIGINT in a background shell?

Here's a simple program that registers two trap handlers and then displays them with trap -p. Then it does the same thing, but in a child background process.
Why does the background process ignore the SIGINT trap?
#!/bin/bash
echo "Traps on startup:"
trap -p
echo ""
trap 'echo "Received INT"' INT
trap 'echo "Received TERM"' TERM
echo "Traps set on parent:"
trap -p
echo ""
(
echo "Child traps on startup:"
trap -p
echo ""
trap 'echo "Child received INT"' INT
trap 'echo "Child received TERM"' TERM
echo "Traps set on child:"
trap -p
echo ""
) &
child_pid=$!
wait $child_pid
Output:
$ ./show-traps.sh
Traps on startup:
Traps set on parent:
trap -- 'echo "Received INT"' SIGINT
trap -- 'echo "Received TERM"' SIGTERM
Child traps on startup:
Traps set on child:
trap -- 'echo "Child received TERM"' SIGTERM
SIGINT and SIGQUIT are ignored in backgrounded processes (unless they're backgrounded with set -m on). It's a (weird) POSIX requirement (see
2. Shell Command Language or my SO question Why do shells ignore SIGINT and SIGQUIT in backgrounded processes? for more details).
Additionally, POSIX requires that:
When a subshell is entered, traps that are not being ignored shall be
set to the default actions, except in the case of a command
substitution containing only a single trap command ..
However, even if you set the INT handler in the subshell again after it was reset, the subshell won't be able to receive it because it's ignored (you can try it or you can inspect the signal ignore mask using ps, for example).
Background jobs are not supposed to be tied to the shell that started them. If you exit a shell, they will continue running. As such they shouldn't be interrupted by SIGINT, not by default. When job control is enabled, that is fulfilled automatically, since background jobs are running in separate process groups. When job control is disabled (generally in non-interactive shells), bash makes the asynchronous commands ignore SIGINT.
The relevant parts of the documentation:
Non-builtin commands started by Bash have signal handlers set to the values inherited by the shell from its parent. When job control is not in effect, asynchronous commands ignore SIGINT and SIGQUIT in addition to these inherited handlers. Commands run as a result of command substitution ignore the keyboard-generated job control signals SIGTTIN, SIGTTOU, and SIGTSTP.
https://www.gnu.org/software/bash/manual/html_node/Signals.html
To facilitate the implementation of the user interface to job control, the operating system maintains the notion of a current terminal process group ID. Members of this process group (processes whose process group ID is equal to the current terminal process group ID) receive keyboard-generated signals such as SIGINT. These processes are said to be in the foreground. Background processes are those whose process group ID differs from the terminal’s; such processes are immune to keyboard-generated signals. Only foreground processes are allowed to read from or, if the user so specifies with stty tostop, write to the terminal. Background processes which attempt to read from (write to when stty tostop is in effect) the terminal are sent a SIGTTIN (SIGTTOU) signal by the kernel’s terminal driver, which, unless caught, suspends the process.
https://www.gnu.org/software/bash/manual/html_node/Job-Control-Basics.html
More on it here.

Forwarding signals in bash script which is submitted on the cluster

I have a launch.sh script which I submit on the cluster with
bsub $settings < launch.sh
This launch.sh bash script looks simplified as the following:
function trap_with_arg() {
func="$1" ; shift
for sig ; do
echo "$ES Installing trap for signal $sig"
trap "$func $sig" "$sig"
done
}
function signalHandler() {
# do stuff depending in what stage the script is
}
# Setup the Trap
trap_with_arg signalHandler SIGINT SIGTERM SIGUSR1 SIGUSR2
./start.sh
mpirun process.sh
./end.sh
Where process.sh calls two binaries (as an example) as
./binaryA
./binaryB
My question is the following:
The cluster already sends SIGUSR1 (approx. 10min before SIGTERM) to the process (I think this is the bash shell running my launch.sh script).
At the moment I catch this signal in the launch.sh script and call some signal handler. The problem is, this signal handler only gets executed (at least what I know) after a running command is finished (e.g. that might be mpirun process.sh or ./start.sh )
How can I forward these signals to make the commands/binaries exit gracefully. Forwarding for example to process.sh (mpirun, as I experienced, already forwards somehow these received signals (how does it do that?)
What is the proper way of forwarding signals, (e.g. also to the binaries binaryA, binaryB ?
I have no really good clue how to do this? Making the commands execute in background, creating a child process?
Thanks for some enlightenment :-)
From bash manual at http://www.gnu.org/software/bash/manual/html_node/Signals.html:
If Bash is waiting for a command to complete and receives a signal for which a trap has been set, the trap will not be executed until the command completes. When Bash is waiting for an asynchronous command via the wait builtin, the reception of a signal for which a trap has been set will cause the wait builtin to return immediately with an exit status greater than 128, immediately after which the trap is executed.
Thus, the solution seems to place commands in background and use "wait":
something &
wait

Basic signal communication

I have a bash script, its contents are:
function foo {
echo "Foo!"
}
function clean {
echo "exiting"
}
trap clean EXIT
trap foo SIGTERM
echo "Starting process with PID: $$"
while :
do
sleep 60
done
I execute this on a terminal with:
./my_script
And then do this on another terminal
kill -SIGTERM my_script_pid # obviously the PID is the one echoed from my_script
I would expect to see the message "Foo!" from the other terminal, but It's not working. SIGKILL works and the EXIT code is also executed.
Using Ctrl-C on the terminal my_script is running on triggers foo normally, but somehow I can't send the signal SIGTERM from another terminal to this one.
Replacing SIGTERM with any other signal doesn't change a thing (besides Ctrl-C not triggering anything, it was actually mapped to SIGUSR1 in the beginning).
It may be worth mentioning that just the signal being trapped is not working, and any other signal is having the default behaviour.
So, what am I missing? Any clues?
EDIT: I also just checked it wasn't a privilege issue (that would be weird as I'm able to send SIGKILL anyway), but it doesn't seem to be that.
Bash runs the trap only after sleep returns.
To understand why, think in C / Unix internals: While the signal is dispatched instantly to bash, the corresponding signal handler that bash has setup only does something like received_sigterm = true.
Only when sleep returns, and the wait system call which bash issued after starting the sleep process returns also, bash resumes its normal work and executes your trap (after noticing received_sigterm).
This is done this way for good reasons: Doing I/O (or generally calling into the kernel) directly from a signal handler generally results in undefined behaviour as far as I know - although I can't tell more about that.
Apart from this technical reason, there is another reason why bash doesn't run the trap instantly: This would actually undermine the fundamental semantics of the shell. Jobs (this includes pipelines) are executed strictly in a sequential manner unless you explicitly mess with background jobs.
The PID that you originally print is for the bash instance that executes your script, not for the sleep process that it is waiting on. During sleep, the signal is likely to be ignored.
If you want to see the effect that you are looking for, replace sleep with a shorter-lived process like ps.
function foo {
echo "Foo!"
}
function clean {
echo "exiting"
}
trap clean EXIT
trap foo SIGTERM
echo "Starting process with PID: $$"
while :
do
ps > /dev/null
done

How do I stop a signal from killing my Bash script?

I want an infinite loop to keep on running, and only temporarily be interrupted by a kill signal. I've tried SIGINT, SIGUSR1, SIGUSR2. All of them seem to halt the loop. I even tried SIGINFO, but that wasn't supported by Linux.
#!/bin/bash
echo $$ > /tmp/pid # Save the pid
function do_something {
echo "I am doing stuff" #let's do this now, and go back to doing the thing that is to be done over and over again.
#exit
}
while :
do
echo "This should be done over and over again, but always wait for someething else to be done in between"
trap do_something SIGINT
while `true`
do
sleep 1 #so we're waiting for that other thing.
done
done
My code runs the function once, after getting a INT signal from another script, but then never again. It halts.
EDIT: Although I accidentally put en exit at the end of the function, here on Stack Overflow, I didn't in the actual code I used. Either way, it made no difference. The solution is SIGTERM as described by Tiago.
I believe you're looking for SIGTERM:
Example:
#! /bin/bash
trap -- '' SIGINT SIGTERM
while true; do
date +%F_%T
sleep 1
done
Running this example cTRL+C won't kill it nor kill <pid> you can however kill it with kill -9 <pid>.
If you don't want CTRL+Z to interrupt use: trap -- '' SIGINT SIGTERM SIGTSTP
trap the signal, then either react to it appropriately, in the function associate with the trap, or ignore it by for example associate : as command to get executed when the signal occurs.
to trap signals, bash knows the trap command
Reset trap to former action by executing trap with signal name only.
Therefore you want to (i think that's what you say you want with "only temporarily be interrupted by a kill signal"):
trap the signal at the begin of your script: trap signal custom_action
just before you want the signal to allow interrupting your script, execute: trap signal
At the end of that phase, trap again by: signal custom_action
to specify signals, you can also use their respective signal numbers. A list of signal names is printed with the command:
trap -l
the default signal sent by kill is SIGTERM (15), unless you specify a different signal after the kill command
don't exit in your do_something function. Simply let the function return to the section in your code where it was interrupted when the signal occured.
The mentioned ":" command has another potential use in your script, if you feel thusly inclined:
while :
do
sleep 1
done
can be an alternative to "while true" - no backticks needed for that, btw.
You just want to ignore the exit status.
If you want your script to keep running and not exit, without worrying about handling traps.
(my_command) || true
The parentheses execute that command in a subshell. The true is for compatibility with set -e, if you use it. It simply overrides the status to always report a success.
See the source.
I found this question to be helpful:
How to run a command before a Bash script exits?

shell script process termination issue

/bin/sh -version
GNU sh, version 1.14.7(1)
exitfn () {
# Resore signal handling for SIGINT
echo "exiting with trap" >> /tmp/logfile
rm -f /var/run/lockfile.pid # Growl at user,
exit # then exit script.
}
trap 'exitfn; exit' SIGINT SIGQUIT SIGTERM SIGKILL SIGHUP
The above is my function in shell script.
I want to call it in some special conditions...like
when:
"kill -9" fires on pid of this script
"ctrl + z" press while it is running on -x mode
server reboots while script is executing ..
In short, with any kind of interrupt in script, should do some action
eg. rm -f /var/run/lockfile.pid
but my above function is not working properly; it works only for terminal close or "ctrl + c"
Kindly don't suggest to upgrade "bash / sh" version.
SIGKILL cannot be trapped by the trap command, or by any process. It is a guarenteed kill signal, that by it's definition cannot be trapped. Thus upgrading you sh/bash will not work anyway.
You can't trap kill -9 that's the whole point of it, to destroy processes violently that don't respond to other signals (there's a workaround for this, see below).
The server reboot should first deliver a signal to your script which should be caught with what you have.
As to the CTRL-Z, that also gives you a signal, SIGSTOP from memory, so you may want to add that. Though that wouldn't normally be a reason to shut down your process since it may be then put into the background and restarted (with bg).
As to what do do for those situations where your process dies without a catchable signal (like the -9 case), the program should check for that on startup.
By that, I mean lockfile.pid should store the actual PID of the process that created it (by using echo $$ >/var/run/myprog_lockfile.pid for example) and, if you try to start your program, it should check for the existence of that process.
If the process doesn't exist, or it exists but isn't the right one (based on name usually), your new process should delete the pidfile and carry on as if it was never there. If the old process both exists and is the right one, your new process should log a message and exit.

Resources