Why does my bash script take so long to respond to kill when it runs in the background? - bash

(Question revised, now that I understand more about what's actually happening):
I have a script that runs in the background, periodically doing some work and then sleeping for 30 seconds:
echo "background script PID: $$"
trap 'echo "Exiting..."' INT EXIT
while true; do
# check for stuff to do, do it
sleep 30
done &
If I try to kill this script via kill or kill INT, it takes 30 seconds to respond to the signal.
I will answer this question below, since I found a good explanation online.
(My original, embarrassingly un-researched question)
This question is for a bash script that includes the following trap:
trap 'echo "Exiting...">&2; kill $childPID 2>/dev/null; exit 0' \
SIGALRM SIGHUP SIGINT SIGKILL SIGPIPE SIGPROF SIGTERM \
SIGUSR1 SIGUSR2 SIGVTALRM SIGSTKFLT
If I run the script in the foreground, and hit
CTRL-C, it gets the signal immediately and exits
(under one sec).
If I run the same script in the background (&), and kill it via
kill or kill -INT, it takes 30 seconds before getting the signal.
Why is that, and how can I fix it?

As explained in http://mywiki.wooledge.org/SignalTrap --
"When bash is executing an external command in the foreground, it does not handle any signals received until the foreground process terminates" - and since sleep is an external command, bash does not even see the signal until sleep finishes.
That page has a very good overview of signal processing in bash, and work-arounds to this issue. Briefly, one correct way of handling the situation is to send the signal to the process group instead of just the parent process:
kill -INT -123 # will kill the process group with the ID 123
Head over to the referenced page for a full explanation (no sense in my reproducing any more of it here).

Possible reason: signals issued while a process is sleeping are not delivered until wake-up of the process. When started via the command line, the process doesn't sleep, so the signal gets delivered immediately.

#RashaMatt, I was unable to get the read command to work as advertised on Greg's wiki. Sending a signal to the script simply did not interrupt the read. I needed to do this:
#!/bin/bash
bail() {
echo "exiting"
kill $readpid
rm -rf $TMPDIR
exit 0
}
sig2() {
echo "doing stuff"
}
echo Shell $$ started.
trap sig2 SIGUSR2
trap bail SIGUSR1 SIGHUP SIGINT SIGQUIT SIGTERM
trap -p
TMPDIR=$(mktemp -p /tmp -d .daemonXXXXXXX)
chmod 700 $TMPDIR
mkfifo $TMPDIR/fifo
chmod 400 $TMPDIR/fifo
while : ; do
read < $TMPDIR/fifo & readpid=$!
wait $readpid
done
...send the desired signal to the shell's pid displayed from the Shell $$ started line, and watch the excitement.
waiting on a sleep is simpler, true, but some os' don't have sleep infinity, and I wanted to see how Greg's read example would work (which it didn't).

Related

inconsistent signal behavior? Only works for the first signal?

Trying to have a script that is able to restart itself with exec (so it can pick up any "upgrade") given a specific signal (tried SIGHUP & SIGUSR1).
This seems to work the first time, but not the second, even tho the registration (trap) does recur in the execed instance (which is still the same PID).
#!/usr/bin/env bash
set -x
readonly PROGNAME="${0}"
function run_prog()
{
echo hi
sleep 2
echo ho
sleep 1000 &
wait $!
}
restart()
{
sleep 5
exec "${PROGNAME}"
}
trap restart USR1
echo -e "TRAPS:"
trap
echo
run_prog
This is how I run it:
./tst.sh & TSTPID=$! # Starts ok, see both "hi" & "ho" messages
sleep 10
kill -USR1 ${TSTPID} # Restarts ok, see both "hi" & "ho" messages
sleep 10
kill -USR1 ${TSTPID} # NOTHING HAPPENS
sleep 5
kill ${TSTPID}
Any idea why the second signal is ignored? (some code, like de-registering the trap in the cleanup may just be paranoia)
Maybe because you're execing from a signal handler, the signal code is continuing to run and continuing into oblivion, due to the exec, or preventing other cleanup code or daisy-chained handlers from executing.
Who knows what's going on in the blackbox of the OS signal handling code and bash's own layering over it that might be circumvented by exec. exec is a very draconian measure :-)
Also check out this cool bash site. I'm looking for the bash source code that handles signals. Just curious.
Your solution here is the right approach:
#!/usr/bin/env bash
set -x
readonly PROGNAME="${0}"
DO_RESTART=
function run_prog()
{
echo hi
sleep 2
echo ho
sleep 1000 &
SLEEPPID=$!
#builtin
wait ${SLEEPPID}
}
trap DO_RESTART=1 SIGUSR1
echo -e "TRAPS:"
trap -p
echo
run_prog
if [ -n "${DO_RESTART}" ]; then
sleep 5
kill ${SLEEPPID}
exec "${PROGNAME}"
fi

Sending SIGINT to foreground process works but not background

I have two scripts. script1 spawns script2 and then sends a SIGINT signal to it. However the trap in script2 doesn't seem to work?!
script1:
#!/bin/bash
./script2 &
sleep 1
kill -SIGINT $!
sleep 2
script2:
#!/bin/bash
echo "~~ENTRY"
trap 'echo you hit ctrl-c, waking up...' SIGINT
sleep infinity
echo "~~EXIT"
If change ./script2 & to ./script2 and press CTRL+C the whole things works fine. So what am I doing wrong?
You have several issues in your examples, at the end I have a solution for your issue:
your first script seems to miss a wait statement, thus, it exits
after roughly 3 seconds. However script2 will remain in memory and
running.
How do you want bash to automatically figure which process it should
send the SIGINT signal ?
Actually bash will disable SIGINT (and SIGQUIT) on background processes and they can't be enabled (you can check by running trap command alone to check the current status of set traps). See How to send a signal SIGINT from script to script ? BASH
So your script2 is NOT setting a trap on SIGINT because it's a background process, both SIGINT and SIGQUIT are ignored and can't be anymore trapped nor resetted on background processes.
As a reference, here are the documentation from bash related to your issue:
Process group id effect on background process (in Job Control section of doc):
[...] processes whose process group ID is equal to the current terminal
process group ID [..] receive keyboard-generated signals such as
SIGINT. These processes are said to be in the foreground.
Background processes are those whose process group ID differs from
the terminal's; such processes are immune to keyboard-generated
signals.
Default handler for SIGINT and SIGQUIT (in Signals section of doc):
Non-builtin commands run by bash have signal handlers set to the values inherited by the shell from its parent. When job control is not in effect, asynchronous commands ignore SIGINT and SIGQUIT in addition to these inherited handlers.
and about modification of traps (in trap builtin doc):
Signals ignored upon entry to the shell cannot be trapped or reset.
SOLUTION 1
modify your script1 to be:
#!/bin/bash
{ ./script2; } &
sleep 1
subshell_pid=$!
pid=$(ps -ax -o ppid,pid --no-headers | sed -r 's/^ +//g;s/ +/ /g' |
grep "^$subshell_pid " | cut -f 2 -d " ")
kill -SIGINT $pid
sleep 2
wait ## Don't forget this.
How does this work ? Actually, the usage of { and } will create a subshell, that will be limited by the explained limitation on SIGINT, because this subshell is a background process. However, the subshell's own subprocess are foreground and NOT background processes (for our subshell scope)... as a consequence, they can trap or reset SIGINT and SIGQUIT signals.
The trick is then to find the pid of this subprocess in the subshell, here I use ps to find the only process having the subshell's pid as parent pid.
SOLUTION 2
Actually, only direct new process managed as job will get their SIGINT and SIGQUIT ignored. A simple bash function won't. So if script2 code was in a function sourced in script1, here would be your new script1 that doesn't need anything else:
#!/bin/bash
script2() {
## script2 code
echo "~~ENTRY"
trap 'echo you hit ctrl-c, waking up...' SIGINT
sleep infinity
echo "~~EXIT"
}
## script1 code
script2 &
sleep 1
kill -SIGINT $!
sleep 2
This will work also. Behind the scene, the same mecanism than SOLUTION 1 is working: a bash function is very close to the { } construct.
I guess what you are trying to achieve is that when script2 receives the SIGINT it continues and prints the message. Then, you need
#!/bin/bash
echo "~~ENTRY"
trap 'echo you hit ctrl-c, waking up...; CONT=true' SIGINT
CONT=false
while ! $CONT
do
sleep 1
done
echo "~~EXIT"

Why subshell can't catch signal from parent shell?

I've got 2 shell scripts:
# subshell.sh
trap "echo Caught SIGTERM" 15
echo $$
sleep 100000
# parent.sh
setsid sh subshell.sh &
pid=$!
echo "sid=$pid"
sleep 2
# This won't work!
kill -15 -$pid
The main purpose is to send SIGTERM to subshell and all its children. After googling for a while (there is a tricky problem of how bash handles signal), I choose setsid to create a new session and sending the signal used -pid. However, the message won't be printed although pid is correct. If I manually execuate kill -15 -$pid, this can work. So how can I send a signal to the subshell?
Well finally I managed to make this work by creating another subshell..., and then call kill -15 -$pid inside that subshell. Still don't know why parent shell can't do this

Set trap in bash for different process with PID known

I need to set a trap for a bash process I'm starting in the background. The background process may run very long and has its PID saved in a specific file.
Now I need to set a trap for that process, so if it terminates, the PID file will be deleted.
Is there a way I can do that?
EDIT #1
It looks like I was not precise enough with my description of the problem. I have full control over all the code, but the long running background process I have is this:
cat /dev/random >> myfile&
When I now add the trap at the beginning of the script this statement is in, $$ will be the PID of that bigger script not of this small background process I am starting here.
So how can I set traps for that background process specifically?
(./jobsworthy& echo $! > $pidfile; wait; rm -f $pidfile)&
disown
Add this to the beginning of your Bash script.
#!/bin/bash
trap 'rm "$pidfile"; exit' EXIT SIGQUIT SIGINT SIGSTOP SIGTERM ERR
pidfile=$(tempfile -p foo -s $$)
echo $$ > "$pidfile"
# from here, do your long running process
You can run your long running background process in an explicit subshell, as already shown by Petesh's answer, and set a trap inside this specific subshell to handle the exiting of your long running background process. The parent shell remains unaffected by this subshell trap.
(
trap '
trap - EXIT ERR
kill -0 ${!} 1>/dev/null 2>&1 && kill ${!}
rm -f pidfile.pid
exit
' EXIT QUIT INT STOP TERM ERR
# simulate background process
sleep 15 &
echo ${!} > pidfile.pid
wait
) &
disown
# remove background process by hand
# kill -TERM ${!}
You do not need trap to just run some command after a background process terminates, you can instead run through a shell command line and add the command following after the background process, separated with semicolon (and let this shell run in the background instead of the background process).
If you still would like to have some notification in your shell script send and trap SIGUSR2 for instance:
#!/bin/sh
BACKGROUND_PROCESS=xterm # for my testing, replace with what you have
sh -c "$BACKGROUND_PROCESS; rm -f the_pid_file; kill -USR2 $$" &
trap "echo $BACKGROUND_PROCESS ended" USR2
while sleep 1
do
echo -n .
done

bash trap will echo from keyboard Ctrl-C while not kill 2

Say I have a script:
#!/bin/bash
# test_trap.sh
trap "echo SIGINT captured!" SIGINT
echo $$
sleep 1000
I know trap COMMAND will only be executed after sleep 1000 finishes when it receives SIGINT signal. But the command of trap will be executed when I pressed keyboard Ctrl-C:
> sh test_sh.sh
50138
^CSIGINT captured!
And using kill -s SIGINT will not.
What am I missing here?
The bash version is GNU bash, 4.2.46(2)-release
With kill -s SIGINT 50138, you are only sending the signal to the shell's process, and that has to wait for sleep 1000 to finish, because sleep doesn't receive the signal.
Control-C, though, causes the terminal to send SIGINT to every process in the current process group, so both your shell script and sleep receive it. Your script still doesn't process the trap command until sleep completes, but sleep exits immediately in response to the SIGINT it just received from the terminal.
If your kill supports it, you can also use kill -s SIGINT -50138 (note the negative process id) to send SIGINT to the entire process group.

Resources