I'm trying to build a generic retry shell function to re-run the specified shell command a few times if it fails in the last time, here is my code:
retry() {
declare -i number=$1
declare -i interrupt=0
trap "echo Exited!; interrupt=1;" SIGINT SIGTERM SIGQUIT SIGKILL
shift
for i in `seq $number`; do
echo "\n-- Retry ${i}th time(s) --\n"
$#
if [[ $? -eq 0 || $interrupt -ne 0 ]]; then
break;
fi
done
}
It works great for wget, curl and other all kinds of common commands. However, if I run
retry 10 rsync local remote
, send a ctrl+c to interrupt it during transferring progress, it reports
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(700) [sender=3.1.3]
It seems that rsync suppresses the SIGINT and other some related signals inside, then returns a code 20 to the outside caller. This return code didn't make the loop break, then I send a few ctrl+c to interrupt the next rsync commands. It prints Exited! only for the last ctrl+c and trap catch it.
Questions:
Why does it first return code 20 didn't make the loop break?
How to let the trap catch the SIGINT signal but rsync, if not, what should I do?
Why does it first return code 20 didn't make the loop break?
You are correct that rsync does catch certain signals and exit with RERR_SIGNAL (20).
How to let the trap catch the SIGINT signal but rsync, if not, what should I do?
Since rsync has its own handlers, you can't do anything ( could use some hacks to override signal handlers within rsync with LD_PRELOAD for example. But it may be unnecessarily complicated). Since your traps are in the current shell, you wouldn't know whether the "command" was signaled or exit with non-zero.
I'd assume you want to your retry to be generic and you don't want special handling of rsync (e.g., a different command may exit with 75 on signals and you don't want to dealing with special cases).
The problem is your trap handlers isn't active as the the signal is received by the current process running process (rsync). You could instead run your command in the background and wait for it to complete. This would allow your catch signals from retry. On receiving a signal, it simply kills the child process.
#!/bin/bash
retry()
{
declare -i number=$1
declare -i i
declare -i pid
declare -i interrupted=0
trap "echo Exiting...; interrupted=1" SIGINT SIGTERM SIGQUIT
shift
# Turn off "monitor mode" so the shell doesn't report terminating background jobs.
set +m
for ((i = 0; i < number; ++i)); do
echo "\n-- Retry ${i}th time(s) --\n"
$# &
pid=$!
# If command succeeded, break
wait $pid && break
# If we receive one of the signals, break
[[ $interrupted == 1 ]] && kill $pid && break
done
# Switch back to default behaviour
set -m
trap - SIGINT SIGTERM SIGQUIT
}
Note that SIGKILL can't be caught. So there's no point in setting a trap for it. So I have removed it.
Related
Here is a simplified version of some code I am working on:
#!/bin/bash
term() {
echo ctrl c pressed!
# perform cleanup - don't exit immediately
}
trap term SIGINT
sleep 100 &
wait $!
As you can see, I would like to trap CTRL+C / SIGINT and handle these with a custom function to perform some cleanup operation, rather than exiting immediately.
However, upon pressing CTRL+C, what actually seems to happen is that, while I see ctrl c pressed! is echoed as expected, the wait command is also killed which I would not like to happen (part of my cleanup operation kills sleep a bit later but first does some other things). Is there a way I can prevent this, i.e. stop CTRL+C input being sent to the wait command?
You can prevent a process called from a Bash script from receiving sigint by first ignoring the signal with trap:
#!/bin/bash
# Cannot be interrupted
( trap '' INT; exec sleep 10; )
However, only a parent process can wait for its child, so wait is a shell builtin and not a new process. This therefore doesn't apply.
Instead, just restart the wait after it gets interrupted:
#!/bin/bash
n=0
term() {
echo "ctrl c pressed!"
n=$((n+1))
}
trap term INT
sleep 100 &
while
wait "$!"
[ "$?" -eq 130 ] # Sigint results in exit code 128+2
do
if [ "$n" -ge 3 ]
then
echo "Jeez, fine"
exit 1
fi
done
I ended up using a modified version of what #thatotherguy suggested:
#!/bin/bash
term() {
echo ctrl c pressed!
# perform cleanup - don't exit immediately
}
trap term SIGINT
sleep 100 &
pid=$!
while ps -p $pid > /dev/null; do
wait $pid
done
This checks if the process is still running and, if so, runs wait again.
I have two scripts. script1 spawns script2 and then sends a SIGINT signal to it. However the trap in script2 doesn't seem to work?!
script1:
#!/bin/bash
./script2 &
sleep 1
kill -SIGINT $!
sleep 2
script2:
#!/bin/bash
echo "~~ENTRY"
trap 'echo you hit ctrl-c, waking up...' SIGINT
sleep infinity
echo "~~EXIT"
If change ./script2 & to ./script2 and press CTRL+C the whole things works fine. So what am I doing wrong?
You have several issues in your examples, at the end I have a solution for your issue:
your first script seems to miss a wait statement, thus, it exits
after roughly 3 seconds. However script2 will remain in memory and
running.
How do you want bash to automatically figure which process it should
send the SIGINT signal ?
Actually bash will disable SIGINT (and SIGQUIT) on background processes and they can't be enabled (you can check by running trap command alone to check the current status of set traps). See How to send a signal SIGINT from script to script ? BASH
So your script2 is NOT setting a trap on SIGINT because it's a background process, both SIGINT and SIGQUIT are ignored and can't be anymore trapped nor resetted on background processes.
As a reference, here are the documentation from bash related to your issue:
Process group id effect on background process (in Job Control section of doc):
[...] processes whose process group ID is equal to the current terminal
process group ID [..] receive keyboard-generated signals such as
SIGINT. These processes are said to be in the foreground.
Background processes are those whose process group ID differs from
the terminal's; such processes are immune to keyboard-generated
signals.
Default handler for SIGINT and SIGQUIT (in Signals section of doc):
Non-builtin commands run by bash have signal handlers set to the values inherited by the shell from its parent. When job control is not in effect, asynchronous commands ignore SIGINT and SIGQUIT in addition to these inherited handlers.
and about modification of traps (in trap builtin doc):
Signals ignored upon entry to the shell cannot be trapped or reset.
SOLUTION 1
modify your script1 to be:
#!/bin/bash
{ ./script2; } &
sleep 1
subshell_pid=$!
pid=$(ps -ax -o ppid,pid --no-headers | sed -r 's/^ +//g;s/ +/ /g' |
grep "^$subshell_pid " | cut -f 2 -d " ")
kill -SIGINT $pid
sleep 2
wait ## Don't forget this.
How does this work ? Actually, the usage of { and } will create a subshell, that will be limited by the explained limitation on SIGINT, because this subshell is a background process. However, the subshell's own subprocess are foreground and NOT background processes (for our subshell scope)... as a consequence, they can trap or reset SIGINT and SIGQUIT signals.
The trick is then to find the pid of this subprocess in the subshell, here I use ps to find the only process having the subshell's pid as parent pid.
SOLUTION 2
Actually, only direct new process managed as job will get their SIGINT and SIGQUIT ignored. A simple bash function won't. So if script2 code was in a function sourced in script1, here would be your new script1 that doesn't need anything else:
#!/bin/bash
script2() {
## script2 code
echo "~~ENTRY"
trap 'echo you hit ctrl-c, waking up...' SIGINT
sleep infinity
echo "~~EXIT"
}
## script1 code
script2 &
sleep 1
kill -SIGINT $!
sleep 2
This will work also. Behind the scene, the same mecanism than SOLUTION 1 is working: a bash function is very close to the { } construct.
I guess what you are trying to achieve is that when script2 receives the SIGINT it continues and prints the message. Then, you need
#!/bin/bash
echo "~~ENTRY"
trap 'echo you hit ctrl-c, waking up...; CONT=true' SIGINT
CONT=false
while ! $CONT
do
sleep 1
done
echo "~~EXIT"
(Question revised, now that I understand more about what's actually happening):
I have a script that runs in the background, periodically doing some work and then sleeping for 30 seconds:
echo "background script PID: $$"
trap 'echo "Exiting..."' INT EXIT
while true; do
# check for stuff to do, do it
sleep 30
done &
If I try to kill this script via kill or kill INT, it takes 30 seconds to respond to the signal.
I will answer this question below, since I found a good explanation online.
(My original, embarrassingly un-researched question)
This question is for a bash script that includes the following trap:
trap 'echo "Exiting...">&2; kill $childPID 2>/dev/null; exit 0' \
SIGALRM SIGHUP SIGINT SIGKILL SIGPIPE SIGPROF SIGTERM \
SIGUSR1 SIGUSR2 SIGVTALRM SIGSTKFLT
If I run the script in the foreground, and hit
CTRL-C, it gets the signal immediately and exits
(under one sec).
If I run the same script in the background (&), and kill it via
kill or kill -INT, it takes 30 seconds before getting the signal.
Why is that, and how can I fix it?
As explained in http://mywiki.wooledge.org/SignalTrap --
"When bash is executing an external command in the foreground, it does not handle any signals received until the foreground process terminates" - and since sleep is an external command, bash does not even see the signal until sleep finishes.
That page has a very good overview of signal processing in bash, and work-arounds to this issue. Briefly, one correct way of handling the situation is to send the signal to the process group instead of just the parent process:
kill -INT -123 # will kill the process group with the ID 123
Head over to the referenced page for a full explanation (no sense in my reproducing any more of it here).
Possible reason: signals issued while a process is sleeping are not delivered until wake-up of the process. When started via the command line, the process doesn't sleep, so the signal gets delivered immediately.
#RashaMatt, I was unable to get the read command to work as advertised on Greg's wiki. Sending a signal to the script simply did not interrupt the read. I needed to do this:
#!/bin/bash
bail() {
echo "exiting"
kill $readpid
rm -rf $TMPDIR
exit 0
}
sig2() {
echo "doing stuff"
}
echo Shell $$ started.
trap sig2 SIGUSR2
trap bail SIGUSR1 SIGHUP SIGINT SIGQUIT SIGTERM
trap -p
TMPDIR=$(mktemp -p /tmp -d .daemonXXXXXXX)
chmod 700 $TMPDIR
mkfifo $TMPDIR/fifo
chmod 400 $TMPDIR/fifo
while : ; do
read < $TMPDIR/fifo & readpid=$!
wait $readpid
done
...send the desired signal to the shell's pid displayed from the Shell $$ started line, and watch the excitement.
waiting on a sleep is simpler, true, but some os' don't have sleep infinity, and I wanted to see how Greg's read example would work (which it didn't).
Is it possible in bash to intercept a SIGINT, do something, and then ignore it (keep bash running).
I know that I can ignore the SIGINT with
trap '' SIGINT
And I can also do something on the sigint with
trap handler SIGINT
But that will still stop the script after the handler executes. E.g.
#!/bin/bash
handler()
{
kill -s SIGINT $PID
}
program &
PID=$!
trap handler SIGINT
wait $PID
#do some other cleanup with results from program
When I press ctrl+c, the SIGINT to program will be sent, but bash will skip the wait BEFORE program was properly shut down and created its output in its signal handler.
Using #suspectus answer I can change the wait $PID to:
while kill -0 $PID > /dev/null 2>&1
do
wait $PID
done
This actually works for me I am just not 100% sure if this is 'clean' or a 'dirty workaround'.
trap will return from the handler, but after the command called when the handler was invoked.
So the solution is a little clumsy but I think it does what is required. trap handler INT also will work.
trap 'echo "Be patient"' INT
for ((n=20; n; n--))
do
sleep 1
done
The short answer:
SIGINT in bash can be caught, handled and then ignored, assumed that "ignored" here means that bash continues to run the script.
The wanted actions of the handler can even be postponed to build a kind of "transaction" so that SIGINT will be fired (or "ignored") AFTER a group of statements have done their work.
But since the above example touches many aspects of bash (foreground vs. background behavior, trap and wait) AND 8 years went away since then, the solution discussed here may not immediately work on all systems without further finetuning.
The solution discussed here was successfully tested on a "Linux mint-mate 5.4.0-73-generic x86_64" system with "GNU bash, Version 4.4.20(1)-release":
The wait shell builtin command IS DESIGNED to be interruptable. But one can examine the exit status of wait, which is 128 + signal number = 130 (in the case of SIGINT).
So if you want to trick around and wait til the background is process really finished, one can also do something like this:
wait ${programPID}
while [ $? -ge 128 ]; do
# 1st opportunity to place your **handler actions** is here
wait ${programPID}
done
But let it also said that we ran into a bug/feature while testing all of this. The problem was that wait kept on returning 130 even after the process in the background was no longer there. The documentation says that wait will return 127 in the case of a false process id, but this did not happen in our tests.
Keep in mind to check the existence of the background process before running the wait command in the while loop, if you also run into this problem.
Assumed that the following script is your program, which simply counts down from 5 to 0 and also tee's its output to a file named program.out. The while loop here is considered as a "transaction", which shall not be disturbed by SIGINT. And one last comment: This code does NOT ignore SIGINT after doing postponed actions, but instead restores the old SIGINT handler and raises a SIGINT:
#!/bin/bash
rm -f program.out
# Will be set to 1 by the SIGINT ignoring/postponing handler
declare -ig SIGINT_RECEIVED=0
# On <CTRL>+C or "kill -s SIGINT $$" set flag for [later|postponed] examination
function _set_SIGINT_RECEIVED {
SIGINT_RECEIVED=1
}
# Remember current SIGINT handler
old_SIGINT_handler=$(trap -p SIGINT)
# Prepare for later restoration via ${old_SIGINT_handler}
old_SIGINT_handler=${old_SIGINT_handler:-trap - SIGINT}
# Start your "transaction", which should NOT be disturbed by SIGINT
trap -- '_set_SIGINT_RECEIVED' SIGINT
count=5
echo $count | tee -a program.out
while (( count-- )); do
sleep 1
echo $count | tee -a program.out
done
# End of your "transaction"
# Look whether SIGINT was received
if [ ${SIGINT_RECEIVED} -eq 1 ]; then
# Your **handler actions** are here
echo "SIGINT was received during transaction..." | tee -a program.out
echo "... doing postponed work now..." | tee -a program.out
echo "... restoring old SIGINT handler and sending SIGINT" | tee -a program.out
echo "program finished after SIGINT postponed." | tee -a program.out
${old_SIGINT_handler}
kill -s SIGINT $$
fi
echo "program finished without having received SIGINT." | tee -a program.out
But let it also be said here that we ran into problems after sending program in the background. The problem was that program inherited a trap '' SIGINT which means that SIGINT was generally ignored and program was NOT able to set another handler via trap -- '_set_SIGINT_RECEIVED' SIGINT.
We solved this problem by putting program in a subshell and sending this subshell in the background, as you will see now in the MAIN script example, which runs in the foreground. And one last comment also: In this script you can decide via variable ignore_SIGINT_after_handling whether to finally ignore SIGINT and continue to run the script OR to execute the default SIGINT behavior after your handler action has finished its work:
#!/bin/bash
# Will be set to 1 by the SIGINT ignoring/postponing handler
declare -ig SIGINT_RECEIVED=0
# On <CTRL>+C or "kill -s SIGINT $$" set flag for later examination
function _set_SIGINT_RECEIVED {
SIGINT_RECEIVED=1
}
# Set to 1 if you want to keep bash running after handling SIGINT in a particular way
# or to 0 (or any other value) to run original SIGINT action after postponing SIGINT
ignore_SIGINT_after_handling=1
# Remember current SIGINT handler
old_SIGINT_handler=$(trap -p SIGINT)
# Prepare for later restoration via ${old_SIGINT_handler}
old_SIGINT_handler=${old_SIGINT_handler:-trap - SIGINT}
# Start your "transaction", which should NOT be disturbed by SIGINT
trap -- '_set_SIGINT_RECEIVED' SIGINT
# Do your work, for eample
(./program) &
programPID=$!
wait ${programPID}
while [ $? -ge 128 ]; do
# 1st opportunity to place a part of your **handler actions** is here
# i.e. send SIGINT to ${programPID} and make sure that it is only sent once
# even if MAIN receives more SIGINT's during this loop
wait ${programPID}
done
# End of your "transaction"
# Look whether SIGINT was received
if [ ${SIGINT_RECEIVED} -eq 1 ]; then
# Your postponed **handler actions** are here
echo -e "\nMAIN is doing postponed work now..."
if [ ${ignore_SIGINT_after_handling} -eq 1 ]; then
echo "... and continuing with normal program execution..."
else
echo "... and restoring old SIGINT handler and sending SIGINT via 'kill -s SIGINT \$\$'"
${old_SIGINT_handler}
kill -s SIGINT $$
fi
fi
# Restore "old" SIGINT behaviour
${old_SIGINT_handler}
# Prepare for next "transaction"
SIGINT_RECEIVED=0
echo ""
echo "This message has to be shown in the case of normal program execution"
echo "as well as after a caught and handled and then ignored SIGINT"
echo "End of MAIN script received"
Hope this helps a bit.
Shall everybody have a good time.
i had the same problem: my script was exiting after my sigint handler
i solved this by recursion
#! /bin/sh
# devloop.sh
# run command in infinite loop
# wait before restarting, to allow stopping the loop
# license: MIT, author: milahu
# https://stackoverflow.com/questions/15785522/catch-sigint-in-bash-handle-and-ignore
restart_delay=2
command="$1" # TODO use all args: $#
# example: drop cache, run vite
#command="rm -rf node_modules/.vite/ ; npx vite --clearScreen false"
if [ -z "$command" ]
then
command="( set -x; sleep 5 ); false # example command: sleep 5 seconds, set rc=1"
fi
loop_next() {
echo
echo "starting command. hit Ctrl+C to restart"
echo " $command"
(eval "$command") &
command_pid=$!
#echo "main pid: $$"; echo "cmd pid: $command_pid" # debug
restart_command() {
echo
echo "restarting command in $restart_delay seconds. hit Ctrl+C to stop"
sleep $restart_delay
loop_next # recursion
}
stop_command() {
echo
echo "got Ctrl+C -> stopping command"
kill $command_pid
trap exit SIGINT # handle second Ctrl+C
restart_command
}
trap stop_command SIGINT # handle first Ctrl+C
wait $command_pid # this is blocking
echo "command stopped. return code: $?"
restart_command
}
echo starting loop
loop_next
I wonder if anyone can help with this?
I have a bash script. It starts a sub-process which is another gui-based application. The bash script then goes into an interactive mode getting input from the user. This interactive mode continues indefinately. I would like it to terminate when the gui-application in the sub-process exits.
I have looked at SIGCHLD but this doesn't seem to be the answer. Here's what I've tried but I don't get a signal when the prog ends.
set -o monitor
"${prog}" &
prog_pid=$!
function check_pid {
kill -0 $1 2> /dev/null
}
function cleanup {
### does cleanup stuff here
exit
}
function sigchld {
check_pid $prog_pid
[[ $? == 1 ]] && cleanup
}
trap sigchld SIGCHLD
Updated following answers. I now have this working using the suggestion from 'nosid'. I have another, related, issue now which is that the interactive process that follows is a basic menu driven process that blocks waiting for key input from the user. If the child process ends the USR1 signal is not handled until after input is received. Is there any way to force the signal to be handled immediately?
The wait look looks like this:
stty raw # set the tty driver to raw mode
max=$1 # maximum valid choice
choice=$(expr $max + 1) # invalid choice
while [[ $choice -gt $max ]]; do
choice=`dd if=/dev/tty bs=1 count=1 2>/dev/null`
done
stty sane # restore tty
Updated with solution. I have solved this. The trick was to use nonblocking I/O for the read. Now, with the answer from 'nosid' and my modifications, I have exactly what I want. For completeness, here is what works for me:
#!/bin/bash -bm
{
"${1}"
kill -USR1 $$
} &
function cleanup {
# cleanup stuff
exit
}
trap cleanup SIGUSR1
while true ; do
stty raw # set the tty driver to raw mode
max=9 # maximum valid choice
while [[ $choice -gt $max || -z $choice ]]; do
choice=`dd iflag=nonblock if=/dev/tty bs=1 count=1 2>/dev/null`
done
stty sane # restore tty
# process choice
done
Here is a different approach. Instead of using SIGCHLD, you can execute an arbitrary command as soon as the GUI application terminates.
{
some_command args...
kill -USR1 $$
} &
function sigusr1() { ... }
trap sigusr1 SIGUSR1
Ok. I think I understand what you need. Have a look at my .xinitrc:
xrdb ~/.Xdefaults
source ~/.xinitrc.hw.settings
xcompmgr &
xscreensaver &
# after starting some arbitrary crap we want to start the main gui.
startfluxbox & PIDOFAPP=$! ## THIS IS THE IMPORTANT PART
setxkbmap genja
wmclockmon -bl &
sleep 1
wmctrl -s 3 && aterms sone &
sleep 1
wmctrl -s 0
wait $PIDOFAPP ## THIS IS THE SECOND PART OF THE IMPORTANT PART
xeyes -geometry 400x400+500+400 &
sleep 2
echo im out!
What happens is that after you send a process to the background, you can use wait to wait until the process dies. whatever is after wait will not be executed as long as the application is running. You can use this to exit after the GUI has been shut down.
PS: I run bash.
I think you need to do:
set -bm
or
set -o monitor notify
As per the bash manual:
-b
Cause the status of terminated background jobs to be reported immediately, rather than before printing the next primary prompt.
The shell's main job is executing child processes, and
it needs to catch SIGCHLD for its own purposes. This somehow restricts it to pass on the signal to the script itself.
Could you just check for the child pid and based on that send the alert. You can find the child pid as below-
bash_pid=$$
while true
do
children=`ps -eo ppid | grep -w $bash_pid`
if [ -z "$children" ]; then
cleanup
alert
exit
fi
done