I have a shell script that calls some data processing functions. These functions can be long-running.
I want to update the script to interrupt the running process in case a certain "status" is seen from an external source. Otherwise the program should complete normally.
I have written a monitor_status function and called is asynchronously. The function sends a kill command to the main process in case the status is found. I have the following questions on this
in case the kill -15 $ is invoked from monitor_status function, the cleanup function gets called twice. how can I prevent that?
If the main process completes normally, how should I terminate the monitor_status function?
Also please suggest if there is a better way to handle such a scenario and any other improvements I can make to the script
The bash script looks something like this:
#!/bin/bash
function cleanup() {
echo "cleanup invoked"
if [[ -n "${child}" ]]; then
echo "Stopping the child"
kill "${child}" >/dev/null 2>&1 || true
fi
echo "cleanup done"
}
function main() {
echo "Starting main process"
# do long running data processing
echo "Exiting Main"
}
function monitor_status() {
echo "checking status"
$Status = get_status
if [[ $Status = "Terminate" ]]; then
echo "Alert Alert!!!"
kill -15 $$
fi
sleep 5
done
}
monitor_status &
main &
trap cleanup SIGTERM EXIT
child=$!
wait "${child}"
Related
What I am trying to do is create a generic, asynchronous command runner that will allow me to run a command in the background and get its output and code without blocking the shell I am working in (think serial). For most commands, I could do something like:
FUNCwaitForCommand() {
wait "$1"
echo $? > "code.txt"
}
ls > "output.txt" &
pid=$!
FUNCwaitForCommand $pid &
however, this does not work for composed commands, e.g.
(cat < somefifo)
I can make it run the commands with something like:
FUNCwaitForCommand() {
wait $1
echo $? > code
}
eval "ls > output.txt &"
pid=$!
FUNCwaitForCommand $pid &
but the wait does not wait. I can make it wait to finish until the process finishes by doing:
while kill -0 "$1"; do wait "$1"; done
instead of just wait, but the code it gives me is 127 instead of the code of the command that gets run. If I put a wait directly after the pid collection
eval "ls > output.txt &"
pid=$!
wait $pid
it waits for the process just fine, but obviously it doesn't background and release the shell back to me.
I'm not great at bash, but it looks like inside of the function is not in the same sub shell as the eval, so it doesn't recognize the background process, though I don't know why it only acts that way when using eval, and not when using the normal execution method.
Explanation
Just as you can't use:
sleep 5 & pid=$!
wait $pid &
you can't put that wait inside a backgrounded function either. That is to say, you can run:
sleep 5 & pid=$!
waitForCommand() { wait "$#"; }
waitForCommand "$pid"
but you can't run:
sleep 5 & pid=$!
waitForCommand() { wait "$#"; }
waitForCommand "$pid" &
This is because processes can only wait() for their children. When you fork off a new child off the shell with &, you're no longer the parent -- instead, you're a sibling. As such, this isn't shell-specific behavior but general-purpose UNIX semantics -- you'd get an equivalent error in any language.
Workaround
Ensure that exit-status recording is done by the direct parent of the process whose exit status is being recorded, even if that parent itself is in the background relative to the original shell.
tempdir_top=$(mktemp -t -d bgdir.XXXXXX)
declare -g -A tempdirs=( )
runBackgroundCommand() {
(( "$#" == 1 )) || { echo "Usage: runBackgroundCommand 'command'" >&2; return 1; }
local cmd tempdir
cmd=$1
tempdir=$(mktemp -d "$tempdir_top/proc.XXXXXX")
{
printf '%s\0' "$cmd" >"$tempdir/cmd"
eval "$cmd" >"$tempdir/stdout" 2>"$tempdir/stderr" & pid=$!
printf '%s\n' "$pid" >"$tempdir/pid"
wait "$1"; retval=$?
printf '%s\n' "$retval" >"$tempdir/retval"
} &
tempdirs[$tempdir]=$!
}
# example usage
runBackgroundCommand "sleep 5"
runBackgroundCommand "sleep 10"
That way, in the parent process, you have a map of temporary directories to the top-level PID for each (easily used to check for completion), and can look inside that directory for more information on any of the processes involved.
I'm running several background processes in my script
run_gui()
{
exec ... # the real commands here
}
The functions run_ai1(), run_ai2 are analogous.
Then I run the functions and do the needed piping
run_gui &
run_ai1 &
run_ai2 &
while true; do
while true; do
read -u $ai1_outfd line || echo "Nothing read"
if [[ $line ]]; then
: # processing
fi
done
sleep $turndelay
while true; do
read -u $ai2_outfd line || echo "nothing read"
if [[ $line ]]; then
: # processing
fi
done
sleep $turndelay
done
If any of those three processes exits, I want to check their exit codes and terminate the rest of the processes. For example, if run_ai2 exits with exit code 3, then I want to stop the processes run_ai1 and run_gui and exit the main script with exit code 1. The correct exitcodes for the different backgrounds processes may differ.
The problem is: how can I detect it? There's the command wait but I don't know in advance which script will finish first. I could run wait as a background process - but it's becoming even more clumsy.
Can you help me please?
The following script monitors test child processes (in the example, sleep+false and sleep+true) and reports their PID and exit code:
#!/bin/bash
set -m
trap myhandler CHLD
myhandler() {
echo sigchld received
cat /tmp/foo
}
( sleep 5; false; echo "exit p1=$?" ) > /tmp/foo &
p1=$!
echo "p1=$p1"
( sleep 3; true; echo "exit p2=$?" ) > /tmp/foo &
p2=$!
echo "p2=$p2"
pstree -p $$
wait
The result is:
p1=3197
p2=3198
prueba(3196)─┬─prueba(3197)───sleep(3199)
├─prueba(3198)───sleep(3201)
└─pstree(3200)
sigchld received
sigchld received
exit p2=0
sigchld received
exit p1=1
It could be interesting to use SIGUSR1 instead of SIGCHLD; see here for an example: https://stackoverflow.com/a/12751700/4886927.
Also, inside the trap handler, it is posible to verify which child is still alive. Something like:
myhandler() {
if kill -0 $p1; then
echo "child1 is alive"
fi
if kill -0 $p2; then
echo "child2 is alive"
fi
}
or kill both childs when one of them dies:
myhandler() {
if kill -0 $p1 && kill -0 $p2; then
echo "all childs alive"
else
kill -9 $p1 $p2
fi
}
I need to run several child processes in background and pipe data between them. When the script exits, I want to kill any remaining of them, so I added
trap cleanup EXIT
cleanup()
{
echo "Cleaning up!"
pkill -TERM -P $$
}
Since I need to react if one of the processes reports an error, I created wrapper functions. Anything that ends with fd is a previously opened file descriptor, connected to a FIFO pipe.
run_gui()
{
"$GAME_BIN" $args <&$gui_infd >&$gui_outfd # redirecting IO to some file descriptors
if [[ $? == 0 ]]; then
echo exiting ok
exit $OK_EXITCODE
else
exit $ERROR_EXITCODE
fi
}
The functions run_ai1(), run_ai2 are analogous.
run_ai1()
{
"$ai1" <&$ai1_infd >&$ai1_outfd
if [[ $? == 0 || $? == 1 || $? == 2 ]]; then
exit $OK_EXITCODE
else
exit $ERROR_EXITCODE
fi
}
run_ai2()
{
"$ai2" <&$ai2_infd >&$ai2_outfd
if [[ $? == 0 || $? == 1 || $? == 2 ]]; then
exit $OK_EXITCODE
else
exit $ERROR_EXITCODE
fi
}
Then I run the functions and do the needed piping
printinit 1 >&$ai1_infd
printinit 2 >&$ai2_infd
run_gui &
run_ai1 &
run_ai2 &
while true; do
echo "Started the loop"
while true; do
read -u $ai1_outfd line || echo "Nothing read"
echo $line
if [[ $line ]]; then
echo "$line" >&$gui_infd
echo "$line" >&$ai2_infd
if [[ "$line" == "END_TURN" ]]; then
break
fi
fi
done
sleep $turndelay
while true; do
read -u $ai2_outfd line || echo "nothing read"
echo $line
if [[ $line ]]; then
echo "$line" >&$gui_infd
echo "$line" >&$ai1_infd
if [[ "$line" == "END_TURN" ]]; then
break
fi
fi
done
sleep $turndelay
done
When $GAME_BIN exits, i.e. the GUI is closed by the close button, I can see the exiting ok message on the stdout, but the cleanup function is not called at all. When I add a manual call to cleanup before calling exit $OK_EXITCODE, although the processes are killed:
./game.sh: line 309: 9193 Terminated run_gui
./game.sh: line 309: 9194 Terminated run_ai1
./game.sh: line 309: 9195 Terminated run_ai2
./game.sh: line 309: 9203 Terminated sleep $turndelay
the loop runs anyway and the script doesn't exit, as it should (exit $OK_EXITCODE). The AI scripts are simple:
#!/bin/sh
while true; do
echo END_TURN
done
There is no wait call anywhere in my script. What am I doing wrong?
What's interesting: when I call jobs -p right after run_ai2 &, then I get 3 pids listed. On the other hand, when I invoke this command from the cleanup function - the output is empty.
Besides, why is the sleep $turndelay process terminated? It's not a child invoked process.
An EXIT trap fires when the trapping script exits. Your toplevel script isn't exiting here.
The trap isn't inherited by the sub-shell that your run_* functions are running under (from being run in the background) so it never triggers when the sub-shell's exit.
What you want is most likely what you did manually (though slightly incorrectly it sounded like).
You want the cleanup function called from run_gui when $GAME has exited. Something like this.
run_gui() {
"$GAME_BIN" $args <&$gui_infd >&$gui_outfd # redirecting IO to some file descriptors
ret=$?
cleanup
exit $ret
}
Then you'll just need to make sure that cleanup gets the right value of $$ (Which in bash it will, for your usage, even in a sub-shell since $$ in a sub-shell is the parent process ID but you might want to make that more explicit by setting up a handler in your main script for a signal and signalling the main script when run_gui terminates instead.)
I'd guess you are getting some child processes kicked off by a child process. Do this: in another window do a ps -ft pts/1 or whatever your tty is. Verify.
Also change the pkill to a kill $(jobs -p) and see if that works.
I am having problems with finding how to exit my script when a keyphrase is entered: e.g. "Foo".
Essentially I wish to test every user input for this phrase and invoke the exit command. I could create a test function I call after every user entry but this seems inelegant.
I am using function:
function EXIT {
printf "\n\nSCRIPT IS NOW TERMINATING\n"
if [ -n $userLogged ]; then
local TIME="$username LOGGED OUT at: "$(date +%r)" on the "$(date +%d/%m/%Y)"\n"
printf "$TIME" >> usage.db
fi
exit
}
and:
trap EXIT SIGTERM
Can it be done using trap?
I'm not exactly sure but I guess you are after something like this:
#!/bin/bash
# Save this script as "my_exit"
function EXIT {
printf "\n\nSCRIPT IS NOW TERMINATING\n"
if [ -n $userLogged ]; then
local TIME="$username LOGGED OUT at: "$(date +%r)" on the "$(date +%d/%m/%Y)"\n"
printf "$TIME" >> usage.db
fi
exit
}
trap EXIT SIGUSR1
while :; do
read -p "Enter your test word: " word
if [ "$word" = "Foo" ];
then
pkill --signal SIGUSR1 my_exit
fi
done
I used SIGUSR1 instead of SIGTERM just to show the functionality better. It's also possible to change that into two separate scripts with minor modifications i.e. "EXIT+trap" block will be one, the eternal loop another and latter one would signal the first one via SIGUSR1 to do exit routines.
I´ve asked Bash trap - exit only at the end of loop and the submitted solution works but while pressing CTRL-C the running command in the script (mp3convert with lame) will be interrupt and than the complete for loop will running to the end. Let me show you the simple script:
#!/bin/bash
mp3convert () { lame -V0 file.wav file.mp3 }
PreTrap() { QUIT=1 }
CleanUp() {
if [ ! -z $QUIT ]; then
rm -f $TMPFILE1
rm -f $TMPFILE2
echo "... done!" && exit
fi }
trap PreTrap SIGINT SIGTERM SIGTSTP
trap CleanUp EXIT
case $1 in
write)
while [ -n "$line" ]
do
mp3convert
[SOMEMOREMAGIC]
CleanUp
done
;;
QUIT=1
If I press CTRL-C while function mp3convert is running the lame command will be interrupt and then [SOMEMOREMAGIC] will execute before CleanUp is running. I don´t understand why the lame command will be interrupt and how I could avoid them.
Try to simplify the discussion above, I wrap up an easier understandable version of show-case script below. This script also HANDLES the "double control-C problem":
(Double control-C problem: If you hit control C twice, or three times, depending on how many wait $PID you used, those clean up can not be done properly.)
#!/bin/bash
mp3convert () {
echo "mp3convert..."; sleep 5; echo "mp3convert done..."
}
PreTrap() {
echo "in trap"
QUIT=1
echo "exiting trap..."
}
CleanUp() {
### Since 'wait $PID' can be interrupted by ^C, we need to protected it
### by the 'kill' loop ==> double/triple control-C problem.
while kill -0 $PID >& /dev/null; do wait $PID; echo "check again"; done
### This won't work (A simple wait $PID is vulnerable to double control C)
# wait $PID
if [ ! -z $QUIT ]; then
echo "clean up..."
exit
fi
}
trap PreTrap SIGINT SIGTERM SIGTSTP
#trap CleanUp EXIT
for loop in 1 2 3; do
(
echo "loop #$loop"
mp3convert
echo magic 1
echo magic 2
echo magic 3
) &
PID=$!
CleanUp
echo "done loop #$loop"
done
The kill -0 trick can be found in a comment of this link
When you hit Ctrl-C in a terminal, SIGINT gets sent to all processes in the foreground process group of that terminal, as described in this Stack Exchange "Unix & Linux" answer: How Ctrl C works. (The other answers in that thread are well worth reading, too). And that's why your mp3convert function gets interrupted even though you have set a SIGINT trap.
But you can get around that by running the mp3convert function in the background, as mattias mentioned. Here's a variation of your script that demonstrates the technique.
#!/usr/bin/env bash
myfunc()
{
echo -n "Starting $1 :"
for i in {1..7}
do
echo -n " $i"
sleep 1
done
echo ". Finished $1"
}
PreTrap() { QUIT=1; echo -n " in trap "; }
CleanUp() {
#Don't start cleanup until current run of myfunc is completed.
wait $pid
[[ -n $QUIT ]] &&
{
QUIT=''
echo "Cleaning up"
sleep 1
echo "... done!" && exit
}
}
trap PreTrap SIGINT SIGTERM SIGTSTP
trap CleanUp EXIT
for i in {a..e}
do
#Run myfunc in background but wait until it completes.
myfunc "$i" &
pid=$!
wait $pid
CleanUp
done
QUIT=1
When you hit Ctrl-C while myfunc is in the middle of a run, PreTrap prints its message and sets the QUIT flag, but myfunc continues running and CleanUp doesn't commence until the current myfunc run has finished.
Note that my version of CleanUp resets the QUIT flag. This prevents CleanUp from running twice.
This version removes the CleanUp call from the main loop and puts it inside the PreTrap function. It uses wait with no ID argument in PreTrap, which means we don't need to bother saving the PID of each child process. This should be ok since if we're in the trap we do want to wait for all child processes to complete before proceeding.
#!/bin/bash
# Yet another Trap demo...
myfunc()
{
echo -n "Starting $1 :"
for i in {1..5}
do
echo -n " $i"
sleep 1
done
echo ". Finished $1"
}
PreTrap() { echo -n " in trap "; wait; CleanUp; }
CleanUp() {
[[ -n $CLEAN ]] && { echo bye; exit; }
echo "Cleaning up"
sleep 1
echo "... done!"
CLEAN=1
exit
}
trap PreTrap SIGINT SIGTERM SIGTSTP
trap "echo exittrap; CleanUp" EXIT
for i in {a..c}
do
#Run myfunc in background but wait until it completes.
myfunc "$i" & wait $!
done
We don't really need to do myfunc "$i" & wait $! in this script, it could be simplified even further to myfunc "$i" & wait. But generally it's better to wait for a specific PID just in case there's some other process running in the background that we don't want to wait for.
Note that pressing Ctrl-C while CleanUp itself is running will interrupt the current foreground process (probably sleep in this demo).
One way of doing this would be to simply disable the interrupt until your program is done.
Some pseudo code follows:
#!/bin/bash
# First, store your stty settings and disable the interrupt
STTY=$(stty -g)
stty intr undef
#run your program here
runMp3Convert()
#restore stty settings
stty ${STTY}
# eof
Another idea would be to run your bash script in the background (if possible).
mp3convert.sh &
or even,
nohup mp3convert.sh &