write to fifo/pipe from shell, with timeout - shell

I have a pair of shell programs that talk over a named pipe. The reader creates the pipe when it starts, and removes it when it exits.
Sometimes, the writer will attempt to write to the pipe between the time that the reader stops reading and the time that it removes the pipe.
reader: while condition; do read data <$PIPE; do_stuff; done
writer: echo $data >>$PIPE
reader: rm $PIPE
when this happens, the writer will hang forever trying to open the pipe for writing.
Is there a clean way to give it a timeout, so that it won't stay hung until killed manually? I know I can do
#!/bin/sh
# timed_write <timeout> <file> <args>
# like "echo <args> >> <file>" with a timeout
TIMEOUT=$1
shift;
FILENAME=$1
shift;
PID=$$
(X=0; # don't do "sleep $TIMEOUT", the "kill %1" doesn't kill the sleep
while [ "$X" -lt "$TIMEOUT" ];
do sleep 1; X=$(expr $X + 1);
done; kill $PID) &
echo "$#" >>$FILENAME
kill %1
but this is kind of icky. Is there a shell builtin or command to do this more cleanly (without breaking out the C compiler)?

The UNIX "standard" way of dealing with this is to use Expect, which comes with timed-run example: run a program for only a given amount of time.
Expect can do wonders for scripting, well worth learning it. If you don't like Tcl, there is a Python Expect module as well.

This pair of programs works much more nicely after being re-written in Perl using Unix domain sockets instead of named pipes. The particular problem in this question went away entirely, since if/when one end dies the connection disappears instead of hanging.

This question comes up periodically (though I couldn't find it with a search). I've written two shell scripts to use as timeout commands: one for things that read standard input and one for things that don't read standard input. This stinks, and I've been meaning to write a C program, but I haven't gotten around to it yet. I'd definitely recommend writing a timeout command in C once and for all. But meanwhile, here's the simpler of the two shell scripts, which hangs if the command reads standard input:
#!/bin/ksh
# our watchdog timeout in seconds
maxseconds="$1"
shift
case $# in
0) echo "Usage: `basename $0` <seconds> <command> [arg ...]" 1>&2 ;;
esac
"$#" &
waitforpid=$!
{
sleep $maxseconds
echo "TIMED OUT: $#" 1>&2
2>/dev/null kill -0 $waitforpid && kill -15 $waitforpid
} &
killerpid=$!
>>/dev/null 2>&1 wait $waitforpid
# this is the exit value we care about, so save it and use it when we
rc=$?
# zap our watchdog if it's still there, since we no longer need it
2>>/dev/null kill -0 $killerpid && kill -15 $killerpid
exit $rc
The other script is online at http://www.cs.tufts.edu/~nr/drop/timeout.

trap 'kill $(ps -L $! -o pid=); exit 30' 30
echo kill -30 $$ 2\>/dev/null | at $1 2>/dev/null
shift; eval $# &
wait

Related

Kill bash command when line is found

I want to kill a bash command when I found some string in the output.
To clarify, I want the solution to be similar to a timeout command:
timeout 10s looping_program.sh
Which will execute the script: looping_program.sh and kill the script after 10 seconds of execute.
Instead I want something like:
regexout "^Success$" looping_program.sh
Which will execute the script until it matches a line that just says Success in the stdout of the program.
Note that I'm assuming that this looping_program.sh does not exit at the same time it outputs Success for whatever reason, so simply waiting for the program to exit would waste time if I don't care about what happens after that.
So something like:
bash -e looping_program.sh > /tmp/output &
PID="$(ps aux | grep looping_program.sh | head -1 | tr -s ' ' | cut -f 2 -d ' ')"
echo $PID
while :; do
echo "$(tail -1 /tmp/output)"
if [[ "$(tail -1 /tmp/output)" == "Success" ]]; then
kill $PID
exit 0
fi
sleep 1
done
Where looping_program.sh is something like:
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
echo "Success"
sleep 1;
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
But that is not very robust (uses a single tmp file... might kill other programs...) and I want it to just be one command. Does something like this exist? I may just write a c program to do it if not.
P.S.: I provided my code as an example of what I wanted the program to do. It does not use good programming practices. Notes from other commenters:
#KamilCuk Do not use temporary file. Use a fifo.
#pjh Note that any approach that involves using kill with a PID in shell code runs the risk of killing the wrong process. Use kill in shell programs only when it is absolutely necessary.
There are more suggestions below from other users, I just wanted to make sure no one came across this and thought it would be good to model their code after.
looping_program() {
for i in 1 2 3; do echo $i; sleep 1; done
echo Success
yes
}
coproc looping_program
while IFS= read -r line; do
if [[ "$line" =~ Success ]]; then
break
fi
done <&${COPROC[0]}
exec {COPROC[0]}>&- {COPROC[1]}>&-
kill ${COPROC_PID}
wait ${COPROC_PID}
Notes:
Do not use temporary file. Use a fifo.
Do not use tail -n1 to read last line. Read from the stream in a loop.
Do not repeat tail -1 twice. Cache the result.
Wait for pid after killing to synchronize.
When you're using a coprocess, use COPROC_PID to get the PID
When you're not using a coprocess, use $! to get the PID of a background process started from the current shell.
When you can't use $! (because the process you're trying to get a PID of was not spawned in the background as a direct child of the current shell), do not use ps aux | grep to get the pid. Use pgrep.
Do not use echo $(stuff). Just run the stuff, no echo.
With expect
#!/usr/bin/env -S expect -f
set timeout -1
spawn ./looping_program.sh
expect "Success"
send -- "\x03"
expect eof
Call it looping_killer:
$ ./looping_killer
spawn ./looping_program.sh
Fail
Fail
Fail
Success
^C
To pass the program and pattern:
./looping_killer some_program "some pattern"
You'd change the expect script to
#!/usr/bin/env -S expect -f
set timeout -1
spawn [lindex $argv 0]
expect -- [lindex $argv 1]
send -- "\x03"
expect eof
Assuming that your looping program exists when it tries to write to a broken pipe, this will print all output up to and including the 'Success' line and then exit:
./looping_program | sed '/^Success$/q'
You may need to disable buffering of the looping program output. See Force line-buffering of stdout in a pipeline and How to make output of any shell command unbuffered? for ways to do it.
See Should I save my scripts with the .sh extension? and Erlkonig: Commandname Extensions Considered Harmful for reasons why I dropped the '.sh' suffix.
Note that any approach that involves using kill with a PID in shell code runs the risk of killing the wrong process. Use kill in shell programs only when it is absolutely necessary.

how do I watch for a process to have died in shell script?

I'm running a shell test program that I can view a progress bar but when I run it I keep getting a unary error . Is kill -0 a way to kill a subprocess in shell ?
Or is there another method to test if my process has died?
heres my code to run a progress bar until my command ends:
#!/bin/sh
# test my progress bar
spin[0]="-"
spin[1]="\\"
spin[2]="|"
spin[3]="/"
sleep 10 2>/dev/null & # run as background process
pid=$! # grab process id
echo -n "[sleeping] ${spin[0]}"
while [ kill -0 $pid ] # wait for process to end
do
for i in "${spin[#]}"
do
echo -ne "\b$i"
sleep 0.1
done
done
enter code here
1. Is kill -0 a way to kill a subprocess in shell ?
On Linux OS, kill -0 is just a way to try to kill a process and see what happens, '0' is not a POSIX signal, it does nothing at all.
If the process is running, kill will return 0, if not, it will return 1.
ps $pid >/dev/null 2>&1 could do the same job.
To kill a process, one generally use the SIGQUIT/3 (quit program) or SIGKILL/9 (terminate program) ; the process could trap the signal and make a clean exit, or it could ignore the signal so the OS has to terminate it 'quick and dirty'.
2. test and '['
The square bracket '[' is an utility ( /bin/[ ), and expect something you didn't provide correctly.
The syntax of while is while list; do list; done where list will return an exit code, so you don't have to use something else.
3. how do I watch for a process to have died in shell script?
Like you did, the code below will do the job:
#!/bin/bash
spin[0]="-"
spin[1]="\\"
spin[2]="|"
spin[3]="/"
sleep 10 2>/dev/null & # run as background process
pid=$! # grab process id
echo -n "[sleeping] ${spin[0]}"
#while ps -p $pid >/dev/null 2>&1 # using ps
while kill -0 $pid >/dev/null 2>&1 # using kill
do
for i in "${spin[#]}"
do
echo -ne "\b$i"
sleep 0.5
done
done
CAVEATS
I use /bin/bash as interpreter, as some of the Bourne Shell (sh) could not support the use of an array (ie spin[n]).
It's probably cleaner to run the spinner in the background and kill it when the process (running in the foreground) terminates. Or, you could open another file descriptor and write something into it after the background process terminates, and have the main process block on a read. eg:
#!/bin/bash
# test my progress bar
spin[0]='-'
spin[1]='\'
spin[2]='|'
spin[3]='/'
{ { { sleep 10 2>/dev/null; echo >&5; } & # run as background process
} 5>&1 1>&3 | { # wait for process to end
while ! read -t 1; do
printf "\r[sleeping] ${spin[ $(( i = ++i % 4 )) ]}"
done
}
} 3>&1

Why is the second bash script not printing its iteration?

I have two bash scripts:
a.sh:
echo "running"
doit=true
if [ $doit = true ];then
./b.sh &
fi
some-long-operation-binary
echo "done"
b.sh:
for i in {0..50}; do
echo "counting";
sleep 1;
done
I get this output:
> ./a.sh
running
counting
Why do I only see the first "counting" from b.sh and then nothing anymore? (Currently some-long-operation-binary just sleep 5 for this example). I first thought that due to setting b.sh in the background, its STDOUT is lost, but why do I see the first output? More importantly: is b.sh still running and doing its thing (its iteration)?
For context:
b.sh is going to poll a service provided by some-long-operation-binary, which is only available after some time the latter has run, and when ready, would write its content to a file.
Apologies if this is just rubbish, it's a bit late...
You should add #!/bin/bash or the like to b.sh that uses a Bash-like expansion, to make sure Bash is actually running the script. Otherwise there may be (indeed) only one loop iteration happening.
When you start a background process, it is usually a good practice to kill it and wait for it, no matter which way the script exits.
#!/bin/bash
set -e -o pipefail
declare -i show_counter=1
counter() {
local -i i
for ((i = 0;; ++i)); do
echo "counting $((i))"
sleep 1
done
}
echo starting
if ((show_counter)); then
counter &
declare -i counter_pid="${!}"
trap 'kill "${counter_pid}"
wait -n "${counter_pid}" || :
echo terminating' EXIT
fi
sleep 10 # long-running process

shell script - how to stop "watch" command in the shell script [duplicate]

I have a bash script that launches a child process that crashes (actually, hangs) from time to time and with no apparent reason (closed source, so there isn't much I can do about it). As a result, I would like to be able to launch this process for a given amount of time, and kill it if it did not return successfully after a given amount of time.
Is there a simple and robust way to achieve that using bash?
P.S.: tell me if this question is better suited to serverfault or superuser.
(As seen in:
BASH FAQ entry #68: "How do I run a command, and have it abort (timeout) after N seconds?")
If you don't mind downloading something, use timeout (sudo apt-get install timeout) and use it like: (most Systems have it already installed otherwise use sudo apt-get install coreutils)
timeout 10 ping www.goooooogle.com
If you don't want to download something, do what timeout does internally:
( cmdpid=$BASHPID; (sleep 10; kill $cmdpid) & exec ping www.goooooogle.com )
In case that you want to do a timeout for longer bash code, use the second option as such:
( cmdpid=$BASHPID;
(sleep 10; kill $cmdpid) \
& while ! ping -w 1 www.goooooogle.com
do
echo crap;
done )
# Spawn a child process:
(dosmth) & pid=$!
# in the background, sleep for 10 secs then kill that process
(sleep 10 && kill -9 $pid) &
or to get the exit codes as well:
# Spawn a child process:
(dosmth) & pid=$!
# in the background, sleep for 10 secs then kill that process
(sleep 10 && kill -9 $pid) & waiter=$!
# wait on our worker process and return the exitcode
exitcode=$(wait $pid && echo $?)
# kill the waiter subshell, if it still runs
kill -9 $waiter 2>/dev/null
# 0 if we killed the waiter, cause that means the process finished before the waiter
finished_gracefully=$?
sleep 999&
t=$!
sleep 10
kill $t
I also had this question and found two more things very useful:
The SECONDS variable in bash.
The command "pgrep".
So I use something like this on the command line (OSX 10.9):
ping www.goooooogle.com & PING_PID=$(pgrep 'ping'); SECONDS=0; while pgrep -q 'ping'; do sleep 0.2; if [ $SECONDS = 10 ]; then kill $PING_PID; fi; done
As this is a loop I included a "sleep 0.2" to keep the CPU cool. ;-)
(BTW: ping is a bad example anyway, you just would use the built-in "-t" (timeout) option.)
Assuming you have (or can easily make) a pid file for tracking the child's pid, you could then create a script that checks the modtime of the pid file and kills/respawns the process as needed. Then just put the script in crontab to run at approximately the period you need.
Let me know if you need more details. If that doesn't sound like it'd suit your needs, what about upstart?
One way is to run the program in a subshell, and communicate with the subshell through a named pipe with the read command. This way you can check the exit status of the process being run and communicate this back through the pipe.
Here's an example of timing out the yes command after 3 seconds. It gets the PID of the process using pgrep (possibly only works on Linux). There is also some problem with using a pipe in that a process opening a pipe for read will hang until it is also opened for write, and vice versa. So to prevent the read command hanging, I've "wedged" open the pipe for read with a background subshell. (Another way to prevent a freeze to open the pipe read-write, i.e. read -t 5 <>finished.pipe - however, that also may not work except with Linux.)
rm -f finished.pipe
mkfifo finished.pipe
{ yes >/dev/null; echo finished >finished.pipe ; } &
SUBSHELL=$!
# Get command PID
while : ; do
PID=$( pgrep -P $SUBSHELL yes )
test "$PID" = "" || break
sleep 1
done
# Open pipe for writing
{ exec 4>finished.pipe ; while : ; do sleep 1000; done } &
read -t 3 FINISHED <finished.pipe
if [ "$FINISHED" = finished ] ; then
echo 'Subprocess finished'
else
echo 'Subprocess timed out'
kill $PID
fi
rm finished.pipe
Here's an attempt which tries to avoid killing a process after it has already exited, which reduces the chance of killing another process with the same process ID (although it's probably impossible to avoid this kind of error completely).
run_with_timeout ()
{
t=$1
shift
echo "running \"$*\" with timeout $t"
(
# first, run process in background
(exec sh -c "$*") &
pid=$!
echo $pid
# the timeout shell
(sleep $t ; echo timeout) &
waiter=$!
echo $waiter
# finally, allow process to end naturally
wait $pid
echo $?
) \
| (read pid
read waiter
if test $waiter != timeout ; then
read status
else
status=timeout
fi
# if we timed out, kill the process
if test $status = timeout ; then
kill $pid
exit 99
else
# if the program exited normally, kill the waiting shell
kill $waiter
exit $status
fi
)
}
Use like run_with_timeout 3 sleep 10000, which runs sleep 10000 but ends it after 3 seconds.
This is like other answers which use a background timeout process to kill the child process after a delay. I think this is almost the same as Dan's extended answer (https://stackoverflow.com/a/5161274/1351983), except the timeout shell will not be killed if it has already ended.
After this program has ended, there will still be a few lingering "sleep" processes running, but they should be harmless.
This may be a better solution than my other answer because it does not use the non-portable shell feature read -t and does not use pgrep.
Here's the third answer I've submitted here. This one handles signal interrupts and cleans up background processes when SIGINT is received. It uses the $BASHPID and exec trick used in the top answer to get the PID of a process (in this case $$ in a sh invocation). It uses a FIFO to communicate with a subshell that is responsible for killing and cleanup. (This is like the pipe in my second answer, but having a named pipe means that the signal handler can write into it too.)
run_with_timeout ()
{
t=$1 ; shift
trap cleanup 2
F=$$.fifo ; rm -f $F ; mkfifo $F
# first, run main process in background
"$#" & pid=$!
# sleeper process to time out
( sh -c "echo \$\$ >$F ; exec sleep $t" ; echo timeout >$F ) &
read sleeper <$F
# control shell. read from fifo.
# final input is "finished". after that
# we clean up. we can get a timeout or a
# signal first.
( exec 0<$F
while : ; do
read input
case $input in
finished)
test $sleeper != 0 && kill $sleeper
rm -f $F
exit 0
;;
timeout)
test $pid != 0 && kill $pid
sleeper=0
;;
signal)
test $pid != 0 && kill $pid
;;
esac
done
) &
# wait for process to end
wait $pid
status=$?
echo finished >$F
return $status
}
cleanup ()
{
echo signal >$$.fifo
}
I've tried to avoid race conditions as far as I can. However, one source of error I couldn't remove is when the process ends near the same time as the timeout. For example, run_with_timeout 2 sleep 2 or run_with_timeout 0 sleep 0. For me, the latter gives an error:
timeout.sh: line 250: kill: (23248) - No such process
as it is trying to kill a process that has already exited by itself.
#Kill command after 10 seconds
timeout 10 command
#If you don't have timeout installed, this is almost the same:
sh -c '(sleep 10; kill "$$") & command'
#The same as above, with muted duplicate messages:
sh -c '(sleep 10; kill "$$" 2>/dev/null) & command'

How do I receive notification in a bash script when a specific child process terminates?

I wonder if anyone can help with this?
I have a bash script. It starts a sub-process which is another gui-based application. The bash script then goes into an interactive mode getting input from the user. This interactive mode continues indefinately. I would like it to terminate when the gui-application in the sub-process exits.
I have looked at SIGCHLD but this doesn't seem to be the answer. Here's what I've tried but I don't get a signal when the prog ends.
set -o monitor
"${prog}" &
prog_pid=$!
function check_pid {
kill -0 $1 2> /dev/null
}
function cleanup {
### does cleanup stuff here
exit
}
function sigchld {
check_pid $prog_pid
[[ $? == 1 ]] && cleanup
}
trap sigchld SIGCHLD
Updated following answers. I now have this working using the suggestion from 'nosid'. I have another, related, issue now which is that the interactive process that follows is a basic menu driven process that blocks waiting for key input from the user. If the child process ends the USR1 signal is not handled until after input is received. Is there any way to force the signal to be handled immediately?
The wait look looks like this:
stty raw # set the tty driver to raw mode
max=$1 # maximum valid choice
choice=$(expr $max + 1) # invalid choice
while [[ $choice -gt $max ]]; do
choice=`dd if=/dev/tty bs=1 count=1 2>/dev/null`
done
stty sane # restore tty
Updated with solution. I have solved this. The trick was to use nonblocking I/O for the read. Now, with the answer from 'nosid' and my modifications, I have exactly what I want. For completeness, here is what works for me:
#!/bin/bash -bm
{
"${1}"
kill -USR1 $$
} &
function cleanup {
# cleanup stuff
exit
}
trap cleanup SIGUSR1
while true ; do
stty raw # set the tty driver to raw mode
max=9 # maximum valid choice
while [[ $choice -gt $max || -z $choice ]]; do
choice=`dd iflag=nonblock if=/dev/tty bs=1 count=1 2>/dev/null`
done
stty sane # restore tty
# process choice
done
Here is a different approach. Instead of using SIGCHLD, you can execute an arbitrary command as soon as the GUI application terminates.
{
some_command args...
kill -USR1 $$
} &
function sigusr1() { ... }
trap sigusr1 SIGUSR1
Ok. I think I understand what you need. Have a look at my .xinitrc:
xrdb ~/.Xdefaults
source ~/.xinitrc.hw.settings
xcompmgr &
xscreensaver &
# after starting some arbitrary crap we want to start the main gui.
startfluxbox & PIDOFAPP=$! ## THIS IS THE IMPORTANT PART
setxkbmap genja
wmclockmon -bl &
sleep 1
wmctrl -s 3 && aterms sone &
sleep 1
wmctrl -s 0
wait $PIDOFAPP ## THIS IS THE SECOND PART OF THE IMPORTANT PART
xeyes -geometry 400x400+500+400 &
sleep 2
echo im out!
What happens is that after you send a process to the background, you can use wait to wait until the process dies. whatever is after wait will not be executed as long as the application is running. You can use this to exit after the GUI has been shut down.
PS: I run bash.
I think you need to do:
set -bm
or
set -o monitor notify
As per the bash manual:
-b
Cause the status of terminated background jobs to be reported immediately, rather than before printing the next primary prompt.
The shell's main job is executing child processes, and
it needs to catch SIGCHLD for its own purposes. This somehow restricts it to pass on the signal to the script itself.
Could you just check for the child pid and based on that send the alert. You can find the child pid as below-
bash_pid=$$
while true
do
children=`ps -eo ppid | grep -w $bash_pid`
if [ -z "$children" ]; then
cleanup
alert
exit
fi
done

Resources