nonblocking wait ${myPid} in bash [duplicate] - bash

Is there any builtin feature in Bash to wait for a process to finish?
The wait command only allows one to wait for child processes to finish.
I would like to know if there is any way to wait for any process to finish before proceeding in any script.
A mechanical way to do this is as follows but I would like to know if there is any builtin feature in Bash.
while ps -p `cat $PID_FILE` > /dev/null; do sleep 1; done

To wait for any process to finish
Linux (doesn't work on Alpine, where ash doesn't support tail --pid):
tail --pid=$pid -f /dev/null
Darwin (requires that $pid has open files):
lsof -p $pid +r 1 &>/dev/null
With timeout (seconds)
Linux:
timeout $timeout tail --pid=$pid -f /dev/null
Darwin (requires that $pid has open files):
lsof -p $pid +r 1m%s -t | grep -qm1 $(date -v+${timeout}S +%s 2>/dev/null || echo INF)

There's no builtin. Use kill -0 in a loop for a workable solution:
anywait(){
for pid in "$#"; do
while kill -0 "$pid"; do
sleep 0.5
done
done
}
Or as a simpler oneliner for easy one time usage:
while kill -0 PIDS 2> /dev/null; do sleep 1; done;
As noted by several commentators, if you want to wait for processes that you do not have the privilege to send signals to, you have find some other way to detect if the process is running to replace the kill -0 $pid call. On Linux, test -d "/proc/$pid" works, on other systems you might have to use pgrep (if available) or something like ps | grep "^$pid ".

I found "kill -0" does not work if the process is owned by root (or other), so I used pgrep and came up with:
while pgrep -u root process_name > /dev/null; do sleep 1; done
This would have the disadvantage of probably matching zombie processes.

This bash script loop ends if the process does not exist, or it's a zombie.
PID=<pid to watch>
while s=`ps -p $PID -o s=` && [[ "$s" && "$s" != 'Z' ]]; do
sleep 1
done
EDIT: The above script was given below by Rockallite. Thanks!
My orignal answer below works for Linux, relying on procfs i.e. /proc/. I don't know its portability:
while [[ ( -d /proc/$PID ) && ( -z `grep zombie /proc/$PID/status` ) ]]; do
sleep 1
done
It's not limited to shell, but OS's themselves do not have system calls to watch non-child process termination.

FreeBSD and Solaris have this handy pwait(1) utility, which does exactly, what you want.
I believe, other modern OSes also have the necessary system calls too (MacOS, for example, implements BSD's kqueue), but not all make it available from command-line.

From the bash manpage
wait [n ...]
Wait for each specified process and return its termination status
Each n may be a process ID or a job specification; if a
job spec is given, all processes in that job's pipeline are
waited for. If n is not given, all currently active child processes
are waited for, and the return status is zero. If n
specifies a non-existent process or job, the return status is
127. Otherwise, the return status is the exit status of the
last process or job waited for.

Okay, so it seems the answer is -- no, there is no built in tool.
After setting /proc/sys/kernel/yama/ptrace_scope to 0, it is possible to use the strace program. Further switches can be used to make it silent, so that it really waits passively:
strace -qqe '' -p <PID>

All these solutions are tested in Ubuntu 14.04:
Solution 1 (by using ps command):
Just to add up to Pierz answer, I would suggest:
while ps axg | grep -vw grep | grep -w process_name > /dev/null; do sleep 1; done
In this case, grep -vw grep ensures that grep matches only process_name and not grep itself. It has the advantage of supporting the cases where the process_name is not at the end of a line at ps axg.
Solution 2 (by using top command and process name):
while [[ $(awk '$12=="process_name" {print $0}' <(top -n 1 -b)) ]]; do sleep 1; done
Replace process_name with the process name that appears in top -n 1 -b. Please keep the quotation marks.
To see the list of processes that you wait for them to be finished, you can run:
while : ; do p=$(awk '$12=="process_name" {print $0}' <(top -n 1 -b)); [[ $b ]] || break; echo $p; sleep 1; done
Solution 3 (by using top command and process ID):
while [[ $(awk '$1=="process_id" {print $0}' <(top -n 1 -b)) ]]; do sleep 1; done
Replace process_id with the process ID of your program.

Blocking solution
Use the wait in a loop, for waiting for terminate all processes:
function anywait()
{
for pid in "$#"
do
wait $pid
echo "Process $pid terminated"
done
echo 'All processes terminated'
}
This function will exits immediately, when all processes was terminated. This is the most efficient solution.
Non-blocking solution
Use the kill -0 in a loop, for waiting for terminate all processes + do anything between checks:
function anywait_w_status()
{
for pid in "$#"
do
while kill -0 "$pid"
do
echo "Process $pid still running..."
sleep 1
done
done
echo 'All processes terminated'
}
The reaction time decreased to sleep time, because have to prevent high CPU usage.
A realistic usage:
Waiting for terminate all processes + inform user about all running PIDs.
function anywait_w_status2()
{
while true
do
alive_pids=()
for pid in "$#"
do
kill -0 "$pid" 2>/dev/null \
&& alive_pids+="$pid "
done
if [ ${#alive_pids[#]} -eq 0 ]
then
break
fi
echo "Process(es) still running... ${alive_pids[#]}"
sleep 1
done
echo 'All processes terminated'
}
Notes
These functions getting PIDs via arguments by $# as BASH array.

Had the same issue, I solved the issue killing the process and then waiting for each process to finish using the PROC filesystem:
while [ -e /proc/${pid} ]; do sleep 0.1; done

There is no builtin feature to wait for any process to finish.
You could send kill -0 to any PID found, so you don't get puzzled by zombies and stuff that will still be visible in ps (while still retrieving the PID list using ps).

If you need to both kill a process and wait for it finish, this can be achieved with killall(1) (based on process names), and start-stop-daemon(8) (based on a pidfile).
To kill all processes matching someproc and wait for them to die:
killall someproc --wait # wait forever until matching processes die
timeout 10s killall someproc --wait # timeout after 10 seconds
(Unfortunately, there's no direct equivalent of --wait with kill for a specific pid).
To kill a process based on a pidfile /var/run/someproc.pid using signal SIGINT, while waiting for it to finish, with SIGKILL being sent after 20 seconds of timeout, use:
start-stop-daemon --stop --signal INT --retry 20 --pidfile /var/run/someproc.pid

Use inotifywait to monitor some file that gets closed, when your process terminates. Example (on Linux):
yourproc >logfile.log & disown
inotifywait -q -e close logfile.log
-e specifies the event to wait for, -q means minimal output only on termination. In this case it will be:
logfile.log CLOSE_WRITE,CLOSE
A single wait command can be used to wait for multiple processes:
yourproc1 >logfile1.log & disown
yourproc2 >logfile2.log & disown
yourproc3 >logfile3.log & disown
inotifywait -q -e close logfile1.log logfile2.log logfile3.log
The output string of inotifywait will tell you, which process terminated. This only works with 'real' files, not with something in /proc/

Rauno Palosaari's solution for Timeout in Seconds Darwin, is an excellent workaround for a UNIX-like OS that does not have GNU tail (it is not specific to Darwin). But, depending on the age of the UNIX-like operating system, the command-line offered is more complex than necessary, and can fail:
lsof -p $pid +r 1m%s -t | grep -qm1 $(date -v+${timeout}S +%s 2>/dev/null || echo INF)
On at least one old UNIX, the lsof argument +r 1m%s fails (even for a superuser):
lsof: can't read kernel name list.
The m%s is an output format specification. A simpler post-processor does not require it. For example, the following command waits on PID 5959 for up to five seconds:
lsof -p 5959 +r 1 | awk '/^=/ { if (T++ >= 5) { exit 1 } }'
In this example, if PID 5959 exits of its own accord before the five seconds elapses, ${?} is 0. If not ${?} returns 1 after five seconds.
It may be worth expressly noting that in +r 1, the 1 is the poll interval (in seconds), so it may be changed to suit the situation.

On a system like OSX you might not have pgrep so you can try this appraoch, when looking for processes by name:
while ps axg | grep process_name$ > /dev/null; do sleep 1; done
The $ symbol at the end of the process name ensures that grep matches only process_name to the end of line in the ps output and not itself.

Related

Is there a command, or programmatic way I can use to extract PIDs, and put them in a space-separated list? [duplicate]

I'm writing a bash script, which does several things.
In the beginning it starts several monitor scripts, each of them runs some other tools.
At the end of my main script, I would like to kill all things that were spawned from my shell.
So, it might looks like this:
#!/bin/bash
some_monitor1.sh &
some_monitor2.sh &
some_monitor3.sh &
do_some_work
...
kill_subprocesses
The thing is that most of these monitors spawn their own subprocesses, so doing (for example): killall some_monitor1.sh will not always help.
Any other way to handle this situation?
pkill -P $$
will fit (just kills it's own descendants)
EDIT: I got a downvote, don't know why. Anyway here is the help of -P
-P, --parent ppid,...
Only match processes whose parent process ID is listed.
and $$ is the process id of the script itself
After starting each child process, you can get its id with
ID=$!
Then you can use the stored PIDs to find and kill all grandchild etc. processes as described here or here.
If you use a negative PID with kill it will kill a process group. Example:
kill -- -1234
Extending pihentagy's answer to recursively kill all descendants (not just children):
kill_descendant_processes() {
local pid="$1"
local and_self="${2:-false}"
if children="$(pgrep -P "$pid")"; then
for child in $children; do
kill_descendant_processes "$child" true
done
fi
if [[ "$and_self" == true ]]; then
kill -9 "$pid"
fi
}
Now
kill_descendant_processes $$
will kill descedants of the current script/shell.
(Tested on Mac OS 10.9.5. Only depends on pgrep and kill)
kill $(jobs -p)
Rhys Ulerich's suggestion:
Caveat a race condition, using [code below] accomplishes what Jürgen suggested without causing an error when no jobs exist
[[ -z "$(jobs -p)" ]] || kill $(jobs -p)
pkill with optioin "-P" should help:
pkill -P $(pgrep some_monitor1.sh)
from man page:
-P ppid,...
Only match processes whose parent process ID is listed.
There are some discussions on linuxquests.org, please check:
http://www.linuxquestions.org/questions/programming-9/use-only-one-kill-to-kill-father-and-child-processes-665753/
I like the following straightforward approach: start the subprocesses with an environment variable with some name/value and use this to kill the subprocesses later. Most convenient is to use the process-id of the running bash script i.e. $$. This also works when subprocesses starts another subprocesses as the environment is inherited.
So start the subprocesses like this:
MY_SCRIPT_TOKEN=$$ some_monitor1.sh &
MY_SCRIPT_TOKEN=$$ some_monitor2.sh &
And afterwards kill them like this:
ps -Eef | grep "MY_SCRIPT_TOKEN=$$" | awk '{print $2}' | xargs kill
Similar to above, just a minor tweak to kill all processes indicated by ps:
ps -o pid= | tail -n +2 | xargs kill -9
Perhaps sloppy / fragile, but seemed to work at first blush. Relies on fact that current process ($$) tends to be first line.
Description of commands, in order:
Print PIDs for processes in current terminal, excl. header column
Start from Line 2 (excl. current terminal's shell)
Kill those procs
I've incorporated a bunch of the suggestions from the answers here into a single function. It gives time for processes to exit, murders them if they take too long, and doesn't have to grep through output (eg, via ps)
#!/bin/bash
# This function will kill all sub jobs.
function KillJobs() {
[[ -z "$(jobs -p)" ]] && return # no jobs to kill
local SIG="INT" # default to a gentle goodbye
[[ ! -z "$1" ]] && SIG="$1" # optionally send a different signal
# my version of 'kill' doesn't seem to understand `kill -- -${PID}`
#jobs -p | xargs -I%% kill -s "$SIG" -- -%% # kill each job's processes group
jobs -p | xargs kill -s "$SIG" # kill each job's processes group
## give the processes a moment to die, before forcing them to.
[[ "$SIG" != "KILL" ]] && {
sleep 0.2
KillJobs "KILL"
}
}
I also tried to get a variation working with pkill, but on my system (xubuntu 21.10) it does absolutely nothing.
#!/bin/bash
# This function doesn't seem to work.
function KillChildren() {
local SIG="INT" # default to a gentle goodbye
[[ ! -z "$1" ]] && SIG="$1" # optionally send a different signal
pkill --signal "$SIG" -P $$ # kill descendent's and their processes groups
[[ "$SIG" != "KILL" ]] && {
# give them a moment to die before we force them to.
sleep 0.2
KillChildren "KILL" ;
}
}

How to kill a process group with kill in bash?

I have a script which is much more complicated but I managed to produce a short script that exhibits the same problem.
I create a process and make it a session leader and then send SIGINT to it. The kill builtin doesn't fail but the process doesn't get killed either (i.e. the default behaviour for SIGINT is to kill). I tried with kill -INT -pid (which should be equivalent to what I do currently) and the /bin/kill command but the behaviour is the same.
The script is as follows:
#!/bin/bash
# Run in a new session so that I don't have to kill the shell
setsid bash -c "sleep 50" &
procs=$(ps --ppid $$ -o pid,pgid,command | grep 'sleep' | head -1)
if [[ -z "$procs" ]]; then
echo "Couldn't find process group"
exit 1
fi
PID=$(echo $procs | cut -d ' ' -f 1)
pgid=$(echo $procs | cut -d ' ' -f 2)
if ! kill -n SIGINT $pgid; then
echo "kill failed"
fi
echo "done"
ps -P $pgid
My expectation is that the last ps command shouldn't report anything (as kill didn't report failure and hence the process should have died) but it does.
I am looking for an explanation of the above noted behaviour and how I can kill a process group (i.e. both the bash and the sleep it starts -- the setsid line above) running in a separate session.
I think you'll find that sleep ignores SIGINT. Take a look at the signals of your sleep command and see. On my Linux box I find:
SigIgn: 0000000000000006
The second bit from the right is set (6 = 4 + 2 + 0), and from the above link:
--> 2 = SIGINT
Try send a HUP, and you'll find it does kill the sleep.

shell script - how to stop "watch" command in the shell script [duplicate]

I have a bash script that launches a child process that crashes (actually, hangs) from time to time and with no apparent reason (closed source, so there isn't much I can do about it). As a result, I would like to be able to launch this process for a given amount of time, and kill it if it did not return successfully after a given amount of time.
Is there a simple and robust way to achieve that using bash?
P.S.: tell me if this question is better suited to serverfault or superuser.
(As seen in:
BASH FAQ entry #68: "How do I run a command, and have it abort (timeout) after N seconds?")
If you don't mind downloading something, use timeout (sudo apt-get install timeout) and use it like: (most Systems have it already installed otherwise use sudo apt-get install coreutils)
timeout 10 ping www.goooooogle.com
If you don't want to download something, do what timeout does internally:
( cmdpid=$BASHPID; (sleep 10; kill $cmdpid) & exec ping www.goooooogle.com )
In case that you want to do a timeout for longer bash code, use the second option as such:
( cmdpid=$BASHPID;
(sleep 10; kill $cmdpid) \
& while ! ping -w 1 www.goooooogle.com
do
echo crap;
done )
# Spawn a child process:
(dosmth) & pid=$!
# in the background, sleep for 10 secs then kill that process
(sleep 10 && kill -9 $pid) &
or to get the exit codes as well:
# Spawn a child process:
(dosmth) & pid=$!
# in the background, sleep for 10 secs then kill that process
(sleep 10 && kill -9 $pid) & waiter=$!
# wait on our worker process and return the exitcode
exitcode=$(wait $pid && echo $?)
# kill the waiter subshell, if it still runs
kill -9 $waiter 2>/dev/null
# 0 if we killed the waiter, cause that means the process finished before the waiter
finished_gracefully=$?
sleep 999&
t=$!
sleep 10
kill $t
I also had this question and found two more things very useful:
The SECONDS variable in bash.
The command "pgrep".
So I use something like this on the command line (OSX 10.9):
ping www.goooooogle.com & PING_PID=$(pgrep 'ping'); SECONDS=0; while pgrep -q 'ping'; do sleep 0.2; if [ $SECONDS = 10 ]; then kill $PING_PID; fi; done
As this is a loop I included a "sleep 0.2" to keep the CPU cool. ;-)
(BTW: ping is a bad example anyway, you just would use the built-in "-t" (timeout) option.)
Assuming you have (or can easily make) a pid file for tracking the child's pid, you could then create a script that checks the modtime of the pid file and kills/respawns the process as needed. Then just put the script in crontab to run at approximately the period you need.
Let me know if you need more details. If that doesn't sound like it'd suit your needs, what about upstart?
One way is to run the program in a subshell, and communicate with the subshell through a named pipe with the read command. This way you can check the exit status of the process being run and communicate this back through the pipe.
Here's an example of timing out the yes command after 3 seconds. It gets the PID of the process using pgrep (possibly only works on Linux). There is also some problem with using a pipe in that a process opening a pipe for read will hang until it is also opened for write, and vice versa. So to prevent the read command hanging, I've "wedged" open the pipe for read with a background subshell. (Another way to prevent a freeze to open the pipe read-write, i.e. read -t 5 <>finished.pipe - however, that also may not work except with Linux.)
rm -f finished.pipe
mkfifo finished.pipe
{ yes >/dev/null; echo finished >finished.pipe ; } &
SUBSHELL=$!
# Get command PID
while : ; do
PID=$( pgrep -P $SUBSHELL yes )
test "$PID" = "" || break
sleep 1
done
# Open pipe for writing
{ exec 4>finished.pipe ; while : ; do sleep 1000; done } &
read -t 3 FINISHED <finished.pipe
if [ "$FINISHED" = finished ] ; then
echo 'Subprocess finished'
else
echo 'Subprocess timed out'
kill $PID
fi
rm finished.pipe
Here's an attempt which tries to avoid killing a process after it has already exited, which reduces the chance of killing another process with the same process ID (although it's probably impossible to avoid this kind of error completely).
run_with_timeout ()
{
t=$1
shift
echo "running \"$*\" with timeout $t"
(
# first, run process in background
(exec sh -c "$*") &
pid=$!
echo $pid
# the timeout shell
(sleep $t ; echo timeout) &
waiter=$!
echo $waiter
# finally, allow process to end naturally
wait $pid
echo $?
) \
| (read pid
read waiter
if test $waiter != timeout ; then
read status
else
status=timeout
fi
# if we timed out, kill the process
if test $status = timeout ; then
kill $pid
exit 99
else
# if the program exited normally, kill the waiting shell
kill $waiter
exit $status
fi
)
}
Use like run_with_timeout 3 sleep 10000, which runs sleep 10000 but ends it after 3 seconds.
This is like other answers which use a background timeout process to kill the child process after a delay. I think this is almost the same as Dan's extended answer (https://stackoverflow.com/a/5161274/1351983), except the timeout shell will not be killed if it has already ended.
After this program has ended, there will still be a few lingering "sleep" processes running, but they should be harmless.
This may be a better solution than my other answer because it does not use the non-portable shell feature read -t and does not use pgrep.
Here's the third answer I've submitted here. This one handles signal interrupts and cleans up background processes when SIGINT is received. It uses the $BASHPID and exec trick used in the top answer to get the PID of a process (in this case $$ in a sh invocation). It uses a FIFO to communicate with a subshell that is responsible for killing and cleanup. (This is like the pipe in my second answer, but having a named pipe means that the signal handler can write into it too.)
run_with_timeout ()
{
t=$1 ; shift
trap cleanup 2
F=$$.fifo ; rm -f $F ; mkfifo $F
# first, run main process in background
"$#" & pid=$!
# sleeper process to time out
( sh -c "echo \$\$ >$F ; exec sleep $t" ; echo timeout >$F ) &
read sleeper <$F
# control shell. read from fifo.
# final input is "finished". after that
# we clean up. we can get a timeout or a
# signal first.
( exec 0<$F
while : ; do
read input
case $input in
finished)
test $sleeper != 0 && kill $sleeper
rm -f $F
exit 0
;;
timeout)
test $pid != 0 && kill $pid
sleeper=0
;;
signal)
test $pid != 0 && kill $pid
;;
esac
done
) &
# wait for process to end
wait $pid
status=$?
echo finished >$F
return $status
}
cleanup ()
{
echo signal >$$.fifo
}
I've tried to avoid race conditions as far as I can. However, one source of error I couldn't remove is when the process ends near the same time as the timeout. For example, run_with_timeout 2 sleep 2 or run_with_timeout 0 sleep 0. For me, the latter gives an error:
timeout.sh: line 250: kill: (23248) - No such process
as it is trying to kill a process that has already exited by itself.
#Kill command after 10 seconds
timeout 10 command
#If you don't have timeout installed, this is almost the same:
sh -c '(sleep 10; kill "$$") & command'
#The same as above, with muted duplicate messages:
sh -c '(sleep 10; kill "$$" 2>/dev/null) & command'

How to kill all subprocesses of shell?

I'm writing a bash script, which does several things.
In the beginning it starts several monitor scripts, each of them runs some other tools.
At the end of my main script, I would like to kill all things that were spawned from my shell.
So, it might looks like this:
#!/bin/bash
some_monitor1.sh &
some_monitor2.sh &
some_monitor3.sh &
do_some_work
...
kill_subprocesses
The thing is that most of these monitors spawn their own subprocesses, so doing (for example): killall some_monitor1.sh will not always help.
Any other way to handle this situation?
pkill -P $$
will fit (just kills its own descendants)
And here is the help of -P
-P, --parent ppid,...
Only match processes whose parent process ID is listed.
and $$ is the process id of the script itself
After starting each child process, you can get its id with
ID=$!
Then you can use the stored PIDs to find and kill all grandchild etc. processes as described here or here.
If you use a negative PID with kill it will kill a process group. Example:
kill -- -1234
Extending pihentagy's answer to recursively kill all descendants (not just children):
kill_descendant_processes() {
local pid="$1"
local and_self="${2:-false}"
if children="$(pgrep -P "$pid")"; then
for child in $children; do
kill_descendant_processes "$child" true
done
fi
if [[ "$and_self" == true ]]; then
kill -9 "$pid"
fi
}
Now
kill_descendant_processes $$
will kill descedants of the current script/shell.
(Tested on Mac OS 10.9.5. Only depends on pgrep and kill)
kill $(jobs -p)
Rhys Ulerich's suggestion:
Caveat a race condition, using [code below] accomplishes what Jürgen suggested without causing an error when no jobs exist
[[ -z "$(jobs -p)" ]] || kill $(jobs -p)
pkill with optioin "-P" should help:
pkill -P $(pgrep some_monitor1.sh)
from man page:
-P ppid,...
Only match processes whose parent process ID is listed.
There are some discussions on linuxquests.org, please check:
http://www.linuxquestions.org/questions/programming-9/use-only-one-kill-to-kill-father-and-child-processes-665753/
I like the following straightforward approach: start the subprocesses with an environment variable with some name/value and use this to kill the subprocesses later. Most convenient is to use the process-id of the running bash script i.e. $$. This also works when subprocesses starts another subprocesses as the environment is inherited.
So start the subprocesses like this:
MY_SCRIPT_TOKEN=$$ some_monitor1.sh &
MY_SCRIPT_TOKEN=$$ some_monitor2.sh &
And afterwards kill them like this:
ps -Eef | grep "MY_SCRIPT_TOKEN=$$" | awk '{print $2}' | xargs kill
Similar to above, just a minor tweak to kill all processes indicated by ps:
ps -o pid= | tail -n +2 | xargs kill -9
Perhaps sloppy / fragile, but seemed to work at first blush. Relies on fact that current process ($$) tends to be first line.
Description of commands, in order:
Print PIDs for processes in current terminal, excl. header column
Start from Line 2 (excl. current terminal's shell)
Kill those procs
I've incorporated a bunch of the suggestions from the answers here into a single function. It gives time for processes to exit, murders them if they take too long, and doesn't have to grep through output (eg, via ps)
#!/bin/bash
# This function will kill all sub jobs.
function KillJobs() {
[[ -z "$(jobs -p)" ]] && return # no jobs to kill
local SIG="INT" # default to a gentle goodbye
[[ ! -z "$1" ]] && SIG="$1" # optionally send a different signal
# my version of 'kill' doesn't seem to understand `kill -- -${PID}`
#jobs -p | xargs -I%% kill -s "$SIG" -- -%% # kill each job's processes group
jobs -p | xargs kill -s "$SIG" # kill each job's processes group
## give the processes a moment to die, before forcing them to.
[[ "$SIG" != "KILL" ]] && {
sleep 0.2
KillJobs "KILL"
}
}
I also tried to get a variation working with pkill, but on my system (xubuntu 21.10) it does absolutely nothing.
#!/bin/bash
# This function doesn't seem to work.
function KillChildren() {
local SIG="INT" # default to a gentle goodbye
[[ ! -z "$1" ]] && SIG="$1" # optionally send a different signal
pkill --signal "$SIG" -P $$ # kill descendent's and their processes groups
[[ "$SIG" != "KILL" ]] && {
# give them a moment to die before we force them to.
sleep 0.2
KillChildren "KILL" ;
}
}

Wait for a process to finish

Is there any builtin feature in Bash to wait for a process to finish?
The wait command only allows one to wait for child processes to finish.
I would like to know if there is any way to wait for any process to finish before proceeding in any script.
A mechanical way to do this is as follows but I would like to know if there is any builtin feature in Bash.
while ps -p `cat $PID_FILE` > /dev/null; do sleep 1; done
To wait for any process to finish
Linux (doesn't work on Alpine, where ash doesn't support tail --pid):
tail --pid=$pid -f /dev/null
Darwin (requires that $pid has open files):
lsof -p $pid +r 1 &>/dev/null
With timeout (seconds)
Linux:
timeout $timeout tail --pid=$pid -f /dev/null
Darwin (requires that $pid has open files):
lsof -p $pid +r 1m%s -t | grep -qm1 $(date -v+${timeout}S +%s 2>/dev/null || echo INF)
There's no builtin. Use kill -0 in a loop for a workable solution:
anywait(){
for pid in "$#"; do
while kill -0 "$pid"; do
sleep 0.5
done
done
}
Or as a simpler oneliner for easy one time usage:
while kill -0 PIDS 2> /dev/null; do sleep 1; done;
As noted by several commentators, if you want to wait for processes that you do not have the privilege to send signals to, you have find some other way to detect if the process is running to replace the kill -0 $pid call. On Linux, test -d "/proc/$pid" works, on other systems you might have to use pgrep (if available) or something like ps | grep "^$pid ".
I found "kill -0" does not work if the process is owned by root (or other), so I used pgrep and came up with:
while pgrep -u root process_name > /dev/null; do sleep 1; done
This would have the disadvantage of probably matching zombie processes.
This bash script loop ends if the process does not exist, or it's a zombie.
PID=<pid to watch>
while s=`ps -p $PID -o s=` && [[ "$s" && "$s" != 'Z' ]]; do
sleep 1
done
EDIT: The above script was given below by Rockallite. Thanks!
My orignal answer below works for Linux, relying on procfs i.e. /proc/. I don't know its portability:
while [[ ( -d /proc/$PID ) && ( -z `grep zombie /proc/$PID/status` ) ]]; do
sleep 1
done
It's not limited to shell, but OS's themselves do not have system calls to watch non-child process termination.
FreeBSD and Solaris have this handy pwait(1) utility, which does exactly, what you want.
I believe, other modern OSes also have the necessary system calls too (MacOS, for example, implements BSD's kqueue), but not all make it available from command-line.
From the bash manpage
wait [n ...]
Wait for each specified process and return its termination status
Each n may be a process ID or a job specification; if a
job spec is given, all processes in that job's pipeline are
waited for. If n is not given, all currently active child processes
are waited for, and the return status is zero. If n
specifies a non-existent process or job, the return status is
127. Otherwise, the return status is the exit status of the
last process or job waited for.
Okay, so it seems the answer is -- no, there is no built in tool.
After setting /proc/sys/kernel/yama/ptrace_scope to 0, it is possible to use the strace program. Further switches can be used to make it silent, so that it really waits passively:
strace -qqe '' -p <PID>
All these solutions are tested in Ubuntu 14.04:
Solution 1 (by using ps command):
Just to add up to Pierz answer, I would suggest:
while ps axg | grep -vw grep | grep -w process_name > /dev/null; do sleep 1; done
In this case, grep -vw grep ensures that grep matches only process_name and not grep itself. It has the advantage of supporting the cases where the process_name is not at the end of a line at ps axg.
Solution 2 (by using top command and process name):
while [[ $(awk '$12=="process_name" {print $0}' <(top -n 1 -b)) ]]; do sleep 1; done
Replace process_name with the process name that appears in top -n 1 -b. Please keep the quotation marks.
To see the list of processes that you wait for them to be finished, you can run:
while : ; do p=$(awk '$12=="process_name" {print $0}' <(top -n 1 -b)); [[ $b ]] || break; echo $p; sleep 1; done
Solution 3 (by using top command and process ID):
while [[ $(awk '$1=="process_id" {print $0}' <(top -n 1 -b)) ]]; do sleep 1; done
Replace process_id with the process ID of your program.
Blocking solution
Use the wait in a loop, for waiting for terminate all processes:
function anywait()
{
for pid in "$#"
do
wait $pid
echo "Process $pid terminated"
done
echo 'All processes terminated'
}
This function will exits immediately, when all processes was terminated. This is the most efficient solution.
Non-blocking solution
Use the kill -0 in a loop, for waiting for terminate all processes + do anything between checks:
function anywait_w_status()
{
for pid in "$#"
do
while kill -0 "$pid"
do
echo "Process $pid still running..."
sleep 1
done
done
echo 'All processes terminated'
}
The reaction time decreased to sleep time, because have to prevent high CPU usage.
A realistic usage:
Waiting for terminate all processes + inform user about all running PIDs.
function anywait_w_status2()
{
while true
do
alive_pids=()
for pid in "$#"
do
kill -0 "$pid" 2>/dev/null \
&& alive_pids+="$pid "
done
if [ ${#alive_pids[#]} -eq 0 ]
then
break
fi
echo "Process(es) still running... ${alive_pids[#]}"
sleep 1
done
echo 'All processes terminated'
}
Notes
These functions getting PIDs via arguments by $# as BASH array.
Had the same issue, I solved the issue killing the process and then waiting for each process to finish using the PROC filesystem:
while [ -e /proc/${pid} ]; do sleep 0.1; done
There is no builtin feature to wait for any process to finish.
You could send kill -0 to any PID found, so you don't get puzzled by zombies and stuff that will still be visible in ps (while still retrieving the PID list using ps).
If you need to both kill a process and wait for it finish, this can be achieved with killall(1) (based on process names), and start-stop-daemon(8) (based on a pidfile).
To kill all processes matching someproc and wait for them to die:
killall someproc --wait # wait forever until matching processes die
timeout 10s killall someproc --wait # timeout after 10 seconds
(Unfortunately, there's no direct equivalent of --wait with kill for a specific pid).
To kill a process based on a pidfile /var/run/someproc.pid using signal SIGINT, while waiting for it to finish, with SIGKILL being sent after 20 seconds of timeout, use:
start-stop-daemon --stop --signal INT --retry 20 --pidfile /var/run/someproc.pid
Use inotifywait to monitor some file that gets closed, when your process terminates. Example (on Linux):
yourproc >logfile.log & disown
inotifywait -q -e close logfile.log
-e specifies the event to wait for, -q means minimal output only on termination. In this case it will be:
logfile.log CLOSE_WRITE,CLOSE
A single wait command can be used to wait for multiple processes:
yourproc1 >logfile1.log & disown
yourproc2 >logfile2.log & disown
yourproc3 >logfile3.log & disown
inotifywait -q -e close logfile1.log logfile2.log logfile3.log
The output string of inotifywait will tell you, which process terminated. This only works with 'real' files, not with something in /proc/
Rauno Palosaari's solution for Timeout in Seconds Darwin, is an excellent workaround for a UNIX-like OS that does not have GNU tail (it is not specific to Darwin). But, depending on the age of the UNIX-like operating system, the command-line offered is more complex than necessary, and can fail:
lsof -p $pid +r 1m%s -t | grep -qm1 $(date -v+${timeout}S +%s 2>/dev/null || echo INF)
On at least one old UNIX, the lsof argument +r 1m%s fails (even for a superuser):
lsof: can't read kernel name list.
The m%s is an output format specification. A simpler post-processor does not require it. For example, the following command waits on PID 5959 for up to five seconds:
lsof -p 5959 +r 1 | awk '/^=/ { if (T++ >= 5) { exit 1 } }'
In this example, if PID 5959 exits of its own accord before the five seconds elapses, ${?} is 0. If not ${?} returns 1 after five seconds.
It may be worth expressly noting that in +r 1, the 1 is the poll interval (in seconds), so it may be changed to suit the situation.
On a system like OSX you might not have pgrep so you can try this appraoch, when looking for processes by name:
while ps axg | grep process_name$ > /dev/null; do sleep 1; done
The $ symbol at the end of the process name ensures that grep matches only process_name to the end of line in the ps output and not itself.

Resources