Related
I have a command CMD called from my main bourne shell script that takes forever.
I want to modify the script as follows:
Run the command CMD in parallel as a background process (CMD &).
In the main script, have a loop to monitor the spawned command every few seconds. The loop also echoes some messages to stdout indicating progress of the script.
Exit the loop when the spawned command terminates.
Capture and report the exit code of the spawned process.
Can someone give me pointers to accomplish this?
1: In bash, $! holds the PID of the last background process that was executed. That will tell you what process to monitor, anyway.
4: wait <n> waits until the process with PID <n> is complete (it will block until the process completes, so you might not want to call this until you are sure the process is done), and then returns the exit code of the completed process.
2, 3: ps or ps | grep " $! " can tell you whether the process is still running. It is up to you how to understand the output and decide how close it is to finishing. (ps | grep isn't idiot-proof. If you have time you can come up with a more robust way to tell whether the process is still running).
Here's a skeleton script:
# simulate a long process that will have an identifiable exit code
(sleep 15 ; /bin/false) &
my_pid=$!
while ps | grep " $my_pid " # might also need | grep -v grep here
do
echo $my_pid is still in the ps output. Must still be running.
sleep 3
done
echo Oh, it looks like the process is done.
wait $my_pid
# The variable $? always holds the exit code of the last command to finish.
# Here it holds the exit code of $my_pid, since wait exits with that code.
my_status=$?
echo The exit status of the process was $my_status
This is how I solved it when I had a similar need:
# Some function that takes a long time to process
longprocess() {
# Sleep up to 14 seconds
sleep $((RANDOM % 15))
# Randomly exit with 0 or 1
exit $((RANDOM % 2))
}
pids=""
# Run five concurrent processes
for i in {1..5}; do
( longprocess ) &
# store PID of process
pids+=" $!"
done
# Wait for all processes to finish, will take max 14s
# as it waits in order of launch, not order of finishing
for p in $pids; do
if wait $p; then
echo "Process $p success"
else
echo "Process $p fail"
fi
done
The pid of a backgrounded child process is stored in $!.
You can store all child processes' pids into an array, e.g. PIDS[].
wait [-n] [jobspec or pid …]
Wait until the child process specified by each process ID pid or job specification jobspec exits and return the exit status of the last command waited for. If a job spec is given, all processes in the job are waited for. If no arguments are given, all currently active child processes are waited for, and the return status is zero. If the -n option is supplied, wait waits for any job to terminate and returns its exit status. If neither jobspec nor pid specifies an active child process of the shell, the return status is 127.
Use wait command you can wait for all child processes finish, meanwhile you can get exit status of each child processes via $? and store status into STATUS[]. Then you can do something depending by status.
I have tried the following 2 solutions and they run well. solution01 is
more concise, while solution02 is a little complicated.
solution01
#!/bin/bash
# start 3 child processes concurrently, and store each pid into array PIDS[].
process=(a.sh b.sh c.sh)
for app in ${process[#]}; do
./${app} &
PIDS+=($!)
done
# wait for all processes to finish, and store each process's exit code into array STATUS[].
for pid in ${PIDS[#]}; do
echo "pid=${pid}"
wait ${pid}
STATUS+=($?)
done
# after all processed finish, check their exit codes in STATUS[].
i=0
for st in ${STATUS[#]}; do
if [[ ${st} -ne 0 ]]; then
echo "$i failed"
else
echo "$i finish"
fi
((i+=1))
done
solution02
#!/bin/bash
# start 3 child processes concurrently, and store each pid into array PIDS[].
i=0
process=(a.sh b.sh c.sh)
for app in ${process[#]}; do
./${app} &
pid=$!
PIDS[$i]=${pid}
((i+=1))
done
# wait for all processes to finish, and store each process's exit code into array STATUS[].
i=0
for pid in ${PIDS[#]}; do
echo "pid=${pid}"
wait ${pid}
STATUS[$i]=$?
((i+=1))
done
# after all processed finish, check their exit codes in STATUS[].
i=0
for st in ${STATUS[#]}; do
if [[ ${st} -ne 0 ]]; then
echo "$i failed"
else
echo "$i finish"
fi
((i+=1))
done
As I see almost all answers use external utilities (mostly ps) to poll the state of the background process. There is a more unixesh solution, catching the SIGCHLD signal. In the signal handler it has to be checked which child process was stopped. It can be done by kill -0 <PID> built-in (universal) or checking the existence of /proc/<PID> directory (Linux specific) or using the jobs built-in (bash specific. jobs -l also reports the pid. In this case the 3rd field of the output can be Stopped|Running|Done|Exit . ).
Here is my example.
The launched process is called loop.sh. It accepts -x or a number as an argument. For -x is exits with exit code 1. For a number it waits num*5 seconds. In every 5 seconds it prints its PID.
The launcher process is called launch.sh:
#!/bin/bash
handle_chld() {
local tmp=()
for((i=0;i<${#pids[#]};++i)); do
if [ ! -d /proc/${pids[i]} ]; then
wait ${pids[i]}
echo "Stopped ${pids[i]}; exit code: $?"
else tmp+=(${pids[i]})
fi
done
pids=(${tmp[#]})
}
set -o monitor
trap "handle_chld" CHLD
# Start background processes
./loop.sh 3 &
pids+=($!)
./loop.sh 2 &
pids+=($!)
./loop.sh -x &
pids+=($!)
# Wait until all background processes are stopped
while [ ${#pids[#]} -gt 0 ]; do echo "WAITING FOR: ${pids[#]}"; sleep 2; done
echo STOPPED
For more explanation see: Starting a process from bash script failed
#/bin/bash
#pgm to monitor
tail -f /var/log/messages >> /tmp/log&
# background cmd pid
pid=$!
# loop to monitor running background cmd
while :
do
ps ax | grep $pid | grep -v grep
ret=$?
if test "$ret" != "0"
then
echo "Monitored pid ended"
break
fi
sleep 5
done
wait $pid
echo $?
I would change your approach slightly. Rather than checking every few seconds if the command is still alive and reporting a message, have another process that reports every few seconds that the command is still running and then kill that process when the command finishes. For example:
#!/bin/sh
cmd() { sleep 5; exit 24; }
cmd & # Run the long running process
pid=$! # Record the pid
# Spawn a process that coninually reports that the command is still running
while echo "$(date): $pid is still running"; do sleep 1; done &
echoer=$!
# Set a trap to kill the reporter when the process finishes
trap 'kill $echoer' 0
# Wait for the process to finish
if wait $pid; then
echo "cmd succeeded"
else
echo "cmd FAILED!! (returned $?)"
fi
Our team had the same need with a remote SSH-executed script which was timing out after 25 minutes of inactivity. Here is a solution with the monitoring loop checking the background process every second, but printing only every 10 minutes to suppress an inactivity timeout.
long_running.sh &
pid=$!
# Wait on a background job completion. Query status every 10 minutes.
declare -i elapsed=0
# `ps -p ${pid}` works on macOS and CentOS. On both OSes `ps ${pid}` works as well.
while ps -p ${pid} >/dev/null; do
sleep 1
if ((++elapsed % 600 == 0)); then
echo "Waiting for the completion of the main script. $((elapsed / 60))m and counting ..."
fi
done
# Return the exit code of the terminated background process. This works in Bash 4.4 despite what Bash docs say:
# "If neither jobspec nor pid specifies an active child process of the shell, the return status is 127."
wait ${pid}
A simple example, similar to the solutions above. This doesn't require monitoring any process output. The next example uses tail to follow output.
$ echo '#!/bin/bash' > tmp.sh
$ echo 'sleep 30; exit 5' >> tmp.sh
$ chmod +x tmp.sh
$ ./tmp.sh &
[1] 7454
$ pid=$!
$ wait $pid
[1]+ Exit 5 ./tmp.sh
$ echo $?
5
Use tail to follow process output and quit when the process is complete.
$ echo '#!/bin/bash' > tmp.sh
$ echo 'i=0; while let "$i < 10"; do sleep 5; echo "$i"; let i=$i+1; done; exit 5;' >> tmp.sh
$ chmod +x tmp.sh
$ ./tmp.sh
0
1
2
^C
$ ./tmp.sh > /tmp/tmp.log 2>&1 &
[1] 7673
$ pid=$!
$ tail -f --pid $pid /tmp/tmp.log
0
1
2
3
4
5
6
7
8
9
[1]+ Exit 5 ./tmp.sh > /tmp/tmp.log 2>&1
$ wait $pid
$ echo $?
5
Another solution is to monitor processes via the proc filesystem (safer than ps/grep combo); when you start a process it has a corresponding folder in /proc/$pid, so the solution could be
#!/bin/bash
....
doSomething &
local pid=$!
while [ -d /proc/$pid ]; do # While directory exists, the process is running
doSomethingElse
....
else # when directory is removed from /proc, process has ended
wait $pid
local exit_status=$?
done
....
Now you can use the $exit_status variable however you like.
With this method, your script doesnt have to wait for the background process, you will only have to monitor a temporary file for the exit status.
FUNCmyCmd() { sleep 3;return 6; };
export retFile=$(mktemp);
FUNCexecAndWait() { FUNCmyCmd;echo $? >$retFile; };
FUNCexecAndWait&
now, your script can do anything else while you just have to keep monitoring the contents of retFile (it can also contain any other information you want like the exit time).
PS.: btw, I coded thinking in bash
My solution was to use an anonymous pipe to pass the status to a monitoring loop. There are no temporary files used to exchange status so nothing to cleanup. If you were uncertain about the number of background jobs the break condition could be [ -z "$(jobs -p)" ].
#!/bin/bash
exec 3<> <(:)
{ sleep 15 ; echo "sleep/exit $?" >&3 ; } &
while read -u 3 -t 1 -r STAT CODE || STAT="timeout" ; do
echo "stat: ${STAT}; code: ${CODE}"
if [ "${STAT}" = "sleep/exit" ] ; then
break
fi
done
how about ...
# run your stuff
unset PID
for process in one two three four
do
( sleep $((RANDOM%20)); echo hello from process $process; exit $((RANDOM%3)); ) & 2>&1
PID+=($!)
done
# (optional) report on the status of that stuff as it exits
for pid in "${PID[#]}"
do
( wait "$pid"; echo "process $pid complemted with exit status $?") &
done
# (optional) while we wait, monitor that stuff
while ps --pid "${PID[*]}" --ppid "${PID[*]}" --format pid,ppid,command,pcpu
do
sleep 5
done | xargs -i date '+%x %X {}'
# return non-zero if any are non zero
SUCCESS=0
for pid in "${PID[#]}"
do
wait "$pid" && ((SUCCESS++)) && echo "$pid OK" || echo "$pid returned $?"
done
echo "success for $SUCCESS out of ${#PID} jobs"
exit $(( ${#PID} - SUCCESS ))
This may be extending beyond your question, however if you're concerned about the length of time processes are running for, you may be interested in checking the status of running background processes after an interval of time. It's easy enough to check which child PIDs are still running using pgrep -P $$, however I came up with the following solution to check the exit status of those PIDs that have already expired:
cmd1() { sleep 5; exit 24; }
cmd2() { sleep 10; exit 0; }
pids=()
cmd1 & pids+=("$!")
cmd2 & pids+=("$!")
lasttimeout=0
for timeout in 2 7 11; do
echo -n "interval-$timeout: "
sleep $((timeout-lasttimeout))
# you can only wait on a pid once
remainingpids=()
for pid in ${pids[*]}; do
if ! ps -p $pid >/dev/null ; then
wait $pid
echo -n "pid-$pid:exited($?); "
else
echo -n "pid-$pid:running; "
remainingpids+=("$pid")
fi
done
pids=( ${remainingpids[*]} )
lasttimeout=$timeout
echo
done
which outputs:
interval-2: pid-28083:running; pid-28084:running;
interval-7: pid-28083:exited(24); pid-28084:running;
interval-11: pid-28084:exited(0);
Note: You could change $pids to a string variable rather than array to simplify things if you like.
I want to write 2 scripts that send a signal each other, like a ping-pong match, but with a signal, not a ball.
First script:
#!/bin/bash
PATH=${PATH}:"/home/cosimo/Università/Sistemi Operativi/scripts"
exec player2pp.sh $$ &
trap "kill -SIGUSR1 $pidp2" SIGUSR1
sleep 2
Second script (player2pp.sh):
#!/bin/bash
trap "kill -SIGUSR1 $1" SIGUSR1
sleep 2
kill -SIGUSR1 $1
sleep 2
I got this error in player2pp.sh:
kill: no corresponding process.
What am I doing wrong?
Some problems:
you're launching the player2 script in the background, so you don't need exec
you don't store the PID of the player2 process anywhere
the first script launches player2, sleeps, then exits. You need to start some kind of infinite loop to avoid exiting.
if you kill the first script with Ctrl-C, the second script is left running in the background
Additionally, I would write a function for the trap so it's easier to do some logging:
player1pp.sh
#!/bin/bash
cd "/home/cosimo/Università/Sistemi Operativi/scripts"
./player2pp.sh $$ &
pidp2=$!
_ping() {
echo "ping! killing $pidp2"
kill -SIGUSR1 $pidp2
}
trap _ping SIGUSR1
trap "kill $pidp2" EXIT
while true; do sleep 2; done
player2pp.sh
#!/bin/bash
pidp1=$1
_pong() {
echo "pong! killing $pidp1"
kill -SIGUSR1 $pidp1
}
trap _pong SIGUSR1
_pong # start the game
while true; do sleep 2; done
For fun, add some randomness:
Player 1
#!/bin/bash
cd "/home/cosimo/Università/Sistemi Operativi/scripts"
./player2pp.sh $$ &
opponent=$!
ping() {
sleep=$((RANDOM % 5))
echo "ping! killing $opponent in $sleep"
sleep $sleep
kill -USR1 $opponent
}
trap ping USR2
cleanup () {
kill -0 $opponent && kill $opponent
}
trap cleanup EXIT
ping
while :; do :; done
Player 2
#!/bin/bash
opponent=$1
pong() {
sleep=$((RANDOM % 5))
echo "pong! killing $opponent in $sleep"
sleep $sleep
kill -USR2 $opponent
}
trap pong USR1
cleanup () {
kill -0 $opponent && kill $opponent
}
trap cleanup EXIT
while :; do :; done
I have a command CMD called from my main bourne shell script that takes forever.
I want to modify the script as follows:
Run the command CMD in parallel as a background process (CMD &).
In the main script, have a loop to monitor the spawned command every few seconds. The loop also echoes some messages to stdout indicating progress of the script.
Exit the loop when the spawned command terminates.
Capture and report the exit code of the spawned process.
Can someone give me pointers to accomplish this?
1: In bash, $! holds the PID of the last background process that was executed. That will tell you what process to monitor, anyway.
4: wait <n> waits until the process with PID <n> is complete (it will block until the process completes, so you might not want to call this until you are sure the process is done), and then returns the exit code of the completed process.
2, 3: ps or ps | grep " $! " can tell you whether the process is still running. It is up to you how to understand the output and decide how close it is to finishing. (ps | grep isn't idiot-proof. If you have time you can come up with a more robust way to tell whether the process is still running).
Here's a skeleton script:
# simulate a long process that will have an identifiable exit code
(sleep 15 ; /bin/false) &
my_pid=$!
while ps | grep " $my_pid " # might also need | grep -v grep here
do
echo $my_pid is still in the ps output. Must still be running.
sleep 3
done
echo Oh, it looks like the process is done.
wait $my_pid
# The variable $? always holds the exit code of the last command to finish.
# Here it holds the exit code of $my_pid, since wait exits with that code.
my_status=$?
echo The exit status of the process was $my_status
This is how I solved it when I had a similar need:
# Some function that takes a long time to process
longprocess() {
# Sleep up to 14 seconds
sleep $((RANDOM % 15))
# Randomly exit with 0 or 1
exit $((RANDOM % 2))
}
pids=""
# Run five concurrent processes
for i in {1..5}; do
( longprocess ) &
# store PID of process
pids+=" $!"
done
# Wait for all processes to finish, will take max 14s
# as it waits in order of launch, not order of finishing
for p in $pids; do
if wait $p; then
echo "Process $p success"
else
echo "Process $p fail"
fi
done
The pid of a backgrounded child process is stored in $!.
You can store all child processes' pids into an array, e.g. PIDS[].
wait [-n] [jobspec or pid …]
Wait until the child process specified by each process ID pid or job specification jobspec exits and return the exit status of the last command waited for. If a job spec is given, all processes in the job are waited for. If no arguments are given, all currently active child processes are waited for, and the return status is zero. If the -n option is supplied, wait waits for any job to terminate and returns its exit status. If neither jobspec nor pid specifies an active child process of the shell, the return status is 127.
Use wait command you can wait for all child processes finish, meanwhile you can get exit status of each child processes via $? and store status into STATUS[]. Then you can do something depending by status.
I have tried the following 2 solutions and they run well. solution01 is
more concise, while solution02 is a little complicated.
solution01
#!/bin/bash
# start 3 child processes concurrently, and store each pid into array PIDS[].
process=(a.sh b.sh c.sh)
for app in ${process[#]}; do
./${app} &
PIDS+=($!)
done
# wait for all processes to finish, and store each process's exit code into array STATUS[].
for pid in ${PIDS[#]}; do
echo "pid=${pid}"
wait ${pid}
STATUS+=($?)
done
# after all processed finish, check their exit codes in STATUS[].
i=0
for st in ${STATUS[#]}; do
if [[ ${st} -ne 0 ]]; then
echo "$i failed"
else
echo "$i finish"
fi
((i+=1))
done
solution02
#!/bin/bash
# start 3 child processes concurrently, and store each pid into array PIDS[].
i=0
process=(a.sh b.sh c.sh)
for app in ${process[#]}; do
./${app} &
pid=$!
PIDS[$i]=${pid}
((i+=1))
done
# wait for all processes to finish, and store each process's exit code into array STATUS[].
i=0
for pid in ${PIDS[#]}; do
echo "pid=${pid}"
wait ${pid}
STATUS[$i]=$?
((i+=1))
done
# after all processed finish, check their exit codes in STATUS[].
i=0
for st in ${STATUS[#]}; do
if [[ ${st} -ne 0 ]]; then
echo "$i failed"
else
echo "$i finish"
fi
((i+=1))
done
As I see almost all answers use external utilities (mostly ps) to poll the state of the background process. There is a more unixesh solution, catching the SIGCHLD signal. In the signal handler it has to be checked which child process was stopped. It can be done by kill -0 <PID> built-in (universal) or checking the existence of /proc/<PID> directory (Linux specific) or using the jobs built-in (bash specific. jobs -l also reports the pid. In this case the 3rd field of the output can be Stopped|Running|Done|Exit . ).
Here is my example.
The launched process is called loop.sh. It accepts -x or a number as an argument. For -x is exits with exit code 1. For a number it waits num*5 seconds. In every 5 seconds it prints its PID.
The launcher process is called launch.sh:
#!/bin/bash
handle_chld() {
local tmp=()
for((i=0;i<${#pids[#]};++i)); do
if [ ! -d /proc/${pids[i]} ]; then
wait ${pids[i]}
echo "Stopped ${pids[i]}; exit code: $?"
else tmp+=(${pids[i]})
fi
done
pids=(${tmp[#]})
}
set -o monitor
trap "handle_chld" CHLD
# Start background processes
./loop.sh 3 &
pids+=($!)
./loop.sh 2 &
pids+=($!)
./loop.sh -x &
pids+=($!)
# Wait until all background processes are stopped
while [ ${#pids[#]} -gt 0 ]; do echo "WAITING FOR: ${pids[#]}"; sleep 2; done
echo STOPPED
For more explanation see: Starting a process from bash script failed
#/bin/bash
#pgm to monitor
tail -f /var/log/messages >> /tmp/log&
# background cmd pid
pid=$!
# loop to monitor running background cmd
while :
do
ps ax | grep $pid | grep -v grep
ret=$?
if test "$ret" != "0"
then
echo "Monitored pid ended"
break
fi
sleep 5
done
wait $pid
echo $?
I would change your approach slightly. Rather than checking every few seconds if the command is still alive and reporting a message, have another process that reports every few seconds that the command is still running and then kill that process when the command finishes. For example:
#!/bin/sh
cmd() { sleep 5; exit 24; }
cmd & # Run the long running process
pid=$! # Record the pid
# Spawn a process that coninually reports that the command is still running
while echo "$(date): $pid is still running"; do sleep 1; done &
echoer=$!
# Set a trap to kill the reporter when the process finishes
trap 'kill $echoer' 0
# Wait for the process to finish
if wait $pid; then
echo "cmd succeeded"
else
echo "cmd FAILED!! (returned $?)"
fi
Our team had the same need with a remote SSH-executed script which was timing out after 25 minutes of inactivity. Here is a solution with the monitoring loop checking the background process every second, but printing only every 10 minutes to suppress an inactivity timeout.
long_running.sh &
pid=$!
# Wait on a background job completion. Query status every 10 minutes.
declare -i elapsed=0
# `ps -p ${pid}` works on macOS and CentOS. On both OSes `ps ${pid}` works as well.
while ps -p ${pid} >/dev/null; do
sleep 1
if ((++elapsed % 600 == 0)); then
echo "Waiting for the completion of the main script. $((elapsed / 60))m and counting ..."
fi
done
# Return the exit code of the terminated background process. This works in Bash 4.4 despite what Bash docs say:
# "If neither jobspec nor pid specifies an active child process of the shell, the return status is 127."
wait ${pid}
A simple example, similar to the solutions above. This doesn't require monitoring any process output. The next example uses tail to follow output.
$ echo '#!/bin/bash' > tmp.sh
$ echo 'sleep 30; exit 5' >> tmp.sh
$ chmod +x tmp.sh
$ ./tmp.sh &
[1] 7454
$ pid=$!
$ wait $pid
[1]+ Exit 5 ./tmp.sh
$ echo $?
5
Use tail to follow process output and quit when the process is complete.
$ echo '#!/bin/bash' > tmp.sh
$ echo 'i=0; while let "$i < 10"; do sleep 5; echo "$i"; let i=$i+1; done; exit 5;' >> tmp.sh
$ chmod +x tmp.sh
$ ./tmp.sh
0
1
2
^C
$ ./tmp.sh > /tmp/tmp.log 2>&1 &
[1] 7673
$ pid=$!
$ tail -f --pid $pid /tmp/tmp.log
0
1
2
3
4
5
6
7
8
9
[1]+ Exit 5 ./tmp.sh > /tmp/tmp.log 2>&1
$ wait $pid
$ echo $?
5
Another solution is to monitor processes via the proc filesystem (safer than ps/grep combo); when you start a process it has a corresponding folder in /proc/$pid, so the solution could be
#!/bin/bash
....
doSomething &
local pid=$!
while [ -d /proc/$pid ]; do # While directory exists, the process is running
doSomethingElse
....
else # when directory is removed from /proc, process has ended
wait $pid
local exit_status=$?
done
....
Now you can use the $exit_status variable however you like.
With this method, your script doesnt have to wait for the background process, you will only have to monitor a temporary file for the exit status.
FUNCmyCmd() { sleep 3;return 6; };
export retFile=$(mktemp);
FUNCexecAndWait() { FUNCmyCmd;echo $? >$retFile; };
FUNCexecAndWait&
now, your script can do anything else while you just have to keep monitoring the contents of retFile (it can also contain any other information you want like the exit time).
PS.: btw, I coded thinking in bash
My solution was to use an anonymous pipe to pass the status to a monitoring loop. There are no temporary files used to exchange status so nothing to cleanup. If you were uncertain about the number of background jobs the break condition could be [ -z "$(jobs -p)" ].
#!/bin/bash
exec 3<> <(:)
{ sleep 15 ; echo "sleep/exit $?" >&3 ; } &
while read -u 3 -t 1 -r STAT CODE || STAT="timeout" ; do
echo "stat: ${STAT}; code: ${CODE}"
if [ "${STAT}" = "sleep/exit" ] ; then
break
fi
done
how about ...
# run your stuff
unset PID
for process in one two three four
do
( sleep $((RANDOM%20)); echo hello from process $process; exit $((RANDOM%3)); ) & 2>&1
PID+=($!)
done
# (optional) report on the status of that stuff as it exits
for pid in "${PID[#]}"
do
( wait "$pid"; echo "process $pid complemted with exit status $?") &
done
# (optional) while we wait, monitor that stuff
while ps --pid "${PID[*]}" --ppid "${PID[*]}" --format pid,ppid,command,pcpu
do
sleep 5
done | xargs -i date '+%x %X {}'
# return non-zero if any are non zero
SUCCESS=0
for pid in "${PID[#]}"
do
wait "$pid" && ((SUCCESS++)) && echo "$pid OK" || echo "$pid returned $?"
done
echo "success for $SUCCESS out of ${#PID} jobs"
exit $(( ${#PID} - SUCCESS ))
This may be extending beyond your question, however if you're concerned about the length of time processes are running for, you may be interested in checking the status of running background processes after an interval of time. It's easy enough to check which child PIDs are still running using pgrep -P $$, however I came up with the following solution to check the exit status of those PIDs that have already expired:
cmd1() { sleep 5; exit 24; }
cmd2() { sleep 10; exit 0; }
pids=()
cmd1 & pids+=("$!")
cmd2 & pids+=("$!")
lasttimeout=0
for timeout in 2 7 11; do
echo -n "interval-$timeout: "
sleep $((timeout-lasttimeout))
# you can only wait on a pid once
remainingpids=()
for pid in ${pids[*]}; do
if ! ps -p $pid >/dev/null ; then
wait $pid
echo -n "pid-$pid:exited($?); "
else
echo -n "pid-$pid:running; "
remainingpids+=("$pid")
fi
done
pids=( ${remainingpids[*]} )
lasttimeout=$timeout
echo
done
which outputs:
interval-2: pid-28083:running; pid-28084:running;
interval-7: pid-28083:exited(24); pid-28084:running;
interval-11: pid-28084:exited(0);
Note: You could change $pids to a string variable rather than array to simplify things if you like.
I´ve asked Bash trap - exit only at the end of loop and the submitted solution works but while pressing CTRL-C the running command in the script (mp3convert with lame) will be interrupt and than the complete for loop will running to the end. Let me show you the simple script:
#!/bin/bash
mp3convert () { lame -V0 file.wav file.mp3 }
PreTrap() { QUIT=1 }
CleanUp() {
if [ ! -z $QUIT ]; then
rm -f $TMPFILE1
rm -f $TMPFILE2
echo "... done!" && exit
fi }
trap PreTrap SIGINT SIGTERM SIGTSTP
trap CleanUp EXIT
case $1 in
write)
while [ -n "$line" ]
do
mp3convert
[SOMEMOREMAGIC]
CleanUp
done
;;
QUIT=1
If I press CTRL-C while function mp3convert is running the lame command will be interrupt and then [SOMEMOREMAGIC] will execute before CleanUp is running. I don´t understand why the lame command will be interrupt and how I could avoid them.
Try to simplify the discussion above, I wrap up an easier understandable version of show-case script below. This script also HANDLES the "double control-C problem":
(Double control-C problem: If you hit control C twice, or three times, depending on how many wait $PID you used, those clean up can not be done properly.)
#!/bin/bash
mp3convert () {
echo "mp3convert..."; sleep 5; echo "mp3convert done..."
}
PreTrap() {
echo "in trap"
QUIT=1
echo "exiting trap..."
}
CleanUp() {
### Since 'wait $PID' can be interrupted by ^C, we need to protected it
### by the 'kill' loop ==> double/triple control-C problem.
while kill -0 $PID >& /dev/null; do wait $PID; echo "check again"; done
### This won't work (A simple wait $PID is vulnerable to double control C)
# wait $PID
if [ ! -z $QUIT ]; then
echo "clean up..."
exit
fi
}
trap PreTrap SIGINT SIGTERM SIGTSTP
#trap CleanUp EXIT
for loop in 1 2 3; do
(
echo "loop #$loop"
mp3convert
echo magic 1
echo magic 2
echo magic 3
) &
PID=$!
CleanUp
echo "done loop #$loop"
done
The kill -0 trick can be found in a comment of this link
When you hit Ctrl-C in a terminal, SIGINT gets sent to all processes in the foreground process group of that terminal, as described in this Stack Exchange "Unix & Linux" answer: How Ctrl C works. (The other answers in that thread are well worth reading, too). And that's why your mp3convert function gets interrupted even though you have set a SIGINT trap.
But you can get around that by running the mp3convert function in the background, as mattias mentioned. Here's a variation of your script that demonstrates the technique.
#!/usr/bin/env bash
myfunc()
{
echo -n "Starting $1 :"
for i in {1..7}
do
echo -n " $i"
sleep 1
done
echo ". Finished $1"
}
PreTrap() { QUIT=1; echo -n " in trap "; }
CleanUp() {
#Don't start cleanup until current run of myfunc is completed.
wait $pid
[[ -n $QUIT ]] &&
{
QUIT=''
echo "Cleaning up"
sleep 1
echo "... done!" && exit
}
}
trap PreTrap SIGINT SIGTERM SIGTSTP
trap CleanUp EXIT
for i in {a..e}
do
#Run myfunc in background but wait until it completes.
myfunc "$i" &
pid=$!
wait $pid
CleanUp
done
QUIT=1
When you hit Ctrl-C while myfunc is in the middle of a run, PreTrap prints its message and sets the QUIT flag, but myfunc continues running and CleanUp doesn't commence until the current myfunc run has finished.
Note that my version of CleanUp resets the QUIT flag. This prevents CleanUp from running twice.
This version removes the CleanUp call from the main loop and puts it inside the PreTrap function. It uses wait with no ID argument in PreTrap, which means we don't need to bother saving the PID of each child process. This should be ok since if we're in the trap we do want to wait for all child processes to complete before proceeding.
#!/bin/bash
# Yet another Trap demo...
myfunc()
{
echo -n "Starting $1 :"
for i in {1..5}
do
echo -n " $i"
sleep 1
done
echo ". Finished $1"
}
PreTrap() { echo -n " in trap "; wait; CleanUp; }
CleanUp() {
[[ -n $CLEAN ]] && { echo bye; exit; }
echo "Cleaning up"
sleep 1
echo "... done!"
CLEAN=1
exit
}
trap PreTrap SIGINT SIGTERM SIGTSTP
trap "echo exittrap; CleanUp" EXIT
for i in {a..c}
do
#Run myfunc in background but wait until it completes.
myfunc "$i" & wait $!
done
We don't really need to do myfunc "$i" & wait $! in this script, it could be simplified even further to myfunc "$i" & wait. But generally it's better to wait for a specific PID just in case there's some other process running in the background that we don't want to wait for.
Note that pressing Ctrl-C while CleanUp itself is running will interrupt the current foreground process (probably sleep in this demo).
One way of doing this would be to simply disable the interrupt until your program is done.
Some pseudo code follows:
#!/bin/bash
# First, store your stty settings and disable the interrupt
STTY=$(stty -g)
stty intr undef
#run your program here
runMp3Convert()
#restore stty settings
stty ${STTY}
# eof
Another idea would be to run your bash script in the background (if possible).
mp3convert.sh &
or even,
nohup mp3convert.sh &
I am a bit confused about process control within a bash script.
What I want, is to run specific functions/routines and at 10:00 stop them and then run some other functions. When the "other functions" end, I want the first functions continue exactly from the spot they were when they were stopped.
For example, suppose I run main.sh at 09:50. I want the function_one to stop when time is 10:00, run function_two and then continue with function_one at the exact state it was. Needless to say, that function_one could be quite complex, and have background child processes itself.
$ cat main.sh
#!/bin/bash
source functions.sh
function_one &
echo $! > function_one_running_in_background_PID.txt
while true; do
sleep 10s
if [[ $(date +%H%M) = 1000 ]]; then
kill -SIGSTOP $(<function_one_running_in_background_PID.txt)
echo Time is 10
function_two
kill -SIGCONT $(<function_one_running_in_background_PID.txt)
fi
done
$ cat functions.sh
#!/bin/bash
function function_one {
for i in {1..100000}; do echo 1st function: $i; sleep 1; done
}
function function_two {
for i in {1..100}; do echo 2nd function: $i; sleep 1; done
}
Could you please tell me if the above is possible? "Maybe" I am missing something in bash. Maybe there is a better way to do it instead of what I have thought.
Thank you all!
I think this can work. Some notes:
Even if you stop a process, its child processes will continue to run happily. If you want to stop those as well, then you need to trap STOP and CONT signals in the parent to stop and resume the children.
Instead saving the PIDs in files, why not save in variables?
You can shorten kill -SIGSTOP as kill -STOP, same for CONT
Here's a sample demo code to play with:
#!/bin/sh -e
f1() { for i in {1..1000}; do echo f1 $i; sleep 1; done; }
f2() { for i in {1..1000}; do echo f2 $i; sleep 1; done; }
f1 &
PID1=$!
f2 &
PID2=$!
kill -STOP $PID2
turn=1
while :; do
if test $turn = 1; then
kill -STOP $PID1
kill -CONT $PID2
turn=2
elif test $turn = 2; then
kill -STOP $PID2
kill -CONT $PID1
turn=1
fi
sleep 5
done
Rather than run a busy loop that constantly checks if it is 10:00, just figure out how many seconds until 10:00 and sleep that long before stopping the first process. There's no need for a temporary file, as you can just store the process ID in another parameter until needed.
t=$(( $(date +%s --date 1000) - $(date +%s) ))
function_one & f1_pid=$!
sleep $t
kill -STOP $f1_pid
function_two
kill -CONT $f1_pid
If function_one forks a child of its own, you will have to make sure that it is written in such a way to stop them when it is stopped itself.
To catch all the children I suppose they will belong to the same process-group, initiated by the program that launches them. Killing the group will kill all children, grand children etc.
It could be as simple as this:
#!/bin/bash
#set the 'at' timer on 22:00 to give self a signal (see 'man at' )
at 22:00 <<<"/usr/bin/kill -SIGCONT $$"
#gentlemen...start your engines..ehrm..programs!
"/path/firstset/member1.sh" &
"/path/firstset/member2.sh" &
"/path/firstset/member3.sh" &
"/path/firstset/member4.sh" &
while :
do
kill -SIGSTOP "$$" #<--now we halt this script.
# wait for SIGCONT given by 'at'
# Very nicely scheduled on 22:00
#when waking up:
PLIST=$(pgrep -g "$$") # list PIDs of group (= self + subtree)
PLIST=${PLIST/"$$"/} # but not self, please
kill -SIGSTOP $PLIST #OK it is 22:00, halt the old bunch
"/path/secondset/member1.sh" & # start the new bunch
"/path/secondset/member2.sh" &
"/path/secondset/member3.sh" &
"/path/secondset/member4.sh" &
wait # wait till they are finished
kill -SIGCONT $PLIST #old bunch can continue now,
done #until next scheduled event.
Thank you all for your immediate answers and I apologize for the delay. I had a system disk drive failure (!!)
Janos, what happens if f1 has background processes in it? How could I stop all of them?
#!/bin/sh -e
f1() {
f3 &
f4 &
}
f2() { for i in {1..1000}; do echo f2 $i; sleep 1; done; }
f3() { for i in {1..1000}; do echo f3 $i; sleep 1; done; }
f4() { for i in {1..1000}; do echo f4 $i; sleep 1; done; }
f1 &
PID1=$!
f2 &
PID2=$!
kill -STOP $PID2
while :; do
if [[ $(date +%S) = *0 ]]; then
kill -STOP $PID1
kill -CONT $PID2
elif [[ $(date +%S) = *5 ]]; then
kill -STOP $PID2
kill -CONT $PID1
fi
sleep 1
done