I have a bash scripts in which I invoke other scripts to run in parallel. With the wait command I can wait until all parallel processes have finished. But I want to know if all the processes that executed in background in parallel executed successfully (with return code 0).
My code looks like:
--calling multiple processes to execute in backgroud
process-1 &
process-2 &
process-3 &
wait
--after parallel execution finishes I want to know if all of them were successful and returned '0'
You can use wait -n which returns the exit code of the next job that terminates. Call it once for each background process.
process-1 &
process-2 &
process-3 &
wait -n && wait -n && wait -n
wait -n seems the correct solution, but since it is not present in bash 4.2.37 you can try this trick:
#!/bin/bash
(
process-1 || echo $! failed &
process-2 || echo $! failed &
process-2 || echo $! failed &
wait
) | grep -q failed
if [ $? -eq 0 ]; then
echo at least one process failed
else
echo all processes finished successfully
fi
Just make sure the string "failed" is not returned by the processes themselves while having an actual success. Also you could run processes with stdin and stderr redirected do /dev/null with process-1 &>/dev/null
I've written a tool that simplifies the solutions a bit: https://github.com/wagoodman/bashful
You provide a file describing what you want to run...
# awesome.yaml
tasks:
- name: My awesome tasks
parallel-tasks:
- cmd: ./some-script-1.sh
- cmd: ./some-script-2.sh
- cmd: ./some-script-3.sh
- cmd: ./some-script-4.sh
...and run it like so:
bashful run awesome.yaml
Then it will run your tasks in parallel with a vertical progress bar showing the status of each task. Failures are indicated in red and the program exits with 1 if there were any errors found (exit occurs after the parallel block completes).
Related
I'm trying to write a program using bash script. I'd like to give an alert when this program is killed.
The desired action is like this:
#!/bin/bash
... # The original program
if killed ; do
echo "trying to kill the demo program ... "
sleep 5s
echo "demo program killed"
fi
If you expect the signal to be delivered only to the running program and not to the shell running your script, then the basic synopsis might be:
#!/bin/bash
set -euo pipefail
sleep 1 & # The original program
pid="$!"
kill -9 "$pid" # Pick your lethal signal
wait -n "$pid" && status=0 || status="$?"
((status > 128)) && echo "${pid} got signal $((status - 128))" 1>&2 || :
Presumably, here^^^ we run the program in the background, so that we can send it the kill signal from the same snippet. In practice you would probably run it in the foreground and then check its $? return status instead of the status from wait -n.
If the killing signal is delivered to your entire process group, including the shell running your script, that is a different story. For the signal KILL (9) in particular, there is no way to mask it or report it. When the shell gets it, it dies. For other signals you could set up a trap command (see man bash for its syntax) to handle the signal gracefully in the script while still being able to detect and report the child process’ death from the signal.
Context:
Users provide me their custom scripts to run. These scripts can be of any sort like scripts to start multiple GUI programs, backend services. I have no control over how the scripts are written. These scripts can be of blocking type i.e. execution waits till all the child processes (programs that are run sequentially) exit
#exaple of blocking script
echo "START"
first_program
second_program
echo "DONE"
or non blocking type i.e. ones that fork child process in the background and exit something like
#example of non-blocking script
echo "START"
first_program &
second_program &
echo "DONE"
What am I trying to achieve?
User provided scripts can be of any of the above two types or mix of both. My job is to run the script and wait till all the processes started by it exit and then shutdown the node. If its of blocking type, case is plain simple i.e. get the PID of script execution process and wait till ps -ef|grep -ef PID has no more entries. Non-blocking scripts are the ones giving me trouble
Is there a way I can get list of PIDs of all the child process spawned by execution of a script? Any pointers or hints will be highly appreciated
You can use wait to wait for all the background processes started by userscript to complete. Since wait only works on children of the current shell, you'll need to source their script instead of running it as a separate process.
( source userscript; wait )
Sourcing the script in an explicit subshell should simulate starting a new process closely enough. If not, you can also background the subshell, which forces a new process to be started, then wait for it to complete.
( source userscript; wait ) & wait
ps --ppid $PID will list all child processes of the process with $PID.
You can open a file descriptor that gets inherited by other processes, and then wait until it's no longer in use. This is a low overhead method that usually works fine, though it's possible for processes to work around it if they want:
foo=$(mktemp)
( flock -x 5000; theirscript; ) 5000> "$foo"
flock -x 0 < "$foo"
rm "$foo"
echo "The script and its subprocesses are done"
You can follow all invoked processes using ptrace, such as with strace. This is easier, but has some associated overhead and may not work when scripts invoke suid binaries:
strace -f -e none theirscript
You can use pgrep -P <parent_pid> to get a list of child processes. Example:
IFS=$'\n' read -ra CHILD_PROCS -d '' < <(exec pgrep -P "$1")
And to get the grand-children, simply do the same procedure on each child process.
Check out my blog Bash functions to list and kill or send signals to process trees.
You can use one of those function to properly list all processes spawned under one process. Each has their own method or order of sending signals to process.
The only limitation by those is that process still have to be connected and not orphaned. If you could somehow find a way to group your processes, then that might be your solution.
To simply answer the question that was asked. You could store the process ID of each script you're calling into the same variable:
echo "START"
first_program &
child_process_ids+="$! "
second_program &
child_process_ids+="$! "
echo $child_process_ids
echo "DONE"
$child_process_ids would just be a space delimited string of process Ids. Now, this answers the question asked, however, what I would do would be a bit different. I would call each script from a for loop, store its process ID, then wait on each one in another for loop to finish and inspect each exit code individually. Using the same example, here's what it would look like.
echo "START"
scripts="first_program second_program"
for script in $scripts; do
#Call script and send to background
./$script &
#Store the script's processID that was just sent to the background
child_process_ids+="$! "
done
for child_process_id in $child_process_ids; do
#Pass each processId into the wait command to retrieve its exit
#code and store it in $rc
wait $child_process_id
rc=$?
#Inspect each processes exit code
if [ $rc -ne 0 ]; then
echo "$child_process_id failed with an exit code of $rc"
else
echo "$child_process_id was successful"
fi
done
I am trying to make a bash script to start a jar file and do it in the background. For that reason I'm using nohup. Right now I can capture the pid of the java process but I also need to be able to execute a command when the process finishes.
This is how I started
nohup java -jar jarfile.jar & echo $! > conf/pid
I also know from this answer that using ; will make a command execute after the first one finishes.
nohup java -jar jarfile.jar; echo "done"
echo "done" is just an example. My problem now is that I don't know how to combine them both. If I run echo $! first then echo "done" executes immediately. While if echo "done" goes first then echo $! will capture the PID of echo "done" instead of the one of the jarfile.
I know that I could achieve the desire functionality by polling until I don't see the PID running anymore. But I would like to avoid that as much as possible.
You can use the bash util wait once you start the process using nohup
nohup java -jar jarfile.jar &
pid=$! # Getting the process id of the last command executed
wait $pid # Waits until the process mentioned by the pid is complete
echo "Done, execute the new command"
I don't think you're going to get around "polling until you don't see the pid running anymore." wait is a bash builtin; it's what you want and I'm certain that's exactly what it does behind the scenes. But since Inian beat me to it, here's a friendly function for you anyway (in case you want to get a few things running in parallel).
alert_when_finished () {
declare cmd="${#}";
${cmd} &
declare pid="${!}";
while [[ -d "/proc/${pid}/" ]]; do :; done; #equivalent to wait
echo "[${pid}] Finished running: ${cmd}";
}
Running a command like this will give the desired effect and suppress unneeded job output:
( alert_when_finished 'sleep 5' & )
How can a KornShell (ksh) script exit/kill all the processes started from another ksh script?
If scriptA.ksh calls scriptB.ksh then the following code works good enough, but is there a better solution for this?:
scriptA.ksh:
#call scriptBSnippet
scriptBSnippet.ksh ${a}
scriptB.ksh:
#if error: exit this script (scriptB) and calling script (scriptA)#
kill ${PPID}
exit 1
To add complexity what if scriptA calls scriptB which calls scriptC, then how to exit out of all three scripts if there is an error in scriptC?
scriptA.ksh:
#call scriptBSnippet
scriptBSnippet.ksh ${a}
scriptB.ksh:
#if error: exit this script (scriptB) and calling script (scriptA)#
kill ${PPID}
exit 1
scriptC.ksh:
#if error: exit this script (scriptC) and calling scripts (scriptA, scriptB)#
#kill ${PPID}
#exit 1
Thanks in advance.
Killing all processes started by the same script is a bit of a brute force method.
It would be best to have some method of communication between the processes that would allow them to gracefully shutdown.
However, if all processes are in the same process group, you can send a signal to the entire process group:
kill -${Signal:?} -${Pgid:?}
Note that two arguments are required in this case. A single argument starting with - is always interpreted as a signal.
Run some tests to see which processes get included in the process group.
parent.sh:
Shell=ksh
($Shell -c :) || exit
$Shell child1.sh & pid1=$!
$Shell child2.sh & pid2=$!
$Shell child3.sh & pid3=$!
ps -o pid,sid,pgid,tty,cmd $PPID $$ $pid1 $pid2 $pid3
exit
child.sh:
sleep 50
If you run parent.sh from a terminal - it will become the process leader.
granny.sh:
Shell=ksh
($Shell -c :) || exit
$Shell parent.sh &
wait
exit
If you run parent.sh from another script granny.sh, then that will be the process group leader, and will be included when you use the kill -SIG -PGID method.
See also this answer to:
What are “session leaders” in ps? for some background on sessions and process groups.
I have a simple shell script whose also is below:
#!/usr/bin/sh
echo "starting the process which is a c++ process which does some database action for around 30 minutes"
#this below process should be run in the background
<binary name> <arg1> <arg2>
exit
Now what I want is to monitor and display the status information of the process.
I don't want to go deep into its functionality. Since I know that the process will complete in 30 minutes, I want to show to the user that 3.3% is completed for every 1 min and also check whether the process is running in the background and finally if the process is completed I want to display that it is completed.
could anybody please help me?
The best thing you could do is to put some kind of instrumentation in your application,
and let it report the actual progress in terms of work items processed / total amount of work.
Failing that, you can indeed refer to the time that the thing has been running.
Here's a sample of what I've used in the past. Works in ksh93 and bash.
#! /bin/ksh
set -u
prog_under_test="sleep"
args_for_prog=30
max=30 interval=1 n=0
main() {
($prog_under_test $args_for_prog) & pid=$! t0=$SECONDS
while is_running $pid; do
sleep $interval
(( delta_t = SECONDS-t0 ))
(( percent=100*delta_t/max ))
report_progress $percent
done
echo
}
is_running() { (kill -0 ${1:?is_running: missing process ID}) 2>& -; }
function report_progress { typeset percent=$1
printf "\r%5.1f %% complete (est.) " $(( percent ))
}
main
If your process involves a pipe than http://www.ivarch.com/programs/quickref/pv.shtml would be an excellent solution or an alternative is http://clpbar.sourceforge.net/ . But these are essentially like "cat" with a progress bar and need something to pipe through them. There is a small program that you could compile and then execute as a background process then kill when things finish up, http://www.dreamincode.net/code/snippet3062.htm that would probablly work if you just want to dispaly something for 30 minutes and then print out almost done in the console if your process runs long and it exits, but you would have to modify it. Might be better just to create another shell script that displays a character every few seconds in a loop and checks if the pid of the previous process is still running, I believe you can get the parent pid by looking at the $$ variable then check if it is still running in /proc/pid .
You really should let the command output statistics, but for simplicity's sake you can do something like this to simply increment a counter while your process runs:
#!/bin/sh
cmd & # execute a command
pid=$! # Record the pid of the command
i=0
while sleep 60; do
: $(( i += 1 ))
e=$( echo $i 3.3 \* p | dc ) # compute percent completed
printf "$e percent complete\r" # report completion
done & # reporter is running in the background
pid2=$! # record reporter's pid
# Wait for the original command to finish
if wait $pid; then
echo cmd completed successfully
else
echo cmd failed
fi
kill $pid2 # Kill the status reporter