Barrier in bash, can it be done easily? - bash

Let's say I have a bash script that executes three scripts in parallel
./script1 &
./script2 &
./script3 &
Now, let us say that ./script4 depends on script1, script2 and script3. How can I force it to wait for those, while still executing the three scripts in parallel?

You can use wait a built-in command available in Bash and in some other shells.
(see equivalent command WAITFOR on Windows)
wait documentation
Wait for each specified process to complete and return its termination
status.
Syntax
wait [n ...]
Key
n A process ID or a job specification
Each n can be a process ID or a job specification; if a job
specification is given, all processes in that job's pipeline are
waited for.
If n is not given, all currently active child processes are waited
for, and the return status is zero.
If n specifies a non-existent process or job, the return status is
127. Otherwise, the return status is the exit status of the last process or job waited for.
Simple solution
Below wait waits indefinitely for all currently active child processes to be all ended (i.e. in this case the three scripts).
./script1 &
./script2 &
./script3 &
wait # waits for all child processes
./script4
Store the PIDs in shell local variables
./script1 & pid1=$!
./script2 & pid2=$!
./script3 & pid3=$!
wait $pid1 $pid2 $pid3 # waits for 3 PIDs
./script4
Store the PIDs in temporary files
./script1 & echo $! >1.pid
./script2 & echo $! >2.pid
./script3 & echo $! >3.pid
wait $(<1.pid) $(<2.pid) $(<3.pid)
rm 1.pid 2.pid 3.pid # clean up
./script4
This last solution pollutes the current directory with three files (1.pid, 2.pid and 3.pid). One of these file may be corrupted before wait call. Moreover these files could be left in the file-system in case of crash.

From the bash man page:
wait [n ...]
Wait for each specified process and return its termination status.
Each `n` may be a process ID or a job specification.... If `n` is not
given, all currently active child processes are waited for, and the return
status is zero.
The easiest implementation might be for your last script to start the others. That way it's easy for it to store their PIDs and pass them to wait.

I whipped up something quickly years ago, but now I wanted nested parallelism. This is what I came up with:
# Run each supplied argument as a bash command, inheriting calling environment.
# bash_parallel's can be nested, though escaping quotes can be tricky -- define helper function for such cases.
# Example: bash_parallel "sleep 10" "ls -altrc"
function bash_parallel
{
(
i=0
unset BASH_PARALLEL_PIDS # Do not inherit BASH_PARALLEL_PIDS from parent bash_parallel (if any)
for cmd in "$#"
do
($cmd) & # In subshell, so sibling bash_parallel's wont interfere
BASH_PARALLEL_PIDS[$i]=$!
echo "bash_parallel started PID ${BASH_PARALLEL_PIDS[$i]}: $cmd"
i=$(($i + 1))
done
echo "bash_parallel waiting for PIDs: ${BASH_PARALLEL_PIDS[#]}"
wait ${BASH_PARALLEL_PIDS[#]}
) # In subshell, so ctrl-c will kill still-running children.
}
Use:
eisbaw#leno:~$ time (bash_parallel "sleep 10" "sleep 5")
bash_parallel started PID 30183: sleep 10
bash_parallel started PID 30184: sleep 5
bash_parallel waiting for PIDs: 30183 30184
real 0m10.007s
user 0m0.000s
sys 0m0.004s

Related

How to run the a shell script as background process and move on with next script without waiting for completion of first

I have below scripts ready with me -
1.sh:
echo "Good"
sleep 10
echo "Morning"
2.sh:
echo "Whats"
sleep 30
echo "Up"
script1.sh:
sh1.sh &
sh2.sh &
script2.sh:
echo "Hello world"
Requirement:
Execute script1.sh and do not wait for its completion or failure i.e., let the script run in background As soon as script1.sh is triggered the very next second execute the script2.sh.
./script1.sh
./script2.sh
Challenge:
./script2.sh keeps on waiting for completion of . ./script1.sh.
Like ./script2.sh I have lot of scripts to be run one after another but they should never wait for completion of ./script1.sh
Thanks,
B.J.
Just as youdid in 1.sh, you should append & after script1.sh:
#! /bin/bash
./script1.sh &
./script2.sh
exit 0
This will create a background process of script1.sh and continues in the main thread with script2.sh.
Usually, it a good practice not to leave background processes (unless they are long running servers, daemons, etc.). Better to make the parent script wait for all the children. Otherwise, you might have lot of orphan processes, which may use resources and have unintended consequences (e.g., open files, logging, ...)
Consider
#! /bin/bash
script1.sh &
script2.sh
script3.sh
wait # wait for any backgrounded processs
One immediate advantage is that killing the main script will also kill running script1 and script2. If for some reason the main script exit before all background childs are terminated, they can not be easily stopped (other then killing them by PID).
Also, using ps/pstree will show system status in clear way

Bash files: run process in parallel and stop when one is over

I would like to start two C codes from a bash file in parallel and the second one stops when the first one has finished.
The instruction wait expects both processes to stop which is not what I would like to do.
Thanks for any suggestion.
GNU parallel can do this kind of job. Check termination section, it can shutdown down remaining processes based on the exit code (either success or failure:
parallel -j2 --halt now,success=1 ::: 'cmd1 args' 'cmd2 args'
When one of the job finishes successfully, it will send TERM signal to the other jobs (if jobs are not terminated it forces using KILL signal).
With $! you get the pid of the last command executed in parallel. See some nice examples here: Bash `wait` command, waiting for more than 1 PID to finish execution
For your peculiar problem I imagine something like:
#!/bin/bash
command_master() {
echo -e "Command_master"
sleep 1
}
command_tokill() {
echo -e "Command_tokill"
sleep 10
}
command_master & pid_master=($!)
command_tokill & pid_tokill=($!)
wait "$pid_master"
kill "$pid_tokill"
wait -n is what you are looking for. It waits for the next job to finish. You can then have a list of the PIDs of the remaining jobs with jobs -p if you want to kill them.
prog1 & pids=( $! )
prog2 & pids+=( $! )
wait -n
kill "${pids[#]}"
This requires bash.
The two programs are started as background jobs, and the shell waits for one of them to exit.
When this happens, kill is used to terminate both processes (this will cause an error since one of them is already dead).

Bash: Start and kill child process

I have a program I want to start. Let' say this program will run a while(true)-loop (so it does not terminate. I want to write a bash script which:
Starts the program (./endlessloop &)
Waits 1 second (sleep 1)
Kills the program --> How?
I cannot use $! to get pid from child because server is running a lot of instances concurrently.
Store the PID:
./endlessloop & endlessloop_pid=$!
sleep 1
kill "$endlessloop_pid"
You can also check whether the process is still running with kill -0:
if kill -0 "$endlessloop_pid"; then
echo "Endlessloop is still running"
fi
...and storing the content in a variable means it scales to multiple processes:
endlessloop_pids=( ) # initialize an empty array to store PIDs
./endlessloop & endlessloop_pids+=( "$!" ) # start one in background and store its PID
./endlessloop & endlessloop_pids+=( "$!" ) # start another and store its PID also
kill "${endlessloop_pids[#]}" # kill both endlessloop instances started above
See also BashFAQ #68, "How do I run a command, and have it abort (timeout) after N seconds?"
The ProcessManagement page on the Wooledge wiki also discusses relevant best practices.
You can use the pgrep command for the same:
kill $(pgrep endlessloop)

WAIT for "1 of many process" to finish

Is there any built in feature in bash to wait for 1 out of many processes to finish? And then kill remaining processes?
pids=""
# Run five concurrent processes
for i in {1..5}; do
( longprocess ) &
# store PID of process
pids+=" $!"
done
if [ "one of them finished" ]; then
kill_rest_of_them;
fi
I'm looking for "one of them finished" command. Is there any?
bash 4.3 added a -n flag to the built-in wait command, which causes the script to wait for the next child to complete. The -p option to jobs also means you don't need to store the list of pids, as long as there aren't any background jobs that you don't want to wait on.
# Run five concurrent processes
for i in {1..5}; do
( longprocess ) &
done
wait -n
kill $(jobs -p)
Note that if there is another background job other than the 5 long processes that completes first, wait -n will exit when it completes. That would also mean you would still want to save the list of process ids to kill, rather than killing whatever jobs -p returns.
It's actually fairly easy:
#!/bin/bash
set -o monitor
killAll()
{
# code to kill all child processes
}
# call function to kill all children on SIGCHLD from the first one
trap killAll SIGCHLD
# start your child processes here
# now wait for them to finish
wait
You just have to be really careful in your script to use only bash built-in commands. You can't start any utilities that run as a separate process after you issue the trap command - any child process exiting will send SIGCHLD - and you can't tell where it came from.

How to wait on all child (and grandchild etc) process spawned by a script

Context:
Users provide me their custom scripts to run. These scripts can be of any sort like scripts to start multiple GUI programs, backend services. I have no control over how the scripts are written. These scripts can be of blocking type i.e. execution waits till all the child processes (programs that are run sequentially) exit
#exaple of blocking script
echo "START"
first_program
second_program
echo "DONE"
or non blocking type i.e. ones that fork child process in the background and exit something like
#example of non-blocking script
echo "START"
first_program &
second_program &
echo "DONE"
What am I trying to achieve?
User provided scripts can be of any of the above two types or mix of both. My job is to run the script and wait till all the processes started by it exit and then shutdown the node. If its of blocking type, case is plain simple i.e. get the PID of script execution process and wait till ps -ef|grep -ef PID has no more entries. Non-blocking scripts are the ones giving me trouble
Is there a way I can get list of PIDs of all the child process spawned by execution of a script? Any pointers or hints will be highly appreciated
You can use wait to wait for all the background processes started by userscript to complete. Since wait only works on children of the current shell, you'll need to source their script instead of running it as a separate process.
( source userscript; wait )
Sourcing the script in an explicit subshell should simulate starting a new process closely enough. If not, you can also background the subshell, which forces a new process to be started, then wait for it to complete.
( source userscript; wait ) & wait
ps --ppid $PID will list all child processes of the process with $PID.
You can open a file descriptor that gets inherited by other processes, and then wait until it's no longer in use. This is a low overhead method that usually works fine, though it's possible for processes to work around it if they want:
foo=$(mktemp)
( flock -x 5000; theirscript; ) 5000> "$foo"
flock -x 0 < "$foo"
rm "$foo"
echo "The script and its subprocesses are done"
You can follow all invoked processes using ptrace, such as with strace. This is easier, but has some associated overhead and may not work when scripts invoke suid binaries:
strace -f -e none theirscript
You can use pgrep -P <parent_pid> to get a list of child processes. Example:
IFS=$'\n' read -ra CHILD_PROCS -d '' < <(exec pgrep -P "$1")
And to get the grand-children, simply do the same procedure on each child process.
Check out my blog Bash functions to list and kill or send signals to process trees.
You can use one of those function to properly list all processes spawned under one process. Each has their own method or order of sending signals to process.
The only limitation by those is that process still have to be connected and not orphaned. If you could somehow find a way to group your processes, then that might be your solution.
To simply answer the question that was asked. You could store the process ID of each script you're calling into the same variable:
echo "START"
first_program &
child_process_ids+="$! "
second_program &
child_process_ids+="$! "
echo $child_process_ids
echo "DONE"
$child_process_ids would just be a space delimited string of process Ids. Now, this answers the question asked, however, what I would do would be a bit different. I would call each script from a for loop, store its process ID, then wait on each one in another for loop to finish and inspect each exit code individually. Using the same example, here's what it would look like.
echo "START"
scripts="first_program second_program"
for script in $scripts; do
#Call script and send to background
./$script &
#Store the script's processID that was just sent to the background
child_process_ids+="$! "
done
for child_process_id in $child_process_ids; do
#Pass each processId into the wait command to retrieve its exit
#code and store it in $rc
wait $child_process_id
rc=$?
#Inspect each processes exit code
if [ $rc -ne 0 ]; then
echo "$child_process_id failed with an exit code of $rc"
else
echo "$child_process_id was successful"
fi
done

Resources