Why does "(sleep 10 & sleep 1); wait" return after 1 second instead of 10? - bash

wait without arguments is supposed to wait for all child processes, however
(sleep 10 & sleep 1); wait
returns after 1 second instead of 10, and so it's failing to wait for sleep 10 to finish.
Why is that, and how could I fix it?

The parentheses create a subshell—an entirely new shell process just for those two commands.
The wait command only waits for a shell’s own children (in fact, that’s all it can wait for); grandchildren don’t count. Since the sleep processes are children of the subshell instead of the main shell, they cannot be waited for.
Therefore, what happens is:
Subshell is created to run sleep 10 & sleep 1, and the main shell waits for it to finish
Subshell runs sleep 10, and continues processing immediately because of &
Subshell runs sleep 1, and waits for it to finish
One second later, sleep 1 exits.
The subshell is out of commands to process, so quits.
All of the subshells children (i.e., sleep 10) are orphaned, and re-parented to the init process (not the original shell).
The main shell continues to the wait command.
Because the shell has no children, wait returns immediately.
Nine seconds later, sleep 10 exits, and init cleans it up.
The only way to get wait to recognize a command is to not exit it in a subshell. In this example, you can achieve that by using curly braces, or by omitting the braces entirely. In either case, first sleep 10 will be run, then sleep 1 will be run, and when sleep 1 finishes wait will run until sleep 10 finishes.

Related

SIGINT propagation for background process vs for subshell

I have two programs in two files that I run with bash:
The first:
(sleep 100) &
wait
The second:
sleep 100 &
wait
If I send a SIGINT to the first program, it also kills my sleep command. But for the second the sleep command remains and isn't killed.
Why the difference?
Thanks so much!

Wait for one of two processes in shell script

I am running a couple of background processes in my shell script. I want to exit the script when one of the two processes exit.
If I apply:
wait $PID1
wait $PID2
It'll will wait for process 1 to complete and then wait for process 2. Same happens for:
command 1 && command 2 && wait
Is there any way I could perform an or operation on the wait command?
You can trap SIGCHLD:
trap 'exit 0' SIGCHLD

WAIT for "1 of many process" to finish

Is there any built in feature in bash to wait for 1 out of many processes to finish? And then kill remaining processes?
pids=""
# Run five concurrent processes
for i in {1..5}; do
( longprocess ) &
# store PID of process
pids+=" $!"
done
if [ "one of them finished" ]; then
kill_rest_of_them;
fi
I'm looking for "one of them finished" command. Is there any?
bash 4.3 added a -n flag to the built-in wait command, which causes the script to wait for the next child to complete. The -p option to jobs also means you don't need to store the list of pids, as long as there aren't any background jobs that you don't want to wait on.
# Run five concurrent processes
for i in {1..5}; do
( longprocess ) &
done
wait -n
kill $(jobs -p)
Note that if there is another background job other than the 5 long processes that completes first, wait -n will exit when it completes. That would also mean you would still want to save the list of process ids to kill, rather than killing whatever jobs -p returns.
It's actually fairly easy:
#!/bin/bash
set -o monitor
killAll()
{
# code to kill all child processes
}
# call function to kill all children on SIGCHLD from the first one
trap killAll SIGCHLD
# start your child processes here
# now wait for them to finish
wait
You just have to be really careful in your script to use only bash built-in commands. You can't start any utilities that run as a separate process after you issue the trap command - any child process exiting will send SIGCHLD - and you can't tell where it came from.

bg / fg inside a command line loop

ctrl-z (^z) acts in ways I do not understand when done inside a loop executed from a terminal.
Say I type
for ii in {0..100}; do echo $ii; sleep 1; done
then I hit ^z. I'll get:
[1]+ Stopped sleep 1
I can resume the job using fg or bg, but the job refers only to he sleep command. The rest of the loop has apparently disappeared, and no more number appear on the terminal.
I could use & after the command to immediately run it in the background, or another solution is to wrap the whole thing in a subshell:
( for ii in {0..100}; do echo $ii; sleep 1; done )
then ^z gives me
[1]+ Stopped ( for ii in {0..100};
do
echo $ii; sleep 1;
done )
This job can be resumed and everyone is happy. But I'm not generally in the habit of doing this when running a one-off task, and the question I am asking is why the first behavior happens in the first place. Is there a way to suspend a command-line loop that isn't subshell'd? And what happened to the rest of the loop in the first example?
Note that this is specific to the loop:
echo 1; sleep 5; echo 2
and hitting ^z during the sleep causes the echo 2 to execute:
1
^Z
[2]+ Stopped sleep 5
2
Or should I just get in the habit of using & and call it dark magic?
You cannot suspend execution of the current shell. When you run your loop from the command line, it is executing in your current login shell/terminal. When you press [ctrl+z] you are telling the shell to suspend the current active process. Your loop is simply a counter in the current shell, the process being executed is sleep. Suspend only operates on sleep.
When you backgroud a process or execute it in a subshell (roughly equivalent), you can suspend that separate process in total.

Barrier in bash, can it be done easily?

Let's say I have a bash script that executes three scripts in parallel
./script1 &
./script2 &
./script3 &
Now, let us say that ./script4 depends on script1, script2 and script3. How can I force it to wait for those, while still executing the three scripts in parallel?
You can use wait a built-in command available in Bash and in some other shells.
(see equivalent command WAITFOR on Windows)
wait documentation
Wait for each specified process to complete and return its termination
status.
Syntax
wait [n ...]
Key
n A process ID or a job specification
Each n can be a process ID or a job specification; if a job
specification is given, all processes in that job's pipeline are
waited for.
If n is not given, all currently active child processes are waited
for, and the return status is zero.
If n specifies a non-existent process or job, the return status is
127. Otherwise, the return status is the exit status of the last process or job waited for.
Simple solution
Below wait waits indefinitely for all currently active child processes to be all ended (i.e. in this case the three scripts).
./script1 &
./script2 &
./script3 &
wait # waits for all child processes
./script4
Store the PIDs in shell local variables
./script1 & pid1=$!
./script2 & pid2=$!
./script3 & pid3=$!
wait $pid1 $pid2 $pid3 # waits for 3 PIDs
./script4
Store the PIDs in temporary files
./script1 & echo $! >1.pid
./script2 & echo $! >2.pid
./script3 & echo $! >3.pid
wait $(<1.pid) $(<2.pid) $(<3.pid)
rm 1.pid 2.pid 3.pid # clean up
./script4
This last solution pollutes the current directory with three files (1.pid, 2.pid and 3.pid). One of these file may be corrupted before wait call. Moreover these files could be left in the file-system in case of crash.
From the bash man page:
wait [n ...]
Wait for each specified process and return its termination status.
Each `n` may be a process ID or a job specification.... If `n` is not
given, all currently active child processes are waited for, and the return
status is zero.
The easiest implementation might be for your last script to start the others. That way it's easy for it to store their PIDs and pass them to wait.
I whipped up something quickly years ago, but now I wanted nested parallelism. This is what I came up with:
# Run each supplied argument as a bash command, inheriting calling environment.
# bash_parallel's can be nested, though escaping quotes can be tricky -- define helper function for such cases.
# Example: bash_parallel "sleep 10" "ls -altrc"
function bash_parallel
{
(
i=0
unset BASH_PARALLEL_PIDS # Do not inherit BASH_PARALLEL_PIDS from parent bash_parallel (if any)
for cmd in "$#"
do
($cmd) & # In subshell, so sibling bash_parallel's wont interfere
BASH_PARALLEL_PIDS[$i]=$!
echo "bash_parallel started PID ${BASH_PARALLEL_PIDS[$i]}: $cmd"
i=$(($i + 1))
done
echo "bash_parallel waiting for PIDs: ${BASH_PARALLEL_PIDS[#]}"
wait ${BASH_PARALLEL_PIDS[#]}
) # In subshell, so ctrl-c will kill still-running children.
}
Use:
eisbaw#leno:~$ time (bash_parallel "sleep 10" "sleep 5")
bash_parallel started PID 30183: sleep 10
bash_parallel started PID 30184: sleep 5
bash_parallel waiting for PIDs: 30183 30184
real 0m10.007s
user 0m0.000s
sys 0m0.004s

Resources