Why didn't the shell command execute in order as I expect? - bash

I have written 3 shell scripts named s1.sh s2.sh s3.sh. They have the same content:
#!/bin/ksh
echo $0 $$
and s.sh invoke them in order:
#!/bin/sh
echo $0 $$
exec ./s1.sh &
exec ./s2.sh &
exec ./s3.sh &
but the result is disorder:
victor#ThinkPad-Edge:~$ ./s.sh
./s.sh 3524
victor#ThinkPad-Edge:~$ ./s1.sh 3525
./s3.sh 3527
./s2.sh 3526
why not s1 s2 then s3 in sequence?
If I remove & in s.sh:
#!/bin/sh
echo $0 $$
exec ./s1.sh
exec ./s2.sh
exec ./s3.sh
the output:
$ ./s.sh
./s.sh 4022
./s1.sh 4022
Missing s2 and s3, why?

They have been executing in order (at least starting in order - Notice the ids are incrementing). You open 3 separate threads for 3 separate programs. One (for some reason) is faster than the other. If you want them in sequence, take the execs and &s out of exec ./s1.sh &.

The process scheduler achieves apparent multitasking by running a snippet of each task at a time, then rapidly switching to another. Depending on system load, I/O wait, priority, scheduling algorithm etc, two processes started at almost the same time may get radically different allotments of the available CPU. Thus there can be no guarantee as to which of your three processes reaches its echo statement first.
This is very basic Unix knowledge; perhaps you should read a book or online tutorial if you mean to use Unix seriously.
If you require parallel processes to execute in a particular order, use a locking mechanism (semaphore, shared memory, etc) to prevent one from executing a particular part of the code, called a "critical section", before another. (This isn't easy to do in shell script, though. Switch to Python or Perl if you don't want to go all the way to C. Or use a lock file if you can live with the I/O latency.)
In your second example, the exec command replaces the current process with another. Thus s1 takes over completely, and the commands to start s2 and s3 are never seen by the shell.
(This was not apparent in your first example because the & caused the shell to fork a background process first, basically rendering the exec useless anyway.)

The & operator places each exec in the background. Effectively, you are running all 3 of your scripts in parallel. They don't stay in order because the operating system executes a bit of each script whenever it gets a chance, but it is also executing a bunch of other stuff too. One process can be given more time to run than the others, causing it to finish sooner.

Missing s2 and s3, why?
You are not missing s2 or s3 -- s2 and s3 are executing in a replacement or subshell (when s.sh exits (or is replaced), they lose communication with the console causing their output to overwrite prior output on the TTY).
Other answers have discussed, that s1,s2,s3 are all executed within replacement shells (exec) or subshells (without exec) and how removing exec and & will force sequential execution of s1,s2,s3. There are two cases to discuss. One where exec is present and one where it is not. Where exec is present, the current shell is replaced by the executed process (as pointed out in the comments, the parent shell is killed).
Where exec is not used, then then s1,s2,s3 are executed in subshells. You are not seeing the output of s2, s3, because s.sh has finished and/or exited before s2, s3 execute removing their communication with the console (if you look you will see you get an additional prompt shown and then the output of the remaining s(2,3).sh commands. But, there is a way to require their completion before s.sh exits. Use wait. wait tells s.sh not to exit until all of its child processes s1, s2, and s3 complete. This provides an output path to the console. Example:
#!/bin/bash
echo $0 $$
exec ./1.sh &
exec ./s2.sh &
exec ./s3.sh &
wait
output:
$ ./s.sh
./s.sh 11151
/home/david/scr/tmp/stack/s1.sh 11153
/home/david/scr/tmp/stack/s3.sh 11155
/home/david/scr/tmp/stack/s2.sh 11154

Related

Bash script is waiting to open second file in gedit until I close the first one [duplicate]

When running commands from a bash script, does bash always wait for the previous command to complete, or does it just start the command then go on to the next one?
ie: If you run the following two commands from a bash script is it possible for things to fail?
cp /tmp/a /tmp/b
cp /tmp/b /tmp/c
Yes, if you do nothing else then commands in a bash script are serialized. You can tell bash to run a bunch of commands in parallel, and then wait for them all to finish, but doing something like this:
command1 &
command2 &
command3 &
wait
The ampersands at the end of each of the first three lines tells bash to run the command in the background. The fourth command, wait, tells bash to wait until all the child processes have exited.
Note that if you do things this way, you'll be unable to get the exit status of the child commands (and set -e won't work), so you won't be able to tell whether they succeeded or failed in the usual way.
The bash manual has more information (search for wait, about two-thirds of the way down).
add '&' at the end of a command to run it parallel.
However, it is strange because in your case the second command depends on the final result of the first one. Either use sequential commands or copy to b and c from a like this:
cp /tmp/a /tmp/b &
cp /tmp/a /tmp/c &
Unless you explicitly tell bash to start a process in the background, it will wait until the process exits. So if you write this:
foo args &
bash will continue without waiting for foo to exit. But if you don't explicitly put the process in the background, bash will wait for it to exit.
Technically, a process can effectively put itself in the background by forking a child and then exiting. But since that technique is used primarily by long-lived processes, this shouldn't affect you.
In general, unless explicitly sent to the background or forking themselves off as a daemon, commands in a shell script are serialized.
They wait until the previous one is finished.
However, you can write 2 scripts and run them in separate processes, so they can be executed simultaneously. It's a wild guess, really, but I think you'll get an access error if a process tries to write in a file that's being read by another process.
I think what you want is the concept of a subshell. Here's one reference I just googled: http://www.linuxtopia.org/online_books/advanced_bash_scripting_guide/subshells.html

Waiting for completion of parallel running shell scripts

I have an array and using that array I need to run the shell scripts in parallel as
for i in arr
do
sh i.sh &
done
wait
I need to wait for the completion of their execution before proceeding to the next step.
I think that your script doesn't do what you want it to do for a different reason than you're expecting. sh i.sh & is trying to run a file called i.sh. It's not using the variable i. To fix it, simply add $ before the i. it is waiting for commands to complete. Just not the ones you're expecting it to. It's actually trying to run the same script that doesn't exist a bunch of times.
for i in arr
do
sh $i.sh &
done
wait

shell: clean up leaked background processes which hang due to shared stdout/stderr

I need to run essentially arbitrary commands on a (remote) shell in ephemeral containers/VMs for a test execution engine. Sometimes these leak background processes which then cause the entire command to hang. This can be boiled down to this simple command:
$ sh -c 'sleep 30 & echo payload'
payload
$
Here the backgrounded sleep 30 plays the role of a leaked process (which in reality will be something like dbus-daemon) and the echo is the actual thing I want to run. The sleep 30 & echo payload should be considered as an atomic opaque example command here.
The above command is fine and returns immediately as the shell's and also sleep's stdout/stderr are a PTY. However, when capturing the output of the command to a pipe/file (a test runner wants to save everything into a log, after all), the whole command hangs:
$ sh -c 'sleep 30 & echo payload' | cat
payload
# ... does not return to the shell (until the sleep finishes)
Now, this could be fixed with some rather ridiculously complicated shell magic which determines the FDs of stdout/err from /proc/$$/fd/{1,2}, iterating over ls /proc/[0-9]*/fd/* and killing every process which also has the same stdout/stderr. But this involves a lot of brittle shell code and expensive shell string comparisons.
Is there a way to clean up these leaked background processes in a more elegant and simpler way? setsid does not help:
$ sh -c 'setsid -w sh -c "sleep 30 & echo payload"' | cat
payload
# hangs...
Note that process groups/sessions and killing them wholesale isn't sufficient as leaked processes (like dbus-daemon) often setsid themselves.
P.S. I can only assume POSIX shell or bash in these environments; no Python, Perl, etc.
Thank you in advance!
We had this problem with parallel tests in Launchpad. The simplest solution we had then - which worked well - was just to make sure that no processes share stdout/stdin/stderr (except ones where you actually want to hang if they haven't finished - e.g. the test workers themselves).
Hmm, having re-read this I cannot give you the solution you are after (use systemd to kill them). What we came up with is to simply ignore the processes but reliably not hang when the single process we were waiting for is done. Note that this is distinctly different from the pipes getting closed.
Another option, not perfect but useful, is to become a local reaper with prctl(2) and PR_SET_CHILD_SUBREAPER. This will allow you to be the parent of all the processes that would otherwise reparent to init. With this arrangement you could try to kill all the processes that have you as ppid. This is terrible but it's the closest best thing to using cgroups.
But note, that unless you are running this helper as root you will find that practical testing might spawn some setuid thing that will lurk and won't be killable. It's an annoying problem really.
Use script -qfc instead of sh -c.

How to make bash interpreter stop until a command is finished?

I have a bash script with a loop that calls a hard calculation routine every iteration. I use the results from every calculation as input to the next. I need make bash stop the script reading until every calculation is finished.
for i in $(cat calculation-list.txt)
do
./calculation
(other commands)
done
I know the sleep program, and i used to use it, but now the time of the calculations varies greatly.
Thanks for any help you can give.
P.s>
The "./calculation" is another program, and a subprocess is opened. Then the script passes instantly to next step, but I get an error in the calculation because the last is not finished yet.
If your calculation daemon will work with a precreated empty logfile, then the inotify-tools package might serve:
touch $logfile
inotifywait -qqe close $logfile & ipid=$!
./calculation
wait $ipid
(edit: stripped a stray semicolon)
if it closes the file just once.
If it's doing an open/write/close loop, perhaps you can mod the daemon process to wrap some other filesystem event around the execution? `
#!/bin/sh
# Uglier, but handles logfile being closed multiple times before exit:
# Have the ./calculation start this shell script, perhaps by substituting
# this for the program it's starting
trap 'echo >closed-on-calculation-exit' 0 1 2 3 15
./real-calculation-daemon-program
Well, guys, I've solved my problem with a different approach. When the calculation is finished a logfile is created. I wrote then a simple until loop with a sleep command. Although this is very ugly, it works for me and it's enough.
for i in $(cat calculation-list.txt)
do
(calculations routine)
until [[ -f $logfile ]]; do
sleep 60
done
(other commands)
done
Easy. Get the process ID (PID) via some awk magic and then use wait too wait for that PID to end. Here are the details on wait from the advanced Bash scripting guide:
Suspend script execution until all jobs running in background have
terminated, or until the job number or process ID specified as an
option terminates. Returns the exit status of waited-for command.
You may use the wait command to prevent a script from exiting before a
background job finishes executing (this would create a dreaded orphan
process).
And using it within your code should work like this:
for i in $(cat calculation-list.txt)
do
./calculation >/dev/null 2>&1 & CALCULATION_PID=(`jobs -l | awk '{print $2}'`);
wait ${CALCULATION_PID}
(other commands)
done

Is it possible for bash commands to continue before the result of the previous command?

When running commands from a bash script, does bash always wait for the previous command to complete, or does it just start the command then go on to the next one?
ie: If you run the following two commands from a bash script is it possible for things to fail?
cp /tmp/a /tmp/b
cp /tmp/b /tmp/c
Yes, if you do nothing else then commands in a bash script are serialized. You can tell bash to run a bunch of commands in parallel, and then wait for them all to finish, but doing something like this:
command1 &
command2 &
command3 &
wait
The ampersands at the end of each of the first three lines tells bash to run the command in the background. The fourth command, wait, tells bash to wait until all the child processes have exited.
Note that if you do things this way, you'll be unable to get the exit status of the child commands (and set -e won't work), so you won't be able to tell whether they succeeded or failed in the usual way.
The bash manual has more information (search for wait, about two-thirds of the way down).
add '&' at the end of a command to run it parallel.
However, it is strange because in your case the second command depends on the final result of the first one. Either use sequential commands or copy to b and c from a like this:
cp /tmp/a /tmp/b &
cp /tmp/a /tmp/c &
Unless you explicitly tell bash to start a process in the background, it will wait until the process exits. So if you write this:
foo args &
bash will continue without waiting for foo to exit. But if you don't explicitly put the process in the background, bash will wait for it to exit.
Technically, a process can effectively put itself in the background by forking a child and then exiting. But since that technique is used primarily by long-lived processes, this shouldn't affect you.
In general, unless explicitly sent to the background or forking themselves off as a daemon, commands in a shell script are serialized.
They wait until the previous one is finished.
However, you can write 2 scripts and run them in separate processes, so they can be executed simultaneously. It's a wild guess, really, but I think you'll get an access error if a process tries to write in a file that's being read by another process.
I think what you want is the concept of a subshell. Here's one reference I just googled: http://www.linuxtopia.org/online_books/advanced_bash_scripting_guide/subshells.html

Resources