Why does bash "forget" about my background processes? - bash

I have this code:
#!/bin/bash
pids=()
for i in $(seq 1 999); do
sleep 1 &
pids+=( "$!" )
done
for pid in "${pids[#]}"; do
wait "$pid"
done
I expect the following behavior:
spin through the first loop
wait about a second on the first pid
spin through the second loop
Instead, I get this error:
./foo.sh: line 8: wait: pid 24752 is not a child of this shell
(repeated 171 times with different pids)
If I run the script with shorter loop (50 instead of 999), then I get no errors.
What's going on?
Edit: I am using GNU bash 4.4.23 on Windows.

POSIX says:
The implementation need not retain more than the {CHILD_MAX} most recent entries in its list of known process IDs in the current shell execution environment.
{CHILD_MAX} here refers to the maximum number of simultaneous processes allowed per user. You can get the value of this limit using the getconf utility:
$ getconf CHILD_MAX
13195
Bash stores the statuses of at most twice as that many exited background processes in a circular buffer, and says not a child of this shell when you call wait on the PID of an old one that's been overwritten. You can see how it's implemented here.

The way you might reasonably expect this to work, as it would if you wrote a similar program in most other languages, is:
sleep is executed in the background via a fork+exec.
At some point, sleep exits leaving behind a zombie.
That zombie remains in place, holding its PID, until its parent calls wait to retrieve its exit code.
However, shells such as bash actually do this a little differently. They proactively reap their zombie children and store their exit codes in memory so that they can deallocate the system resources those processes were using. Then when you wait the shell just hands you whatever value is stored in memory, but the zombie could be long gone by then.
Now, because all of these exit statuses are being stored in memory, there is a practical limit to how many background processes can exit without you calling wait before you've filled up all the memory you have available for this in the shell. I expect that you're hitting this limit somewhere in the several hundreds of processes in your environment, while other users manage to make it into the several thousands in theirs. Regardless, the outcome is the same - eventually there's nowhere to store information about your children and so that information is lost.

I can reproduce on ArchLinux with docker run -ti --rm bash:5.0.18 bash -c 'pids=; for ((i=1;i<550;++i)); do true & pids+=" $!"; done; wait $pids' and any earlier. I can't reproduce with bash:5.1.0 .
What's going on?
It looks like a bug in your version of Bash. There were a couple of improvements in jobs.c and wait.def in Bash:5.1 and Make sure SIGCHLD is blocked in all cases where waitchld() is not called from a signal handler is mentioned in the changelog. From the look of it, it looks like an issue with handling a SIGCHLD signal while already handling another SIGCHLD signal.

Related

shell: clean up leaked background processes which hang due to shared stdout/stderr

I need to run essentially arbitrary commands on a (remote) shell in ephemeral containers/VMs for a test execution engine. Sometimes these leak background processes which then cause the entire command to hang. This can be boiled down to this simple command:
$ sh -c 'sleep 30 & echo payload'
payload
$
Here the backgrounded sleep 30 plays the role of a leaked process (which in reality will be something like dbus-daemon) and the echo is the actual thing I want to run. The sleep 30 & echo payload should be considered as an atomic opaque example command here.
The above command is fine and returns immediately as the shell's and also sleep's stdout/stderr are a PTY. However, when capturing the output of the command to a pipe/file (a test runner wants to save everything into a log, after all), the whole command hangs:
$ sh -c 'sleep 30 & echo payload' | cat
payload
# ... does not return to the shell (until the sleep finishes)
Now, this could be fixed with some rather ridiculously complicated shell magic which determines the FDs of stdout/err from /proc/$$/fd/{1,2}, iterating over ls /proc/[0-9]*/fd/* and killing every process which also has the same stdout/stderr. But this involves a lot of brittle shell code and expensive shell string comparisons.
Is there a way to clean up these leaked background processes in a more elegant and simpler way? setsid does not help:
$ sh -c 'setsid -w sh -c "sleep 30 & echo payload"' | cat
payload
# hangs...
Note that process groups/sessions and killing them wholesale isn't sufficient as leaked processes (like dbus-daemon) often setsid themselves.
P.S. I can only assume POSIX shell or bash in these environments; no Python, Perl, etc.
Thank you in advance!
We had this problem with parallel tests in Launchpad. The simplest solution we had then - which worked well - was just to make sure that no processes share stdout/stdin/stderr (except ones where you actually want to hang if they haven't finished - e.g. the test workers themselves).
Hmm, having re-read this I cannot give you the solution you are after (use systemd to kill them). What we came up with is to simply ignore the processes but reliably not hang when the single process we were waiting for is done. Note that this is distinctly different from the pipes getting closed.
Another option, not perfect but useful, is to become a local reaper with prctl(2) and PR_SET_CHILD_SUBREAPER. This will allow you to be the parent of all the processes that would otherwise reparent to init. With this arrangement you could try to kill all the processes that have you as ppid. This is terrible but it's the closest best thing to using cgroups.
But note, that unless you are running this helper as root you will find that practical testing might spawn some setuid thing that will lurk and won't be killable. It's an annoying problem really.
Use script -qfc instead of sh -c.

How can I create a process in Bash that has zero overhead but which gives me a process ID

For those of you who know what you're talking about I apologise for butchering the way that I'm going to phrase this question. I know nothing about bash whatsoever. With that caveat out of the way, let me get out my cleaver...
I am building a Rails app which has what's called a procfile which sets up any processes that need to be run in different environments
e.g.
web: bundle exec unicorn -p $PORT -c ./config/unicorn.rb
redis: redis-server
worker: bundle exec sidekiq
proxylocal: bin/proxylocal_local
Each one of these lines specs a process to be run. It also expects a pid to be returned after the process spins up. The syntax is
process_name: process_invokation_script
However the last process, proxylocal, only actually starts a process in development. In production it doesn't do anything.
Unfortunately that causes the Procfile to choke as it needs a process ID returned. So is there some super-simple, zero-overhead process that I can spawn in that case just to keep the procfile happy?
The sleep command does nothing for a specified period of time, with very low overhead. Give it an argument longer than your code will run.
For example
sleep 2147483647
does nothing for 231-1 seconds, just over 68 years. I picked that number because any reasonable implementation of sleep should be able to handle it.
In the unlikely event that that doesn't work (say if you're on an old 16-bit system that can't sleep for more than 216-1 seconds), you can do a sleep in an infinite loop:
sh -c 'while : ; do sleep 30000 ; done'
This assumes that you need the process to run for a very long time; that depends on what your application needs to do with the process ID. If it's required to be unique as long as the application is running, you need something that will continue to run for a long time; if the process terminates, its PID can be re-used by another process.
If that's not a requirement, you can use sleep 0 or true, which will terminate immediately.
If you need to give the application a little time to get the process ID before the process terminates, something like sleep 10 or even sleep 1 might work, though determining just how long it needs to run can be tricky and error-prone.
If Heroku isn't doing anything with proxylocal I'm not sure why you'd even want this in your Procifle. I'm also a bit confused about whether you want to change the Procfile or what bin/proxylocal_local does and how you would even do that.
That being said, if you are able to do anything you like for production your script can just call cat and it will create a pid and then just sit waiting for the next command (which never comes).
For truly minimal overhead, you don't want to run any external commands. When the shell starts a command, it first forks itself, then the child shell execs the external command. If the forked child can run a builtin, you can skip the exec.
Start by creating a read-only fifo somewhere.
mkfifo foo
chmod 400 foo
Then, whenever you need a do-nothing process, just fork a shell which tries to read from the fifo. It's read-only, so no one can write to it, so all reads will block.
read < foo &

Resource leaking of available PIDs by long running bash scripts

I am currently reading up on some more details on Bash scripting and especially process management here. In the section on "PIDs and Parents" I found the following statement:
A process's PID will NEVER be freed up for use after the process dies UNTIL the parent process waits for the PID to see whether it ended and retrieve its exit code.
So if I understand this correctly, if I start an process in a bash script, then the process terminates, that the PID cannot be used by any other process. Wouldn't this mean, that if I have a long running script, which repeatedly starts other sub-processes but never waits on them, that I'll eventually have a resource leak, because the used PIDs will not be returned back to the system?
How about if I actually wait for the other process, but the wait get's cancelled by a trap. Would this wait somehow still free up the PID, or do I have to wait again after the trap has been caught?
Luckily you won't. I can't tell you exactly why but you can easily test this. Run the following script (stop with Ctrl+C):
#!/bin/bash
while true; do
sleep 5 &
sleep 1
done
You can see you get no zombies (leaked PIDs) after 6+ seconds. To see some zombies use the following python code (again, stop with Ctrl+C):
#!/usr/bin/python
import subprocess, time
pl = []
while True:
pl.append(subprocess.Popen(["sleep", "5"]))
time.sleep(1)
After 6 seconds you'll see one zombie:
ps xaw | grep 'sleep'
...
26470 pts/2 Z+ 0:00 [sleep] <defunct>
...
My guess is that bash does wait and stores the results reaping the zombile processes with or without the builtin wait command. For the python script, if you remove the pl.append part the garbage collection releases the objects and does it's magic again reaping the zombies. Just for info a child may never become a zombie (from wikipedia, Zombie process):
...if the parent explicitly ignores SIGCHLD by setting its handler to SIG_IGN (rather
than simply ignoring the signal by default) or has the SA_NOCLDWAIT flag set, all
child exit status information will be discarded and no zombie processes will be left.
You don't have to explicitly wait on foreground processes because the shell in which your script is running waits on them. The next process won't start until the previous one finishes.
If you start many long running background processes, you could use all available PIDs, but that's subject to the limit of ulimit -u (which could be unlimited).

bash restart sub-process using trap SIGCHLD?

I've seen monitoring programs either in scripts that check process status using 'ps' or 'service status(on Linux)' periodically, or in C/C++ that forks and wait on the process...
I wonder if it is possible to use bash with trap and restart the sub-process when SIGCLD received?
I have tested a basic suite on RedHat Linux with following idea (and certainly it didn't work...)
#!/bin/bash
set -o monitor # can someone explain this? discussion on Internet say this is needed
trap startProcess SIGCHLD
startProcess() {
/path/to/another/bash/script.sh & # the one to restart
while [ 1 ]
do
sleep 60
done
}
startProcess
what the bash script being started just sleep for a few seconds and exit for now.
several issues observed:
when the shell starts in foreground, SIGCHLD will be handled only once. does trap reset signal handling like signal()?
the script and its child seem to be immune to SIGINT, which means they cannot be stopped by ^C
since cannot be closed, I closed the terminal. The script seems to be HUP and many zombie children left.
when run in background, the script caused terminal to die
... anyway, this does not work at all. I have to say I know too little about this topic.
Can someone suggest or give some working examples?
Are there scripts for such use?
how about use wait in bash, then?
Thanks
I can try to answer some of your questions but not all based on what I
know.
The line set -o monitor (or equivalently, set -m) turns on job
control, which is only on by default for interactive shells. This seems
to be required for SIGCHLD to be sent. However, job control is more of
an interactive feature and not really meant to be used in shell scripts
(see also this question).
Also keep in mind this is probably not what you intended to do
because once you enable job control, SIGCHLD will be sent for every
external command that exists (e.g. every time you run ls or grep or
anything, a SIGCHLD will fire when that command completes and your trap
will run).
I suspect the reason the SIGCHLD trap only appears to run once is
because your trap handler contains a foreground infinite loop, so your
script gets stuck in the trap handler. There doesn't seem to be a point
to that loop anyways, so you could simply remove it.
The script's "immunity" to SIGINT seems to be an effect of enabling
job control (the monitor part). My hunch is with job control turned on,
the sub-instance of bash that runs your script no longer terminates
itself in response to a SIGINT but instead passes the SIGINT through to
its foreground child process. In your script, the ^C i.e. SIGINT
simply acts like a continue statement in other programming languages
case, since SIGINT will just kill the currently running sleep 60,
whereupon the while loop will immediately run a new sleep 60.
When I tried running your script and then killing it (from another
terminal), all I ended up with were two stray sleep processes.
Backgrounding that script also kills my shell for me, although
the behavior is not terribly consistent (sometimes it happens
immediately, other times not at all). It seems typing any keys other
than enter causes an EOF to get sent somehow. Even after the terminal
exits the script continues to run in the background. I have no idea
what is going on here.
Being more specific about what you want to accomplish would help. If
you just want a command to run continuously for the lifetime of your
script, you could run an infinite loop in the background, like
while true; do
some-command
echo some-command finished
echo restarting some-command ...
done &
Note the & after the done.
For other tasks, wait is probably a better idea than using job control
in a shell script. Again, it would depend on what exactly you are trying
to do.

How to kill all children of the current shell on interrupt?

My scripts cdist-deploy-to and cdist-mass-deploy (from cdist configuration management) run interactively (i.e. are called by a user).
These scripts call a lot of scripts, which again call some scripts:
cdist-mass-deploy ...
cdist-deploy-to ...
cdist-explorer-run-global ...
cdist-dir ....
What I want is to exit / kill all scripts, as soon as cdist-mass-deploy is either stopped by control C (SIGINT) or killed with SIGTERM.
cdist-deploy-to can also be called interactively and should exhibit the same behaviour.
Using ps -ef... and co variants to find out all processes with the ppid looks like it could be quite unportable. Using $! does not work as in the deeper levels the children are no background processes.
I tried using the following code:
__cdist_kill_on_interrupt()
{
__cdist_tmp_removal
kill 0
exit 1
}
trap __cdist_kill_on_interrupt INT TERM
But this leads to ugly Terminated messages as well as to a segfault in the shells (dash, bash, zsh) and seems not to stop everything instantly anyway:
# cdist-mass-deploy -p ikq04.ethz.ch ikq05.ethz.ch
core: Waiting for cdist-deploy-to jobs to finish
^CTerminated
Terminated
Terminated
Terminated
Segmentation fault
So the question is, how to cleanly exit including all (sub-)children in a portable manner (bourne shell, no csh support needed)?
You don't need to handle ^C, that will result in a signal being sent to the whole process group, which will kill all the processes that are not in the background. So you don't need to catch INT.
The only reason you get a Terminated when you kill them is that kill sends TERM by default, but that's reasonable if you are handling a TERM in the first place. You could use kill -INT 0 if you want to avoid the messages.
(responding with extra info)
If the child processes are run in the background, you can get their process ids just after you start them, using the $! special shell variable. Gather these together in a variable and just kill them all when you need to terminate.

Resources