Shell Job Control: Track status of background with WNOHANG wait - shell

What does GNU mean by this line (from here)?
The shell must also check on the status of background jobs so that it can report terminated and stopped jobs to the user; this can be done by calling waitpid with the WNOHANG option.
I don't understand why the shell should alert the user about background processes before executing. What would that look like too? Like, call ls, but a background process completed, so that process' status is printed before ls?

It's for implementing notifications like the following for background jobs:
$ cmd_1 &
$ cmd_2
$ cmd_3
[1]+ Done cmd_1
$
(Something like sleep 5 is a good cmd_1 to try this out with.)
In the above, it's assumed that the backgrounded cmd_1 job finishes while cmd_3 is being typed in or run. The notification is delivered afterwards, just before printing the last prompt above.
waitpid(2) is used to wait for processes to change state (either terminate or stop or start, as in what e.g. Ctrl-Z and fg does).
To implement the display above, the shell can call waitpid(2) to check if the background job has changed state each time before prompting for a new command. If it does this without passing WNOHANG, then the waitpid() call will block until the background job actually changes state, meaning the shell will be stuck until cmd_1 finishes before printing the second prompt. WNOHANG makes the waitpid() call non-blocking and allows the shell to "poll" for state changes in the job instead.

Related

Neovim process spawned from fish script terminates immediately

I'm trying to achieve the following:
from a fish script, open a PDF reader as a background job. Once it is opened, spawn another fish process (that runs an infinite while loop), also as a background job.
Next, open an editor (neovim) and allow it to take control of the running terminal. Once neovim terminates, also suspend the previous 2 background jobs (mupdf and the other fish process).
My current attempt looks something along the lines of:
mupdf $pdfpath &
set pid_mupdf $last_pid
fish -c "while inotifywait ...; [logic to rebuild the pdf file..]; end" &
set pid_sub $last_pid
nvim $mdpath && kill -2 $pid_mudf $pid_sub
First I open mupdf as a background job and save its PID in a variable. Next I spawn the other fish process, also as a background job, and I save its PID as well.
Next I run nvim (but not as a background job, as I intend to actually control it), and after it is terminated by the user, I gracefully kill the previous 2 background jobs.
However this doesn't work as intended.
mupdf and the second fish process open successfully, and so does nvim, but it quickly closes after around half a second, after which I get the following in the controlling terminal window: image (bote is just the filename of the script from which the lines above originate)
The 2 background processes stay running after that and I have to kill them manually.
I understand that the script is sent a SIGHUP because the controlling terminal now executes another application (neovim), but why does neovim close after that?
I also tried disowning the background processes after they're spawned but that didn't help.
How would I solve this issue?
The problem is that $last_pid, in fish 3, and %last, in fish 2, doesn't work by default in scripts. See https://github.com/fish-shell/fish-shell/issues/5036. You can "fix" this by putting status job-control full at the top of the script or using the (jobs -lp) hack that Glenn mentioned.
Regarding the background process remaining running... I can't reproduce that. It works for me. However, note that your nvim && kill will only run the kill if nvim exits with a status of zero. If you always want the kill to be run you should just unconditionally execute it. Also, your use of signal two (SIGINT) should produce the desired result but is unusual. You should use kill -15 or just omit the signal in which case it defaults to 15 (SIGTERM).
You're getting the PID incorrectly. The $pid_mudf and $pid_sub variables are empty. You want
set pid_mupdf (jobs -lp)

Multiple process from one bash script [duplicate]

I'm trying to use a shell script to start a command. I don't care if/when/how/why it finishes. I want the process to start and run, but I want to be able to get back to my shell immediately...
You can just run the script in the background:
$ myscript &
Note that this is different from putting the & inside your script, which probably won't do what you want.
Everyone just forgot disown. So here is a summary:
& puts the job in the background.
Makes it block on attempting to read input, and
Makes the shell not wait for its completion.
disown removes the process from the shell's job control, but it still leaves it connected to the terminal.
One of the results is that the shell won't send it a SIGHUP(If the shell receives a SIGHUP, it also sends a SIGHUP to the process, which normally causes the process to terminate).
And obviously, it can only be applied to background jobs(because you cannot enter it when a foreground job is running).
nohup disconnects the process from the terminal, redirects its output to nohup.out and shields it from SIGHUP.
The process won't receive any sent SIGHUP.
Its completely independent from job control and could in principle be used also for foreground jobs(although that's not very useful).
Usually used with &(as a background job).
nohup cmd
doesn't hangup when you close the terminal. output by default goes to nohup.out
You can combine this with backgrounding,
nohup cmd &
and get rid of the output,
nohup cmd > /dev/null 2>&1 &
you can also disown a command. type cmd, Ctrl-Z, bg, disown
Alternatively, after you got the program running, you can hit Ctrl-Z which stops your program and then type
bg
which puts your last stopped program in the background. (Useful if your started something without '&' and still want it in the backgroung without restarting it)
screen -m -d $command$ starts the command in a detached session. You can use screen -r to attach to the started session. It is a wonderful tool, extremely useful also for remote sessions. Read more at man screen.

Is there a way to make bash job control quiet?

Bash is quite verbose when running jobs in the background:
$ echo toto&
toto
[1] 15922
[1]+ Done echo toto
Since I'm trying to run jobs in parallel and use the output, I'd like to find a way to silence bash. Is there a way to remove this superfluous output?
You can use parentheses to run a background command in a subshell, and that will silence the job control messages. For example:
(sleep 10 & )
Note: The following applies to interactive Bash sessions. In scripts, job-control messages are never printed.
There are 2 basic scenarios for silencing Bash's job-control messages:
Launch-and-forget:
CodeGnome's helpful answer answer suggests enclosing the background command in a simple subshell - e.g, (sleep 10 &) - which effectively silences job-control messages - both on job creation and on job termination.
This has an important side effect:
By using control operator & inside the subshell, you lose control of the background job - jobs won't list it, and neither %% (the spec. (ID) of the most recently launched job) nor $! (the PID of the (last) process launched (as part of) the most recent job) will reflect it.[1]
For launch-and-forget scenarios, this is not a problem:
You just fire off the background job,
and you let it finish on its own (and you trust that it runs correctly).
[1] Conceivably, you could go looking for the process yourself, by searching running processes for ones matching its command line, but that is cumbersome and not easy to make robust.
Launch-and-control-later:
If you want to remain in control of the job, so that you can later:
kill it, if need be.
synchronously wait (at some later point) for its completion,
a different approach is needed:
Silencing the creation job-control messages is handled below, but in order to silence the termination job-control messages categorically, you must turn the job-control shell option OFF:
set +m (set -m turns it back on)
Caveat: This is a global setting that has a number of important side effects, notably:
Stdin for background commands is then /dev/null rather than the current shell's.
The keyboard shortcuts for suspending (Ctrl-Z) and delay-suspending (Ctrl-Y) a foreground command are disabled.
For the full story, see man bash and (case-insensitively) search for occurrences of "job control".
To silence the creation job-control messages, enclose the background command in a group command and redirect the latter's stderr output to /dev/null
{ sleep 5 & } 2>/dev/null
The following example shows how to quietly launch a background job while retaining control of the job in principle.
$ set +m; { sleep 5 & } 2>/dev/null # turn job-control option off and launch quietly
$ jobs # shows the job just launched; it will complete quietly due to set +m
If you do not want to turn off the job-control option (set +m), the only way to silence the termination job-control message is to either kill the job or wait for it:
Caveat: There are two edge cases where this technique still produces output:
If the background command tries to read from stdin right away.
If the background command terminates right away.
To launch the job quietly (as above, but without set +m):
$ { sleep 5 & } 2>/dev/null
To wait for it quietly:
$ wait %% 2>/dev/null # use of %% is optional here
To kill it quietly:
{ kill %% && wait; } 2>/dev/null
The additional wait is necessary to make the termination job-control message that is normally displayed asynchronously by Bash (at the time of actual process termination, shortly after the kill) a synchronous output from wait, which then allows silencing.
But, as stated, if the job completes by itself, a job-control message will still be displayed.
Wrap it in a dummy script:
quiet.sh:
#!/bin/bash
$# &
then call it, passing your command to it as an argument:
./quiet.sh echo toto
You may need to play with quotes depending on your input.
Interactively, no. It will always display job status. You can influence when the status is shown using set -b.
There's nothing preventing you from using the output of your commands (via pipes, or storing it variables, etc). The job status is sent to the controlling terminal by the shell and doesn't mix with other I/O. If you're doing something complex with jobs, the solution is to write a separate script.
The job messages are only really a problem if you have, say, functions in your bashrc which make use of job control which you want to have direct access to your interactive environment. Unfortunately there's nothing you can do about it.
One solution (in bash anyway) is to route all the output to /dev/null
echo 'hello world' > /dev/null &
The above will not give you any output other than the id for the bg process.

bash restart sub-process using trap SIGCHLD?

I've seen monitoring programs either in scripts that check process status using 'ps' or 'service status(on Linux)' periodically, or in C/C++ that forks and wait on the process...
I wonder if it is possible to use bash with trap and restart the sub-process when SIGCLD received?
I have tested a basic suite on RedHat Linux with following idea (and certainly it didn't work...)
#!/bin/bash
set -o monitor # can someone explain this? discussion on Internet say this is needed
trap startProcess SIGCHLD
startProcess() {
/path/to/another/bash/script.sh & # the one to restart
while [ 1 ]
do
sleep 60
done
}
startProcess
what the bash script being started just sleep for a few seconds and exit for now.
several issues observed:
when the shell starts in foreground, SIGCHLD will be handled only once. does trap reset signal handling like signal()?
the script and its child seem to be immune to SIGINT, which means they cannot be stopped by ^C
since cannot be closed, I closed the terminal. The script seems to be HUP and many zombie children left.
when run in background, the script caused terminal to die
... anyway, this does not work at all. I have to say I know too little about this topic.
Can someone suggest or give some working examples?
Are there scripts for such use?
how about use wait in bash, then?
Thanks
I can try to answer some of your questions but not all based on what I
know.
The line set -o monitor (or equivalently, set -m) turns on job
control, which is only on by default for interactive shells. This seems
to be required for SIGCHLD to be sent. However, job control is more of
an interactive feature and not really meant to be used in shell scripts
(see also this question).
Also keep in mind this is probably not what you intended to do
because once you enable job control, SIGCHLD will be sent for every
external command that exists (e.g. every time you run ls or grep or
anything, a SIGCHLD will fire when that command completes and your trap
will run).
I suspect the reason the SIGCHLD trap only appears to run once is
because your trap handler contains a foreground infinite loop, so your
script gets stuck in the trap handler. There doesn't seem to be a point
to that loop anyways, so you could simply remove it.
The script's "immunity" to SIGINT seems to be an effect of enabling
job control (the monitor part). My hunch is with job control turned on,
the sub-instance of bash that runs your script no longer terminates
itself in response to a SIGINT but instead passes the SIGINT through to
its foreground child process. In your script, the ^C i.e. SIGINT
simply acts like a continue statement in other programming languages
case, since SIGINT will just kill the currently running sleep 60,
whereupon the while loop will immediately run a new sleep 60.
When I tried running your script and then killing it (from another
terminal), all I ended up with were two stray sleep processes.
Backgrounding that script also kills my shell for me, although
the behavior is not terribly consistent (sometimes it happens
immediately, other times not at all). It seems typing any keys other
than enter causes an EOF to get sent somehow. Even after the terminal
exits the script continues to run in the background. I have no idea
what is going on here.
Being more specific about what you want to accomplish would help. If
you just want a command to run continuously for the lifetime of your
script, you could run an infinite loop in the background, like
while true; do
some-command
echo some-command finished
echo restarting some-command ...
done &
Note the & after the done.
For other tasks, wait is probably a better idea than using job control
in a shell script. Again, it would depend on what exactly you are trying
to do.

How to kill all children of the current shell on interrupt?

My scripts cdist-deploy-to and cdist-mass-deploy (from cdist configuration management) run interactively (i.e. are called by a user).
These scripts call a lot of scripts, which again call some scripts:
cdist-mass-deploy ...
cdist-deploy-to ...
cdist-explorer-run-global ...
cdist-dir ....
What I want is to exit / kill all scripts, as soon as cdist-mass-deploy is either stopped by control C (SIGINT) or killed with SIGTERM.
cdist-deploy-to can also be called interactively and should exhibit the same behaviour.
Using ps -ef... and co variants to find out all processes with the ppid looks like it could be quite unportable. Using $! does not work as in the deeper levels the children are no background processes.
I tried using the following code:
__cdist_kill_on_interrupt()
{
__cdist_tmp_removal
kill 0
exit 1
}
trap __cdist_kill_on_interrupt INT TERM
But this leads to ugly Terminated messages as well as to a segfault in the shells (dash, bash, zsh) and seems not to stop everything instantly anyway:
# cdist-mass-deploy -p ikq04.ethz.ch ikq05.ethz.ch
core: Waiting for cdist-deploy-to jobs to finish
^CTerminated
Terminated
Terminated
Terminated
Segmentation fault
So the question is, how to cleanly exit including all (sub-)children in a portable manner (bourne shell, no csh support needed)?
You don't need to handle ^C, that will result in a signal being sent to the whole process group, which will kill all the processes that are not in the background. So you don't need to catch INT.
The only reason you get a Terminated when you kill them is that kill sends TERM by default, but that's reasonable if you are handling a TERM in the first place. You could use kill -INT 0 if you want to avoid the messages.
(responding with extra info)
If the child processes are run in the background, you can get their process ids just after you start them, using the $! special shell variable. Gather these together in a variable and just kill them all when you need to terminate.

Resources