Bash is quite verbose when running jobs in the background:
$ echo toto&
toto
[1] 15922
[1]+ Done echo toto
Since I'm trying to run jobs in parallel and use the output, I'd like to find a way to silence bash. Is there a way to remove this superfluous output?
You can use parentheses to run a background command in a subshell, and that will silence the job control messages. For example:
(sleep 10 & )
Note: The following applies to interactive Bash sessions. In scripts, job-control messages are never printed.
There are 2 basic scenarios for silencing Bash's job-control messages:
Launch-and-forget:
CodeGnome's helpful answer answer suggests enclosing the background command in a simple subshell - e.g, (sleep 10 &) - which effectively silences job-control messages - both on job creation and on job termination.
This has an important side effect:
By using control operator & inside the subshell, you lose control of the background job - jobs won't list it, and neither %% (the spec. (ID) of the most recently launched job) nor $! (the PID of the (last) process launched (as part of) the most recent job) will reflect it.[1]
For launch-and-forget scenarios, this is not a problem:
You just fire off the background job,
and you let it finish on its own (and you trust that it runs correctly).
[1] Conceivably, you could go looking for the process yourself, by searching running processes for ones matching its command line, but that is cumbersome and not easy to make robust.
Launch-and-control-later:
If you want to remain in control of the job, so that you can later:
kill it, if need be.
synchronously wait (at some later point) for its completion,
a different approach is needed:
Silencing the creation job-control messages is handled below, but in order to silence the termination job-control messages categorically, you must turn the job-control shell option OFF:
set +m (set -m turns it back on)
Caveat: This is a global setting that has a number of important side effects, notably:
Stdin for background commands is then /dev/null rather than the current shell's.
The keyboard shortcuts for suspending (Ctrl-Z) and delay-suspending (Ctrl-Y) a foreground command are disabled.
For the full story, see man bash and (case-insensitively) search for occurrences of "job control".
To silence the creation job-control messages, enclose the background command in a group command and redirect the latter's stderr output to /dev/null
{ sleep 5 & } 2>/dev/null
The following example shows how to quietly launch a background job while retaining control of the job in principle.
$ set +m; { sleep 5 & } 2>/dev/null # turn job-control option off and launch quietly
$ jobs # shows the job just launched; it will complete quietly due to set +m
If you do not want to turn off the job-control option (set +m), the only way to silence the termination job-control message is to either kill the job or wait for it:
Caveat: There are two edge cases where this technique still produces output:
If the background command tries to read from stdin right away.
If the background command terminates right away.
To launch the job quietly (as above, but without set +m):
$ { sleep 5 & } 2>/dev/null
To wait for it quietly:
$ wait %% 2>/dev/null # use of %% is optional here
To kill it quietly:
{ kill %% && wait; } 2>/dev/null
The additional wait is necessary to make the termination job-control message that is normally displayed asynchronously by Bash (at the time of actual process termination, shortly after the kill) a synchronous output from wait, which then allows silencing.
But, as stated, if the job completes by itself, a job-control message will still be displayed.
Wrap it in a dummy script:
quiet.sh:
#!/bin/bash
$# &
then call it, passing your command to it as an argument:
./quiet.sh echo toto
You may need to play with quotes depending on your input.
Interactively, no. It will always display job status. You can influence when the status is shown using set -b.
There's nothing preventing you from using the output of your commands (via pipes, or storing it variables, etc). The job status is sent to the controlling terminal by the shell and doesn't mix with other I/O. If you're doing something complex with jobs, the solution is to write a separate script.
The job messages are only really a problem if you have, say, functions in your bashrc which make use of job control which you want to have direct access to your interactive environment. Unfortunately there's nothing you can do about it.
One solution (in bash anyway) is to route all the output to /dev/null
echo 'hello world' > /dev/null &
The above will not give you any output other than the id for the bg process.
Related
I found that some people run a program in shell like this
exe > the.log 2>&1 &!
I understand the first part, it redirects stderr to stdout also "&" means runs the program in background, but I don't know what does "&!" mean, what does the exclamation mark mean?
Within zsh the command &! is a shortcut for disown, i.e. the program won't get killed upon exiting the invoking shell.
See man zshbuiltins
disown [ job ... ]
job ... &|
job ... &!
Remove the specified jobs from the job table; the shell will no longer report their status, and will not complain if you try to exit an interactive shell with them running or stopped. If no job is specified, disown the current job. If the jobs are currently stopped and the AUTO_CONTINUE option is not set, a warning is printed containing information about how to make them running after they have been disowned. If one of the latter two forms is used, the jobs will automatically be made running, independent of the setting of the AUTO_CONTINUE option.
When I start some background process in the shell, for example:
geth --maxpeers 0 --rpc &
It returns something like:
[1] 1859
...without any out stream of this process. I do not understand what is it? And how can I get stdout of geth? There is information in documentation that stdout of background process is displayed in shell by default.
My shell is running in remote Ubuntu system.
The "&" directs the shell to run the command in the background. It uses the fork system call to create a sub-shell and run the job asynchronously.
The stdout and stderr should still be printed to the screen.
If you do not want to see any output on the screen, redirect both stdout and stderr to a file by:
geth --maxpeers 0 --rpc > logfile 2>&1 &
Regarding the first part of your question:
...without any out stream of this process. I do not understand what is
it?
This is part of the command execution environment (part of the shell itself) and is not the result of your script. (it is how the shell handles backgrounding your script, and keeping track of the process to allow pause and resume of jobs).
If you look at man bash under JOB CONTROL, it explains what you are seeing in detail, e.g.
The shell associates a job with each pipeline. It keeps a table
of currently executing jobs, which may be listed with the jobs
command. When bash starts a job asynchronously (in the background),
it prints a line that looks like:
[1] 25647
I do not understand what is it? [1] 1859
It is the output from Bash's job feature, which enables managing background processes (jobs), and it contains information about the job just started, printed to stderr:
1 is the job ID (which, prefixed with %, can be used with builtins such as kill and wait)
25647 is the PID (process ID) of the background process.
Read more in the JOB CONTROL section of man bash.
how can I get stdout of geth? There is information in documentation that stdout of background process is displayed in shell by default.
Indeed, background jobs by default print their output to the current shell's stdout and stderr streams, but note that they do so asynchronously - that is, output from background jobs will appear as it is produced (potentially buffered), interleaved with output sent directly to the current shell, which can be disruptive.
You can apply redirections as usual to a background command in order to capture its output in file(s), as demonstrated in user3589054's helpful answer, but note that doing so will not silence the job-control message ([1] 1859 in the example above).
If you want to silence the job-control message on creation, use:
{ geth --maxpeers 0 --rpc & } 2>/dev/null
To silence the entire life cycle of a job, see this answer of mine.
What does GNU mean by this line (from here)?
The shell must also check on the status of background jobs so that it can report terminated and stopped jobs to the user; this can be done by calling waitpid with the WNOHANG option.
I don't understand why the shell should alert the user about background processes before executing. What would that look like too? Like, call ls, but a background process completed, so that process' status is printed before ls?
It's for implementing notifications like the following for background jobs:
$ cmd_1 &
$ cmd_2
$ cmd_3
[1]+ Done cmd_1
$
(Something like sleep 5 is a good cmd_1 to try this out with.)
In the above, it's assumed that the backgrounded cmd_1 job finishes while cmd_3 is being typed in or run. The notification is delivered afterwards, just before printing the last prompt above.
waitpid(2) is used to wait for processes to change state (either terminate or stop or start, as in what e.g. Ctrl-Z and fg does).
To implement the display above, the shell can call waitpid(2) to check if the background job has changed state each time before prompting for a new command. If it does this without passing WNOHANG, then the waitpid() call will block until the background job actually changes state, meaning the shell will be stuck until cmd_1 finishes before printing the second prompt. WNOHANG makes the waitpid() call non-blocking and allows the shell to "poll" for state changes in the job instead.
I've seen monitoring programs either in scripts that check process status using 'ps' or 'service status(on Linux)' periodically, or in C/C++ that forks and wait on the process...
I wonder if it is possible to use bash with trap and restart the sub-process when SIGCLD received?
I have tested a basic suite on RedHat Linux with following idea (and certainly it didn't work...)
#!/bin/bash
set -o monitor # can someone explain this? discussion on Internet say this is needed
trap startProcess SIGCHLD
startProcess() {
/path/to/another/bash/script.sh & # the one to restart
while [ 1 ]
do
sleep 60
done
}
startProcess
what the bash script being started just sleep for a few seconds and exit for now.
several issues observed:
when the shell starts in foreground, SIGCHLD will be handled only once. does trap reset signal handling like signal()?
the script and its child seem to be immune to SIGINT, which means they cannot be stopped by ^C
since cannot be closed, I closed the terminal. The script seems to be HUP and many zombie children left.
when run in background, the script caused terminal to die
... anyway, this does not work at all. I have to say I know too little about this topic.
Can someone suggest or give some working examples?
Are there scripts for such use?
how about use wait in bash, then?
Thanks
I can try to answer some of your questions but not all based on what I
know.
The line set -o monitor (or equivalently, set -m) turns on job
control, which is only on by default for interactive shells. This seems
to be required for SIGCHLD to be sent. However, job control is more of
an interactive feature and not really meant to be used in shell scripts
(see also this question).
Also keep in mind this is probably not what you intended to do
because once you enable job control, SIGCHLD will be sent for every
external command that exists (e.g. every time you run ls or grep or
anything, a SIGCHLD will fire when that command completes and your trap
will run).
I suspect the reason the SIGCHLD trap only appears to run once is
because your trap handler contains a foreground infinite loop, so your
script gets stuck in the trap handler. There doesn't seem to be a point
to that loop anyways, so you could simply remove it.
The script's "immunity" to SIGINT seems to be an effect of enabling
job control (the monitor part). My hunch is with job control turned on,
the sub-instance of bash that runs your script no longer terminates
itself in response to a SIGINT but instead passes the SIGINT through to
its foreground child process. In your script, the ^C i.e. SIGINT
simply acts like a continue statement in other programming languages
case, since SIGINT will just kill the currently running sleep 60,
whereupon the while loop will immediately run a new sleep 60.
When I tried running your script and then killing it (from another
terminal), all I ended up with were two stray sleep processes.
Backgrounding that script also kills my shell for me, although
the behavior is not terribly consistent (sometimes it happens
immediately, other times not at all). It seems typing any keys other
than enter causes an EOF to get sent somehow. Even after the terminal
exits the script continues to run in the background. I have no idea
what is going on here.
Being more specific about what you want to accomplish would help. If
you just want a command to run continuously for the lifetime of your
script, you could run an infinite loop in the background, like
while true; do
some-command
echo some-command finished
echo restarting some-command ...
done &
Note the & after the done.
For other tasks, wait is probably a better idea than using job control
in a shell script. Again, it would depend on what exactly you are trying
to do.
In this answer to another question, I was told that
in scripts you don't have job control
(and trying to turn it on is stupid)
This is the first time I've heard this, and I've pored over the bash.info section on Job Control (chapter 7), finding no mention of either of these assertions. [Update: The man page is a little better, mentioning 'typical' use, default settings, and terminal I/O, but no real reason why job control is particularly ill-advised for scripts.]
So why doesn't script-based job-control work, and what makes it a bad practice (aka 'stupid')?
Edit: The script in question starts a background process, starts a second background process, then attempts to put the first process back into the foreground so that it has normal terminal I/O (as if run directly), which can then be redirected from outside the script. Can't do that to a background process.
As noted by the accepted answer to the other question, there exist other scripts that solve that particular problem without attempting job control. Fine. And the lambasted script uses a hard-coded job number — Obviously bad. But I'm trying to understand whether job control is a fundamentally doomed approach. It still seems like maybe it could work...
What he meant is that job control is by default turned off in non-interactive mode (i.e. in a script.)
From the bash man page:
JOB CONTROL
Job control refers to the ability to selectively stop (suspend)
the execution of processes and continue (resume) their execution at a
later point.
A user typically employs this facility via an interactive interface
supplied jointly by the system’s terminal driver and bash.
and
set [--abefhkmnptuvxBCHP] [-o option] [arg ...]
...
-m Monitor mode. Job control is enabled. This option is on by
default for interactive shells on systems that support it (see
JOB CONTROL above). Background processes run in a separate
process group and a line containing their exit status is
printed upon their completion.
When he said "is stupid" he meant that not only:
is job control meant mostly for facilitating interactive control (whereas a script can work directly with the pid's), but also
I quote his original answer, ... relies on the fact that you didn't start any other jobs previously in the script which is a bad assumption to make. Which is quite correct.
UPDATE
In answer to your comment: yes, nobody will stop you from using job control in your bash script -- there is no hard case for forcefully disabling set -m (i.e. yes, job control from the script will work if you want it to.) Remember that in the end, especially in scripting, there always are more than one way to skin a cat, but some ways are more portable, more reliable, make it simpler to handle error cases, parse the output, etc.
You particular circumstances may or may not warrant a way different from what lhunath (and other users) deem "best practices".
Job control with bg and fg is useful only in interactive shells. But & in conjunction with wait is useful in scripts too.
On multiprocessor systems spawning background jobs can greatly improve the script's performance, e.g. in build scripts where you want to start at least one compiler per CPU, or process images using ImageMagick tools parallely etc.
The following example runs up to 8 parallel gcc's to compile all source files in an array:
#!bash
...
for ((i = 0, end=${#sourcefiles[#]}; i < end;)); do
for ((cpu_num = 0; cpu_num < 8; cpu_num++, i++)); do
if ((i < end)); then gcc ${sourcefiles[$i]} & fi
done
wait
done
There is nothing "stupid" about this. But you'll require the wait command, which waits for all background jobs before the script continues. The PID of the last background job is stored in the $! variable, so you may also wait ${!}. Note also the nice command.
Sometimes such code is useful in makefiles:
buildall:
for cpp_file in *.cpp; do gcc -c $$cpp_file & done; wait
This gives much finer control than make -j.
Note that & is a line terminator like ; (write command& not command&;).
Hope this helps.
Job control is useful only when you are running an interactive shell, i.e., you know that stdin and stdout are connected to a terminal device (/dev/pts/* on Linux). Then, it makes sense to have something on foreground, something else on background, etc.
Scripts, on the other hand, doesn't have such guarantee. Scripts can be made executable, and run without any terminal attached. It doesn't make sense to have foreground or background processes in this case.
You can, however, run other commands non-interactively on the background (appending "&" to the command line) and capture their PIDs with $!. Then you use kill to kill or suspend them (simulating Ctrl-C or Ctrl-Z on the terminal, it the shell was interactive). You can also use wait (instead of fg) to wait for the background process to finish.
It could be useful to turn on job control in a script to set traps on
SIGCHLD. The JOB CONTROL section in the manual says:
The shell learns immediately whenever a job changes state. Normally,
bash waits until it is about to print a prompt before reporting
changes in a job's status so as to not interrupt any other output. If
the -b option to the set builtin command is enabled, bash reports
such changes immediately. Any trap on SIGCHLD is executed for each
child that exits.
(emphasis is mine)
Take the following script, as an example:
dualbus#debian:~$ cat children.bash
#!/bin/bash
set -m
count=0 limit=3
trap 'counter && { job & }' CHLD
job() {
local amount=$((RANDOM % 8))
echo "sleeping $amount seconds"
sleep "$amount"
}
counter() {
((count++ < limit))
}
counter && { job & }
wait
dualbus#debian:~$ chmod +x children.bash
dualbus#debian:~$ ./children.bash
sleeping 6 seconds
sleeping 0 seconds
sleeping 7 seconds
Note: CHLD trapping seems to be broken as of bash 4.3
In bash 4.3, you could use 'wait -n' to achieve the same thing,
though:
dualbus#debian:~$ cat waitn.bash
#!/home/dualbus/local/bin/bash
count=0 limit=3
trap 'kill "$pid"; exit' INT
job() {
local amount=$((RANDOM % 8))
echo "sleeping $amount seconds"
sleep "$amount"
}
for ((i=0; i<limit; i++)); do
((i>0)) && wait -n; job & pid=$!
done
dualbus#debian:~$ chmod +x waitn.bash
dualbus#debian:~$ ./waitn.bash
sleeping 3 seconds
sleeping 0 seconds
sleeping 5 seconds
You could argue that there are other ways to do this in a more
portable way, that is, without CHLD or wait -n:
dualbus#debian:~$ cat portable.sh
#!/bin/sh
count=0 limit=3
trap 'counter && { brand; job & }; wait' USR1
unset RANDOM; rseed=123459876$$
brand() {
[ "$rseed" -eq 0 ] && rseed=123459876
h=$((rseed / 127773))
l=$((rseed % 127773))
rseed=$((16807 * l - 2836 * h))
RANDOM=$((rseed & 32767))
}
job() {
amount=$((RANDOM % 8))
echo "sleeping $amount seconds"
sleep "$amount"
kill -USR1 "$$"
}
counter() {
[ "$count" -lt "$limit" ]; ret=$?
count=$((count+1))
return "$ret"
}
counter && { brand; job & }
wait
dualbus#debian:~$ chmod +x portable.sh
dualbus#debian:~$ ./portable.sh
sleeping 2 seconds
sleeping 5 seconds
sleeping 6 seconds
So, in conclusion, set -m is not that useful in scripts, since
the only interesting feature it brings to scripts is being able to
work with SIGCHLD. And there are other ways to achieve the same thing
either shorter (wait -n) or more portable (sending signals yourself).
Bash does support job control, as you say. In shell script writing, there is often an assumption that you can't rely on the fact that you have bash, but that you have the vanilla Bourne shell (sh), which historically did not have job control.
I'm hard-pressed these days to imagine a system in which you are honestly restricted to the real Bourne shell. Most systems' /bin/sh will be linked to bash. Still, it's possible. One thing you can do is instead of specifying
#!/bin/sh
You can do:
#!/bin/bash
That, and your documentation, would make it clear your script needs bash.
Possibly o/t but I quite often use nohup when ssh into a server on a long-running job so that if I get logged out the job still completes.
I wonder if people are confusing stopping and starting from a master interactive shell and spawning background processes? The wait command allows you to spawn a lot of things and then wait for them all to complete, and like I said I use nohup all the time. It's more complex than this and very underused - sh supports this mode too. Have a look at the manual.
You've also got
kill -STOP pid
I quite often do that if I want to suspend the currently running sudo, as in:
kill -STOP $$
But woe betide you if you've jumped out to the shell from an editor - it will all just sit there.
I tend to use mnemonic -KILL etc. because there's a danger of typing
kill - 9 pid # note the space
and in the old days you could sometimes bring the machine down because it would kill init!
jobs DO work in bash scripts
BUT, you ... NEED to watch for the spawned staff
like:
ls -1 /usr/share/doc/ | while read -r doc ; do ... done
jobs will have different context on each side of the |
bypassing this may be using for instead of while:
for `ls -1 /usr/share/doc` ; do ... done
this should demonstrate how to use jobs in a script ...
with the mention that my commented note is ... REAL (dunno why that behaviour)
#!/bin/bash
for i in `seq 7` ; do ( sleep 100 ) & done
jobs
while [ `jobs | wc -l` -ne 0 ] ; do
for jobnr in `jobs | awk '{print $1}' | cut -d\[ -f2- |cut -d\] -f1` ; do
kill %$jobnr
done
#this is REALLY ODD ... but while won't exit without this ... dunno why
jobs >/dev/null 2>/dev/null
done
sleep 1
jobs