To redirect stderr for a command to syslog I use a helper like
with_logger my-tag command arg1 ...
where with_logger is
#!/bin/bash
syslog_tag="$1"
shift
exec "$#" 2> >(exec /usr/bin/logger -t "$syslog_tag")
Here 2 exec calls and the process substitution is used to avoid having a bash process waiting for the command or logger command to finish. However this creates a zombie. That is, when the logger process exits after the command exits closing its stderr, nobody waited for the process. This results in the parent process receiving an unexpected signal about unknown child processes.
To solve this I suppose I have to somehow disown the >() process. Is there a way to do it?
Update to clarify the question
I need to invoke my wrapper script from another program, not from a bash script.
Update 2 - this was a wrong question
See the answer below.
I would just define a short shell function
to_logger () {
exec /usr/bin/logger -t "$1"
}
and call your code with the minimally longer
2> >(to_logger my-tag) command arg1 ...
This has several benefits:
The command can be any shell construct; you aren't passing the command as arguments to another command; you are just redirecting standard error of an arbitrary command.
You are spawning one fewer process to handle the logging.
My question was wrong.
In my setup I use supervisord to control few processes. As it has limited syslog support and does not allow to use different tags when redirecting processes' stderr to syslog, I use the above shell script for that. While testing the script I noticed CRIT reaped unknown pid <number> messages in the log for supervisord itself. I assumed that this was bad and tried to fix this.
But it turned out the messages are not critical at all. In fact supervisord was doing the proper job and in its latest source the message was changed from CRIT to INFO. So there is nothing to answer here as there are no issues with the script in question :)
Related
The way I normally start a long-running shell script is
% (nohup ./script.sh </dev/null >script.log 2>&1 & )
The redirections close stdin, and reopen stdout and stderr; the nohup stops HUP reaching the process when the owning process exits (I realise that the 2>&1 is somewhat redundant, since the nohup does something like this anyway); and the backgrounding within the subshell is the double-fork which means that the ./script.sh process's parent has exited while it's still running, so it acquires the init process as its parent.
That doesn't completely work, however, because when I exit the shell from which I've invoked this (typically, of course, I'm doing this on a remote machine), it doesn't exit cleanly. I can do ^C to exit, and this is OK – the process does carry on in the background as intended. However I can't work out what is/isn't happening to require the ^C, and that's annoying me.
The actions above seem to tick most of the boxes in the unix FAQ (question 1.7), except that I'm not doing anything to detach this process from a controlling terminal, or to make it a session leader. The setsid(2) call exists on FreeBSD, but not the setsid command; nor, as far as I can see, is there an obvious substitute for that command. The same is true on macOS, of course.
So, the questions are:
Is there a differently-named caller of setsid on this platform, that I'm missing?
What, precisely, is happening when I exit the calling shell, that I'm killing with the ^C? Is there any way this could bite me?
Related questions (eg 1, 2) either answer a slightly different question, or assume the presence of the setsid command.
(This question has annoyed me for years, but because what I do here doesn't actually not work, I've never before got around to investigating, getting stumped, and asking about it).
In FreeBSD, out of the box you could use daemon -- run detached from the controlling terminal. option -r could be useful:
-r Supervise and restart the program after a one-second delay if it
has been terminated.
You could also try a supervisor, for example immortal is available for both platforms:
pkg install immortal # FreeBSD
brew install immortal # macOS
To daemonize your script and log (stdout/stderr) you could use:
immortal /path/to/your/script.sh -l /tmp/script.log
Or for more options, you could create a my-service.yml for example:
cmd: /path/to/script
cwd: /your/path
env:
DEBUG: 1
ENVIROMENT: production
log:
file: /tmp/app.log
stderr:
file: /tmp/app-error.log
And then run it with immortal -c my-service.yml
More examples can be found here: https://immortal.run/post/examples
If just want to use nohup and save the stdout & stderr into a file, you could add this to your script:
#!/bin/sh
exec 2>&1
...
Check more about exec 2>&1 in this answers https://stackoverflow.com/a/13088401/1135424
And then simply call nohup /your/script.sh & and check the file nohup.out, from the man
FILES
nohup.out The output file of the nohup execution if stan-
dard output is a terminal and if the current
directory is writable.
$HOME/nohup.out The output file of the nohup execution if stan-
dard output is a terminal and if the current
directory is not writable.
I found that some people run a program in shell like this
exe > the.log 2>&1 &!
I understand the first part, it redirects stderr to stdout also "&" means runs the program in background, but I don't know what does "&!" mean, what does the exclamation mark mean?
Within zsh the command &! is a shortcut for disown, i.e. the program won't get killed upon exiting the invoking shell.
See man zshbuiltins
disown [ job ... ]
job ... &|
job ... &!
Remove the specified jobs from the job table; the shell will no longer report their status, and will not complain if you try to exit an interactive shell with them running or stopped. If no job is specified, disown the current job. If the jobs are currently stopped and the AUTO_CONTINUE option is not set, a warning is printed containing information about how to make them running after they have been disowned. If one of the latter two forms is used, the jobs will automatically be made running, independent of the setting of the AUTO_CONTINUE option.
When I start some background process in the shell, for example:
geth --maxpeers 0 --rpc &
It returns something like:
[1] 1859
...without any out stream of this process. I do not understand what is it? And how can I get stdout of geth? There is information in documentation that stdout of background process is displayed in shell by default.
My shell is running in remote Ubuntu system.
The "&" directs the shell to run the command in the background. It uses the fork system call to create a sub-shell and run the job asynchronously.
The stdout and stderr should still be printed to the screen.
If you do not want to see any output on the screen, redirect both stdout and stderr to a file by:
geth --maxpeers 0 --rpc > logfile 2>&1 &
Regarding the first part of your question:
...without any out stream of this process. I do not understand what is
it?
This is part of the command execution environment (part of the shell itself) and is not the result of your script. (it is how the shell handles backgrounding your script, and keeping track of the process to allow pause and resume of jobs).
If you look at man bash under JOB CONTROL, it explains what you are seeing in detail, e.g.
The shell associates a job with each pipeline. It keeps a table
of currently executing jobs, which may be listed with the jobs
command. When bash starts a job asynchronously (in the background),
it prints a line that looks like:
[1] 25647
I do not understand what is it? [1] 1859
It is the output from Bash's job feature, which enables managing background processes (jobs), and it contains information about the job just started, printed to stderr:
1 is the job ID (which, prefixed with %, can be used with builtins such as kill and wait)
25647 is the PID (process ID) of the background process.
Read more in the JOB CONTROL section of man bash.
how can I get stdout of geth? There is information in documentation that stdout of background process is displayed in shell by default.
Indeed, background jobs by default print their output to the current shell's stdout and stderr streams, but note that they do so asynchronously - that is, output from background jobs will appear as it is produced (potentially buffered), interleaved with output sent directly to the current shell, which can be disruptive.
You can apply redirections as usual to a background command in order to capture its output in file(s), as demonstrated in user3589054's helpful answer, but note that doing so will not silence the job-control message ([1] 1859 in the example above).
If you want to silence the job-control message on creation, use:
{ geth --maxpeers 0 --rpc & } 2>/dev/null
To silence the entire life cycle of a job, see this answer of mine.
I have a VM that I want running indefinitely. The server is always running but I want the script to keep running after I log out. How would I go about doing so? Creating a cron job?
In general the following steps are sufficient to convince most Unix shells that the process you're launching should not depend on the continued existence of the shell:
run the command under nohup
run the command in the background
redirect all file descriptors that normally point to the terminal to other locations
So, if you want to run command-name, you should do it like so:
nohup command-name >/dev/null 2>/dev/null </dev/null &
This tells the process that will execute command-name to send all stdout and stderr to nowhere (instead of to your terminal) and also to read stdin from nowhere (instead of from your terminal). Of course if you actually have locations to write to/read from, you can certainly use those instead -- anything except the terminal is fine:
nohup command-name >outputFile 2>errorFile <inputFile &
See also the answer in Petur's comment, which discusses this issue a fair bit.
I have a bash script server.sh which is maintained by an external source and ideally should not be modified. This script writes to stdout and stderr.
In fact, this server.sh itself is doing an exec tclsh immediately:
#!/bin/sh
# \
exec tclsh "$0" ${1+"$#"}
so in fact, it is just a wrapper around a Tcl script. I just mention this in case you think that this matters.
I need a Tcl script setup.tcl which is supposed to do some preparatory work, then invoke server.sh (in the background), then do some cleanup work (and display the PID of the background process), and terminate.
server.sh is supposed to continue running until explicitly killed.
setup.tcl is usually invoked manually, either from a Cygwin bash shell or from a Windows cmd shell. In the latter case, it is ensured that Cygwin's bash.exe is in the PATH.
The environment is Windows 7 and Cygwin. The Tcl is either Cygwin's (8.5) or ActiveState 8.4.
The first version (omitting error handling) went like this:
# setup.tcl:
# .... preparatory work goes here
set childpid [exec bash.exe server.sh &]
# .... clean up work goes here
puts $childpid
exit 0
While this works when started as ActiveState Tcl from a Windows CMD shell, it does not work in a pure Cygwin setup. The reason is that as soon as setup.tcl ends, a signal is sent to the child process and this is killed too.
Using nohup would not help here, because I want to see the output of server.sh as soon as it occurs.
My next idea would be to created an intermediate bash script, mediator.sh, which uses disown -h to detach the child process and keep it from being killed:
#!/usr/bin/bash
# mediator.sh
server.sh &
child=$!
disown -h $child
and invoke mediator.sh from setup.tcl. But aside from the fact that I don't see an easy way to pass the child PID up to setup.tcl, the main problem is that it doesn't work either: While mediator.sh indeed keeps the child alive when called from the Cygwin command line directly, we have the same behaviour again (server.sh being killed when setup.tcl exits), when I call it via setup.tcl.
Anybody knowing a solution for this?
You'll want to set a trap handler in your server script so you can handle/ignore certain signals.
For example, to ignore HUP signals, you can do something like the following:
#!/bin/bash
handle_signal() {
echo "Ignoring HUP signal"
}
trap handle_signal SIGHUP
# Rest of code goes here
In the example case, if the script receives a HUP signal it will print a message and continue as normal. It will still die to Ctrl-C as that's the INT signal which is unhandled.