After trying to figure out why a Capistrano task (which tried to start a daemon in the background) was hanging, I discovered that using && in bash over ssh prevents a subsequent program from running in the background. I tried it on bash 4.1.5 and 4.2.20.
The following will hang (i.e. wait for sleep to finish) in bash:
ssh localhost "cd /tmp && nohup sleep 10 >/dev/null 2>&1 &"
The following won't:
ssh localhost "cd /tmp ; nohup sleep 10 >/dev/null 2>&1 &"
Neither will this:
cd /tmp && nohup sleep 10 >/dev/null 2>&1 &
Both zsh and dash will execute it in the background in all cases, regardless of && and ssh. Is this normal/expected behavior for bash, or a bug?
One easy solution is to use:
ssh localhost "(cd /tmp && nohup sleep 10) >/dev/null 2>&1 &"
(this also works if you use braces, see second example below).
I did not experiment further but I am reasonably convinced it has to do with open file descriptors hanging around. Perhaps zsh and dash bind the && so that this means what has to be spelled as:
{ cd /tmp && nohup sleep 10; } >/dev/null 2>&1
in bash.Nope, quick experiment in dash shows that echo foo && echo bar >file only redirects the latter. Still, it has to have something to do with lingering open fd's causing ssh to wait for more output; I've run into this a lot in the past.
One more trick, not needed if you use the parentheses or braces for this particular case but might be useful in a more general context, where the set of commands to do with && are more complex. Since bash seems to be hanging on to the file descriptor inappropriately with && but not with ;, you can turn a && b && c into a || exit 1; b || exit 1; c. This works with the test case:
ssh localhost "true || exit 1; echo going on; nohup sleep 10 >/dev/null 2>&1 &"
Replace true with false and the echo of "going on" is omitted.
(You can also set -e, although sometimes that is a bigger hammer than desired.)
This seems to work:
ssh localhost "(exec 0>&- ; exec 1>&-; exec 2>&-; cd /tmp; sleep 20&)"
Related
I'm creating a simple Traffic simulator: a client that curl a Webserver every 10 seconds. The client(Debian) and the WebServer are configured using Ansible. The background-cycle is closed when the Ansible-SSH connection is closed.
At first I launched the command:
$ while true; do curl python_webserver:8000; sleep10; done </dev/null >/dev/null 2>&1 &;
$ disown
And It works fine from Bash, but if I put it into a script, it exit at ssh-connection end.
I try some other solutions, like using:
$ nohup [command]
$ nohup /bin/bash -c '[command]
or using "deamonize" but nothing worked. Nothing I found online works; maybe I'm missing something important. (Writing the pid is important, but not fundamental)
I here the script, maybe there is a big newbie error.
#!/bin/bash
PORT=8000
while true; do
curl python_webserver:$PORT
sleep 10
done >/dev/null 2>&1 &
ANSIBLE_CLIENT_PID=$!
echo $ANSIBLE_CLIENT_PID >> /tmp/ANSIBLE_PID.txt
disown
The answers is
nohup sh -c 'while true; do curl python_webserver:'"$PORT"'; sleep 10; done >/dev/null 2>&1' &
Probably I made a mistake with quotation marks.
Thank you all for you help
(especially Kamil Cuk for the solution)
I know I can run my bash script in the background by using bash script.sh & disown or alternatively, by using nohup. However, I want to run my script in the background by default, so when I run bash script.sh or after making it executable, by running ./script.sh it should run in the background by default. How can I achieve this?
Self-contained solution:
#!/bin/sh
# Re-spawn as a background process, if we haven't already.
if [[ "$1" != "-n" ]]; then
nohup "$0" -n &
exit $?
fi
# Rest of the script follows. This is just an example.
for i in {0..10}; do
sleep 2
echo $i
done
The if statement checks if the -n flag has been passed. If not, it calls itself with nohup (to disassociate the calling terminal so closing it doesn't close the script) and & (to put the process in the background and return to the prompt). The parent then exits to leave the background version to run. The background version is explicitly called with the -n flag, so wont cause an infinite loop (which is hell to debug!).
The for loop is just an example. Use tail -f nohup.out to see the script's progress.
Note that I pieced this answer together with this and this but neither were succinct or complete enough to be a duplicate.
Simply write a wrapper that calls your actual script with nohup actualScript.sh &.
Wrapper script wrapper.sh
#! /bin/bash
nohup ./actualScript.sh &
Actual script in actualScript.sh
#! /bin/bash
for i in {0..10}
do
sleep 10 #script is running, test with ps -eaf|grep actualScript
echo $i
done
tail -f 10 nohup.out
0
1
2
3
4
...
Adding to Heath Raftery's answer, what worked for me is a variation of what he suggested such as this:
if [[ "$1" != "-n" ]]; then
$0 -n & disown
exit $?
fi
I have a script that uses ssh to login to a remote machine, cd to a particular directory, and then start a daemon. The original script looks like this:
ssh server "cd /tmp/path ; nohup java server 0</dev/null 1>server_stdout 2>server_stderr &"
This script appears to work fine. However, it is not robust to the case when the user enters the wrong path so the cd fails. Because of the ;, this command will try to run the nohup command even if the cd fails.
The obvious fix doesn't work:
ssh server "cd /tmp/path && nohup java server 0</dev/null 1>server_stdout 2>server_stderr &"
that is, the SSH command does not return until the server is stopped. Putting nohup in front of the cd instead of in front of the java didn't work.
Can anyone help me fix this? Can you explain why this solution doesn't work? Thanks!
Edit: cbuckley suggests using sh -c, from which I derived:
ssh server "nohup sh -c 'cd /tmp/path && java server 0</dev/null 1>master_stdout 2>master_stderr' 2>/dev/null 1>/dev/null &"
However, now the exit code is always 0 when the cd fails; whereas if I do ssh server cd /failed/path then I get a real exit code. Suggestions?
See Bash's Operator Precedence.
The & is being attached to the whole statement because it has a higher precedence than &&. You don't need ssh to verify this. Just run this in your shell:
$ sleep 100 && echo yay &
[1] 19934
If the & were only attached to the echo yay, then your shell would sleep for 100 seconds and then report the background job. However, the entire sleep 100 && echo yay is backgrounded and you're given the job notification immediately. Running jobs will show it hanging out:
$ sleep 100 && echo yay &
[1] 20124
$ jobs
[1]+ Running sleep 100 && echo yay &
You can use parenthesis to create a subshell around echo yay &, giving you what you'd expect:
sleep 100 && ( echo yay & )
This would be similar to using bash -c to run echo yay &:
sleep 100 && bash -c "echo yay &"
Tossing these into an ssh, and we get:
# using parenthesis...
$ ssh localhost "cd / && (nohup sleep 100 >/dev/null </dev/null &)"
$ ps -ef | grep sleep
me 20136 1 0 16:48 ? 00:00:00 sleep 100
# and using `bash -c`
$ ssh localhost "cd / && bash -c 'nohup sleep 100 >/dev/null </dev/null &'"
$ ps -ef | grep sleep
me 20145 1 0 16:48 ? 00:00:00 sleep 100
Applying this to your command, and we get
ssh server "cd /tmp/path && (nohup java server 0</dev/null 1>server_stdout 2>server_stderr &)"
or:
ssh server "cd /tmp/path && bash -c 'nohup java server 0</dev/null 1>server_stdout 2>server_stderr &'"
Also, with regard to your comment on the post,
Right, sh -c always returns 0. E.g., sh -c exit 1 has error code
0"
this is incorrect. Directly from the manpage:
Bash's exit status is the exit status of the last command executed in
the script. If no commands are executed, the exit status is 0.
Indeed:
$ bash -c "true ; exit 1"
$ echo $?
1
$ bash -c "false ; exit 22"
$ echo $?
22
ssh server "test -d /tmp/path" && ssh server "nohup ... &"
Answer roundup:
Bad: Using sh -c to wrap the entire nohup command doesn't work for my purposes because it doesn't return error codes. (#cbuckley)
Okay: ssh <server> <cmd1> && ssh <server> <cmd2> works but is much slower (#joachim-nilsson)
Good: Create a shell script on <server> that runs the commands in succession and returns the correct error code.
The last is what I ended up using. I'd still be interested in learning why the original use-case doesn't work, if someone who understands shell internals can explain it to me!
I want to start a process using nohup on a remote machine via ssh. The problem is how to get the PID of the process started with nohup, so the "process actually doing something", not some outer shell instance or the like. Also, I want to store stdout and stderr in files, but that is not the issue here...
Locally, it works flawlessly using
nohup sleep 30 > out 2> err < /dev/null & echo $!
It is echoing me the exact PID of the command "sleep 30", which I can also see using "top" or "ps aux|grep sleep".
But I'm having trouble doing it remotely via ssh. I tried something like
ssh remote_machine 'nohup bash -c "( ( sleep 30 ) & )" > out 2> err < /dev/null'
but I cannot figure out where to place the "echo $!" so that it is displayed in my local shell. It is always showing me wrong PIDs, for example the one of the "bash" instance etc.
Has somebody an idea how to solve this?
EDIT:
OK, the "bash -c" might not be needed here. Like Lotharyx pointed out, I get the right PID just fine using
ssh remote 'nohup sleep 30 > out 2> err < /dev/null & echo $!'
but then the problem is that if you substitute "sleep 30" with something that produces output, say, "echo Hello World!", that output does not end up in the file "out", neither on the local nor on remote side. Anybody got an idea why?
EDIT2: My fault! There was just no space left on the other device, that's why the files "out" and "err" stayed empty!
So this is working. In addition, if one wants to call multiple commands in a row, separated by a semicolon (;), one can still use "bash -c", like so:
ssh remote 'nohup bash -c "echo bla;sleep 30;echo blupp" > out 2> err < /dev/null & echo $!'
Then it prints out the PID of the "bash -c" on the local side, which is just fine. (It is impossible to get the PID of the "innermost" or "busy" process, because every program itself can spawn new subprocesses, there is no way to find out...)
I tried the following (the local machine is Debian; the remote machine is CentOS), and it worked exactly as I think you're expecting:
~# ssh someone#somewhere 'nohup sleep 30 > out 2> err < /dev/null & echo $!'
someone#somewhere's password:
14193
~#
On the remote machine, I did ps -e, and saw this line:
14193 ? 00:00:00 sleep
So, clearly, on my local machine, the output is the PID of "sleep" executing on the remote machine.
Why are you adding bash to your command when sending it across an SSH tunnel?
I'm wondering if there is a better way to make a daemon that waits for something using only sh than:
#! /bin/sh
trap processUserSig SIGUSR1
processUserSig() {
echo "doing stuff"
}
while true; do
sleep 1000
done
In particular, I'm wondering if there's any way to get rid of the loop and still have the thing listen for the signals.
Just backgrounding your script (./myscript &) will not daemonize it. See http://www.faqs.org/faqs/unix-faq/programmer/faq/, section 1.7, which describes what's necessary to become a daemon. You must disconnect it from the terminal so that SIGHUP does not kill it. You can take a shortcut to make a script appear to act like a daemon;
nohup ./myscript 0<&- &>/dev/null &
will do the job. Or, to capture both stderr and stdout to a file:
nohup ./myscript 0<&- &> my.admin.log.file &
Redirection explained (see bash redirection)
0<&- closes stdin
&> file sends stdout and stderr to a file
However, there may be further important aspects that you need to consider. For example:
You will still have a file descriptor open to the script, which means that the directory it's mounted in would be unmountable. To be a true daemon you should chdir("/") (or cd / inside your script), and fork so that the parent exits, and thus the original descriptor is closed.
Perhaps run umask 0. You may not want to depend on the umask of the caller of the daemon.
For an example of a script that takes all of these aspects into account, see Mike S' answer.
Some of the top-upvoted answers here are missing some important parts of what makes a daemon a daemon, as opposed to just a background process, or a background process detached from a shell.
This http://www.faqs.org/faqs/unix-faq/programmer/faq/ describes what is necessary to be a daemon. And this Run bash script as daemon implements the setsid, though it misses the chdir to root.
The original poster's question was actually more specific than "How do I create a daemon process using bash?", but since the subject and answers discuss daemonizing shell scripts generally, I think it's important to point it out (for interlopers like me looking into the fine details of creating a daemon).
Here's my rendition of a shell script that would behave according to the FAQ. Set DEBUG to true to see pretty output (but it also exits immediately rather than looping endlessly):
#!/bin/bash
DEBUG=false
# This part is for fun, if you consider shell scripts fun- and I do.
trap process_USR1 SIGUSR1
process_USR1() {
echo 'Got signal USR1'
echo 'Did you notice that the signal was acted upon only after the sleep was done'
echo 'in the while loop? Interesting, yes? Yes.'
exit 0
}
# End of fun. Now on to the business end of things.
print_debug() {
whatiam="$1"; tty="$2"
[[ "$tty" != "not a tty" ]] && {
echo "" >$tty
echo "$whatiam, PID $$" >$tty
ps -o pid,sess,pgid -p $$ >$tty
tty >$tty
}
}
me_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
me_FILE=$(basename $0)
cd /
#### CHILD HERE --------------------------------------------------------------------->
if [ "$1" = "child" ] ; then # 2. We are the child. We need to fork again.
shift; tty="$1"; shift
$DEBUG && print_debug "*** CHILD, NEW SESSION, NEW PGID" "$tty"
umask 0
$me_DIR/$me_FILE XXrefork_daemonXX "$tty" "$#" </dev/null >/dev/null 2>/dev/null &
$DEBUG && [[ "$tty" != "not a tty" ]] && echo "CHILD OUT" >$tty
exit 0
fi
##### ENTRY POINT HERE -------------------------------------------------------------->
if [ "$1" != "XXrefork_daemonXX" ] ; then # 1. This is where the original call starts.
tty=$(tty)
$DEBUG && print_debug "*** PARENT" "$tty"
setsid $me_DIR/$me_FILE child "$tty" "$#" &
$DEBUG && [[ "$tty" != "not a tty" ]] && echo "PARENT OUT" >$tty
exit 0
fi
##### RUNS AFTER CHILD FORKS (actually, on Linux, clone()s. See strace -------------->
# 3. We have been reforked. Go to work.
exec >/tmp/outfile
exec 2>/tmp/errfile
exec 0</dev/null
shift; tty="$1"; shift
$DEBUG && print_debug "*** DAEMON" "$tty"
# The real stuff goes here. To exit, see fun (above)
$DEBUG && [[ "$tty" != "not a tty" ]] && echo NOT A REAL DAEMON. NOT RUNNING WHILE LOOP. >$tty
$DEBUG || {
while true; do
echo "Change this loop, so this silly no-op goes away." >/dev/null
echo "Do something useful with your life, young padawan." >/dev/null
sleep 10
done
}
$DEBUG && [[ "$tty" != "not a tty" ]] && sleep 3 && echo "DAEMON OUT" >$tty
exit # This may never run. Why is it here then? It's pretty.
# Kind of like, "The End" at the end of a movie that you
# already know is over. It's always nice.
Output looks like this when DEBUG is set to true. Notice how the session and process group ID (SESS, PGID) numbers change:
<shell_prompt>$ bash blahd
*** PARENT, PID 5180
PID SESS PGID
5180 1708 5180
/dev/pts/6
PARENT OUT
<shell_prompt>$
*** CHILD, NEW SESSION, NEW PGID, PID 5188
PID SESS PGID
5188 5188 5188
not a tty
CHILD OUT
*** DAEMON, PID 5198
PID SESS PGID
5198 5188 5188
not a tty
NOT A REAL DAEMON. NOT RUNNING WHILE LOOP.
DAEMON OUT
# double background your script to have it detach from the tty
# cf. http://www.linux-mag.com/id/5981
(./program.sh &) &
Use your system's daemon facility, such as start-stop-daemon.
Otherwise, yes, there has to be a loop somewhere.
$ ( cd /; umask 0; setsid your_script.sh </dev/null &>/dev/null & ) &
It really depends on what is the binary itself going to do.
For example I want to create some listener.
The starting Daemon is simple task :
lis_deamon :
#!/bin/bash
# We will start the listener as Deamon process
#
LISTENER_BIN=/tmp/deamon_test/listener
test -x $LISTENER_BIN || exit 5
PIDFILE=/tmp/deamon_test/listener.pid
case "$1" in
start)
echo -n "Starting Listener Deamon .... "
startproc -f -p $PIDFILE $LISTENER_BIN
echo "running"
;;
*)
echo "Usage: $0 start"
exit 1
;;
esac
this is how we start the daemon (common way for all /etc/init.d/ staff)
now as for the listener it self,
It must be some kind of loop/alert or else that will trigger the script
to do what u want. For example if u want your script to sleep 10 min
and wake up and ask you how you are doing u will do this with the
while true ; do sleep 600 ; echo "How are u ? " ; done
Here is the simple listener that u can do that will listen for your
commands from remote machine and execute them on local :
listener :
#!/bin/bash
# Starting listener on some port
# we will run it as deamon and we will send commands to it.
#
IP=$(hostname --ip-address)
PORT=1024
FILE=/tmp/backpipe
count=0
while [ -a $FILE ] ; do #If file exis I assume that it used by other program
FILE=$FILE.$count
count=$(($count + 1))
done
# Now we know that such file do not exist,
# U can write down in deamon it self the remove for those files
# or in different part of program
mknod $FILE p
while true ; do
netcat -l -s $IP -p $PORT < $FILE |/bin/bash > $FILE
done
rm $FILE
So to start UP it : /tmp/deamon_test/listener start
and to send commands from shell (or wrap it to script) :
test_host#netcat 10.184.200.22 1024
uptime
20:01pm up 21 days 5:10, 44 users, load average: 0.62, 0.61, 0.60
date
Tue Jan 28 20:02:00 IST 2014
punt! (Cntrl+C)
Hope this will help.
Have a look at the daemon tool from the libslack package:
http://ingvar.blog.linpro.no/2009/05/18/todays-sysadmin-tip-using-libslack-daemon-to-daemonize-a-script/
On Mac OS X use a launchd script for shell daemon.
If I had a script.sh and i wanted to execute it from bash and leave it running even when I want to close my bash session then I would combine nohup and & at the end.
example: nohup ./script.sh < inputFile.txt > ./logFile 2>&1 &
inputFile.txt can be any file. If your file has no input then we usually use /dev/null. So the command would be:
nohup ./script.sh < /dev/null > ./logFile 2>&1 &
After that close your bash session,open another terminal and execute: ps -aux | egrep "script.sh" and you will see that your script is still running at the background. Of cource,if you want to stop it then execute the same command (ps) and kill -9 <PID-OF-YOUR-SCRIPT>
See Bash Service Manager project: https://github.com/reduardo7/bash-service-manager
Implementation example
#!/usr/bin/env bash
export PID_FILE_PATH="/tmp/my-service.pid"
export LOG_FILE_PATH="/tmp/my-service.log"
export LOG_ERROR_FILE_PATH="/tmp/my-service.error.log"
. ./services.sh
run-script() {
local action="$1" # Action
while true; do
echo "### Running action '${action}'"
echo foo
echo bar >&2
[ "$action" = "run" ] && return 0
sleep 5
[ "$action" = "debug" ] && exit 25
done
}
before-start() {
local action="$1" # Action
echo "* Starting with $action"
}
after-finish() {
local action="$1" # Action
local serviceExitCode=$2 # Service exit code
echo "* Finish with $action. Exit code: $serviceExitCode"
}
action="$1"
serviceName="Example Service"
serviceMenu "$action" "$serviceName" run-script "$workDir" before-start after-finish
Usage example
$ ./example-service
# Actions: [start|stop|restart|status|run|debug|tail(-[log|error])]
$ ./example-service start
# Starting Example Service service...
$ ./example-service status
# Serive Example Service is runnig with PID 5599
$ ./example-service stop
# Stopping Example Service...
$ ./example-service status
# Service Example Service is not running
Here is the minimal change to the original proposal to create a valid daemon in Bourne shell (or Bash):
#!/bin/sh
if [ "$1" != "__forked__" ]; then
setsid "$0" __forked__ "$#" &
exit
else
shift
fi
trap 'siguser1=true' SIGUSR1
trap 'echo "Clean up and exit"; kill $sleep_pid; exit' SIGTERM
exec > outfile
exec 2> errfile
exec 0< /dev/null
while true; do
(sleep 30000000 &>/dev/null) &
sleep_pid=$!
wait
kill $sleep_pid &>/dev/null
if [ -n "$siguser1" ]; then
siguser1=''
echo "Wait was interrupted by SIGUSR1, do things here."
fi
done
Explanation:
Line 2-7: A daemon must be forked so it doesn't have a parent. Using an artificial argument to prevent endless forking. "setsid" detaches from starting process and terminal.
Line 9: Our desired signal needs to be differentiated from other signals.
Line 10: Cleanup is required to get rid of dangling "sleep" processes.
Line 11-13: Redirect stdout, stderr and stdin of the script.
Line 16: sleep in the background
Line 18: wait waits for end of sleep, but gets interrupted by (some) signals.
Line 19: Kill sleep process, because that is still running when signal is caught.
Line 22: Do the work if SIGUSR1 has been caught.
Guess it does not get any simpler than that.
Like many answers this one is not a "real" daemonization but rather an alternative to nohup approach.
echo "script.sh" | at now
There are obviously differences from using nohup. For one there is no detaching from the parent in the first place. Also "script.sh" doesn't inherit parent's environment.
By no means this is a better alternative. It is simply a different (and somewhat lazy) way of launching processes in background.
P.S. I personally upvoted carlo's answer as it seems to be the most elegant and works both from terminal and inside scripts
try executing using &
if you save this file as program.sh
you can use
$. program.sh &