How do I receive notification in a bash script when a specific child process terminates? - bash

I wonder if anyone can help with this?
I have a bash script. It starts a sub-process which is another gui-based application. The bash script then goes into an interactive mode getting input from the user. This interactive mode continues indefinately. I would like it to terminate when the gui-application in the sub-process exits.
I have looked at SIGCHLD but this doesn't seem to be the answer. Here's what I've tried but I don't get a signal when the prog ends.
set -o monitor
"${prog}" &
prog_pid=$!
function check_pid {
kill -0 $1 2> /dev/null
}
function cleanup {
### does cleanup stuff here
exit
}
function sigchld {
check_pid $prog_pid
[[ $? == 1 ]] && cleanup
}
trap sigchld SIGCHLD
Updated following answers. I now have this working using the suggestion from 'nosid'. I have another, related, issue now which is that the interactive process that follows is a basic menu driven process that blocks waiting for key input from the user. If the child process ends the USR1 signal is not handled until after input is received. Is there any way to force the signal to be handled immediately?
The wait look looks like this:
stty raw # set the tty driver to raw mode
max=$1 # maximum valid choice
choice=$(expr $max + 1) # invalid choice
while [[ $choice -gt $max ]]; do
choice=`dd if=/dev/tty bs=1 count=1 2>/dev/null`
done
stty sane # restore tty
Updated with solution. I have solved this. The trick was to use nonblocking I/O for the read. Now, with the answer from 'nosid' and my modifications, I have exactly what I want. For completeness, here is what works for me:
#!/bin/bash -bm
{
"${1}"
kill -USR1 $$
} &
function cleanup {
# cleanup stuff
exit
}
trap cleanup SIGUSR1
while true ; do
stty raw # set the tty driver to raw mode
max=9 # maximum valid choice
while [[ $choice -gt $max || -z $choice ]]; do
choice=`dd iflag=nonblock if=/dev/tty bs=1 count=1 2>/dev/null`
done
stty sane # restore tty
# process choice
done

Here is a different approach. Instead of using SIGCHLD, you can execute an arbitrary command as soon as the GUI application terminates.
{
some_command args...
kill -USR1 $$
} &
function sigusr1() { ... }
trap sigusr1 SIGUSR1

Ok. I think I understand what you need. Have a look at my .xinitrc:
xrdb ~/.Xdefaults
source ~/.xinitrc.hw.settings
xcompmgr &
xscreensaver &
# after starting some arbitrary crap we want to start the main gui.
startfluxbox & PIDOFAPP=$! ## THIS IS THE IMPORTANT PART
setxkbmap genja
wmclockmon -bl &
sleep 1
wmctrl -s 3 && aterms sone &
sleep 1
wmctrl -s 0
wait $PIDOFAPP ## THIS IS THE SECOND PART OF THE IMPORTANT PART
xeyes -geometry 400x400+500+400 &
sleep 2
echo im out!
What happens is that after you send a process to the background, you can use wait to wait until the process dies. whatever is after wait will not be executed as long as the application is running. You can use this to exit after the GUI has been shut down.
PS: I run bash.

I think you need to do:
set -bm
or
set -o monitor notify
As per the bash manual:
-b
Cause the status of terminated background jobs to be reported immediately, rather than before printing the next primary prompt.

The shell's main job is executing child processes, and
it needs to catch SIGCHLD for its own purposes. This somehow restricts it to pass on the signal to the script itself.
Could you just check for the child pid and based on that send the alert. You can find the child pid as below-
bash_pid=$$
while true
do
children=`ps -eo ppid | grep -w $bash_pid`
if [ -z "$children" ]; then
cleanup
alert
exit
fi
done

Related

how do I watch for a process to have died in shell script?

I'm running a shell test program that I can view a progress bar but when I run it I keep getting a unary error . Is kill -0 a way to kill a subprocess in shell ?
Or is there another method to test if my process has died?
heres my code to run a progress bar until my command ends:
#!/bin/sh
# test my progress bar
spin[0]="-"
spin[1]="\\"
spin[2]="|"
spin[3]="/"
sleep 10 2>/dev/null & # run as background process
pid=$! # grab process id
echo -n "[sleeping] ${spin[0]}"
while [ kill -0 $pid ] # wait for process to end
do
for i in "${spin[#]}"
do
echo -ne "\b$i"
sleep 0.1
done
done
enter code here
1. Is kill -0 a way to kill a subprocess in shell ?
On Linux OS, kill -0 is just a way to try to kill a process and see what happens, '0' is not a POSIX signal, it does nothing at all.
If the process is running, kill will return 0, if not, it will return 1.
ps $pid >/dev/null 2>&1 could do the same job.
To kill a process, one generally use the SIGQUIT/3 (quit program) or SIGKILL/9 (terminate program) ; the process could trap the signal and make a clean exit, or it could ignore the signal so the OS has to terminate it 'quick and dirty'.
2. test and '['
The square bracket '[' is an utility ( /bin/[ ), and expect something you didn't provide correctly.
The syntax of while is while list; do list; done where list will return an exit code, so you don't have to use something else.
3. how do I watch for a process to have died in shell script?
Like you did, the code below will do the job:
#!/bin/bash
spin[0]="-"
spin[1]="\\"
spin[2]="|"
spin[3]="/"
sleep 10 2>/dev/null & # run as background process
pid=$! # grab process id
echo -n "[sleeping] ${spin[0]}"
#while ps -p $pid >/dev/null 2>&1 # using ps
while kill -0 $pid >/dev/null 2>&1 # using kill
do
for i in "${spin[#]}"
do
echo -ne "\b$i"
sleep 0.5
done
done
CAVEATS
I use /bin/bash as interpreter, as some of the Bourne Shell (sh) could not support the use of an array (ie spin[n]).
It's probably cleaner to run the spinner in the background and kill it when the process (running in the foreground) terminates. Or, you could open another file descriptor and write something into it after the background process terminates, and have the main process block on a read. eg:
#!/bin/bash
# test my progress bar
spin[0]='-'
spin[1]='\'
spin[2]='|'
spin[3]='/'
{ { { sleep 10 2>/dev/null; echo >&5; } & # run as background process
} 5>&1 1>&3 | { # wait for process to end
while ! read -t 1; do
printf "\r[sleeping] ${spin[ $(( i = ++i % 4 )) ]}"
done
}
} 3>&1

shell script - how to stop "watch" command in the shell script [duplicate]

I have a bash script that launches a child process that crashes (actually, hangs) from time to time and with no apparent reason (closed source, so there isn't much I can do about it). As a result, I would like to be able to launch this process for a given amount of time, and kill it if it did not return successfully after a given amount of time.
Is there a simple and robust way to achieve that using bash?
P.S.: tell me if this question is better suited to serverfault or superuser.
(As seen in:
BASH FAQ entry #68: "How do I run a command, and have it abort (timeout) after N seconds?")
If you don't mind downloading something, use timeout (sudo apt-get install timeout) and use it like: (most Systems have it already installed otherwise use sudo apt-get install coreutils)
timeout 10 ping www.goooooogle.com
If you don't want to download something, do what timeout does internally:
( cmdpid=$BASHPID; (sleep 10; kill $cmdpid) & exec ping www.goooooogle.com )
In case that you want to do a timeout for longer bash code, use the second option as such:
( cmdpid=$BASHPID;
(sleep 10; kill $cmdpid) \
& while ! ping -w 1 www.goooooogle.com
do
echo crap;
done )
# Spawn a child process:
(dosmth) & pid=$!
# in the background, sleep for 10 secs then kill that process
(sleep 10 && kill -9 $pid) &
or to get the exit codes as well:
# Spawn a child process:
(dosmth) & pid=$!
# in the background, sleep for 10 secs then kill that process
(sleep 10 && kill -9 $pid) & waiter=$!
# wait on our worker process and return the exitcode
exitcode=$(wait $pid && echo $?)
# kill the waiter subshell, if it still runs
kill -9 $waiter 2>/dev/null
# 0 if we killed the waiter, cause that means the process finished before the waiter
finished_gracefully=$?
sleep 999&
t=$!
sleep 10
kill $t
I also had this question and found two more things very useful:
The SECONDS variable in bash.
The command "pgrep".
So I use something like this on the command line (OSX 10.9):
ping www.goooooogle.com & PING_PID=$(pgrep 'ping'); SECONDS=0; while pgrep -q 'ping'; do sleep 0.2; if [ $SECONDS = 10 ]; then kill $PING_PID; fi; done
As this is a loop I included a "sleep 0.2" to keep the CPU cool. ;-)
(BTW: ping is a bad example anyway, you just would use the built-in "-t" (timeout) option.)
Assuming you have (or can easily make) a pid file for tracking the child's pid, you could then create a script that checks the modtime of the pid file and kills/respawns the process as needed. Then just put the script in crontab to run at approximately the period you need.
Let me know if you need more details. If that doesn't sound like it'd suit your needs, what about upstart?
One way is to run the program in a subshell, and communicate with the subshell through a named pipe with the read command. This way you can check the exit status of the process being run and communicate this back through the pipe.
Here's an example of timing out the yes command after 3 seconds. It gets the PID of the process using pgrep (possibly only works on Linux). There is also some problem with using a pipe in that a process opening a pipe for read will hang until it is also opened for write, and vice versa. So to prevent the read command hanging, I've "wedged" open the pipe for read with a background subshell. (Another way to prevent a freeze to open the pipe read-write, i.e. read -t 5 <>finished.pipe - however, that also may not work except with Linux.)
rm -f finished.pipe
mkfifo finished.pipe
{ yes >/dev/null; echo finished >finished.pipe ; } &
SUBSHELL=$!
# Get command PID
while : ; do
PID=$( pgrep -P $SUBSHELL yes )
test "$PID" = "" || break
sleep 1
done
# Open pipe for writing
{ exec 4>finished.pipe ; while : ; do sleep 1000; done } &
read -t 3 FINISHED <finished.pipe
if [ "$FINISHED" = finished ] ; then
echo 'Subprocess finished'
else
echo 'Subprocess timed out'
kill $PID
fi
rm finished.pipe
Here's an attempt which tries to avoid killing a process after it has already exited, which reduces the chance of killing another process with the same process ID (although it's probably impossible to avoid this kind of error completely).
run_with_timeout ()
{
t=$1
shift
echo "running \"$*\" with timeout $t"
(
# first, run process in background
(exec sh -c "$*") &
pid=$!
echo $pid
# the timeout shell
(sleep $t ; echo timeout) &
waiter=$!
echo $waiter
# finally, allow process to end naturally
wait $pid
echo $?
) \
| (read pid
read waiter
if test $waiter != timeout ; then
read status
else
status=timeout
fi
# if we timed out, kill the process
if test $status = timeout ; then
kill $pid
exit 99
else
# if the program exited normally, kill the waiting shell
kill $waiter
exit $status
fi
)
}
Use like run_with_timeout 3 sleep 10000, which runs sleep 10000 but ends it after 3 seconds.
This is like other answers which use a background timeout process to kill the child process after a delay. I think this is almost the same as Dan's extended answer (https://stackoverflow.com/a/5161274/1351983), except the timeout shell will not be killed if it has already ended.
After this program has ended, there will still be a few lingering "sleep" processes running, but they should be harmless.
This may be a better solution than my other answer because it does not use the non-portable shell feature read -t and does not use pgrep.
Here's the third answer I've submitted here. This one handles signal interrupts and cleans up background processes when SIGINT is received. It uses the $BASHPID and exec trick used in the top answer to get the PID of a process (in this case $$ in a sh invocation). It uses a FIFO to communicate with a subshell that is responsible for killing and cleanup. (This is like the pipe in my second answer, but having a named pipe means that the signal handler can write into it too.)
run_with_timeout ()
{
t=$1 ; shift
trap cleanup 2
F=$$.fifo ; rm -f $F ; mkfifo $F
# first, run main process in background
"$#" & pid=$!
# sleeper process to time out
( sh -c "echo \$\$ >$F ; exec sleep $t" ; echo timeout >$F ) &
read sleeper <$F
# control shell. read from fifo.
# final input is "finished". after that
# we clean up. we can get a timeout or a
# signal first.
( exec 0<$F
while : ; do
read input
case $input in
finished)
test $sleeper != 0 && kill $sleeper
rm -f $F
exit 0
;;
timeout)
test $pid != 0 && kill $pid
sleeper=0
;;
signal)
test $pid != 0 && kill $pid
;;
esac
done
) &
# wait for process to end
wait $pid
status=$?
echo finished >$F
return $status
}
cleanup ()
{
echo signal >$$.fifo
}
I've tried to avoid race conditions as far as I can. However, one source of error I couldn't remove is when the process ends near the same time as the timeout. For example, run_with_timeout 2 sleep 2 or run_with_timeout 0 sleep 0. For me, the latter gives an error:
timeout.sh: line 250: kill: (23248) - No such process
as it is trying to kill a process that has already exited by itself.
#Kill command after 10 seconds
timeout 10 command
#If you don't have timeout installed, this is almost the same:
sh -c '(sleep 10; kill "$$") & command'
#The same as above, with muted duplicate messages:
sh -c '(sleep 10; kill "$$" 2>/dev/null) & command'

Wait for process to finish, or user input

I have a backgrounded process that I would like to wait for (in case it fails or dies), unless I receive user input. Said another way, the user input should interrupt my waiting.
Here's a simplified snippet of my code
#!/bin/bash
...
mplayer -noconsolecontrols "$media_url" &
sleep 10 # enough time for it to fail
ps -p $!
if [ $? -ne 0 ]
then
fallback
else
read
kill $!
fi
The line that I particularly dislike is sleep 10, which is bad because it could be too much time, or not enough time.
Is there a way to wait $! || read or the equivalent?
Use kill -0 to validate that the process is still there and read with a timeout of 0 to test for user input. Something like this?
pid=$!
while kill -0 $pid; do
read -t 0 && exit
sleep 1
done
Original
ps -p to check the process. read -t 1 to wait for user input.
pid=$!
got_input=142
while ps -p $pid > /dev/null; do
if read -t 1; then
got_input=$?
kill $pid
fi
done
This allows for branching based whether the process died, or was killed due to user input.
All credit to gubblebozer. The only reason I'm posting this answer is the claim by moderators that my edits to his post constituted altering his intent.
Anti Race-Condition
First off, a race condition involving pids is (very likely) not a concern if you're fairly quick, because they're reused on a cycle.
Even so, I guess anything is possible... Here's some code that handles that possibility, without breaking your head on traps.
got_input=142
while true; do
if read -t 1; then
got_input=$?
pkill --ns $$ name > /dev/null
break
elif ! pgrep --ns $$ name > /dev/null; then
break
fi
done
Now, we've accomplished our goal, while (probably) completely eliminating the race condition.
Any loop with a sleep or similar timeout in it, will introduce a race condition. It's better to actively wait for the process to die, or, in this case, to trap the signal that's sent when a child dies.
#!/bin/bash
set -o monitor
trap stop_process SIGCHLD
stop_process()
{
echo sigchld received
exit
}
# the background process: (this simulates a process that exits after 10 seconds)
sleep 10 &
procpid=$!
echo pid of process: $procpid
echo -n hit enter:
read
# not reached when SIGCHLD is received
echo killing pid $procpid
kill $procpid
I'm not 100% sure this eliminates any race condition, but it's a lot closer than a sleep loop.
edit: the shorter, less verbose version
#!/bin/bash
set -o monitor
trap exit SIGCHLD
sleep 5 &
read -p "hit enter: "
kill $!
edit 2: setting the trap before starting the background process prevents another race condition in which the process would die before the trap was installed

Catch SIGINT in bash, handle AND ignore

Is it possible in bash to intercept a SIGINT, do something, and then ignore it (keep bash running).
I know that I can ignore the SIGINT with
trap '' SIGINT
And I can also do something on the sigint with
trap handler SIGINT
But that will still stop the script after the handler executes. E.g.
#!/bin/bash
handler()
{
kill -s SIGINT $PID
}
program &
PID=$!
trap handler SIGINT
wait $PID
#do some other cleanup with results from program
When I press ctrl+c, the SIGINT to program will be sent, but bash will skip the wait BEFORE program was properly shut down and created its output in its signal handler.
Using #suspectus answer I can change the wait $PID to:
while kill -0 $PID > /dev/null 2>&1
do
wait $PID
done
This actually works for me I am just not 100% sure if this is 'clean' or a 'dirty workaround'.
trap will return from the handler, but after the command called when the handler was invoked.
So the solution is a little clumsy but I think it does what is required. trap handler INT also will work.
trap 'echo "Be patient"' INT
for ((n=20; n; n--))
do
sleep 1
done
The short answer:
SIGINT in bash can be caught, handled and then ignored, assumed that "ignored" here means that bash continues to run the script.
The wanted actions of the handler can even be postponed to build a kind of "transaction" so that SIGINT will be fired (or "ignored") AFTER a group of statements have done their work.
But since the above example touches many aspects of bash (foreground vs. background behavior, trap and wait) AND 8 years went away since then, the solution discussed here may not immediately work on all systems without further finetuning.
The solution discussed here was successfully tested on a "Linux mint-mate 5.4.0-73-generic x86_64" system with "GNU bash, Version 4.4.20(1)-release":
The wait shell builtin command IS DESIGNED to be interruptable. But one can examine the exit status of wait, which is 128 + signal number = 130 (in the case of SIGINT).
So if you want to trick around and wait til the background is process really finished, one can also do something like this:
wait ${programPID}
while [ $? -ge 128 ]; do
# 1st opportunity to place your **handler actions** is here
wait ${programPID}
done
But let it also said that we ran into a bug/feature while testing all of this. The problem was that wait kept on returning 130 even after the process in the background was no longer there. The documentation says that wait will return 127 in the case of a false process id, but this did not happen in our tests.
Keep in mind to check the existence of the background process before running the wait command in the while loop, if you also run into this problem.
Assumed that the following script is your program, which simply counts down from 5 to 0 and also tee's its output to a file named program.out. The while loop here is considered as a "transaction", which shall not be disturbed by SIGINT. And one last comment: This code does NOT ignore SIGINT after doing postponed actions, but instead restores the old SIGINT handler and raises a SIGINT:
#!/bin/bash
rm -f program.out
# Will be set to 1 by the SIGINT ignoring/postponing handler
declare -ig SIGINT_RECEIVED=0
# On <CTRL>+C or "kill -s SIGINT $$" set flag for [later|postponed] examination
function _set_SIGINT_RECEIVED {
SIGINT_RECEIVED=1
}
# Remember current SIGINT handler
old_SIGINT_handler=$(trap -p SIGINT)
# Prepare for later restoration via ${old_SIGINT_handler}
old_SIGINT_handler=${old_SIGINT_handler:-trap - SIGINT}
# Start your "transaction", which should NOT be disturbed by SIGINT
trap -- '_set_SIGINT_RECEIVED' SIGINT
count=5
echo $count | tee -a program.out
while (( count-- )); do
sleep 1
echo $count | tee -a program.out
done
# End of your "transaction"
# Look whether SIGINT was received
if [ ${SIGINT_RECEIVED} -eq 1 ]; then
# Your **handler actions** are here
echo "SIGINT was received during transaction..." | tee -a program.out
echo "... doing postponed work now..." | tee -a program.out
echo "... restoring old SIGINT handler and sending SIGINT" | tee -a program.out
echo "program finished after SIGINT postponed." | tee -a program.out
${old_SIGINT_handler}
kill -s SIGINT $$
fi
echo "program finished without having received SIGINT." | tee -a program.out
But let it also be said here that we ran into problems after sending program in the background. The problem was that program inherited a trap '' SIGINT which means that SIGINT was generally ignored and program was NOT able to set another handler via trap -- '_set_SIGINT_RECEIVED' SIGINT.
We solved this problem by putting program in a subshell and sending this subshell in the background, as you will see now in the MAIN script example, which runs in the foreground. And one last comment also: In this script you can decide via variable ignore_SIGINT_after_handling whether to finally ignore SIGINT and continue to run the script OR to execute the default SIGINT behavior after your handler action has finished its work:
#!/bin/bash
# Will be set to 1 by the SIGINT ignoring/postponing handler
declare -ig SIGINT_RECEIVED=0
# On <CTRL>+C or "kill -s SIGINT $$" set flag for later examination
function _set_SIGINT_RECEIVED {
SIGINT_RECEIVED=1
}
# Set to 1 if you want to keep bash running after handling SIGINT in a particular way
# or to 0 (or any other value) to run original SIGINT action after postponing SIGINT
ignore_SIGINT_after_handling=1
# Remember current SIGINT handler
old_SIGINT_handler=$(trap -p SIGINT)
# Prepare for later restoration via ${old_SIGINT_handler}
old_SIGINT_handler=${old_SIGINT_handler:-trap - SIGINT}
# Start your "transaction", which should NOT be disturbed by SIGINT
trap -- '_set_SIGINT_RECEIVED' SIGINT
# Do your work, for eample
(./program) &
programPID=$!
wait ${programPID}
while [ $? -ge 128 ]; do
# 1st opportunity to place a part of your **handler actions** is here
# i.e. send SIGINT to ${programPID} and make sure that it is only sent once
# even if MAIN receives more SIGINT's during this loop
wait ${programPID}
done
# End of your "transaction"
# Look whether SIGINT was received
if [ ${SIGINT_RECEIVED} -eq 1 ]; then
# Your postponed **handler actions** are here
echo -e "\nMAIN is doing postponed work now..."
if [ ${ignore_SIGINT_after_handling} -eq 1 ]; then
echo "... and continuing with normal program execution..."
else
echo "... and restoring old SIGINT handler and sending SIGINT via 'kill -s SIGINT \$\$'"
${old_SIGINT_handler}
kill -s SIGINT $$
fi
fi
# Restore "old" SIGINT behaviour
${old_SIGINT_handler}
# Prepare for next "transaction"
SIGINT_RECEIVED=0
echo ""
echo "This message has to be shown in the case of normal program execution"
echo "as well as after a caught and handled and then ignored SIGINT"
echo "End of MAIN script received"
Hope this helps a bit.
Shall everybody have a good time.
i had the same problem: my script was exiting after my sigint handler
i solved this by recursion
#! /bin/sh
# devloop.sh
# run command in infinite loop
# wait before restarting, to allow stopping the loop
# license: MIT, author: milahu
# https://stackoverflow.com/questions/15785522/catch-sigint-in-bash-handle-and-ignore
restart_delay=2
command="$1" # TODO use all args: $#
# example: drop cache, run vite
#command="rm -rf node_modules/.vite/ ; npx vite --clearScreen false"
if [ -z "$command" ]
then
command="( set -x; sleep 5 ); false # example command: sleep 5 seconds, set rc=1"
fi
loop_next() {
echo
echo "starting command. hit Ctrl+C to restart"
echo " $command"
(eval "$command") &
command_pid=$!
#echo "main pid: $$"; echo "cmd pid: $command_pid" # debug
restart_command() {
echo
echo "restarting command in $restart_delay seconds. hit Ctrl+C to stop"
sleep $restart_delay
loop_next # recursion
}
stop_command() {
echo
echo "got Ctrl+C -> stopping command"
kill $command_pid
trap exit SIGINT # handle second Ctrl+C
restart_command
}
trap stop_command SIGINT # handle first Ctrl+C
wait $command_pid # this is blocking
echo "command stopped. return code: $?"
restart_command
}
echo starting loop
loop_next

Best way to make a shell script daemon?

I'm wondering if there is a better way to make a daemon that waits for something using only sh than:
#! /bin/sh
trap processUserSig SIGUSR1
processUserSig() {
echo "doing stuff"
}
while true; do
sleep 1000
done
In particular, I'm wondering if there's any way to get rid of the loop and still have the thing listen for the signals.
Just backgrounding your script (./myscript &) will not daemonize it. See http://www.faqs.org/faqs/unix-faq/programmer/faq/, section 1.7, which describes what's necessary to become a daemon. You must disconnect it from the terminal so that SIGHUP does not kill it. You can take a shortcut to make a script appear to act like a daemon;
nohup ./myscript 0<&- &>/dev/null &
will do the job. Or, to capture both stderr and stdout to a file:
nohup ./myscript 0<&- &> my.admin.log.file &
Redirection explained (see bash redirection)
0<&- closes stdin
&> file sends stdout and stderr to a file
However, there may be further important aspects that you need to consider. For example:
You will still have a file descriptor open to the script, which means that the directory it's mounted in would be unmountable. To be a true daemon you should chdir("/") (or cd / inside your script), and fork so that the parent exits, and thus the original descriptor is closed.
Perhaps run umask 0. You may not want to depend on the umask of the caller of the daemon.
For an example of a script that takes all of these aspects into account, see Mike S' answer.
Some of the top-upvoted answers here are missing some important parts of what makes a daemon a daemon, as opposed to just a background process, or a background process detached from a shell.
This http://www.faqs.org/faqs/unix-faq/programmer/faq/ describes what is necessary to be a daemon. And this Run bash script as daemon implements the setsid, though it misses the chdir to root.
The original poster's question was actually more specific than "How do I create a daemon process using bash?", but since the subject and answers discuss daemonizing shell scripts generally, I think it's important to point it out (for interlopers like me looking into the fine details of creating a daemon).
Here's my rendition of a shell script that would behave according to the FAQ. Set DEBUG to true to see pretty output (but it also exits immediately rather than looping endlessly):
#!/bin/bash
DEBUG=false
# This part is for fun, if you consider shell scripts fun- and I do.
trap process_USR1 SIGUSR1
process_USR1() {
echo 'Got signal USR1'
echo 'Did you notice that the signal was acted upon only after the sleep was done'
echo 'in the while loop? Interesting, yes? Yes.'
exit 0
}
# End of fun. Now on to the business end of things.
print_debug() {
whatiam="$1"; tty="$2"
[[ "$tty" != "not a tty" ]] && {
echo "" >$tty
echo "$whatiam, PID $$" >$tty
ps -o pid,sess,pgid -p $$ >$tty
tty >$tty
}
}
me_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
me_FILE=$(basename $0)
cd /
#### CHILD HERE --------------------------------------------------------------------->
if [ "$1" = "child" ] ; then # 2. We are the child. We need to fork again.
shift; tty="$1"; shift
$DEBUG && print_debug "*** CHILD, NEW SESSION, NEW PGID" "$tty"
umask 0
$me_DIR/$me_FILE XXrefork_daemonXX "$tty" "$#" </dev/null >/dev/null 2>/dev/null &
$DEBUG && [[ "$tty" != "not a tty" ]] && echo "CHILD OUT" >$tty
exit 0
fi
##### ENTRY POINT HERE -------------------------------------------------------------->
if [ "$1" != "XXrefork_daemonXX" ] ; then # 1. This is where the original call starts.
tty=$(tty)
$DEBUG && print_debug "*** PARENT" "$tty"
setsid $me_DIR/$me_FILE child "$tty" "$#" &
$DEBUG && [[ "$tty" != "not a tty" ]] && echo "PARENT OUT" >$tty
exit 0
fi
##### RUNS AFTER CHILD FORKS (actually, on Linux, clone()s. See strace -------------->
# 3. We have been reforked. Go to work.
exec >/tmp/outfile
exec 2>/tmp/errfile
exec 0</dev/null
shift; tty="$1"; shift
$DEBUG && print_debug "*** DAEMON" "$tty"
# The real stuff goes here. To exit, see fun (above)
$DEBUG && [[ "$tty" != "not a tty" ]] && echo NOT A REAL DAEMON. NOT RUNNING WHILE LOOP. >$tty
$DEBUG || {
while true; do
echo "Change this loop, so this silly no-op goes away." >/dev/null
echo "Do something useful with your life, young padawan." >/dev/null
sleep 10
done
}
$DEBUG && [[ "$tty" != "not a tty" ]] && sleep 3 && echo "DAEMON OUT" >$tty
exit # This may never run. Why is it here then? It's pretty.
# Kind of like, "The End" at the end of a movie that you
# already know is over. It's always nice.
Output looks like this when DEBUG is set to true. Notice how the session and process group ID (SESS, PGID) numbers change:
<shell_prompt>$ bash blahd
*** PARENT, PID 5180
PID SESS PGID
5180 1708 5180
/dev/pts/6
PARENT OUT
<shell_prompt>$
*** CHILD, NEW SESSION, NEW PGID, PID 5188
PID SESS PGID
5188 5188 5188
not a tty
CHILD OUT
*** DAEMON, PID 5198
PID SESS PGID
5198 5188 5188
not a tty
NOT A REAL DAEMON. NOT RUNNING WHILE LOOP.
DAEMON OUT
# double background your script to have it detach from the tty
# cf. http://www.linux-mag.com/id/5981
(./program.sh &) &
Use your system's daemon facility, such as start-stop-daemon.
Otherwise, yes, there has to be a loop somewhere.
$ ( cd /; umask 0; setsid your_script.sh </dev/null &>/dev/null & ) &
It really depends on what is the binary itself going to do.
For example I want to create some listener.
The starting Daemon is simple task :
lis_deamon :
#!/bin/bash
# We will start the listener as Deamon process
#
LISTENER_BIN=/tmp/deamon_test/listener
test -x $LISTENER_BIN || exit 5
PIDFILE=/tmp/deamon_test/listener.pid
case "$1" in
start)
echo -n "Starting Listener Deamon .... "
startproc -f -p $PIDFILE $LISTENER_BIN
echo "running"
;;
*)
echo "Usage: $0 start"
exit 1
;;
esac
this is how we start the daemon (common way for all /etc/init.d/ staff)
now as for the listener it self,
It must be some kind of loop/alert or else that will trigger the script
to do what u want. For example if u want your script to sleep 10 min
and wake up and ask you how you are doing u will do this with the
while true ; do sleep 600 ; echo "How are u ? " ; done
Here is the simple listener that u can do that will listen for your
commands from remote machine and execute them on local :
listener :
#!/bin/bash
# Starting listener on some port
# we will run it as deamon and we will send commands to it.
#
IP=$(hostname --ip-address)
PORT=1024
FILE=/tmp/backpipe
count=0
while [ -a $FILE ] ; do #If file exis I assume that it used by other program
FILE=$FILE.$count
count=$(($count + 1))
done
# Now we know that such file do not exist,
# U can write down in deamon it self the remove for those files
# or in different part of program
mknod $FILE p
while true ; do
netcat -l -s $IP -p $PORT < $FILE |/bin/bash > $FILE
done
rm $FILE
So to start UP it : /tmp/deamon_test/listener start
and to send commands from shell (or wrap it to script) :
test_host#netcat 10.184.200.22 1024
uptime
20:01pm up 21 days 5:10, 44 users, load average: 0.62, 0.61, 0.60
date
Tue Jan 28 20:02:00 IST 2014
punt! (Cntrl+C)
Hope this will help.
Have a look at the daemon tool from the libslack package:
http://ingvar.blog.linpro.no/2009/05/18/todays-sysadmin-tip-using-libslack-daemon-to-daemonize-a-script/
On Mac OS X use a launchd script for shell daemon.
If I had a script.sh and i wanted to execute it from bash and leave it running even when I want to close my bash session then I would combine nohup and & at the end.
example: nohup ./script.sh < inputFile.txt > ./logFile 2>&1 &
inputFile.txt can be any file. If your file has no input then we usually use /dev/null. So the command would be:
nohup ./script.sh < /dev/null > ./logFile 2>&1 &
After that close your bash session,open another terminal and execute: ps -aux | egrep "script.sh" and you will see that your script is still running at the background. Of cource,if you want to stop it then execute the same command (ps) and kill -9 <PID-OF-YOUR-SCRIPT>
See Bash Service Manager project: https://github.com/reduardo7/bash-service-manager
Implementation example
#!/usr/bin/env bash
export PID_FILE_PATH="/tmp/my-service.pid"
export LOG_FILE_PATH="/tmp/my-service.log"
export LOG_ERROR_FILE_PATH="/tmp/my-service.error.log"
. ./services.sh
run-script() {
local action="$1" # Action
while true; do
echo "### Running action '${action}'"
echo foo
echo bar >&2
[ "$action" = "run" ] && return 0
sleep 5
[ "$action" = "debug" ] && exit 25
done
}
before-start() {
local action="$1" # Action
echo "* Starting with $action"
}
after-finish() {
local action="$1" # Action
local serviceExitCode=$2 # Service exit code
echo "* Finish with $action. Exit code: $serviceExitCode"
}
action="$1"
serviceName="Example Service"
serviceMenu "$action" "$serviceName" run-script "$workDir" before-start after-finish
Usage example
$ ./example-service
# Actions: [start|stop|restart|status|run|debug|tail(-[log|error])]
$ ./example-service start
# Starting Example Service service...
$ ./example-service status
# Serive Example Service is runnig with PID 5599
$ ./example-service stop
# Stopping Example Service...
$ ./example-service status
# Service Example Service is not running
Here is the minimal change to the original proposal to create a valid daemon in Bourne shell (or Bash):
#!/bin/sh
if [ "$1" != "__forked__" ]; then
setsid "$0" __forked__ "$#" &
exit
else
shift
fi
trap 'siguser1=true' SIGUSR1
trap 'echo "Clean up and exit"; kill $sleep_pid; exit' SIGTERM
exec > outfile
exec 2> errfile
exec 0< /dev/null
while true; do
(sleep 30000000 &>/dev/null) &
sleep_pid=$!
wait
kill $sleep_pid &>/dev/null
if [ -n "$siguser1" ]; then
siguser1=''
echo "Wait was interrupted by SIGUSR1, do things here."
fi
done
Explanation:
Line 2-7: A daemon must be forked so it doesn't have a parent. Using an artificial argument to prevent endless forking. "setsid" detaches from starting process and terminal.
Line 9: Our desired signal needs to be differentiated from other signals.
Line 10: Cleanup is required to get rid of dangling "sleep" processes.
Line 11-13: Redirect stdout, stderr and stdin of the script.
Line 16: sleep in the background
Line 18: wait waits for end of sleep, but gets interrupted by (some) signals.
Line 19: Kill sleep process, because that is still running when signal is caught.
Line 22: Do the work if SIGUSR1 has been caught.
Guess it does not get any simpler than that.
Like many answers this one is not a "real" daemonization but rather an alternative to nohup approach.
echo "script.sh" | at now
There are obviously differences from using nohup. For one there is no detaching from the parent in the first place. Also "script.sh" doesn't inherit parent's environment.
By no means this is a better alternative. It is simply a different (and somewhat lazy) way of launching processes in background.
P.S. I personally upvoted carlo's answer as it seems to be the most elegant and works both from terminal and inside scripts
try executing using &
if you save this file as program.sh
you can use
$. program.sh &

Resources