How to trap and exit all child processes - bash

I am trying to get hollywood to run in a way that I can exit it with a normal Ctrl+C signal.
Currently I have to press Ctrl+C a bunch of times just to get stuck in the tmux instance that hollywoodcreated. Looking at the source code, there is a trap command:
trap "pkill -f -9 lib/hollywood/ >/dev/null 2>&1; exit 0" INT
But apparently that is not enough. I've tried replacing it with several different ones, but none of them was able to do it right:
trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT
trap 'kill $(jobs -p)' EXIT
trap 'pkill -f -9 lib/hollywood/ >/dev/null 2>&1; kill -9 $(ps -eo pid,command | grep tmux | grep byobu | grep hollywood | sed -r "s/^[^0-9]*([0-9]+).*/\1/") >/dev/null 2>&1; exit 0' INT
trap "exit" INT TERM
trap "kill 0" EXIT
I've tried several answers of this question: How do I kill background processes / jobs when my shell script exits?
But none of those worked. (I still had to press Ctrl+C a bunch of times and then manually exit the tmux session.)
Is there a simple way to fix this? (I would prefer to having to mess with the source code too much.)

They aren't background processes, they are running in different tmux panes - tmux is the parent process, not the hollywood script. So most of the commands you list won't have any effect.
pkill should work if the pattern is right. Does the pkill work if you run it from outside tmux? It looks like each hollywood widget installs its own trap with the same pkill so it should kill them all if any one is killed.
Alternatively, if you are not running it in your own tmux server, you could simply make C-c kill the tmux window - change hollywood to do something like this when it creates the tmux session (around line 78):
$tmux_launcher bind -n C-c kill-window
If you have an existing tmux that you don't want to kill, this is harder because you want C-c for other purposes. You could change your C-c binding to something like:
bind -n C-c if -F '#{==:#{window_name},hollywood}' 'kill-window' 'send C-c'
But you might not want to do this unless you are using hollywood a lot.
It might be more useful to ask this in the hollywood issue tracker than here TBH.

Related

How to kill a process group with kill in bash?

I have a script which is much more complicated but I managed to produce a short script that exhibits the same problem.
I create a process and make it a session leader and then send SIGINT to it. The kill builtin doesn't fail but the process doesn't get killed either (i.e. the default behaviour for SIGINT is to kill). I tried with kill -INT -pid (which should be equivalent to what I do currently) and the /bin/kill command but the behaviour is the same.
The script is as follows:
#!/bin/bash
# Run in a new session so that I don't have to kill the shell
setsid bash -c "sleep 50" &
procs=$(ps --ppid $$ -o pid,pgid,command | grep 'sleep' | head -1)
if [[ -z "$procs" ]]; then
echo "Couldn't find process group"
exit 1
fi
PID=$(echo $procs | cut -d ' ' -f 1)
pgid=$(echo $procs | cut -d ' ' -f 2)
if ! kill -n SIGINT $pgid; then
echo "kill failed"
fi
echo "done"
ps -P $pgid
My expectation is that the last ps command shouldn't report anything (as kill didn't report failure and hence the process should have died) but it does.
I am looking for an explanation of the above noted behaviour and how I can kill a process group (i.e. both the bash and the sleep it starts -- the setsid line above) running in a separate session.
I think you'll find that sleep ignores SIGINT. Take a look at the signals of your sleep command and see. On my Linux box I find:
SigIgn: 0000000000000006
The second bit from the right is set (6 = 4 + 2 + 0), and from the above link:
--> 2 = SIGINT
Try send a HUP, and you'll find it does kill the sleep.

Telnet Process Continues after Bash Script

I am running the following script
#! /bin/bash
HOSTLIST="192.168.0.5 192.168.22.1"
DELAY=3
stty echo
exec 4>&1
for HOST in $HOSTLIST ; do
telnet $HOST 135 | grep Connected & pid=$!
echo "Checking $HOST"
sleep $DELAY
kill -9 $pid &> /dev/null
done
However, when it finishes, the Telnet connections are still being attempted in the background which spams annoying "telnet: unable to connect" errors randomly for the next few moments. I tried adding killing the process to stop this but it still does it. Am I doing something wrong for killing the process?
Also I have to use telnet, can't use netcat or nmap.
The pid you are trying to kill is the pid of the grep since $! is the pid of the most recently executed background command. If you hadn't thrown away stderr when trying to kill it might have provided some clue...
BTW, kill -9 is a serious code smell. Any well behaved process can be killed by at least one of the -INT, -HUP, -TERM or -QUIT signals. You should never need to kill -KILL. It's bad because it doesn't give the process opportunity to clean up its mess.

Letting other users stop/restart simple bash daemons – use signals or what?

I have a web server where I run some slow-starting programs as daemons. These sometimes need quick restarting (or stopping) when I recompile them or switch to another installation of them.
Inspired by http://mywiki.wooledge.org/ProcessManagement, I'm writing a script
called daemonise.sh that looks like
#!/bin/sh
while :; do
./myprogram lotsadata.xml
echo "Restarting server..." 1>&2
done
to keep a "daemon" running. Since I sometimes need to stop it, or just
restart it, I run that script in a screen session, like:
$ ./daemonise.sh & DPID=$!
$ screen -d
Then perhaps I recompile myprogram, install it to a new path, start
the new one up and want to kill the old one:
$ screen -r
$ kill $DPID
$ screen -d
This works fine when I'm the only maintainer, but now I want to let
someone else stop/restart the program, no matter who started it. And
to make things more complicated, the daemonise.sh script in fact
starts about 16 programs, making it a hassle to kill every single one
if you don't know their PIDs.
What would be the "best practices" way of letting another user
stop/restart the daemons?
I thought about shared screen sessions, but that just sounds hacky and
insecure. The best solution I've come up with for now is to wrap
starting and killing in a script that catches certain signals:
#!/bin/bash
DPID=
trap './daemonise.sh & DPID=$!' USR1
trap 'kill $DPID' USR2 EXIT
# Ensure trapper wrapper doesn't exit:
while :; do
sleep 10000 & wait $!
done
Now, should another user need to stop the daemons and I can't do it,
she just has to know the pid of the wrapper, and e.g. sudo kill -s
USR2 $wrapperpid. (Also, this makes it possible to run the daemons
on reboots, and still kill them cleanly.)
Is there a better solution? Are there obvious problems with this
solution that I'm not seeing?
(After reading Greg's Bash Wiki, I'd like to avoid any solution involving pgrep or PID-files …)
I recommend a PID based init script. Anyone with sudo privileged to the script will be able to start and stop the server processes.
On improving your approach: wouldn't it be advisable to make sure that your sleep command in sleep 10000 & wait $! gets properly terminated if your pidwrapper script exits somehow?
Otherwise there would remain a dangling sleep process in the process table for quite some time.
Similarly, wouldn't it be cleaner to terminate myprogram in daemonise.sh properly on restart (i. e. if daemonise.sh receives a TERM signal)?
In addition, it is possible to suppress job notification messages and test for pid existence before killing.
#!/bin/sh
# cat daemonise.sh
# cf. "How to suppress Terminated message after killing in bash?",
# http://stackoverflow.com/q/81520
trap '
echo "server shut down..." 1>&2
kill $spid1 $spid2 $spid3 &&
wait $spid1 $spid2 $spid3 2>/dev/null
exit
' TERM
while :; do
echo "Starting server..." 1>&2
#./myprogram lotsadata.xml
sleep 100 &
spid1=${!}
sleep 100 &
spid2=${!}
sleep 100 &
spid3=${!}
wait
echo "Restarting server..." 1>&2
done
#------------------------------------------------------------
#!/bin/bash
# cat pidwrapper
DPID=
trap '
kill -0 ${!} 2>/dev/null && kill ${!} && wait ${!} 2>/dev/null
./daemonise.sh & DPID=${!}
' USR1
trap '
kill -0 ${!} 2>/dev/null && kill ${!} && wait ${!} 2>/dev/null
kill -0 $DPID 2>/dev/null && kill $DPID && wait ${DPID} 2>/dev/null
' USR2
trap '
trap - EXIT
kill -0 $DPID 2>/dev/null && kill $DPID && wait ${DPID} 2>/dev/null
kill -0 ${!} 2>/dev/null && kill ${!} && wait ${!} 2>/dev/null
exit 0
' EXIT
# Ensure trapper wrapper does not exit:
while :; do
sleep 10000 & wait $!
done
#------------------------------------------------------------
# test
{
wrapperpid="`exec sh -c './pidwrapper & echo ${!}' | head -1`"
echo "wrapperpid: $wrapperpid"
for n in 1 2 3 4 5; do
sleep 2
# start daemonise.sh
kill -s USR1 $wrapperpid
sleep 2
# kill daemonise.sh
kill -s USR2 $wrapperpid
done
sleep 2
echo kill $wrapperpid
kill $wrapperpid
}

Set trap in bash for different process with PID known

I need to set a trap for a bash process I'm starting in the background. The background process may run very long and has its PID saved in a specific file.
Now I need to set a trap for that process, so if it terminates, the PID file will be deleted.
Is there a way I can do that?
EDIT #1
It looks like I was not precise enough with my description of the problem. I have full control over all the code, but the long running background process I have is this:
cat /dev/random >> myfile&
When I now add the trap at the beginning of the script this statement is in, $$ will be the PID of that bigger script not of this small background process I am starting here.
So how can I set traps for that background process specifically?
(./jobsworthy& echo $! > $pidfile; wait; rm -f $pidfile)&
disown
Add this to the beginning of your Bash script.
#!/bin/bash
trap 'rm "$pidfile"; exit' EXIT SIGQUIT SIGINT SIGSTOP SIGTERM ERR
pidfile=$(tempfile -p foo -s $$)
echo $$ > "$pidfile"
# from here, do your long running process
You can run your long running background process in an explicit subshell, as already shown by Petesh's answer, and set a trap inside this specific subshell to handle the exiting of your long running background process. The parent shell remains unaffected by this subshell trap.
(
trap '
trap - EXIT ERR
kill -0 ${!} 1>/dev/null 2>&1 && kill ${!}
rm -f pidfile.pid
exit
' EXIT QUIT INT STOP TERM ERR
# simulate background process
sleep 15 &
echo ${!} > pidfile.pid
wait
) &
disown
# remove background process by hand
# kill -TERM ${!}
You do not need trap to just run some command after a background process terminates, you can instead run through a shell command line and add the command following after the background process, separated with semicolon (and let this shell run in the background instead of the background process).
If you still would like to have some notification in your shell script send and trap SIGUSR2 for instance:
#!/bin/sh
BACKGROUND_PROCESS=xterm # for my testing, replace with what you have
sh -c "$BACKGROUND_PROCESS; rm -f the_pid_file; kill -USR2 $$" &
trap "echo $BACKGROUND_PROCESS ended" USR2
while sleep 1
do
echo -n .
done

How to suppress Terminated message after killing in bash?

How can you suppress the Terminated message that comes up after you kill a
process in a bash script?
I tried set +bm, but that doesn't work.
I know another solution involves calling exec 2> /dev/null, but is that
reliable? How do I reset it back so that I can continue to see stderr?
In order to silence the message, you must be redirecting stderr at the time the message is generated. Because the kill command sends a signal and doesn't wait for the target process to respond, redirecting stderr of the kill command does you no good. The bash builtin wait was made specifically for this purpose.
Here is very simple example that kills the most recent background command. (Learn more about $! here.)
kill $!
wait $! 2>/dev/null
Because both kill and wait accept multiple pids, you can also do batch kills. Here is an example that kills all background processes (of the current process/script of course).
kill $(jobs -rp)
wait $(jobs -rp) 2>/dev/null
I was led here from bash: silently kill background function process.
The short answer is that you can't. Bash always prints the status of foreground jobs. The monitoring flag only applies for background jobs, and only for interactive shells, not scripts.
see notify_of_job_status() in jobs.c.
As you say, you can redirect so standard error is pointing to /dev/null but then you miss any other error messages. You can make it temporary by doing the redirection in a subshell which runs the script. This leaves the original environment alone.
(script 2> /dev/null)
which will lose all error messages, but just from that script, not from anything else run in that shell.
You can save and restore standard error, by redirecting a new filedescriptor to point there:
exec 3>&2 # 3 is now a copy of 2
exec 2> /dev/null # 2 now points to /dev/null
script # run script with redirected stderr
exec 2>&3 # restore stderr to saved
exec 3>&- # close saved version
But I wouldn't recommend this -- the only upside from the first one is that it saves a sub-shell invocation, while being more complicated and, possibly even altering the behavior of the script, if the script alters file descriptors.
EDIT:
For more appropriate answer check answer given by Mark Edgar
Solution: use SIGINT (works only in non-interactive shells)
Demo:
cat > silent.sh <<"EOF"
sleep 100 &
kill -INT $!
sleep 1
EOF
sh silent.sh
http://thread.gmane.org/gmane.comp.shells.bash.bugs/15798
Maybe detach the process from the current shell process by calling disown?
The Terminated is logged by the default signal handler of bash 3.x and 4.x. Just trap the TERM signal at the very first of child process:
#!/bin/sh
## assume script name is test.sh
foo() {
trap 'exit 0' TERM ## here is the key
while true; do sleep 1; done
}
echo before child
ps aux | grep 'test\.s[h]\|slee[p]'
foo &
pid=$!
sleep 1 # wait trap is done
echo before kill
ps aux | grep 'test\.s[h]\|slee[p]'
kill $pid ## no need to redirect stdin/stderr
sleep 1 # wait kill is done
echo after kill
ps aux | grep 'test\.s[h]\|slee[p]'
Is this what we are all looking for?
Not wanted:
$ sleep 3 &
[1] 234
<pressing enter a few times....>
$
$
[1]+ Done sleep 3
$
Wanted:
$ (set +m; sleep 3 &)
<again, pressing enter several times....>
$
$
$
$
$
As you can see, no job end message. Works for me in bash scripts as well, also for killed background processes.
'set +m' disables job control (see 'help set') for the current shell. So if you enter your command in a subshell (as done here in brackets) you will not influence the job control settings of the current shell. Only disadvantage is that you need to get the pid of your background process back to the current shell if you want to check whether it has terminated, or evaluate the return code.
This also works for killall (for those who prefer it):
killall -s SIGINT (yourprogram)
suppresses the message... I was running mpg123 in background mode.
It could only silently be killed by sending a ctrl-c (SIGINT) instead of a SIGTERM (default).
disown did exactly the right thing for me -- the exec 3>&2 is risky for a lot of reasons -- set +bm didn't seem to work inside a script, only at the command prompt
Had success with adding 'jobs 2>&1 >/dev/null' to the script, not certain if it will help anyone else's script, but here is a sample.
while true; do echo $RANDOM; done | while read line
do
echo Random is $line the last jobid is $(jobs -lp)
jobs 2>&1 >/dev/null
sleep 3
done
Another way to disable job notifications is to place your command to be backgrounded in a sh -c 'cmd &' construct.
#!/bin/bash
# ...
pid="`sh -c 'sleep 30 & echo ${!}' | head -1`"
kill "$pid"
# ...
# or put several cmds in sh -c '...' construct
sh -c '
sleep 30 &
pid="${!}"
sleep 5
kill "${pid}"
'
I found that putting the kill command in a function and then backgrounding the function suppresses the termination output
function killCmd() {
kill $1
}
killCmd $somePID &
Simple:
{ kill $! } 2>/dev/null
Advantage? can use any signal
ex:
{ kill -9 $PID } 2>/dev/null

Resources