Starting a process over ssh using bash and then killing it on sigint - bash

I want to start a couple of jobs on different machines using ssh. If the user then interrupts the main script I want to shut down all the jobs gracefully.
Here is a short example of what I'm trying to do:
#!/bin/bash
trap "aborted" SIGINT SIGTERM
aborted() {
kill -SIGTERM $bash2_pid
exit
}
ssh -t remote_machine /foo/bar.sh &
bash2_pid=$!
wait
However the bar.sh process is still running the remote machine. If I do the same commands in a terminal window it shuts down the process on the remote host.
Is there an easy way to make this happen when I run the bash script? Or do I need to make it log on to the remote machine, find the right process and kill it that way?
edit:
Seems like I have to go with option B, killing the remotescript through another ssh connection
So no I want to know how do I get the remotepid?
I've tried a something along the lines of :
remote_pid=$(ssh remote_machine '{ /foo/bar.sh & } ; echo $!')
This doesn't work since it blocks.
How do I wait for a variable to print and then "release" a subprocess?

It would definitely be preferable to keep your cleanup managed by the ssh that starts the process rather than moving in for the kill with a second ssh session later on.
When ssh is attached to your terminal; it behaves quite well. However, detach it from your terminal and it becomes (as you've noticed) a pain to signal or manage remote processes. You can shut down the link, but not the remote processes.
That leaves you with one option: Use the link as a way for the remote process to get notified that it needs to shut down. The cleanest way to do this is by using blocking I/O. Make the remote read input from ssh and when you want the process to shut down; send it some data so that the remote's reading operation unblocks and it can proceed with the cleanup:
command & read; kill $!
This is what we would want to run on the remote. We invoke our command that we want to run remotely; we read a line of text (blocks until we receive one) and when we're done, signal the command to terminate.
To send the signal from our local script to the remote, all we need to do now is send it a line of text. Unfortunately, Bash does not give you a lot of good options, here. At least, not if you want to be compatible with bash < 4.0.
With bash 4 we can use co-processes:
coproc ssh user#host 'command & read; kill $!'
trap 'echo >&"${COPROC[1]}"' EXIT
...
Now, when the local script exits (don't trap on INT, TERM, etc. Just EXIT) it sends a new line to the file in the second element of the COPROC array. That file is a pipe which is connected to ssh's stdin, effectively routing our line to ssh. The remote command reads the line, ends the read and kills the command.
Before bash 4 things get a bit harder since we don't have co-processes. In that case, we need to do the piping ourselves:
mkfifo /tmp/mysshcommand
ssh user#host 'command & read; kill $!' < /tmp/mysshcommand &
trap 'echo > /tmp/mysshcommand; rm /tmp/mysshcommand' EXIT
This should work in pretty much any bash version.

Try this:
ssh -tt host command </dev/null &
When you kill the local ssh process, the remote pty will close and SIGHUP will be sent to the remote process.

Referencing the answer by lhunath and https://unix.stackexchange.com/questions/71205/background-process-pipe-input I came up with this script
run.sh:
#/bin/bash
log="log"
eval "$#" \&
PID=$!
echo "running" "$#" "in PID $PID"> $log
{ (cat <&3 3<&- >/dev/null; kill $PID; echo "killed" >> $log) & } 3<&0
trap "echo EXIT >> $log" EXIT
wait $PID
The difference being that this version kills the process when the connection is closed, but also returns the exit code of the command when it runs to completion.
$ ssh localhost ./run.sh true; echo $?; cat log
0
running true in PID 19247
EXIT
$ ssh localhost ./run.sh false; echo $?; cat log
1
running false in PID 19298
EXIT
$ ssh localhost ./run.sh sleep 99; echo $?; cat log
^C130
running sleep 99 in PID 20499
killed
EXIT
$ ssh localhost ./run.sh sleep 2; echo $?; cat log
0
running sleep 2 in PID 20556
EXIT
For a one-liner:
ssh localhost "sleep 99 & PID=\$!; { (cat <&3 3<&- >/dev/null; kill \$PID) & } 3<&0; wait \$PID"
For convenience:
HUP_KILL="& PID=\$!; { (cat <&3 3<&- >/dev/null; kill \$PID) & } 3<&0; wait \$PID"
ssh localhost "sleep 99 $HUP_KILL"
Note: kill 0 may be preferred to kill $PID depending on the behavior needed with regard to spawned child processes. You can also kill -HUP or kill -INT if you desire.
Update:
A secondary job control channel is better than reading from stdin.
ssh -n -R9002:localhost:8001 -L8001:localhost:9001 localhost ./test.sh sleep 2
Set job control mode and monitor the job control channel:
set -m
trap "kill %1 %2 %3" EXIT
(sleep infinity | netcat -l 127.0.0.1 9001) &
(netcat -d 127.0.0.1 9002; kill -INT $$) &
"$#" &
wait %3
Finally, here's another approach and a reference to a bug filed on openssh:
https://bugzilla.mindrot.org/show_bug.cgi?id=396#c14
This is the best way I have found to do this. You want something on the server side that attempts to read stdin and then kills the process group when that fails, but you also want a stdin on the client side that blocks until the server side process is done and will not leave lingering processes like <(sleep infinity) might.
ssh localhost "sleep 99 < <(cat; kill -INT 0)" <&1
It doesn't actually seem to redirect stdout anywhere but it does function as a blocking input and avoids capturing keystrokes.

The solution for bash 3.2:
mkfifo /tmp/mysshcommand
ssh user#host 'command & read; kill $!' < /tmp/mysshcommand &
trap 'echo > /tmp/mysshcommand; rm /tmp/mysshcommand' EXIT
doesn't work. The ssh command is not on the ps list on the "client" machine. Only after I echo something into the pipe will it appear in the process list of the client machine. The process that appears on the "server" machine would just be the command itself, not the read/kill part.
Writing again into the pipe does not terminate the process.
So summarizing, I need to write into the pipe for the command to start up, and if I write again, it does not kill the remote command, as expected.

You may want to consider mounting the remote file system and run the script from the master box. For instance, if your kernel is compiled with fuse (can check with the following):
/sbin/lsmod | grep -i fuse
You can then mount the remote file system with the following command:
sshfs user#remote_system: mount_point
Now just run your script on the file located in mount_point.

Related

Unable to fully close remote SSH tunnel in script's exit

I'm writing a script, which double-hop SSH-forwards port 80 from our remotely deployed VMs, and opens this "status page" in a local browser. To open it, the SSH tunnel must be "backgrounded", however doing so causes the SSH tunnel to exit with a persistent tunnel remaining on the SSH server that I'm tunneling through (bastion). Here is the script, so far:
#!/bin/sh
# SSH needs a HUP when this script exits
shopt -s huponexit
echo "SSH Forwards the VM status page for a given host..."
read -p "Host Name: " CODE
PORT=$(($RANDOM + 1024))
# "-t -t" (force tty) needed to avoid orphan tunnels on bastion after exit. (Only seems to work when not backgrounded?)
ssh -t -t -4L $PORT:localhost:$PORT user1#bastion sudo ssh -4NL $PORT:localhost:80 root#$CODE.internal-vms &
PID=$!
# Open browser to VM Status Page
sleep 1
open http://localhost:$PORT/
# Runs the SSH tunnel in the background, ensuring it gets killed on shell's exit...
bash
kill -CONT $PID
#kill -QUIT $PID
echo "Killed SSH Tunnel. Exiting..."
sleep 2
Unfortunately, given the backgrounding of the SSH tunnel (using & on line 10), when the script is killed (via CTRL-C), the "bastion" server ends up having an orphaned SSH connection remaining indefinitely.
The "-t -t" and "shopt -s huponexit" are fixed I've tried, but don't seem to help. I've also tried various SIG's in the final kill. What am I doing wrong here? Thanks for the assistance!
The -f flag can be used to background the process. To end the connection, ssh -O exit user1#bastion is a better option than kill which is rather violent.
I would do it like this. Fyi, I didn't test the modified script, although I regularly use a similar, long SSH command.
#!/bin/sh
# SSH needs a HUP when this script exits
shopt -s huponexit
echo "SSH Forwards the VM status page for a given host..."
read -p "Host Name: " CODE
PORT=$(($RANDOM + 1024))
# "-t -t" (force tty) needed to avoid orphan tunnels on bastion after exit. (Only seems to work when not backgrounded?)
ssh -t -t -f -4L $PORT:localhost:$PORT user1#bastion sudo ssh -4NL $PORT:localhost:80 root#$CODE.internal-vms
#PID=$!
# Open browser to VM Status Page
sleep 1
open http://localhost:$PORT/
# Runs the SSH tunnel in the background, ensuring it gets killed on shell's exit...
#bash
#kill -CONT $PID
#kill -QUIT $PID
ssh -O exit user#bastion
echo "Killed SSH Tunnel. Exiting..."
sleep 2

Telnet Process Continues after Bash Script

I am running the following script
#! /bin/bash
HOSTLIST="192.168.0.5 192.168.22.1"
DELAY=3
stty echo
exec 4>&1
for HOST in $HOSTLIST ; do
telnet $HOST 135 | grep Connected & pid=$!
echo "Checking $HOST"
sleep $DELAY
kill -9 $pid &> /dev/null
done
However, when it finishes, the Telnet connections are still being attempted in the background which spams annoying "telnet: unable to connect" errors randomly for the next few moments. I tried adding killing the process to stop this but it still does it. Am I doing something wrong for killing the process?
Also I have to use telnet, can't use netcat or nmap.
The pid you are trying to kill is the pid of the grep since $! is the pid of the most recently executed background command. If you hadn't thrown away stderr when trying to kill it might have provided some clue...
BTW, kill -9 is a serious code smell. Any well behaved process can be killed by at least one of the -INT, -HUP, -TERM or -QUIT signals. You should never need to kill -KILL. It's bad because it doesn't give the process opportunity to clean up its mess.

Get the PID of a process started with nohup via ssh

I want to start a process using nohup on a remote machine via ssh. The problem is how to get the PID of the process started with nohup, so the "process actually doing something", not some outer shell instance or the like. Also, I want to store stdout and stderr in files, but that is not the issue here...
Locally, it works flawlessly using
nohup sleep 30 > out 2> err < /dev/null & echo $!
It is echoing me the exact PID of the command "sleep 30", which I can also see using "top" or "ps aux|grep sleep".
But I'm having trouble doing it remotely via ssh. I tried something like
ssh remote_machine 'nohup bash -c "( ( sleep 30 ) & )" > out 2> err < /dev/null'
but I cannot figure out where to place the "echo $!" so that it is displayed in my local shell. It is always showing me wrong PIDs, for example the one of the "bash" instance etc.
Has somebody an idea how to solve this?
EDIT:
OK, the "bash -c" might not be needed here. Like Lotharyx pointed out, I get the right PID just fine using
ssh remote 'nohup sleep 30 > out 2> err < /dev/null & echo $!'
but then the problem is that if you substitute "sleep 30" with something that produces output, say, "echo Hello World!", that output does not end up in the file "out", neither on the local nor on remote side. Anybody got an idea why?
EDIT2: My fault! There was just no space left on the other device, that's why the files "out" and "err" stayed empty!
So this is working. In addition, if one wants to call multiple commands in a row, separated by a semicolon (;), one can still use "bash -c", like so:
ssh remote 'nohup bash -c "echo bla;sleep 30;echo blupp" > out 2> err < /dev/null & echo $!'
Then it prints out the PID of the "bash -c" on the local side, which is just fine. (It is impossible to get the PID of the "innermost" or "busy" process, because every program itself can spawn new subprocesses, there is no way to find out...)
I tried the following (the local machine is Debian; the remote machine is CentOS), and it worked exactly as I think you're expecting:
~# ssh someone#somewhere 'nohup sleep 30 > out 2> err < /dev/null & echo $!'
someone#somewhere's password:
14193
~#
On the remote machine, I did ps -e, and saw this line:
14193 ? 00:00:00 sleep
So, clearly, on my local machine, the output is the PID of "sleep" executing on the remote machine.
Why are you adding bash to your command when sending it across an SSH tunnel?

Terminating SSH session executed by bash script

I have a script I can run locally to remotely start a server:
#!/bin/bash
ssh user#host.com <<EOF
nohup /path/to/run.sh &
EOF
echo 'done'
After running nohup, it hangs. I have to hit ctrl-c to exit the script.
I've tried adding an explicit exit at the end of the here doc and using "-t" argument for ssh. Neither works. How do I make this script exit immediately?
EDIT: The client is OSX 10.6, server is Ubuntu.
I think the problem is that nohup can't redirect output when you come in from ssh, it only redirects to nohup.out when it thinks it's connected to a terminal, and I the stdin override you have will prevent that, even with -t.
A workaround might be to redirect the output yourself, then the ssh client can disconnect - it's not waiting for the stream to close. Something like:
nohup /path/to/run.sh > run.log &
(This worked for me in a simple test connecting to an Ubuntu server from an OS X client.)
The problem might be that ...
... ssh is respecting the POSIX standard when not closing the session
if a process is still attached to the tty.
Therefore a solution might be to detach the stdin of the nohup command from the tty:
nohup /path/to/run.sh </dev/null &
See: SSH Hangs On Exit When Using nohup
Yet another approach might be to use ssh -t -t to force pseudo-tty allocation even if stdin isn't a terminal.
man ssh | less -Ip 'multiple -t'
ssh -t -t user#host.com <<EOF
nohup /path/to/run.sh &
EOF
See: BASH spawn subshell for SSH and continue with program flow
Redirecting the stdin of the remote host from a here document while invoking ssh without an explicit command leads to the message: Pseudo-terminal will not be allocated because stdin is not a terminal.
To avoid this message either use ssh's -T switch to tell the remote host there is no need to allocate a pseudo-terminal or explicitly specify a command (such as /bin/sh) for the remote host to execute the commands provided by the here document.
If an explicit command is given to ssh, the default is to provide no login shell in the form of a pseudo-terminal, i. e. there will be no normal login session when a command is specified (see man ssh).
Without a command specified for ssh, on the other hand, the default is to create a pseudo-tty for an interactive login session on the remote host.
- ssh user#host.com <<EOF
+ ssh -T user#host.com <<EOF
+ ssh user#host.com /bin/bash <<EOF
As a rule, ssh -t or even ssh -t -t should only be used if there are commands that expect stdin / stdout to be a terminal (such as top or vim) or if it is necessary to kill the remote shell and its children when the ssh client command finishes execution (see: ssh command unexpectedly continues on other system after ssh terminates).
As far as I can tell, the only way to combine an ssh command that does not allocate a pseudo-tty and a nohup command that writes to nohup.out on the remote host is to let the nohup command execute in a pseudo-terminal not created by the ssh mechanism. This can be done with the script command, for example, and will avoid the tcgetattr: Inappropriate ioctl for device message.
#!/bin/bash
ssh localhost /bin/sh <<EOF
#0<&- script -q /dev/null nohup sleep 10 1>&- &
#0<&- script -q -c "nohup sh -c 'date; sleep 10 1>&- &'" /dev/null # Linux
0<&- script -q /dev/null nohup sh -c 'date; sleep 10 1>&- &' # FreeBSD, Mac OS X
cat nohup.out
exit 0
EOF
echo 'done'
exit 0
You need to add a exit 0 at the end.

How to suppress Terminated message after killing in bash?

How can you suppress the Terminated message that comes up after you kill a
process in a bash script?
I tried set +bm, but that doesn't work.
I know another solution involves calling exec 2> /dev/null, but is that
reliable? How do I reset it back so that I can continue to see stderr?
In order to silence the message, you must be redirecting stderr at the time the message is generated. Because the kill command sends a signal and doesn't wait for the target process to respond, redirecting stderr of the kill command does you no good. The bash builtin wait was made specifically for this purpose.
Here is very simple example that kills the most recent background command. (Learn more about $! here.)
kill $!
wait $! 2>/dev/null
Because both kill and wait accept multiple pids, you can also do batch kills. Here is an example that kills all background processes (of the current process/script of course).
kill $(jobs -rp)
wait $(jobs -rp) 2>/dev/null
I was led here from bash: silently kill background function process.
The short answer is that you can't. Bash always prints the status of foreground jobs. The monitoring flag only applies for background jobs, and only for interactive shells, not scripts.
see notify_of_job_status() in jobs.c.
As you say, you can redirect so standard error is pointing to /dev/null but then you miss any other error messages. You can make it temporary by doing the redirection in a subshell which runs the script. This leaves the original environment alone.
(script 2> /dev/null)
which will lose all error messages, but just from that script, not from anything else run in that shell.
You can save and restore standard error, by redirecting a new filedescriptor to point there:
exec 3>&2 # 3 is now a copy of 2
exec 2> /dev/null # 2 now points to /dev/null
script # run script with redirected stderr
exec 2>&3 # restore stderr to saved
exec 3>&- # close saved version
But I wouldn't recommend this -- the only upside from the first one is that it saves a sub-shell invocation, while being more complicated and, possibly even altering the behavior of the script, if the script alters file descriptors.
EDIT:
For more appropriate answer check answer given by Mark Edgar
Solution: use SIGINT (works only in non-interactive shells)
Demo:
cat > silent.sh <<"EOF"
sleep 100 &
kill -INT $!
sleep 1
EOF
sh silent.sh
http://thread.gmane.org/gmane.comp.shells.bash.bugs/15798
Maybe detach the process from the current shell process by calling disown?
The Terminated is logged by the default signal handler of bash 3.x and 4.x. Just trap the TERM signal at the very first of child process:
#!/bin/sh
## assume script name is test.sh
foo() {
trap 'exit 0' TERM ## here is the key
while true; do sleep 1; done
}
echo before child
ps aux | grep 'test\.s[h]\|slee[p]'
foo &
pid=$!
sleep 1 # wait trap is done
echo before kill
ps aux | grep 'test\.s[h]\|slee[p]'
kill $pid ## no need to redirect stdin/stderr
sleep 1 # wait kill is done
echo after kill
ps aux | grep 'test\.s[h]\|slee[p]'
Is this what we are all looking for?
Not wanted:
$ sleep 3 &
[1] 234
<pressing enter a few times....>
$
$
[1]+ Done sleep 3
$
Wanted:
$ (set +m; sleep 3 &)
<again, pressing enter several times....>
$
$
$
$
$
As you can see, no job end message. Works for me in bash scripts as well, also for killed background processes.
'set +m' disables job control (see 'help set') for the current shell. So if you enter your command in a subshell (as done here in brackets) you will not influence the job control settings of the current shell. Only disadvantage is that you need to get the pid of your background process back to the current shell if you want to check whether it has terminated, or evaluate the return code.
This also works for killall (for those who prefer it):
killall -s SIGINT (yourprogram)
suppresses the message... I was running mpg123 in background mode.
It could only silently be killed by sending a ctrl-c (SIGINT) instead of a SIGTERM (default).
disown did exactly the right thing for me -- the exec 3>&2 is risky for a lot of reasons -- set +bm didn't seem to work inside a script, only at the command prompt
Had success with adding 'jobs 2>&1 >/dev/null' to the script, not certain if it will help anyone else's script, but here is a sample.
while true; do echo $RANDOM; done | while read line
do
echo Random is $line the last jobid is $(jobs -lp)
jobs 2>&1 >/dev/null
sleep 3
done
Another way to disable job notifications is to place your command to be backgrounded in a sh -c 'cmd &' construct.
#!/bin/bash
# ...
pid="`sh -c 'sleep 30 & echo ${!}' | head -1`"
kill "$pid"
# ...
# or put several cmds in sh -c '...' construct
sh -c '
sleep 30 &
pid="${!}"
sleep 5
kill "${pid}"
'
I found that putting the kill command in a function and then backgrounding the function suppresses the termination output
function killCmd() {
kill $1
}
killCmd $somePID &
Simple:
{ kill $! } 2>/dev/null
Advantage? can use any signal
ex:
{ kill -9 $PID } 2>/dev/null

Resources