How to check connection to a list of servers in bash? - bash

Im trying to check connections for a list of servers. I want to loop through the list, check if a connection works and if yes, do some stuff, if not, echo out a problem message.
My problem is:
the script stops at the first node without echoing the $?.
So, whats wrong with my for-loop?
These vars are included from a config file:
$nodes is a list of server IPs like 1.1.1.1,2.2.2.2,10.10.10.10
$user is one string
for node in $(echo $nodes | sed "s/,/ /g")
do
echo "Checking Node: $node"
ssh -q -o ConnectTimeout=3 $user#$node echo ok
echo $?
if [[ $? != 0 ]]
then
echo "Problem in logging into $node"
else
# do some stuff here
fi
done
EDIT #1:
for node in $(echo $nodes | sed "s/,/ /g")
do
echo "Checking Node: $node"
ssh -q -t -o ConnectTimeout=3 $user#$node "echo ok"
retcode=$?
echo $retcode
if [[ "$retcode" -ne 0 ]]
then
echo "Problem in logging into $node"
else
echo "OK"
fi
done

It is because ssh first asks you to validate The authority of the host and If you accept the authority it will ask for password. That is why your command does not return to shell and waits for input.
If your intention is just validating ssh connection, then you may consider to use
telnet <your_host> <port> < /dev/null
But if your intend is to run some commands you need a trust relationship between hosts. In that case you can use:
Execute this commands:
ssh-keygen -t rsa
then
ssh-copy-id -i root#ip_address
Now you can connect with
ssh <user>#<host>
Furher information

You can add -tto make virtual terminal and add quotes on command:
ssh -q -t -o ConnectTimeout=3 ${user}#${node} "echo ok"
Also use -ne instead of != which is for compare strings
if [[ "$?" -ne 0 ]]
Also echo $? mess the return code. You should use something like:
ssh -q -t -o ConnectTimeout=3 ${user}#${node} "echo ok"
retcode=$?
echo $retcode
if [[ "$retcode" -ne 0 ]]
You can rewrite ssh command like this to avoid problems with ssh host keys
ssh -q -t -o StrictHostKeyChecking=no -o ConnectTimeout=3 ${user}#${node} "echo ok"

Related

How to wait until ssh is available?

I'm trying to code a script which will wait for a server to be up and check if ssh is running.
#!/bin/bash
until [ $(ssh -o BatchMode=yes -o ConnectTimeout=5 root#HOST echo ok 2>&1) = "ok" ]; do
echo "Trying again..."
done
echo "SSH is running"
I have this error if server is power off :
test3: ligne 3 : [: Too many arguments
Trying again...
^C
If server is running it output :
ok
The trivial fix is to put double quotes around the string which might come up empty.
until [ "$(ssh ...)" = "ok" ]; do ...
The Bash-only test [[ is more tolerant, so you could use [[ ... ]] instead of [ ... ] and not have to add quotes.
... but a better solution is to look for the exit status from ssh:
until ssh ...; do ...
If you want the operation to be silent, add a redirection.
until ssh user#hostname true >/dev/null 2>&1; do ...
with whatever additional options you want, of course. You might need to add one or more ssh -t options if it complains about not being connected to a TTY, for example.
Your ssh command is expanding to nothing, or to multiple words; you should quote it (and run Shellcheck on your script):
until [ "$(ssh ... )" = ok ]; do

Variable Not Picking up when in Quotes

I'm trying to rsync a DIR from one Server to 100s of Servers using script (Bottom)
But, When i put single or double quotes around ${host} variable, Host names are not picked properly or not resolved.
Error is like below
server1.example.com
Host key verification failed.
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
and when I run only with rync command like below, It works. But, Output doesn't contain hostname which is important for me to correlate the output with associated hostname.
hostname -f && rsync -arpn --stats /usr/xyz ${host}:/usr/java
Can you please review and suggest me how to make the script work even with quotes around Host variable. ?
So, that , Output will contain hostname and output of rsync together.
==============================================
#!/bin/bash
tmpdir=${TMPDIR:-/home/user}/output.$$
mkdir -p $tmpdir
count=0
while IFS= read -r host; do
ssh -n -o BatchMode=yes ${host} '\
hostname -f && \
rsync -arpn --stats /usr/xyz '${host}':/usr/java && \
ls -ltr /usr/xyz'
> ${tmpdir}/${host} 2>&1 &
count=`expr $count + 1`
done < /home/user/servers/non_java7_nodes.list
while [ $count -gt 0 ]; do
wait $pids
count=`expr $count - 1`
done
echo "Output for hosts are in $tmpdir"
exit 0
UPDATE:
Based on observation with (set -x), Host name is being resolved on remote (self) it self, it supposed to be resolved on initiating host. I think Once we know how to make host name resolved with in initiating host even when quotes are in place.
As far as I can tell, what you're looking for is something like:
#!/bin/bash
tmpdir=${TMPDIR:-/home/user}/output.$$
mkdir -p "$tmpdir"
host_for_pid=( )
while IFS= read -r host <&3; do
{
ssh -n -o BatchMode=yes "$host" 'hostname -f' && \
rsync -arpn --stats /usr/xyz "$host:/usr/java" && \
ssh -n -o BatchMode=yes "$host" 'ls -ltr /usr/java'
} </dev/null >"${tmpdir}/${host}" 2>&1 & pids[$!]=$host
done 3< /home/user/servers/non_java7_nodes.list
for pid in "${!host_for_pid[#]}"; do
if wait "$pid"; then
:
else
echo "ERROR: Process for host ${host_for_pid[$pid]} had exit status $?" >&2
fi
done
echo "Output for hosts are in $tmpdir"
Note that the rsync is no longer inside the ssh command, so it's run locally, not remotely.

the bash script only reboot the router without echoing whether it is up or down

#!/bin/bash
ip route add 10.105.8.100 via 192.168.1.100
date
cat /home/xxx/Documents/list.txt | while read output
do
ping="ping -c 3 -w 3 -q 'output'"
if $ping | grep -E "min/avg/max/mdev" > /dev/null; then
echo 'connection is ok'
else
echo "router $output is down"
then
cat /home/xxx/Documents/roots.txt | while read outputs
do
cd /home/xxx/Documents/routers
php rebootRouter.php "outputs" admin admin
done
fi
done
The other documents are:
lists.txt
10.105.8.100
roots.txt
192.168.1.100
when i run the script, the result is a reboot of the router am trying to ping. It doesn't ping.
Is there a problem with the bash script.??
If your files only contain a single line, there's no need for the while-loop, just use read:
read -r router_addr < /home/xxx/Documents/list.txt
# the grep is unnecessary, the return-code of the ping will be non-zero if the host is down
if ping -c 3 -w 3 -q "$router_addr" &> /dev/null; then
echo "connection to $router_addr is ok"
else
echo "router $router_addr is down"
read -r outputs < /home/xxx/Documents/roots.txt
cd /home/xxx/Documents/routers
php rebootRouter.php "$outputs" admin admin
fi
If your files contain multiple lines, you should redirect the file from the right-side of the while-loop:
while read -r output; do
...
done < /foo/bar/baz
Also make sure your files contain a newline at the end, or use the following pattern in your while-loops:
while read -r output || [[ -n $output ]]; do
...
done < /foo/bar/baz
where || [[ -n $output ]] is true even if the file doesn't end in a newline.
Note that the way you're checking for your routers status is somewhat brittle as even a single missed ping will force it to reboot (for example the checking computer returns from a sleep-state just as the script is running, the ping fails as the network is still down but the admin script succeeds as the network just comes up at that time).

Unable to kill remote processes with ssh

I need to kill remote processes with a shell script as follows:
#!/bin/bash
ip="172.24.63.41"
user="mag"
timeout 10s ssh -q $user#$ip exit
if [ $? -eq 124 ]
then
echo "can not connect to $ip, timeout out."
else
echo "connected, executing commands"
scp a.txt $user#$ip://home/mag
ssh -o ConnectTimeout=10 $user#$ip > /dev/null 2>&1 << remoteCmd
touch b.txt
jobPid=`jps -l | grep jobserver | awk '{print $1}'`
if [ ! $jobPid == "" ]; then
kill -9 $jobPid
fi
exit
remoteCmd
echo "commands executed."
fi
After executed it I found the scp and touch clauses had been executed, but the kill clause had not been executed successful and the process is still there. If I run clauses from "jobPid= ..." to "fi" on remote machine the process can be killed. How to fix it?
I put a script on the remote machine which can find and kill the process, then I ran the script on local machine which execute the script on the remote machine with ssh. The script is as follows:
Local script:
#!/bin/bash
ip="172.24.63.41"
user="mag"
timeout 10s ssh -q $user#$ip exit
if [ $? -eq 124 ]
then
echo "can not connect to $ip, timeout out."
else
echo "connected, executing commands"
ssh -q $user#$ip "/home/mag/local.sh"
echo "commands executed."
fi
remote script:
#!/bin/bash
jobPid=`jps -l | grep jobserver | awk '{print $1}'`
if [ ! $jobPid == "" ]; then
kill -9 $jobPid
fi
Your script needs root access (WHICH IS NEVER A GOOD IDEA). Or make sure your program which is running, is running under your webuser/group

Checking SSH failure in a script

Hi what is the best way to check to see if SSH fails for whatever reason?
Can I use a IF statement ( if it fails then do something)
I'm using the ssh command in a loop and passing my hosts names form a flat file.
so I do something like:
for i in `cat /tmp/hosts` ; do ssh $i 'hostname;sudo ethtool eth1'; done
I get sometime this error or I just cannot connect
ssh: host1 Temporary failure in name resolution
I want to skip the hosts that I cannot connect to is SSH fails. What is the best way to do this? Is there a runtime error I can trap to bypass the hosts that I cannot ssh into for whatever reason, perhaps ssh is not allowed or I do not have the right password ?
Thanking you in advance
Cheers
To check if there was a problem connecting and/or running the remote command:
if ! ssh host command
then
echo "SSH connection or remote command failed"
fi
To check if there was a problem connecting, regardless of success of the remote command (unless it happens to return status 255, which is rare):
if ssh host command; [ $? -eq 255 ]
then
echo "SSH connection failed"
fi
Applied to your example, this would be:
for i in `cat /tmp/hosts` ;
do
if ! ssh $i 'hostname;sudo ethtool eth1';
then
echo "Connection or remote command on $i failed";
fi
done
You can check the return value that ssh gives you as originally shown here:
How to create a bash script to check the SSH connection?
$ ssh -q user#downhost exit
$ echo $?
255
$ ssh -q user#uphost exit
$ echo $?
0
EDIT - I cheated and used nc
Something like this:
#!/bin/bash
ssh_port_is_open() { nc -z ${1:?hostname} 22 > /dev/null; }
for host in `cat /tmp/hosts` ; do
if ssh_port_is_open $host; then
ssh -o "BatchMode=yes" $i 'hostname; sudo ethtool eth1';
else
echo " $i Down"
fi
done

Resources