Reboot if web service (tomcat7) is not responding - shell

I am running a web app on a Tomcat server. There is a hard-to-detect problem within the server code that causes it to crash once or twice everyday. I will dig in to correct it when I have time. But until that day, in a problematic case restarting tomcat (/etc/init.d/tomcat7 restart) or basically rebooting the machine also seem pretty good solutions for now. I want to detect liveliness of server with wget instead of grep or something else because even though tomcat is running my service may be down.
wget localhost:8080/MyService/
outputs
--2012-12-04 14:10:20-- http://localhost:8080/MyService/
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2777 (2.7K) [text/html]
Saving to: “index.html.3”
100%[======================================>] 2,777 --.-K/s in 0s
2012-12-04 14:10:20 (223 MB/s) - “index.html.3” saved [2777/2777]
when my service is up. And outputs
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:8080... failed: Connection refused.
or just stucks after saying
--2012-12-04 14:07:34-- http://localhost:8080/MyService/
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response...
otherwise. Can you offer me a shell script with a cron job or something else to do that. I prefer not to use cron if there is an alternative.

Why not using cron for that? Anyway ig oogled for tomcat + watchdog and found the following blog post.
Should give you and idea how to solve your problem.
hth

OK I find the solution to add a script under /etc/rc5.d/ as told here http://www.linuxforums.org/forum/mandriva-linux/27687-how-make-command-run-startup.html
!/bin/bash
echo "Restarted: " `date` >> /home/ec2-user/reboot.log
sleep 600
while [ 1 ]
do
var=`php -f /home/ec2-user/lastEntry.php`
if [ ${#var} -gt 3 ]
then
echo "Restarting: " `date` >> /home/ec2-user/reboot.log
reboot
fi
sleep 60
done
where last query checks a table to see if is there any entry in last 10 mins.
<?php
mysql_connect('ip', 'uname', 'pass') or die(mysql_error());
mysql_select_db("db") or die(mysql_error());
$query="SELECT min(now()-time) as last FROM table;";
$result = mysql_query($query)or die(mysql_error());
$row = mysql_fetch_array($result);
echo $row['last'];
?>
There are more straightforward ways to check if tomcat is running but this checks on the last output so more accurate check it's.

Related

Bash script to test status of site

I have a script for testing the status of a site, to then run a command if it is offline. However, I've since realised because the site is proxied through Cloudflare, it always shows the 200 status, even if the site is offline. So I need to come up with another approach. I tried testing the site using curl and HEAD. Both get wrong response (from Cloudflare).
What I have found is that HTTPie command gets the response I need. Although only when I use the -h option (I have no idea why that makes a difference, since visually the output looks identical to when I don't use -h).
Assuming this is an okay way to go about reaching my aim ... I'd like to know how I can test if a certain string appears more than 0 times.
The string is location: https:/// (with three forward slashes).
The command I use to get the header info from the actual site (and not simply from what Cloudflare is dishing up) is, http -h https://example.com/.
I am able to test for the string using, http -h https://example.com | grep -c 'location: https:///'. This will output 1 when the string exists.
What I now want to do is run a command if the output is 1. But this is where I need help. My bash skills are minimal, and I am going about it the wrong way. What I came up with (which doesn't work) is:
#!/bin/bash
STR=$(http -h https://example.com/)
if (( $(grep -c 'location: https:///' $STR) != 1 )); then
echo "Site is UP"
exit
else
echo "Site is DOWN"
sudo wo clean --all && sudo wo stack reload --all
fi
Please explain to me why it's not working, and how to do this correctly.
Thank you.
ADDITIONS:
What the script is testing for is an odd situation in which the site suddenly starts redirecting to, literally, https:///. This obviously causes the site to be down. Safari, for instance, takes this as a redirection to localhost. Chrome simply spits the dummy with a redirect error, ERR_INVALID_REDIRECT.
When this is occurring, the headers from the site are:
HTTP/2 301
server: nginx
date: Thu, 12 May 2022 10:19:58 GMT
content-type: text/html; charset=UTF-8
content-length: 0
location: https:///
x-redirect-by: WordPress
x-powered-by: WordOps
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
referrer-policy: no-referrer, strict-origin-when-cross-origin
x-download-options: noopen
x-srcache-fetch-status: HIT
x-srcache-store-status: BYPASS
I choose to test for the string location: https:/// since that's the most specific (and unique) to this issue. Could also test for HTTP/2 301.
The intention of the script is to remedy the problem when it occurs, as a temporary solution whilst I figure out what's causing Wordpress to generate such an odd redirect. Also in case it happens whilst I am not at work, or sleeping. :-) I will have a cron job running the script every 5 mins, so at least the site is never down for longer than that.
grep reads a file, not a string. Also, you need to quote strings, especially if they might contain whitespace or shell metacharacters.
More tantentially, grep -q is the usual way to check if a string exists at least once. Perhaps see also Why is testing “$?” to see if a command succeeded or not, an anti-pattern?
I can see no reason to save the string in a variable which you only examine once; though if you want to (for debugging reasons etc) probably avoid upper case variables. See also Correct Bash and shell script variable capitalization
The parts which should happen unconditionally should be outside the condition, rather than repeated in both branches.
Nothing here is Bash-specific, so I changed the shebang to use sh instead, which is more portable and sometimes faster. Perhaps see also Difference between sh and bash
#!/bin/sh
if http -h https://example.com/ | grep -q 'location: https:///'
then
echo "Site is UP"
else
echo "Site is DOWN"
fi
sudo wo clean --all && sudo wo stack reload --all
For basic diagnostics, probably try http://shellcheck.net/ before asking for human assistance.

wget resolves to a different IP than host

I have a shell script in which I use host to get the IP of the target site to update ufw and allow outbound traffic to that IP. However, when I make the subsequent wget call to the same base URL, it resolves to a different IP, and thus is blocked by ufw. Just to test, I tried pinging the URL, and it returned a different third IP.
We're blocking all outbound traffic by default in ufw, and only enable what we need to go out, so I need the script to update the correct IP so I can wget the content. The IP in each instance (host vs wget) is consistently the same, but they return different values with respect to each other, so I don't think it's simply a DNS issue. How do I get a consistent IP to update the firewall with, so that the subsequent wget request performs successfully? I disabled the firewall as a test, and was able to download from the URL successfully, so the issue is definitely in getting a consistent IP to point to.
HOSTNAME=<name of site to resolve>
LOGFILE=<logfile path>
Current_IP=$(host $HOSTNAME | head -n 1 | cut -d " " -f 4)
#this echoes the correct value
echo $Current_IP
if [ ! -f $LOGFILE ]; then
/usr/sbin/ufw allow out from any to $Current_IP
echo $Current_IP > $LOGFILE
echo New IP address found and logged >> ./download.log
else
Old_IP=$(cat $LOGFILE)
if [ "$Current_IP" = "$Old_IP" ] ; then
echo IP address has not changed >> ./download.log
else
/usr/sbin/ufw delete allow out from any to $Old_IP
/usr/sbin/ufw allow out from any to $Current_IP
echo $Current_IP > $LOGFILE
echo IP Address was updated in ufw >> ./download.log
fi
fi
After that updates the firewall, a subsequent wget to HOSTNAME attempts to go out to a different IP than was just updated.
Turns out the difference was "www.". When I was resolving host I was not using www, and when I was using wget I was using www, and thus they resolved to different IPs for this particular site.

Bash How to kill wget process after a given timeout? [duplicate]

This question already has answers here:
Timeout command on Mac OS X?
(6 answers)
Closed 1 year ago.
UPDATE:
This question has been closed. I asked in another question.
Bash how to run multiple wget with output by sequence (NOT Parallel) and delete the wget file for speedtest purpose
I used timeout solution
#!/bin/bash
function speedtest() {
local key=$1
local url=$2
timeout 10 wget $url
echo -e "\033[40;32;1m$key is completed.\033[0m"
}
speedtest "Lisbon" "https://lg-lis.fdcservers.net/100MBtest.zip"
speedtest "London" "https://lg-lon.fdcservers.net/100MBtest.zip"
speedtest "Madrid" "https://lg-mad.fdcservers.net/100MBtest.zip"
speedtest "Paris" "https://lg-par2.fdcservers.net/100MBtest.zip"
When I run the bash script, this is the output.
» ./wget_speedtest.sh [2021/05/8 |15:42:49]
Redirecting output to ‘wget-log’.
Lisbon is completed.
Redirecting output to ‘wget-log.1’.
London is completed.
Redirecting output to ‘wget-log.2’.
Madrid is completed.
Redirecting output to ‘wget-log.3’.
Paris is completed.
I am expected to see the kb/s for running after 10 seconds for each wget.
I have a list of wget of file to download and want to see the download speed. I just want to run about 10 seconds, then print out the result what is the download speed.
I have 20 different server file to test out. My goal is to see the how kb/s download for that 10 seconds.
e.g
> wget https://lg-lis.fdcservers.net/100MBtest.zip
--2021-05-08 13:37:37-- https://lg-lis.fdcservers.net/100MBtest.zip
Resolving lg-lis.fdcservers.net (lg-lis.fdcservers.net)... 50.7.43.4
Connecting to lg-lis.fdcservers.net (lg-lis.fdcservers.net)|50.7.43.4|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 104857600 (100M) [application/zip]
Saving to: ‘100MBtest.zip’
100MBtest.zip 0%[ ] 679.66K 174KB/s eta 9m 26s ^C
This is my bash file
#!/bin/bash
function speedtest() {
local key=$1
local url=$2
( cmdpid=$$;
(sleep 10; kill $cmdpid; rm -f 100M) \
& while ! wget "$url"
do
echo -e "\033[40;32;1m$key for 10 seconds done.\033[0m"
done )
}
speedtest "Lisbon" "https://lg-lis.fdcservers.net/100MBtest.zip"
speedtest "London" "https://lg-lon.fdcservers.net/100MBtest.zip"
speedtest "Madrid" "https://lg-mad.fdcservers.net/100MBtest.zip"
speedtest "Paris" "https://lg-par2.fdcservers.net/100MBtest.zip"
However, the above code does not work, it still download at background and redirect to wget-log
> ./wget_speedtest.sh
--2021-05-08 13:41:56-- https://lg-lis.fdcservers.net/100MBtest.zip
Resolving lg-lis.fdcservers.net (lg-lis.fdcservers.net)... 50.7.43.4
Connecting to lg-lis.fdcservers.net (lg-lis.fdcservers.net)|50.7.43.4|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 104857600 (100M) [application/octet-stream]
Saving to: ‘100M’
100M 5%[==> ] 5.84M 1.02MB/s eta 79s [1] 21251 terminated ./wget_speedtest.sh
Redirecting output to ‘wget-log’.
Use the timeout command (part of GNU coreutils):
$ timeout 10 wget https://lg-lis.fdcservers.net
/100MBtest.zip
--2021-05-07 22:53:20-- https://lg-lis.fdcservers.net/100MBtest.zip
Resolving lg-lis.fdcservers.net (lg-lis.fdcservers.net)... 50.7.43.4
Connecting to lg-lis.fdcservers.net (lg-lis.fdcservers.net)|50.7.43.4|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 104857600 (100M) [application/zip]
Saving to: ‘100MBtest.zip’
100MBtest.zip 0%[ ] 23.66K 10.8KB/s
$
If the specified time is exceeded, timeout exits with a status of 124.
timeout --help or info coreutils timeout for more information.
If you're on MacOS, see Timeout command on Mac OS X? for some suggested alternatives.

Bash script - check how many times public IP changes

I am trying to create my first bash script. The goal of this script is to check at what rate my public IP changes. It is a fairly straight forward script. First it checks if the new address is different from the old one. If so then it should update the old one to the new one and print out the date along with the new IP address.
At this point I have created a simple script in order to accomplish this. But I have two main problems.
First the script keeps on printing out the IP even tough it hasn't changed and I have updated the PREV_IP with the CUR_IP.
My second problem is that I want the output to direct to a file instead of outputting it into the terminal.
The interval is currently set to 1 second for test purposes. This will change to a higher interval in the final product.
#!/bin/bash
while true
PREV_IP=00
do
CUR_IP=$(curl https://ipinfo.io/ip)
if [ $PREV_IP != "$CUR_IP" ]; then
PREV_IP=$CUR_IP
"$(date)"
echo "$CUR_IP"
sleep 1
fi
done
I also get a really weird output. I have edited my public IP to xx.xxx.xxx.xxx:
Sat 20 Mar 09:45:29 CET 2021
xx.xxx.xxx.xxx
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:--
while true
PREV_IP=00
do
is the reason you are seeing ip each loop. It's the same as while true; PREV_IP=00; do. The exit status of true; PREV_IP=00 is the exit status of last command - the exit status of assignment is 0 (success) - so the loop will always execute. But PREV_IP will be reset to 00 each loop... This is a typo and you meant to set prev_ip once, before the loop starts.
"$(date)"
will try execute the output of date command, as a next command. So it will print:
$ "$(date)"
bash: sob, 20 mar 2021, 10:57:02 CET: command not found
And finally, to silence curl, read man curl first and then find out about -s. I use -sS so errors are also visible.
Do not use uppercase variables in your scripts. Prefer lower case variables. Check you scripts with http://shellcheck.net . Quote variable expansions.
I would sleep each loop. Your script could look like this:
#!/bin/bash
prev=""
while true; do
cur=$(curl -sS https://ipinfo.io/ip)
if [ "$prev" != "$cur" ]; then
prev="$cur"
echo "$(date) $cur"
fi
sleep 1
done
that I want the output to direct to a file instead of outputting it into the terminal.
Then research how redirection works in shell and how to use it. The simplest would be to redirect echo output.
echo "$(date) $cur" >> "a_file.txt"
The interval is currently set to 1 second for test purposes. This will change to a higher interval in the final product.
You are still limited with the time it takes to connect to https://ipinfo.io/ip. And from ipinfo.io documentation:
Free usage of our API is limited to 50,000 API requests per month.
And finally, I wrote a script where I tried to use many public services as I found ,get_ip_external for getting external ip address. You may take multiple public services for getting ipv4 address and choose a random/round-robin one so that rate-limiting don't kick that fast.

How to make script in bash aware that a server is still busy installing/configuring and wait for reboot?

The issue / dilemma
I am currently busy creating a script to kickstart servers (with CentOS 6.x and CentOS 7.x) remotely. So far the script is working, but hangs on one minor thing. Well actually it does not hang, but it does not give detailed information about what is happening. In other words, I am not getting the correct information back in bash about the job being finished correctly.
I have tried various things, however it's hanging with the following message (which is being repeated endlessly):
servername is still installing and configuring packages...
PING 100.125.150.175 (100.125.150.175) 56(84) bytes of data.
64 bytes from 100.125.150.175: icmp_seq=1 ttl=63 time=0.152 ms
64 bytes from 100.125.150.175: icmp_seq=2 ttl=63 time=0.157 ms
64 bytes from 100.125.150.175: icmp_seq=3 ttl=63 time=0.157 ms
64 bytes from 100.125.150.175: icmp_seq=4 ttl=63 time=0.143 ms
64 bytes from 100.125.150.175: icmp_seq=5 ttl=63 time=0.182 ms
--- 100.125.150.175 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 120025ms
rtt min/avg/max/mdev = 0.143/0.158/0.182/0.015 ms
servername is still installing and configuring packages...
PING 100.125.150.175 (100.125.150.175) 56(84) bytes of data.
64 bytes from 100.125.150.175: icmp_seq=1 ttl=63 time=0.153 ms
64 bytes from 100.125.150.175: icmp_seq=2 ttl=63 time=0.132 ms
64 bytes from 100.125.150.175: icmp_seq=3 ttl=63 time=0.142 ms
etc....
So for some reason it does not contine to the next line of code or does the next action. Since it's only feedback to me (or another user), it's not a majorissue. But it would be nice to get this functional and providing (detailed) information back about the current progress or what the script/server is actually doing at the moment. This is not the case for the above (last) piece of code unfortunately.
This is the current code snippet I have (yes, it's a mess):
while true;
do
#ping -c3 -i3 $HWNODEIP > /dev/null
#ping -c5 -i30 $HWNODEIP > /dev/null
ping -c5 -i30 $HWNODEIP
if [ $? -eq 1 ] || [ $? -eq 2 ] || [ $? -eq 68 ]
then
echo -e " "
echo -e "Kickstart part II also done. $HOSTNAME will be rebooted one more time."
sleep 5
######return 0
echo -e " "
printf "%s" "Waiting for $HOSTNAME to come back online: "
while ! ping -c 1 -n -w 30 $HWNODEIP &> /dev/null
do
printf "%c" "."
#sleep 10
done
echo -e " "
echo -e "Reboot is done and $HOSTNAME is back online. Performing final check. Please wait..."
sleep 10
echo -e " "
sudo /usr/local/collectHWdata.pl $HWNODEIP
ssh root#$HWNODEIP "while ! test -e /root/kickstart-DONE; do sleep 3; done; echo KICKSTART IS DONE\!"
echo -e " "
exit
else
echo -e " "
echo -e "$HOSTNAME is still installing and configuring packages..."
fi
done
Sidenote: I removed > /dev/null #5 for debugging (not that it helped)
I am guessing I am using things incorrectly and I am by no means a experienced scripter; I can only do minor stuff, but ofcourse I am doing my best. I have been fooling around with this since last week and still no result on this part.
What am I trying to achieve?
The server is rebooted after the selected CentOS version, creating partitions and setting up the network. This all works. The above snippet is after that reboot. Now it will install packages I selected, configure various things (like Nagios) and install/compile certain PERL modules. And a few other minor things.
This is done correctly in the background. I wanted to make the script (the above piece of code) that the server is still busy with installing things and such. Since I lack the knowledge to do that, I decided for a different approach; check if the server is online (in other words that it's still installing). As long as the server is online, it's still installing/configuring things obviously. After that is done, the server will reboot once more to perform the final 2 commands (as seen in my snippet). However (here is the problem) it never does those commands, though the kickstart is completely done.
So I am guessing I am doing something wrong and even might messed up things (or got confused by doing so). Maybe someone has an idea, solution or a completely different approach to tackle and fix this problem (or at least I hope so).
Other things I have tried so far? Well I tried a various of ping commands and I also tried nc (netcat) but also without a good result. I every single time hit a brick wall with the last 2 commands and it keeps pinging instead of showing that the kickstart was done... I think I have spend several hours (since last week) on this already without getting anywhere.
So I am hoping someone can take a look at this and tell me what I am doing wrong and maybe there is a better approach (other than pinging a server) to see if it's still busy. Maybe a (remote) check on yum, perl or a service, so that the script knows it's still busy.
Sorry for the long post, but I know when I provide as much information as possible including code examples and results, this is more "appreciated". So I am hoping I provided adequate information. If not, let me know. I will try to add as much information as I can. As always I am always willing to learn or change my approach.
Thank you already for reading my post!
As noted in the comments under the question:
The server may already be rebooted by the time ping -c5 -i30 $HWNODEIP finishes. The command sends 5 packets (-c flag), waiting 30 seconds between each packet (-i interval flag). So thats's 5*30 = 150 seconds, which is a bit more than 2 minutes. A server could reboot just fine within 2 minutes, especially if there's SSD in use. So try lowering the total time it would take this command to complete.
[ $? -eq 68 ] is probably unnecessary. $HWNODEIP is just ip address, and exit code 68 is for domain name not being resolved, which doesn't apply to IP addresses.
The if statement could be simplified to
if ! ping -c5 -i30 "$HWNODEIP"
These are minor suggestions,probably not bulletproof. As confirmed by OP in the comments, lowering interval helps. There's other small improvements that could be done (like quoting variables), but that's outside the scope of the question, so I'll leave it for now.

Resources