I am fairly new to bash scripting... I have an issue with a cronjob where I get too many emails when "ntpq: read: Connection refused" error comes up. I want to create a conditional when this error shows up, DO NOT send the email.
However, I can't seem to parse the output from "nptq -nc peers". I did try to redirect the output of the cronjob to a test.txt file and then create another cronjob that parses that file. However, I feel like there is a better solution.
Thanks for your help!
Here is my code for the cronjob
#!/bin/bash
limit=10101010101010101010000 # Set your limit in milliseconds here
offsets=$(/usr/sbin/ntpq -nc peers | /usr/bin/tail -n +3 | awk 'BEGIN { FS = " " } ; { print $9 }' | /usr/bin/tr -d '-')
for offset in ${offsets}; do
if echo $offset $limit | awk '{exit $1>$2?0:1}'
then
echo "NTPD offset $offset > $limit. Please investigate"
exit 1
fi
done
Related
I originally have this Lua script
function temp_watch()
warn_value=60
crit_value=80
temperature=tonumber(conky_parse("${hwmon 1 temp 1}"))
if cpu_tmp<warn_value then
settings_table[1]['fg_colour']=normal
elseif cpu_tmp<crit_value then
settings_table[1]['fg_colour']=warn
else
settings_table[1]['fg_colour']=crit
end
end
but for some reason, hwmon 1 temp 1 is just stuck reporting 25C. For this reason I switched to sensors. In conky I am executing it using this:
${exec sensors | grep 'Package id 0' | cut -d ' ' -f 5 | cut -c 2,3,4,5,6,7}
I tried using this solution: https://unix.stackexchange.com/questions/666250/how-to-use-conky-variable-with-external-command. Basically, replacing the temperature=tonumber... to
temperature=tonumber(conky_parse("${eval $${exec sensors | grep 'Package id 0' | cut -d ' ' -f 5 | cut -c 2,3,4,5,6,7}}"))
I also tried this: is it possible to pipe output from commandline to lua?. Replaced temperature=tonumber... to
local cpu_tmp = io.popen("exec sensors | grep 'Package id 0' | cut -d ' ' -f 5 | cut -c 2,3,4,5,6,7")
temperature=tonumber(cpu_tmp)
Both outputted this error:
llua_do_call: function conky_main execution failed: /home/joe/conky/conky-grapes/rings-v2_gen.lua:530: attempt to compare nil with number
am I missing some variable conversion or is there any other syntax to execute bash in lua?
Thanks in advance :-)
lua.org gave me an answer. The Complete I/O Model.
local file= io.popen("sensors -u | awk '/temp1_input:/ {print $2; exit}'")
local temperature = tonumber(file:read('*a'))
<SOME CODE HERE>
file:close()
end
It is not elegant but it seems to work. I will be running the lua script for quite some time to ensure it wont have that too many open files error
Many thanks to #Fravadona for pointing me in the right direction :-)
would like to get an opinion on how best to do this in bash, thank you
for x number of servers, each has it's own list of replication agreements and their status.. it's easy to run a few commands and get this data, ex;
get servers, output (setting/variable in/from a local config file);
. ./ldap-config ; echo "$MASTER $REPLICAS"
dc1-server1 dc1-server2 dc2-server1 dc2-server2 dc3...
for dc1-server1, get agreements, output;
ipa-replica-manage -p $(cat ~/.dspw) list -v $SERVER.$DOMAIN | grep ': replica' | sed 's/: replica//'
dc2-server1
dc3-server1
dc4-server1
for dc1-server1, get agreement status codes, output;
ipa-replica-manage -p $(cat ~/.dspw) list -v $SERVER.$DOMAIN | grep 'status: Error (' | sed -e 's/.*status: Error (//' -e 's/).*//'
0
0
18
so output would be several columns based on the 'get servers' list with each 'replica: status' under each server, for that server
looking to achieve something like;
dc2-server1: 0 dc2-server2: 0 dc1-server1: 0 ...
dc3-server1: 0 dc3-server2: 18 dc3-server1: 13 ...
dc4-server1: 18 dc4-server2: 0 dc4-server1: 0 ...
Generally eval is considered evil. Nevertheless, I'm going to use it.
paste is handy for printing files side-by-side.
Bash process substitutions can be used where you'd use a filename.
So, I'm going to dynamically build up a paste command and then eval it
I'm going to use get.sh as a placeholder for your mystery commands.
cmd="paste"
while read -ra servers; do
for server in "${servers[#]}"; do
cmd+=" <(./get.sh \"$server\" agreements | sed 's/\$/:/')"
cmd+=" <(./get.sh \"$server\" status)"
done
done < <(./get.sh servers)
eval "$cmd" | column -t
I want to find errors which have occurred in log file during the last hour. This is because I plan to schedule the script every hour in cron. I want to search for multiple error patterns and want to send mail if any one of those error patterns have been found during the last hour. Can someone please help me?
#Script which I tried to write is below
#!/bin/bash
SUBJECT="Critical errors found on $HOSTNAME"
TO="abc#example.com"
FNAME="/var/log/log4j/test.log"
PATTERN1="StackOverflow"
PATTERN2="OutOfMemory"
if [ ! -f stack.txt ]; then
touch stack.txt
fi
if [ ! -f comp_stack.txt ]; then
touch comp_stack.txt
fi
#first 19 bytes of log entry represents date/timestamp
cat stack.txt > comp_stack.txt
first_date="$(head -c19 comp_stack.txt)"
echo "first date is $first_date"
tac "$FNAME" | grep -m 1 -i "$PATTERN1\|$PATTERN2" > stack.txt
next_date="$(head -c19 stack.txt)"
echo "next date is $next_date"
if [ -s stack.txt ] && [ "$next_date" != "$first_date" ]; then
echo "dates not equal and file exists"
mail -s "$SUBJECT" "$TO" < stack.txt
fi
My script is creating 2 files stack.txt and comp_stack.txt if they don't exist.
It is supposed to search for pattern1 and pattern2 during last hour (I used "-m 1" to achieve this, but it gives me only 1 pattern in the output even if there are multiple patterns which match the error).
As I don't want the script to report same error multiple times, I am saving and comparing dates in my .txt files. So that error will be reported only if error timestamp is different from what was previously saved in the .txt file.
I used "tac" to search my log files from bottom to top as my log files are huge and I want to save time.
I am new to linux and scripting. Please help me write a working script.
Here is one line sample of my log file -
2020-09-01 01:27:16,500 | DEBUG | WebContainer : 17 | | | com.hertz.rates.common.utils.jdbc.RecordCallableStatement | DB Response: Activate/Cancel | 65 | READY | 2020-09-01 01:27:16,500 | DEBUG | WebContainer : 17 | | | com.hertz.rates.common.utils.jdbc.RecordCallableStatement | DB Response: Activate/Cancel | 65 |
Try this
mail="abc#example.com" # email address
subj="Critical errors found on $HOSTNAME" # email subject
logf="/var/log/log4j/test.log" # log file name
errs="StackOverflow|OutOfMemory" # define errors to check
printf -v time '%(%Y-%m-%d %H:)T' # get time template
check=$(grep "$time" "$logf" | grep "$errs" "$logf")
[[ $check ]] && mail -s "$subj" "$mail" <<< "$check"
I'm building an interface to basically list computers on my local network that are 'alive' and more or less have a list of these nodes, and their 'status'.
I've created a file called farm_ping.sh located under /Volumes/raid/farm_scripts/_apps/_scripts/farm_ping.sh
This file contains the following, which simply pings the IP and writes it's result to a txt file also named by the given IP:
HOSTS="192.168.1.110"
# no ping request
COUNT=1
for myHost in $HOSTS
do
count=$(ping -c $COUNT $myHost | grep 'received' | awk -F',' '{ print $2 }' | awk '{ print $1 }')
if [ $count -eq 0 ]; then
# 100% failed
echo "Host : $myHost is down (ping failed) at $(date)" > /Volumes/raid/farm_script/nodes_response/$myHost.txt
else
# 100% Passed
echo "Host : $myHost is running (ping successful) at $(date)" > /Volumes/raid/farm_script/nodes_response/$myHost.txt
fi
done
I want this to run this script every minute, here's what I have done to create a cron job:
env EDITOR=nano crontab -e
And in the cron job I wrote:
1 * * * * /Volumes/raid/farm_script/_apps/_scripts/farm_ping.sh
I saved this file, but it's been 30 mintues, and nothing has written yet, what have I done wrong?
Issue was cronjob syntax.
1 * * * *
In context, * means every possible value and a number means a particular time. So this literary means the 1st minute of every hour of every day of every month of every week day.
Replacing the the 1 with a * makes it run every minute.
I'm having an issue when i try to port my bash script to nagios.The scripts works fine when I run on console, but when I run it from Nagios i get the msg "(null)" - In the nagios debug log I see that it parse the script well but it returns the error msg..
I'm not very good at scripting so i guess i'll need some help
The objective of the script is to check *.ears version from some servers, md5 them and compare the output to see if the version matches or not.
To do that, i have a json on these servers that prints the name of the *.ear and his md5.
so.. The first part of the script gets that info from the json with curl and stores just the md5 number on a .tempfile , then it compares both temp files and if they match i got the $STATE_OK msg. If they dont , it creates a .datetmp file with the date ( the objective of this is to print a message after 48hs of inconsistence). Then, i make a diff of the .datetmp file and the days i wanna check if the result is less than 48hrs it prints the $STATE_WAR, if the result is more than 48 hrs it Prints the $STATE_CRI
The sintaxis of the script is " $ sh script.sh nameoftheear.ear server1 server2 "
Thanks in advance
#/bin/bash
#Variables For Nagios
cont=$1
bas1=$2
bas2=$3
## Here you set the servers hostname
svr1= curl -s "http://$bas1.domain.com:7877/apps.json" | grep -Po '"EAR File":.*? [^\\]",' | grep $cont | awk '{ print $5 }' > .$cont-tmpsvr1
svr2= curl -s "http://$bas2.domain.com:7877/apps.json" | grep -Po '"EAR File":.*? [^\\]",' | grep $cont | awk '{ print $5 }' > .$cont-tmpsvr2
file1=.$cont-tmpsvr1
file2=.$cont-tmpsvr2
md51=$(head -n 1 .$cont-tmpsvr1)
md52=$(head -n 1 .$cont-tmpsvr2)
datenow=$(date +%s)
#Error Msg
ERR_WAR="Not updated $bas1: $cont $md51 --- $bas2: $cont $md52 "
ERR_CRI="48 hs un-updated $bas1: $cont $md51 --- $bas2: $cont $md52 "
OK_MSG="Is up to date $bas1: $cont $md51 --- $bas2: $cont $md52 "
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
##Matching md5 Files
if cmp -s "$file1" "$file2"
then
echo $STATE_OK
echo $OK_MSG
# I do the rm to delete the date tmp file so i can get the $STATE_OK or $STATE_WARNING
rm .$cont-datetmp
exit 0
elif
echo $datenow >> .$cont-datetmp
#Vars to set modification date
datetmp=$(head -n 1 .$cont-datetmp)
diffdate=$(( ($datenow - $datetmp) /60 ))
#This var is to set the time of the critical ERR
days=$((48*60))
[ $diffdate -lt $days ]
then
echo $STATE_WARNING
echo $ERR_WAR
exit 1
else
echo $STATE_CRITICAL
echo $ERR_CRI
exit 2
fi
I am guessing some kind of permission problem - more specifically I don't think the nagios user can write to it's own home directory. You either fix those permissions or write to a file in /tmp (and consider using mktemp?).
...but ideally you'd skip writing all those files, as far as I can see all of those comparisons etc could be kept in memory.
UPDATE
Looked at your script again - I see some obvious errors you can look into:
You are printing out the exit value before you print the message.
You print the exit value rather than exit with the exit value.
...so this:
echo $STATE_WARNING
echo $ERR_WAR
exit 1
Should rather be:
echo $ERR_WAR
exit $STATE_WARNING
Also I am wondering if this is really the script or if you missed something when pasting. There seems to be missing an 'if' and also a superfluous line break in your last piece of code? Should rather be:
if [ $diffdate -lt $days ]
then
...
else
...
fi