We have a script which is checking and sending an alert if process goes down. For some reason it is not capturing it properly for all the users and not sending the alerts in all scenarios.
Please suggest what could be the problem.
Environments – uatwrk1, uatwrk2, uatwrk3 ------- uatwrk100
ServerName - myuatserver
Process to be checked - Amc/apache/bin/httpd
Script is :
#!/bin/ksh
i=1
while (( i<=100 ))
do
myuser=uatwrk$i
NoOfProcess=`ps -ef | grep -v grep | grep $myuser | grep "Amc/apache/bin/httpd" | wc -l`
if [[ $NoOfProcess -eq 0 ]]
then
echo "Amc process is down, sending an alert"
# Assume sendAlert.ksh is fine
./sendAlert.ksh
else
echo "Amc process is running fine" >> /dev/null
fi
(( i+=1 ))
done
I think #Mahesh already indicated the problem in a comment.
When you only want to have a mail once, you can count the users running a httpd process. The backslash in the following command is for avoiding grep -v grep.
ps -ef | grep "A\mc/apache/bin/httpd" | cut -d " " -f1 | grep "^uatwrk"| sort -u | wc -l
Related
I have a script that will run and fill up a drive without remorse. I want to set a highwater mark of 90%.
I created a cron that will check if the script is installed and if so if running.
#!/bin/sh
COUNTER=$(df -Ph | grep -vE '^tmpfs|cdrom' | sed s/%//g | sed 1d | awk '{ if($5 > 90) print $1;}' | wc -l)
if [[ -e /var/stats/automation/stats-collection.pl ]]; then
_running=$(ps faux | grep stats-collection.pl | grep -v grep | awk '{print $2}')
if [[ $_running -eq 1 ]]; then
echo "Stats running"
if [[ $COUNTER -gr 0 ]]; then
echo "Drive found above highwater mark. Ending Platform stats."
kill $_running
fi
else
echo "Platform stats not running."
else
echo "Platform stats not found"
fi
How would I combine the benefits of a continuously running cron job to run and stop the offending script from running. Would a while loop be the answer?
I am trying to capture the number of active processes by running a command and trying to capture the result in a variable of shell script, but unfortunately, nothing is getting captured. The code is as below:
#!/bin/ksh
## Checking whether or not the Previous Build is Completed
count_build_status=`ps -ef | grep BDD_PreCheck.sh | grep -c FT_BGmgmt` | tee -a ${logFile}
echo "The Value of Count Build Status is $count_build_status"
if [[ "${count_build_status}" != "0" ]]
then
echo INFO - The previous build has not ended yet. Please Wait for some time or contact the Administrator | tee -a $logFile
exit 1;
fi
exit 0;
Here, ps -ef | grep BDD_PreCheck.sh | grep -c FT_BGmgmt gives result as 0 if executed individually, but the value stored in 'count_build_status' is null.
Can anyone help?
The problem here is backtick, which does not assign value to the variable when tee'ing is not done inside backtick. It should be included at the end of logFile as in the code below.
#!/bin/bash
logFile=log.txt
## Checking whether or not the Previous Build is Completed
count_build_status=`ps -ef | grep BDD_PreCheck.sh | grep -c FT_BGmgmt | tee -a ${logFile}`
echo "The Value of Count Build Status is $count_build_status"
if [[ "${count_build_status}" != "0" ]]
then
echo INFO - The previous build has not ended yet. Please Wait for some time or contact the Administrator | tee -a $logFile
exit 1;
fi
exit 0;
example: capture output of a cmd-line to a variable and append a logfile with tee:
var=`ps -ef | grep 0 | grep -c 1 | tee -a log`
echo $var
Context
Got a daft script that checks a process is running on a group of hosts, like a watchdog, as I say it's a daft script so bear in mind it isn't 'perfect' by scripting standards
Problem
I've ran bash -x and can see that the script finishes its first check without actually redirecting the output of the command to the file which is very frustrating, it means each host is actually being evaluated to the last hosts output
Code
#!/bin/bash
FILE='OUTPUT'
for host in $(cat /etc/hosts | grep webserver.[2][1-2][0-2][0-9] | awk {' print $2 ' })
do ssh -n -f $host -i <sshkey> 'ps ax | grep myprocess | wc -l' > $FILE 2> /dev/null
cat $FILE
if grep '1' $FILE ; then
echo "Process is NOT running on $host"
cat $FILE
else
cat $FILE
echo "ALL OK on $host"
fi
cat $FILE
done
Script traceback
++ cat /etc/hosts
++ awk '{ print $2 }'
++ grep 'webserver.[2][1-2][0-2][0-9]'
+ for host in '$(cat /etc/hosts | grep webserver.[2][1-2][0-2][0-9] | awk {'\'' print $2 '\''})'
+ ssh -n -f webserver.2100 -i <omitted> 'ps ax | grep myprocess | wc -l'
+ cat OUTPUT
+ grep 1 OUTPUT
+ cat OUTPUT
+ echo 'ALL OK on webserver.2100'
ALL OK on webserver.2100
+ cat OUTPUT
+ printf 'webserver.2100 checked \n'
webserver.2100 checked
+ for host in '$(cat /etc/hosts | grep webserver.[2][1-2][0-2][0-9] | awk {'\'' print $2 '\''})'
+ ssh -n -f webserver.2101 -i <omitted> 'ps ax | grep myprocess | wc -l'
+ cat OUTPUT
2
+ grep 1 OUTPUT
+ cat OUTPUT
2
+ echo 'ALL OK on webserver.2101'
ALL OK on webserver.2101
+ cat OUTPUT
2
+ printf 'webserver.2101 checked \n'
webserver.2101 checked
Issue
As you can see, it's registering nothing for the first host, then after it is done, it's piping the data into the file, then the second host is being evaluated for the previous hosts data...
I suspect its to do with redirection, but in my eyes this should work, it doesn't so it's frustrating.
I think you're assuming that ps ax | grep myprocess will always return at least one line (the grep process). I'm not sure that's true. I'd rewrite that like this:
awk '/webserver.[2][1-2][0-2][0-9]/ {print $2}' /etc/hosts | while IFS= read -r host; do
output=$( ssh -n -f "$host" -i "$sshkey" 'ps ax | grep "[m]yprocess"' )
if [[ -z "$output" ]]; then
echo "Process is NOT running on $host"
else
echo "ALL OK on $host"
fi
done
This trick ps ax | grep "[m]yprocess" effectively removes the grep process from the ps output:
the string "myprocess" matches the regular expression "[m]yprocess" (that's the running "myprocess" process), but
the string "[m]yprocess" does not match the regular expression "[m]yprocess" (that's the running "grep" process)
I want to check if my VPN is connected to a specific country. The VPN client has a status option but sometimes it doesn't return the correct country, so I wrote a script to check if I'm for instance connected to Sweden. My script looks like this:
#!/bin/bash
country=Sweden
service=expressvpn
while true; do
if ((curl -s https://www.iplocation.net/find-ip-address | grep $country | grep -v "grep" | wc -l) > 0 )
then
echo "$service connected!!!"
else
echo "$service not connected!"
$service connect $country
fi;
sleep 5;
done
The problem is, it always says "service connected", even when it isn't. When I enter the curl command manually, wc -l returns 0 if it didn't find Sweden and 1 when it does. What's wrong with the if statement?
Thank you
Peter
(( )) enters a math context -- anything inside it is interpreted as a mathematical expression. (You want your code to be interpreted as a math expression -- otherwise, > 0 would be creating a file named 0 and storing wc -l's output in that file, not comparing the output of wc -l to 0).
Since you aren't using )) on the closing side, this is presumably exactly what's happening: You're storing the output of wc -l in a file named 0, and then using its exit status (successful, since it didn't fail) to decide to follow the truthy branch of the if statement. [Just adding more parens on the closing side won't fix this, either, since curl -s ... isn't valid math syntax].
Now, if you want to go the math approach, what you can do is run a command substitution, which replaces the command with its output; that is a math expression:
# smallest possible change that works -- but don't do this; see other sections
if (( $(curl -s https://www.iplocation.net/find-ip-address | grep $country | grep -v "grep" | wc -l) > 0 )); then
...if your curl | grep | grep | wc becomes 5, then after the command substitution this looks like:
if (( 5 > 0 )); then
...and that does what you'd expect.
That said, this is silly. You want to know if your target country is in curl's output? Just check for that directly with shell builtins alone:
if [[ $(curl -s https://www.iplocation.net/find-ip-address) = *"$country"* ]]; then
echo "Found $country in output of curl" >&2
fi
...or, if you really want to use grep, use grep -q (which suppresses output), and check its exit status (which is zero, and thus truthy, if and only if it successfully found a match):
if curl -s https://www.iplocation.net/find-ip-address | grep -q -e "$country"; then
echo "Found $country in output of curl with grep" >&2
fi
This is more efficient in part because grep -q can stop as soon as it finds a match -- it doesn't need to keep reading more content -- so if your file is 16KB long and the country name is in the first 1KB of output, then grep can stop reading from curl (and curl can stop downloading) as soon as that first match 1KB in is seen.
The result of the curl -s https://www.iplocation.net/find-ip-address | grep $country | grep -v "grep" | wc -l statement is text. You compare text and number, that is why your if statement does not work.
This might solve your problem;
if [ $(curl -s https://www.iplocation.net/find-ip-address | grep $country | grep -v "grep" | wc -l) == "0" ] then ...
That worked, thank you for your help, this is what my script looks now:
#!/bin/bash
country=Switzerland
service=expressvpn
while true; do
if curl -s https://www.iplocation.net/find-ip-address | grep -q -e "$country"; then
echo "Found $country in output of curl with grep" >&2
echo "$service not connected!!!"
$service connect Russia
else
echo "$service connected!"
fi;
sleep 5;
done
First off, I'm new to this. I have some experience with windows scripting and apple script but not much with bash. What I'm trying to do is grab the PID and %CPU of a specific process. then compare the %CPU against a set number, and if it's higher, kill the process. I feel like I'm close, but now I'm getting the following error:
[[: 0.0: syntax error: invalid arithmetic operator (error token is ".0")
what am I doing wrong? here's my code so far:
#!/bin/bash
declare -i app_pid
declare -i app_cpu
declare -i cpu_limit
app_name="top"
cpu_limit="50"
app_pid=`ps aux | grep $app_name | grep -v grep | awk {'print $2'}`
app_cpu=`ps aux | grep $app_name | grep -v grep | awk {'print $3'}`
if [[ ! $app_cpu -gt $cpu_limit ]]; then
echo "crap"
else
echo "we're good"
fi
Obviously I'm going to replace the echos in the if/then statement but it's acting as if the statement is true regardless of what the cpu load actually is (I tested this by changing the -gt to -lt and it still echoed "crap"
Thank you for all the help. Oh, and this is on a OS X 10.7 if that is important.
I recommend taking a look at the facilities of ps to avoid multiple horrible things you do.
On my system (ps from procps on linux, GNU awk) I would do this:
ps -C "$app-name" -o pid=,pcpu= |
awk --assign maxcpu="$cpu_limit" '$2>maxcpu {print "crappy pid",$1}'
The problem is that bash can't handle decimals. You can just multiply them by 100 and work with plain integers instead:
#!/bin/bash
declare -i app_pid
declare -i app_cpu
declare -i cpu_limit
app_name="top"
cpu_limit="5000"
app_pid=`ps aux | grep $app_name | grep -v grep | awk {'print $2'}`
app_cpu=`ps aux | grep $app_name | grep -v grep | awk {'print $3*100'}`
if [[ $app_cpu -gt $cpu_limit ]]; then
echo "crap"
else
echo "we're good"
fi
Keep in mind that CPU percentage is a suboptimal measurement of application health. If you have two processes running infinite loops on a single core system, no other application of the same priority will ever go over 33%, even if they're trashing around.
#!/bin/sh
PROCESS="java"
PID=`pgrep $PROCESS | tail -n 1`
CPU=`top -b -p $PID -n 1 | tail -n 1 | awk '{print $9}'`
echo $CPU
I came up with this, using top and bc.
Use it by passing in ex: ./script apache2 50 # max 50%
If there are many PIDs matching your program argument, only one will be calculated, based on how top lists them. I could have extended the script by catching them all and avergaing the percentage or something, but this will have to do.
You can also pass in a number, ./script.sh 12345 50, which will force it to use an exact PID.
#!/bin/bash
# 1: ['command\ name' or PID number(,s)] 2: MAX_CPU_PERCENT
[[ $# -ne 2 ]] && exit 1
PID_NAMES=$1
# get all PIDS as nn,nn,nn
if [[ ! "$PID_NAMES" =~ ^[0-9,]+$ ]] ; then
PIDS=$(pgrep -d ',' -x $PID_NAMES)
else
PIDS=$PID_NAMES
fi
# echo "$PIDS $MAX_CPU"
MAX_CPU="$2"
MAX_CPU="$(echo "($MAX_CPU+0.5)/1" | bc)"
LOOP=1
while [[ $LOOP -eq 1 ]] ; do
sleep 0.3s
# Depending on your 'top' version and OS you might have
# to change head and tail line-numbers
LINE="$(top -b -d 0 -n 1 -p $PIDS | head -n 8 \
| tail -n 1 | sed -r 's/[ ]+/,/g' | \
sed -r 's/^\,|\,$//')"
# If multiple processes in $PIDS, $LINE will only match\
# the most active process
CURR_PID=$(echo "$LINE" | cut -d ',' -f 1)
# calculate cpu limits
CURR_CPU_FLOAT=$(echo "$LINE"| cut -d ',' -f 9)
CURR_CPU=$(echo "($CURR_CPU_FLOAT+0.5)/1" | bc)
echo "PID $CURR_PID: $CURR_CPU""%"
if [[ $CURR_CPU -ge $MAX_CPU ]] ; then
echo "PID $CURR_PID ($PID_NAMES) went over $MAX_CPU""%"
echo "[[ $CURR_CPU""% -ge $MAX_CPU""% ]]"
LOOP=0
break
fi
done
echo "Stopped"
Erik, I used a modified version of your code to create a new script that does something similar. Hope you don't mind it.
A bash script to get the CPU usage by process
usage:
nohup ./check_proc bwengine 70 &
bwegnine is the process name we want to monitor 70 is to log only when the process is using over 70% of the CPU.
Check the logs at: /var/log/check_procs.log
The output should be like:
DATE | TOTAL CPU | CPU USAGE | Process details
Example:
03/12/14 17:11 |20.99|98| ProdPROXY-ProdProxyPA.tra
03/12/14 17:11 |20.99|100| ProdPROXY-ProdProxyPA.tra
Link to the full blog:
http://felipeferreira.net/?p=1453
It is also useful to have app_user information available to test whether the current user has the rights to kill/modify the running process. This information can be obtained along with the needed app_pid and app_cpu by using read eliminating the need for awk or any other 3rd party parser:
read app_user app_pid tmp_cpu stuff <<< \
$( ps aux | grep "$app_name" | grep -v "grep\|defunct\|${0##*/}" )
You can then get your app_cpu * 100 with:
app_cpu=$((${tmp_cpu%.*} * 100))
Note: Including defunct and ${0##*/} in grep -v prevents against multiple processes matching $app_name.
I use top to check some details. It provides a few more details like CPU time.
On Linux this would be:
top -b -n 1 | grep $app_name
On Mac, with its BSD version of top:
top -l 1 | grep $app_name