First off, I'm new to this. I have some experience with windows scripting and apple script but not much with bash. What I'm trying to do is grab the PID and %CPU of a specific process. then compare the %CPU against a set number, and if it's higher, kill the process. I feel like I'm close, but now I'm getting the following error:
[[: 0.0: syntax error: invalid arithmetic operator (error token is ".0")
what am I doing wrong? here's my code so far:
#!/bin/bash
declare -i app_pid
declare -i app_cpu
declare -i cpu_limit
app_name="top"
cpu_limit="50"
app_pid=`ps aux | grep $app_name | grep -v grep | awk {'print $2'}`
app_cpu=`ps aux | grep $app_name | grep -v grep | awk {'print $3'}`
if [[ ! $app_cpu -gt $cpu_limit ]]; then
echo "crap"
else
echo "we're good"
fi
Obviously I'm going to replace the echos in the if/then statement but it's acting as if the statement is true regardless of what the cpu load actually is (I tested this by changing the -gt to -lt and it still echoed "crap"
Thank you for all the help. Oh, and this is on a OS X 10.7 if that is important.
I recommend taking a look at the facilities of ps to avoid multiple horrible things you do.
On my system (ps from procps on linux, GNU awk) I would do this:
ps -C "$app-name" -o pid=,pcpu= |
awk --assign maxcpu="$cpu_limit" '$2>maxcpu {print "crappy pid",$1}'
The problem is that bash can't handle decimals. You can just multiply them by 100 and work with plain integers instead:
#!/bin/bash
declare -i app_pid
declare -i app_cpu
declare -i cpu_limit
app_name="top"
cpu_limit="5000"
app_pid=`ps aux | grep $app_name | grep -v grep | awk {'print $2'}`
app_cpu=`ps aux | grep $app_name | grep -v grep | awk {'print $3*100'}`
if [[ $app_cpu -gt $cpu_limit ]]; then
echo "crap"
else
echo "we're good"
fi
Keep in mind that CPU percentage is a suboptimal measurement of application health. If you have two processes running infinite loops on a single core system, no other application of the same priority will ever go over 33%, even if they're trashing around.
#!/bin/sh
PROCESS="java"
PID=`pgrep $PROCESS | tail -n 1`
CPU=`top -b -p $PID -n 1 | tail -n 1 | awk '{print $9}'`
echo $CPU
I came up with this, using top and bc.
Use it by passing in ex: ./script apache2 50 # max 50%
If there are many PIDs matching your program argument, only one will be calculated, based on how top lists them. I could have extended the script by catching them all and avergaing the percentage or something, but this will have to do.
You can also pass in a number, ./script.sh 12345 50, which will force it to use an exact PID.
#!/bin/bash
# 1: ['command\ name' or PID number(,s)] 2: MAX_CPU_PERCENT
[[ $# -ne 2 ]] && exit 1
PID_NAMES=$1
# get all PIDS as nn,nn,nn
if [[ ! "$PID_NAMES" =~ ^[0-9,]+$ ]] ; then
PIDS=$(pgrep -d ',' -x $PID_NAMES)
else
PIDS=$PID_NAMES
fi
# echo "$PIDS $MAX_CPU"
MAX_CPU="$2"
MAX_CPU="$(echo "($MAX_CPU+0.5)/1" | bc)"
LOOP=1
while [[ $LOOP -eq 1 ]] ; do
sleep 0.3s
# Depending on your 'top' version and OS you might have
# to change head and tail line-numbers
LINE="$(top -b -d 0 -n 1 -p $PIDS | head -n 8 \
| tail -n 1 | sed -r 's/[ ]+/,/g' | \
sed -r 's/^\,|\,$//')"
# If multiple processes in $PIDS, $LINE will only match\
# the most active process
CURR_PID=$(echo "$LINE" | cut -d ',' -f 1)
# calculate cpu limits
CURR_CPU_FLOAT=$(echo "$LINE"| cut -d ',' -f 9)
CURR_CPU=$(echo "($CURR_CPU_FLOAT+0.5)/1" | bc)
echo "PID $CURR_PID: $CURR_CPU""%"
if [[ $CURR_CPU -ge $MAX_CPU ]] ; then
echo "PID $CURR_PID ($PID_NAMES) went over $MAX_CPU""%"
echo "[[ $CURR_CPU""% -ge $MAX_CPU""% ]]"
LOOP=0
break
fi
done
echo "Stopped"
Erik, I used a modified version of your code to create a new script that does something similar. Hope you don't mind it.
A bash script to get the CPU usage by process
usage:
nohup ./check_proc bwengine 70 &
bwegnine is the process name we want to monitor 70 is to log only when the process is using over 70% of the CPU.
Check the logs at: /var/log/check_procs.log
The output should be like:
DATE | TOTAL CPU | CPU USAGE | Process details
Example:
03/12/14 17:11 |20.99|98| ProdPROXY-ProdProxyPA.tra
03/12/14 17:11 |20.99|100| ProdPROXY-ProdProxyPA.tra
Link to the full blog:
http://felipeferreira.net/?p=1453
It is also useful to have app_user information available to test whether the current user has the rights to kill/modify the running process. This information can be obtained along with the needed app_pid and app_cpu by using read eliminating the need for awk or any other 3rd party parser:
read app_user app_pid tmp_cpu stuff <<< \
$( ps aux | grep "$app_name" | grep -v "grep\|defunct\|${0##*/}" )
You can then get your app_cpu * 100 with:
app_cpu=$((${tmp_cpu%.*} * 100))
Note: Including defunct and ${0##*/} in grep -v prevents against multiple processes matching $app_name.
I use top to check some details. It provides a few more details like CPU time.
On Linux this would be:
top -b -n 1 | grep $app_name
On Mac, with its BSD version of top:
top -l 1 | grep $app_name
Related
I need help completing this. Trying to take user sessions sitting idle for greater than 15 minutes which aren't being kicked off by sshd_config and kill them. this is what I have to pull the sessions, how do I filter for greater than 15 minutes.
#!/bin/bash
IFS=$'\n'
for output in $(w | tr -s " " | cut -d" " -f1,5 | tail -n+3 | awk '{print $2}')
do
echo "$output \> 15:00"
done
If you are using Awk anyway, a shell loop is a clumsy antipattern. Awk already knows how to loop over lines; use it.
A serious complication is that the output from w is system-dependent and typically reformatted for human legibility.
tripleee$ w | head -n 4
8:16 up 37 days, 19:02, 17 users, load averages: 3.49 3.21 3.11
USER TTY FROM LOGIN# IDLE WHAT
tripleee console - 27Aug18 38days -
tripleee s003 - 27Aug18 38 ssh -t there screen -D -r
If yours looks similar, probably filter out anything where the IDLE field contains non-numeric information
w -h | awk '$5 ~ /[^0-9]/ || $5 > 15'
This prints the entire w output line. You might want to extract just the TTY field ({print $2} on my system) and figure out from there which session to kill.
A more fruitful approach on Linux-like systems is probably to examine the /proc filesystem.
You can try something like this …
for i in $(w --no-header | awk '{print $4}')
do
echo $i | grep days > /dev/null 2>&1
if [ $? == 0 ]
then
echo "greater that 15 mins"
fi
echo $i | grep min> /dev/null 2>&1
if [ $? == 0 ]
then
mins=$(echo $i | sed -e's/mins//g')
if [ $min -gt 15 ]
then
echo "Greater than 15 mins"
fi
fi
done
The tricky part is going to be figuring out what pid to kill.
we were given a task to write a script in a course. We have to make the script find out which proccess is "deepest" in process hierarchy, something like "pstree" command, but the output will be "depth_of_process : processes_with_the_depth".
I have started something, but I can't make it work. Could you please look at it and help me ? I haven't even started producing the output, I am working on the algorithm now - trying to make it into something like reverse depth-first search. In case the code is not self-explanatory enough, please let me know, I will do my best to describe it.
#!/bin/bash
PROCS=$(ps -eo "%p %P" | tail -n +2 | sort -nr)
declare -a array
while read -r line; do
counter=1
read kid parent
while read -r otherline; do
read kid2 parent2
if [ "$parent" = "$kid2" ]; then
counter=$((counter+1))
parent="$parent2"
fi
done <<< "$PROCS"
test=2
array["$kid"]="$counter"
done <<< "$PROCS"
#for value in "${!array[#]}"; do
# echo "$value ${array[value]}"
#done
echo "$PROCS"
If pstree is allowed I could offer this (thanks #tripleee for optimizing):
for processid in $(ps -ax | awk 'NR>1 {print $1}' ); do
depth=$(pstree -sA $processid | head -n1 | sed -e 's#-+-.*#---foobar#' -e 's#---*#\n#g' -eq | wc -l)
echo "$depth: $processid"
done
It might have issues if your processes contain two or more dashes in a row.
Of course you can add " | sort" after "done" to get the deepest processes.
We have a script which is checking and sending an alert if process goes down. For some reason it is not capturing it properly for all the users and not sending the alerts in all scenarios.
Please suggest what could be the problem.
Environments – uatwrk1, uatwrk2, uatwrk3 ------- uatwrk100
ServerName - myuatserver
Process to be checked - Amc/apache/bin/httpd
Script is :
#!/bin/ksh
i=1
while (( i<=100 ))
do
myuser=uatwrk$i
NoOfProcess=`ps -ef | grep -v grep | grep $myuser | grep "Amc/apache/bin/httpd" | wc -l`
if [[ $NoOfProcess -eq 0 ]]
then
echo "Amc process is down, sending an alert"
# Assume sendAlert.ksh is fine
./sendAlert.ksh
else
echo "Amc process is running fine" >> /dev/null
fi
(( i+=1 ))
done
I think #Mahesh already indicated the problem in a comment.
When you only want to have a mail once, you can count the users running a httpd process. The backslash in the following command is for avoiding grep -v grep.
ps -ef | grep "A\mc/apache/bin/httpd" | cut -d " " -f1 | grep "^uatwrk"| sort -u | wc -l
I've written the bash script (searchuser) which should display all the users who are executing a specific program or a script (at least a bash script). But when searching for scripts fails because the command the SO is executing is something like bash scriptname.
This script acts parsing the ps command output, it search for all the occurrences of the specified program name, extracts the user and the program name, verifies if the program name is that we're searching for and if it's it displays the relevant information (in this case the user name and the program name, might be better to output also the PID, but that is quite simple). The verification is accomplished to reject all lines containing program names which contain the name of the program but they're not the program we are searching for; if we're searching gedit we don't desire to find sgedit or gedits.
Other issues I've are:
I would like to avoid the use of a tmp file.
I would like to be not tied to GNU extensions.
The script has to be executed as:
root# searchuser programname <invio>
The script searchuser is the following:
#!/bin/bash
i=0
search=$1
tmp=`mktemp`
ps -aux | tr -s ' ' | grep "$search" > $tmp
while read fileline
do
user=`echo "$fileline" | cut -f1 -d' '`
prg=`echo "$fileline" | cut -f11 -d' '`
prg=`basename "$prg"`
if [ "$prg" = "$search" ]; then
echo "$user - $prg"
i=`expr $i + 1`
fi
done < $tmp
if [ $i = 0 ]; then
echo "No users are executing $search"
fi
rm $tmp
exit $i
Have you suggestion about to solve these issues?
One approach might looks like such:
IFS=$'\n' read -r -d '' -a pids < <(pgrep -x -- "$1"; printf '\0')
if (( ! ${#pids[#]} )); then
echo "No users are executing $1"
fi
for pid in "${pids[#]}"; do
# build a more accurate command line than the one ps emits
args=( )
while IFS= read -r -d '' arg; do
args+=( "$arg" )
done </proc/"$pid"/cmdline
(( ${#args[#]} )) || continue # exited while we were running
printf -v cmdline_str '%q ' "${args[#]}"
user=$(stat --format=%U /proc/"$pid") || continue # exited while we were running
printf '%q - %s\n' "$user" "${cmdline_str% }"
done
Unlike the output from ps, which doesn't distinguish between ./command "some argument" and ./command "some" "argument", this will emit output which correctly shows the arguments run by each user, with quoting which will re-run the given command correctly.
What about:
ps -e -o user,comm | egrep "^[^ ]+ +$1$" | cut -d' ' -f1 | sort -u
* Addendum *
This statement:
ps -e -o user,pid,comm | egrep "^\s*\S+\s+\S+\s*$1$" | while read a b; do echo $a; done | sort | uniq -c
or this one:
ps -e -o user,pid,comm | egrep "^\s*\S+\s+\S+\s*sleep$" | xargs -L1 echo | cut -d ' ' -f1 | sort | uniq -c
shows the number of process instances by user.
Welcome, I have a short script to kill processes which works longer than specified time for UIDs bigger than. How to exclude for example mc command from killing?
#!/bin/bash
#
#Put the minimum(!) UID to kill processes
UID_KILL=500
#Put the time in seconds which the process is allowed to run below
KILL_TIME=300
KILL_LIST=`{
ps -eo uid,pid,lstart | tail -n+2 |
while read PROC_UID PROC_PID PROC_LSTART; do
SECONDS=$[$(date +%s) - $(date -d"$PROC_LSTART" +%s)]
if [ $PROC_UID -ge $UID_KILL -a $SECONDS -gt $KILL_TIME ]; then
echo -n "$PROC_PID "
fi
done
}`
#KILLING LOOP
while sleep 1
do
if [[ -n $KILL_LIST ]]
then
kill $KILL_LIST
fi
done
change inner command like this :
ps -eo comm,uid,pid,lstart | tail -n+2 | grep -v '^your_command' | ...
this will exclude 'your_command' from the list.
see STANDARD FORMAT SPECIFIERS in man ps for more about ps -o.