I'm trying to "ping" if some applications are running in some remote machines.
To do that I have a file with the servers and applications, like:
server1:application1
server2:application1
server3:application2
Etc.
I expect to obtain the number of applications with this name that are running in the server.
To connect and to check I need a ssh connection.
My script is a bash and is like this:
Ping_Applications () {
SetParam
cat $APPFILE | while read next
do
server=`echo $next | cut -d : -f1`
app=`echo $next | awk -F":" '{print $2}'`
commando="/bin/ssh $server '/bin/ps -fea | /bin/grep $app | /bin/grep -v grep | /bin/wc -l'"
eval COMA=\$\($commando\)
echo $commando
if [ $COMA != 0 ]
then
echo -e "$TIME : Application $app of on server $server is \E[32m[ RUNNING ]\E[0m";
else
echo -e "$TIME : Application $app of on server $server is \E[31m[ NOT RUNNING ]\E[0m";
fi
done;
}
My problem is that when I send
ssh $server 'ps -fea | grep $app | grep -v grep | wc -l'
This returns the number, but when is sent by using the script, I have not answer, because (I think) the pipe open a new shell.
I don't know how to solve this.
Any idea?
Thanks
Luis
This is Bash FAQ 50
You want to put the command in an array, and not use eval to invoke it:
commando=( /bin/ssh $server '/bin/ps -fea | /bin/grep "'"$app"'" | /bin/grep -v grep | /bin/wc -l' )
COMA=$( "${commando[#]}" )
commando is an array with 3 elements, so the last element can be passed to the remote server as a single word. Note the careful quoting around $app
Also, since $COMA will be a number, use numeric comparison: if [ $COMA -ne 0 ]
You might want to separate the /bin/ssh command and its arguments. Also put the "commando" in double quotes:
Instead of:
commando="/bin/ssh $server '/bin/ps -fea | /bin/grep $app | /bin/grep -v grep | /bin/wc -l'"
eval COMA=\$\($commando\)
Try this:
commando="/bin/ps -fea | /bin/grep $app ..."
/bin/ssh $server "$commando"
Bash variables gets expanded in double quotes, but not in single quotes.
Thanks for your help
I mixed the answers from Eric Renouf and Glenn Jackman and the solution es like this:
Ping_Applications () {
SetParam
cat $APPFILE | while read next
do
server=`echo $next | cut -d : -f1`
app=`echo $next | awk -F":" '{print $2}'`
commando=( /bin/ssh $server '/bin/pgrep -f "'"$app"'" | /bin/wc -l' )
COMA=$( "${commando[#]}" )
if [ $COMA -gt 0 ]
then
echo -e "$TIME : Application $app of on server $server is \E[32m[ RUNNING ]\E[0m";
else
echo -e "$TIME : Application $app of on server $server is \E[31m[ NOT RUNNING ]\E[0m";
fi
done;
}
Now i have a different problem:
This solution works, but only for the first input, then the script stops.
I have around 10 different applications in my $APPFILE, but only the first one is executed.
Regards.
Luis
Related
I have scenario where I have list of 100s of server . Want to check whether those all server can reach to specified destination server or not by telneting from all server to that destination server.
I have written a code as below,
#!/bin/bash
#bash to check telnet status.
#set -x;
#
#clear
SetParam() {
export URLFILE="Host_PortFile.txt"
export TIME=`date +%d-%m-%Y_%H.%M.%S`
export port=80
export STATUS_UP=`echo -e "\E[32m[ RUNNING ]\E[0m"`
export STATUS_DOWN=`echo -e "\E[31m[ DOWN ]\E[0m"`
export MAIL_TO="admin(at)techpaste(dot)com"
export SHELL_LOG="`basename $0`.log"
}
Telnet_Status() {
SetParam
cat $URLFILE | while read next
do
server=`echo $next | cut -d : -f1`
port=`echo $next | awk -F":" '{print $2}'`
TELNETCOUNT=`sleep 5 | telnet $server $port | grep -v "Connection refused" | grep "Connected to" | grep -v grep | wc -l`
if [ $TELNETCOUNT -eq 1 ] ; then
echo -e "$TIME : Port $port of URL http://$server:$port/ is \E[32m[ OPEN ]\E[0m";
else
echo -e "$TIME : Port $port of URL http://$server:$port/ is \E[31m[ NOT OPEN ]\E[0m";
echo -e "$TIME : Port $port of URL http://$server:$port/ is NOT OPEN" | mailx -s "Port $port of URL $server:$port/ is DOWN!!!" $MAIL_TO;
fi
done;
}
Main() {
Telnet_Status
}
SetParam
Main | tee -a $SHELL_LOG
My Host_PortFile.txt file looks like,
gmail.com:443
But here, i need to go to individual server and has to run this which consumes more time. Is there any modification I can do so that I can run the script from one machine to read all source server name from text file or any and can check that server can reach the destination server or not? Can anyone suggest on this please?
We have a script which is checking and sending an alert if process goes down. For some reason it is not capturing it properly for all the users and not sending the alerts in all scenarios.
Please suggest what could be the problem.
Environments – uatwrk1, uatwrk2, uatwrk3 ------- uatwrk100
ServerName - myuatserver
Process to be checked - Amc/apache/bin/httpd
Script is :
#!/bin/ksh
i=1
while (( i<=100 ))
do
myuser=uatwrk$i
NoOfProcess=`ps -ef | grep -v grep | grep $myuser | grep "Amc/apache/bin/httpd" | wc -l`
if [[ $NoOfProcess -eq 0 ]]
then
echo "Amc process is down, sending an alert"
# Assume sendAlert.ksh is fine
./sendAlert.ksh
else
echo "Amc process is running fine" >> /dev/null
fi
(( i+=1 ))
done
I think #Mahesh already indicated the problem in a comment.
When you only want to have a mail once, you can count the users running a httpd process. The backslash in the following command is for avoiding grep -v grep.
ps -ef | grep "A\mc/apache/bin/httpd" | cut -d " " -f1 | grep "^uatwrk"| sort -u | wc -l
I've written the bash script (searchuser) which should display all the users who are executing a specific program or a script (at least a bash script). But when searching for scripts fails because the command the SO is executing is something like bash scriptname.
This script acts parsing the ps command output, it search for all the occurrences of the specified program name, extracts the user and the program name, verifies if the program name is that we're searching for and if it's it displays the relevant information (in this case the user name and the program name, might be better to output also the PID, but that is quite simple). The verification is accomplished to reject all lines containing program names which contain the name of the program but they're not the program we are searching for; if we're searching gedit we don't desire to find sgedit or gedits.
Other issues I've are:
I would like to avoid the use of a tmp file.
I would like to be not tied to GNU extensions.
The script has to be executed as:
root# searchuser programname <invio>
The script searchuser is the following:
#!/bin/bash
i=0
search=$1
tmp=`mktemp`
ps -aux | tr -s ' ' | grep "$search" > $tmp
while read fileline
do
user=`echo "$fileline" | cut -f1 -d' '`
prg=`echo "$fileline" | cut -f11 -d' '`
prg=`basename "$prg"`
if [ "$prg" = "$search" ]; then
echo "$user - $prg"
i=`expr $i + 1`
fi
done < $tmp
if [ $i = 0 ]; then
echo "No users are executing $search"
fi
rm $tmp
exit $i
Have you suggestion about to solve these issues?
One approach might looks like such:
IFS=$'\n' read -r -d '' -a pids < <(pgrep -x -- "$1"; printf '\0')
if (( ! ${#pids[#]} )); then
echo "No users are executing $1"
fi
for pid in "${pids[#]}"; do
# build a more accurate command line than the one ps emits
args=( )
while IFS= read -r -d '' arg; do
args+=( "$arg" )
done </proc/"$pid"/cmdline
(( ${#args[#]} )) || continue # exited while we were running
printf -v cmdline_str '%q ' "${args[#]}"
user=$(stat --format=%U /proc/"$pid") || continue # exited while we were running
printf '%q - %s\n' "$user" "${cmdline_str% }"
done
Unlike the output from ps, which doesn't distinguish between ./command "some argument" and ./command "some" "argument", this will emit output which correctly shows the arguments run by each user, with quoting which will re-run the given command correctly.
What about:
ps -e -o user,comm | egrep "^[^ ]+ +$1$" | cut -d' ' -f1 | sort -u
* Addendum *
This statement:
ps -e -o user,pid,comm | egrep "^\s*\S+\s+\S+\s*$1$" | while read a b; do echo $a; done | sort | uniq -c
or this one:
ps -e -o user,pid,comm | egrep "^\s*\S+\s+\S+\s*sleep$" | xargs -L1 echo | cut -d ' ' -f1 | sort | uniq -c
shows the number of process instances by user.
First off, I'm new to this. I have some experience with windows scripting and apple script but not much with bash. What I'm trying to do is grab the PID and %CPU of a specific process. then compare the %CPU against a set number, and if it's higher, kill the process. I feel like I'm close, but now I'm getting the following error:
[[: 0.0: syntax error: invalid arithmetic operator (error token is ".0")
what am I doing wrong? here's my code so far:
#!/bin/bash
declare -i app_pid
declare -i app_cpu
declare -i cpu_limit
app_name="top"
cpu_limit="50"
app_pid=`ps aux | grep $app_name | grep -v grep | awk {'print $2'}`
app_cpu=`ps aux | grep $app_name | grep -v grep | awk {'print $3'}`
if [[ ! $app_cpu -gt $cpu_limit ]]; then
echo "crap"
else
echo "we're good"
fi
Obviously I'm going to replace the echos in the if/then statement but it's acting as if the statement is true regardless of what the cpu load actually is (I tested this by changing the -gt to -lt and it still echoed "crap"
Thank you for all the help. Oh, and this is on a OS X 10.7 if that is important.
I recommend taking a look at the facilities of ps to avoid multiple horrible things you do.
On my system (ps from procps on linux, GNU awk) I would do this:
ps -C "$app-name" -o pid=,pcpu= |
awk --assign maxcpu="$cpu_limit" '$2>maxcpu {print "crappy pid",$1}'
The problem is that bash can't handle decimals. You can just multiply them by 100 and work with plain integers instead:
#!/bin/bash
declare -i app_pid
declare -i app_cpu
declare -i cpu_limit
app_name="top"
cpu_limit="5000"
app_pid=`ps aux | grep $app_name | grep -v grep | awk {'print $2'}`
app_cpu=`ps aux | grep $app_name | grep -v grep | awk {'print $3*100'}`
if [[ $app_cpu -gt $cpu_limit ]]; then
echo "crap"
else
echo "we're good"
fi
Keep in mind that CPU percentage is a suboptimal measurement of application health. If you have two processes running infinite loops on a single core system, no other application of the same priority will ever go over 33%, even if they're trashing around.
#!/bin/sh
PROCESS="java"
PID=`pgrep $PROCESS | tail -n 1`
CPU=`top -b -p $PID -n 1 | tail -n 1 | awk '{print $9}'`
echo $CPU
I came up with this, using top and bc.
Use it by passing in ex: ./script apache2 50 # max 50%
If there are many PIDs matching your program argument, only one will be calculated, based on how top lists them. I could have extended the script by catching them all and avergaing the percentage or something, but this will have to do.
You can also pass in a number, ./script.sh 12345 50, which will force it to use an exact PID.
#!/bin/bash
# 1: ['command\ name' or PID number(,s)] 2: MAX_CPU_PERCENT
[[ $# -ne 2 ]] && exit 1
PID_NAMES=$1
# get all PIDS as nn,nn,nn
if [[ ! "$PID_NAMES" =~ ^[0-9,]+$ ]] ; then
PIDS=$(pgrep -d ',' -x $PID_NAMES)
else
PIDS=$PID_NAMES
fi
# echo "$PIDS $MAX_CPU"
MAX_CPU="$2"
MAX_CPU="$(echo "($MAX_CPU+0.5)/1" | bc)"
LOOP=1
while [[ $LOOP -eq 1 ]] ; do
sleep 0.3s
# Depending on your 'top' version and OS you might have
# to change head and tail line-numbers
LINE="$(top -b -d 0 -n 1 -p $PIDS | head -n 8 \
| tail -n 1 | sed -r 's/[ ]+/,/g' | \
sed -r 's/^\,|\,$//')"
# If multiple processes in $PIDS, $LINE will only match\
# the most active process
CURR_PID=$(echo "$LINE" | cut -d ',' -f 1)
# calculate cpu limits
CURR_CPU_FLOAT=$(echo "$LINE"| cut -d ',' -f 9)
CURR_CPU=$(echo "($CURR_CPU_FLOAT+0.5)/1" | bc)
echo "PID $CURR_PID: $CURR_CPU""%"
if [[ $CURR_CPU -ge $MAX_CPU ]] ; then
echo "PID $CURR_PID ($PID_NAMES) went over $MAX_CPU""%"
echo "[[ $CURR_CPU""% -ge $MAX_CPU""% ]]"
LOOP=0
break
fi
done
echo "Stopped"
Erik, I used a modified version of your code to create a new script that does something similar. Hope you don't mind it.
A bash script to get the CPU usage by process
usage:
nohup ./check_proc bwengine 70 &
bwegnine is the process name we want to monitor 70 is to log only when the process is using over 70% of the CPU.
Check the logs at: /var/log/check_procs.log
The output should be like:
DATE | TOTAL CPU | CPU USAGE | Process details
Example:
03/12/14 17:11 |20.99|98| ProdPROXY-ProdProxyPA.tra
03/12/14 17:11 |20.99|100| ProdPROXY-ProdProxyPA.tra
Link to the full blog:
http://felipeferreira.net/?p=1453
It is also useful to have app_user information available to test whether the current user has the rights to kill/modify the running process. This information can be obtained along with the needed app_pid and app_cpu by using read eliminating the need for awk or any other 3rd party parser:
read app_user app_pid tmp_cpu stuff <<< \
$( ps aux | grep "$app_name" | grep -v "grep\|defunct\|${0##*/}" )
You can then get your app_cpu * 100 with:
app_cpu=$((${tmp_cpu%.*} * 100))
Note: Including defunct and ${0##*/} in grep -v prevents against multiple processes matching $app_name.
I use top to check some details. It provides a few more details like CPU time.
On Linux this would be:
top -b -n 1 | grep $app_name
On Mac, with its BSD version of top:
top -l 1 | grep $app_name
This is my bash script:
#!/usr/local/bin/bash -x
touch /usr/local/p
touch /usr/local/rec
DATA_FULL=`date +%Y.%m.%d.%H`
CHECK=`netstat -an | grep ESTAB | egrep '(13001|13002|13003|13004|13061|13099|16001|16002|16003|16004|16061|16099|18001|18002|18003|18004|18061|18099|20001|20002|20003|20004|20061|20099|13000|16000|18000|20000)' | awk '{ print $5 }' | sort -u | wc -l`
netstat -an | grep ESTAB | egrep '(13001|13002|13003|13004|13061|13099|16001|16002|16003|16004|16061|16099|18001|18002|18003|18004|18061|18099|20001|20002|20003|20004|20061|20099|13000|16000|18000|20000)' | awk '{ print $5 }' | sort -u | wc -l > /usr/local/www/p
STAT=`cat /usr/local/www/rec`
if [ "$CHECK" -gt "$STAT" ]; then
echo $CHECK"\n"$DATA_FULL > /usr/local/p
fi
Ofcourse I've runned chmod +x script.sh and then sh script.sh, then I receive the following message: [: : bad number.
Why does it happends?
Run your script using
sh -x script.sh
It'll print every line it executes and the variable output.
Run the netstat command and stat command outside and check.
If these are integer for sure, use this syntax,
if [ "0$(echo $CHECK|tr -d ' ')" -gt "0$(echo $STAT|tr -d ' ')" ];
A simple hack. Only works if $STAT is always either empty or positive number.
Are you sure that both STAT and CHECK are numbers that can be compared with -gt?
probably your /usr/local/www/rec is empty. Try
STAT=`cat /usr/local/www/rec 2>/dev/null || echo 0`
maybe.