We are in a situation where we got to check the status of the process over SSH and exit the ssh and script if the process is running and trigger an email if its not running
Below is the snippet of current script which calls status_email function in both the cases.
I am looking for your help to invoke the script only when the process is not running
status_email{
******
****
}
for hostname in `cat ai_hosts.txt`; do
ssh actional#"$hostname" /bin/sh << 'EOF'
pid=`ps -ef | grep <<Process_Details>> | grep -v grep | awk -F ' ' {print'$2'}`
if [ "${pid:-null}" = null ]; then
echo "not running"
else
echo "running"
exit
fi
EOF
status_email;
You can store output of ssh in a variable, then check the variable and depending on its value do something. For example if you save the following code in remote-process-check.sh
#!/bin/sh
host=your-host-name-or-ip-address
status_email () {
echo 'Status email was sent.';
}
output=$(ssh root#"$host" "ps -C $1 >/dev/null && echo 'Running' || echo 'Not running'")
if [ "$output" = 'Not running' ]; then
status_email
fi
You can use it like:
bash remote-process-check.sh process-name
bash remote-process-check.sh mysqld
bash remote-process-check.sh apache2
and if the process is not running status_email function will be invoked and you will see "Status email was sent." in your console.
The interesting part of the script is:
output=$(ssh root#"$host" "ps -C $1 >/dev/null && echo 'Running' || echo 'Not running'")
where with ps -C $1 you are checking if process is running, with >/dev/null you are redirecting the output to the black hole and if command is successful, meaning that the process is running echo 'Running' is executed otherwise echo 'Not running', which is stored in the variable output
Related
My ssh command:
ssh -l prdmover $newFtpHostname "u/prdmover/checkTriggerFilePresent.sh $newFtpFolderPath ${TriggerFileName[0]}"
checkTriggerFilePresent.sh code below:
#!/bin/ksh
triggerFileLocation=$1
triggerFileName=$2
echo "Inside checkTriggerFilePresnt script for product feed..."
if [ -f $triggerFileLocation$triggerFileName ]
then
echo "Trigger File is there..."
exit 0
else
echo "No Trigger File is there..."
exit 1
fi
Depending on condition I am returning values to my main script..
But In every case it is returning 127 to my main script. But I want to return 0 or 1.
Please advise..
After doing
ssh -l prdmover $newFtpHostname "u/prdmover/checkTriggerFilePresent.sh $newFtpFolderPath ${TriggerFileName[0]}"
echo $?
i would expect in $? the return code of the ssh command, not the return code of the script.
I would try to grap the output of the script:
R=$(ssh -l prdmover $newFtpHostname "u/prdmover/checkTriggerFilePresent.sh $newFtpFolderPath ${TriggerFileName[0]}; echo \$?"| tail -1)
echo $R
I have a script to monitor my server, it sends a mail alert if the server are not ping-able.
But when I set the script as a cron job , it throws error as ping command not recognized , mailx command not recognized ; while the same is working when executed manually.
Below is the code of the script
#!/bin/sh
cd `dirname $0`
serverIPs="192.0.0.40 192.0.0.140"
count=4
##checking the status by pinging the individual ips in serverIps variable
for host in $serverIPs
do
recCount=$(ping -c $count $host | grep 'received' | awk -F',' '{ print $2 }' | awk '{ print $1 }')
if [ $recCount -eq 0 ]; then
# 100% failed
echo "Host : $host is down (ping failed) at $(date)" |mailx -s "Server is not responding completely " jagdeep.gupta#gmail.com
elif [ $recCount -lt 4 ]
then
echo "Host : $host is not responding well there is loss of packets , please check " |mailx -s "Server is not responding partially " jagdeep.gupta#gmail.com
fi
done
Your Cron Daemon likely flushes the environment, along with the $PATH variable. Try to add
export PATH=/bin:/usr/bin
at the beginning of your script. (This should suffice. If it does not, check the output of echo $PATH and use that as the value.)
In a loop in shell script, I am connecting to various servers and running some commands. For example
#!/bin/bash
FILENAME=$1
cat $FILENAME | while read HOST
do
0</dev/null ssh $HOST 'echo password| sudo -S
echo $HOST
echo $?
pwd
echo $?'
done
Here I am running "echo $HOST" and "pwd" commands and I am getting exit status via "echo $?".
My question is that I want to be able to store the exit status of the commands I run remotely in some variable and then ( based on if the command was success or not) , write a log to a local file.
Any help and code is appreciated.
ssh will exit with the exit code of the remote command. For example:
$ ssh localhost exit 10
$ echo $?
10
So after your ssh command exits, you can simply check $?. You need to make sure that you don't mask your return value. For example, your ssh command finishes up with:
echo $?
This will always return 0. What you probably want is something more like this:
while read HOST; do
echo $HOST
if ssh $HOST 'somecommand' < /dev/null; then
echo SUCCESS
else
echo FAIL
done
You could also write it like this:
while read HOST; do
echo $HOST
if ssh $HOST 'somecommand' < /dev/null
if [ $? -eq 0 ]; then
echo SUCCESS
else
echo FAIL
done
You can assign the exit status to a variable as simple as doing:
variable=$?
Right after the command you are trying to inspect. Do not echo $? before or the new value of $? will be the exit code of echo (usually 0).
An interesting approach would be to retrieve the whole output of each ssh command set in a local variable using backticks, or even seperate with a special charachter (for simplicity say ":") something like:
export MYVAR=`ssh $HOST 'echo -n ${HOSTNAME}\:;pwd'`
after this you can use awk to split MYVAR into your results and continue bash testing.
Perhaps prepare the log file on the other side and pipe it to stdout, like this:
ssh -n user#example.com 'x() { local ret; "$#" >&2; ret=$?; echo "[`date +%Y%m%d-%H%M%S` $ret] $*"; return $ret; };
x true
x false
x sh -c "exit 77";' > local-logfile
Basically just prefix everything on the remote you want to invoke with this x wrapper. It works for conditionals, too, as it does not alter the exit code of a command.
You can easily loop this command.
This example writes into the log something like:
[20141218-174611 0] true
[20141218-174611 1] false
[20141218-174611 77] sh -c exit 77
Of course you can make it better parsable or adapt it to your whishes how the logfile shall look like. Note that the uncatched normal stdout of the remote programs is written to stderr (see the redirection in x()).
If you need a recipe to catch and prepare output of a command for the logfile, here is a copy of such a catcher from https://gist.github.com/hilbix/c53d525f113df77e323d - but yes, this is a bit bigger boilerplate to "Run something in current context of shell, postprocessing stdout+stderr without disturbing return code":
# Redirect lines of stdin/stdout to some other function
# outfn and errfn get following arguments
# "cmd args.." "one line full of output"
: catch outfn errfn cmd args..
catch()
{
local ret o1 o2 tmp
tmp=$(mktemp "catch_XXXXXXX.tmp")
mkfifo "$tmp.out"
mkfifo "$tmp.err"
pipestdinto "$1" "${*:3}" <"$tmp.out" &
o1=$!
pipestdinto "$2" "${*:3}" <"$tmp.err" &
o2=$!
"${#:3}" >"$tmp.out" 2>"$tmp.err"
ret=$?
rm -f "$tmp.out" "$tmp.err" "$tmp"
wait $o1
wait $o2
return $ret
}
: pipestdinto cmd args..
pipestdinto()
{
local x
while read -r x; do "$#" "$x" </dev/null; done
}
STAMP()
{
date +%Y%m%d-%H%M%S
}
# example output function
NOTE()
{
echo "NOTE `STAMP`: $*"
}
ERR()
{
echo "ERR `STAMP`: $*" >&2
}
catch_example()
{
# Example use
catch NOTE ERR find /proc -ls
}
See the second last line for an example (scroll down)
I have a short bash script to check to see if a Python program is running. The program writes out a PID file when it runs, so comparing this to the current list of running processes gives me what I need. But I'm having a problem with a variable being changed and then apparently changing back! Here's the script:
#!/bin/bash
# Test whether Home Server is currently running
PIDFILE=/tmp/montSvr.pid
isRunning=0
# does a pid file exist?
if [ -f "$PIDFILE" ]; then
# pid file exists
# now get contents of pid file
cat $PIDFILE | while read PID; do
if [ $PID != "" ]; then
PSGREP=$(ps -A | grep $PID | awk '{print $1}')
if [ -n "$PSGREP" ]; then
isRunning=1
echo "RUNNING: $isRunning"
fi
fi
done
fi
echo "Running: $isRunning"
exit $isRunning
The output I get, when the Python script is running, is:
RUNNING: 1
Running: 0
And the exit value of the bash script is 0. So isRunning is getting changed within all those if statements (ie, the code is performing as expected), but then somehow isRunning reverts to 0 again. Confused...
Commands after a pipe | are run in a subshell. Changes to variable values in a subshell do not propagate to the parent shell.
Solution: change your loop to
while read PID; do
# ...
done < $PIDFILE
It's the pipe that is the problem. Using a pipe in this way means that the loop runs in a sub-shell, with its own environment. Kill the cat, use this syntax instead:
while read PID; do
if [ $PID != "" ]; then
PSGREP=$(ps -A | grep $PID | awk '{print $1}')
if [ -n "$PSGREP" ]; then
isRunning=1
echo "RUNNING: $isRunning"
fi
fi
done < "$PIDFILE"
I have a script that I only want to be running one time. If the script gets called a second time I'm having it check to see if a lockfile exists. If the lockfile exists then I want to see if the process is actually running.
I've been messing around with pgrep but am not getting the expected results:
#!/bin/bash
COUNT=$(pgrep $(basename $0) | wc -l)
PSTREE=$(pgrep $(basename $0) ; pstree -p $$)
echo "###"
echo $COUNT
echo $PSTREE
echo "###"
echo "$(basename $0) :" `pgrep -d, $(basename $0)`
echo sleeping.....
sleep 10
The results I'm getting are:
$ ./test.sh
###
2
2581 2587 test.sh(2581)---test.sh(2587)---pstree(2591)
###
test.sh : 2581
sleeping.....
I don't understand why I'm getting a "2" when only one process is actually running.
Any ideas? I'm sure it's the way I'm calling it. I've tried a number of different combinations and can't quite seem to figure it out.
SOLUTION:
What I ended up doing was doing this (portion of my script):
function check_lockfile {
# Check for previous lockfiles
if [ -e $LOCKFILE ]
then
echo "Lockfile $LOCKFILE already exists. Checking to see if process is actually running...." >> $LOGFILE 2>&1
# is it running?
if [ $(ps -elf | grep $(cat $LOCKFILE) | grep $(basename $0) | wc -l) -gt 0 ]
then
abort "ERROR! - Process is already running at PID: $(cat $LOCKFILE). Exitting..."
else
echo "Process is not running. Removing $LOCKFILE" >> $LOGFILE 2>&1
rm -f $LOCKFILE
fi
else
echo "Lockfile $LOCKFILE does not exist." >> $LOGFILE 2>&1
fi
}
function create_lockfile {
# Check for previous lockfile
check_lockfile
#Create lockfile with the contents of the PID
echo "Creating lockfile with PID:" $$ >> $LOGFILE 2>&1
echo -n $$ > $LOCKFILE
echo "" >> $LOGFILE 2>&1
}
# Acquire lock file
create_lockfile >> $LOGFILE 2>&1 \
|| echo "ERROR! - Failed to acquire lock!"
The argument for pgrep is an extended regular expression pattern.
In you case the command pgrep $(basename $0) will evaluate to pgrep test.sh which will match match any process that has test followed by any character and lastly followed by sh. So it wil match btest8sh, atest_shell etc.
You should create a lock file. If the lock file exists program should exit.
lock=$(basename $0).lock
if [ -e $lock ]
then
echo Process is already running with PID=`cat $lock`
exit
else
echo $$ > $lock
fi
You are already opening a lock file. Use it to make your life easier.
Write the process id to the lock file. When you see the lock file exists, read it to see what process id it is supposedly locking, and check to see if that process is still running.
Then in version 2, you can also write program name, program arguments, program start time, etc. to guard against the case where a new process starts with the same process id.
Put this near the top of your script...
pid=$$
script=$(basename $0)
guard="/tmp/$script-$(id -nu).pid"
if test -f $guard ; then
echo >&2 "ERROR: Script already runs... own PID=$pid"
ps auxw | grep $script | grep -v grep >&2
exit 1
fi
trap "rm -f $guard" EXIT
echo $pid >$guard
And yes, there IS a small window for a race condition between the test and echo commands, which can be fixed by appending to the guard file, and then checking that the first line is indeed our own PID. Also, the diagnostic output in the if can be commented out in a production version.