How to modify call stack in Bash? - bash

Suppose I want to write a smart logging function log, that would read the line that is immediately after the log invocation and store it and its output in the log file. The function can find, read and execute the line of code that is in question. The problem is, that when the function returns, bash executes the line again.
Everything works fine except that assignment to BASH_LINENO[0] is silently discarded. Reading the http://wiki.bash-hackers.org/syntax/shellvars#bash_lineno I've learned that the variable is not read only.
function log()
{
BASH_LINENO[0]=$((${BASH_LINENO[0]}+1))
file=${BASH_SOURCE[1]##*/}
linenr=$((${BASH_LINENO[0]} + 1 ))
line=`sed "1,$((${linenr}-1)) d;${linenr} s/^ *//; q" $file`
if [ -f /tmp/tmp.txt ]; then
rm /tmp/tmp.txt
fi
exec 3>&1 4>&2 >>/tmp/tmp.txt 2>&1
set -x
eval $line
exitstatus=$?
set +x
exec 1>&3 2>&4 4>&- 3>&-
#Here goes the code that parses the /tmp/tmp.txt and stores it in the log
if [ "$exitstatus" -ne "0" ]; then
exit $exitstatus
fi
}
#Test case:
log
echo "Unfortunately this line gets appended twice" | tee -a bla.txt;

After consulting the wisdom of users on bug-bash#gnu.org mailing list it appears that modifying the call stack is not possible, after all. Here is an answer I got from Chet Ramey:
BASH_LINENO is a call stack; assignments to it should be (and are)
ignored. That's been the case since at least bash-3.2 (that's where I
quit looking).
There is an indirect way to force bash to not execute the next
command: set the extdebug option and have the DEBUG trap return a
non-zero status.
The above technique works very well for my purposes. I am finally able to do a production version of the log function.
#!/bin/bash
shopt -s extdebug
repetition_count=0
_ERR_HDR_FMT="%.8s %s#%s:%s:%s"
_ERR_MSG_FMT="[${_ERR_HDR_FMT}]%s \$ "
msg() {
printf "$_ERR_MSG_FMT" $(date +%T) $USER $HOSTNAME $PWD/${BASH_SOURCE[2]##*/} ${BASH_LINENO[1]}
echo ${#}
}
function rlog()
{
case $- in *x*) USE_X="-x";; *) USE_X=;; esac
set +x
if [ "${BASH_LINENO[0]}" -ne "$myline" ]; then
repetition_count=0
return 0;
fi
if [ "$repetition_count" -gt "0" ]; then
return -1;
fi
if [ -z "$log" ]; then
return 0
fi
file=${BASH_SOURCE[1]##*/}
line=`sed "1,$((${myline}-1)) d;${myline} s/^ *//; q" $file`
if [ -f /tmp/tmp.txt ]; then
rm /tmp/tmp.txt
fi
echo "$line" > /tmp/tmp2.txt
mymsg=`msg`
exec 3>&1 4>&2 >>/tmp/tmp.txt 2>&1
set -x
source /tmp/tmp2.txt
exitstatus=$?
set +x
exec 1>&3 2>&4 4>&- 3>&-
repetition_count=1 #This flag is to prevent multiple execution of the current line of code. This condition gets checked at the beginning of the function
frstline=`sed '1q' /tmp/tmp.txt`
[[ "$frstline" =~ ^(\++)[^+].*$ ]]
# echo "BASH_REMATCH[1]=${BASH_REMATCH[1]}"
eval 'tmp="${BASH_REMATCH[1]}"'
pluscnt=$(( (${#tmp} + 1) *2 ))
pluses="\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+"
pluses=${pluses:0:$pluscnt}
commandlines="`awk \" gsub(/^${pluses}\\s/,\\\"\\\")\" /tmp/tmp.txt`"
n=0
#There might me more then 1 command in the debugged line. The next loop appends each command to the log.
while read -r line; do
if [ "$n" -ne "0" ]; then
echo "+ $line" >>$log
else
echo "${mymsg}$line" >>$log
n=1
fi
done <<< "$commandlines"
#Next line extracts all lines that are prefixed by sufficent number of "+" (usually 3), that are immidiately after the last line prefixed with $pluses, i.e. after the last command line.
awk "BEGIN {flag=0} /${pluses}/ { flag=1 } /^[^+]/ { if (flag==1) print \$0; }" /tmp/tmp.txt | tee -a $log
if [ "$exitstatus" -ne "0" ]; then
echo "## Exit status: $exitstatus" >>$log
fi
echo >>$log
if [ "$exitstatus" -ne "0" ]; then
exit $exitstatus
fi
if [ -n "$USE_X" ]; then
set -x
fi
return -1
}
log_next_line='eval if [ -n "$log" ]; then myline=$(($LINENO+1)); trap "rlog" DEBUG; fi;'
logoff='trap - DEBUG'
The usage of the file is intended as follows:
#!/bin/bash
log=mylog.log
if [ -f mylog.log ]; then
rm mylog.log
fi
. ./log.sh
a=example
x=a
$log_next_line
echo "KUKU!"
$log_next_line
echo $x
$log_next_line
echo ${!x}
$log_next_line
echo ${!x} > /dev/null
$log_next_line
echo "Proba">/tmp/mtmp.txt
$log_next_line
touch ${!x}.txt
$log_next_line
if [ $(( ${#a} + 6 )) -gt 10 ]; then echo "Too long string"; fi
$log_next_line
echo "\$a and \$x">/dev/null
$log_next_line
echo $x
$log_next_line
ls -l
$log_next_line
mkdir /ddad/adad/dad #Generates an error
The output (`mylog.log):
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:14] $ echo 'KUKU!'
KUKU!
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:16] $ echo a
a
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:18] $ echo example
example
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:20] $ echo example
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:22] $ echo 1,2,3
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:24] $ touch example.txt
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:26] $ '[' 13 -gt 10 ']'
+ echo 'Too long string'
Too long string
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:28] $ echo '$a and $x'
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:30] $ echo a
a
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:32] $ ls -l
total 12
-rw-rw-r-- 1 adam adam 0 gru 4 13:39 example.txt
lrwxrwxrwx 1 adam adam 66 gru 4 13:29 log.sh -> /home/Adama-docs/Adam/Adam/MyDocs/praca/Puppet/bootstrap/common.sh
-rwxrwxr-x 1 adam adam 520 gru 4 13:29 log-test-case.sh
-rw-rw-r-- 1 adam adam 995 gru 4 13:39 mylog.log
[13:39:51 adam#adam-N56VZ:/home/Adama-docs/Adam/Adam/linux/tmp/log/log-test-case.sh:34] $ mkdir /ddad/adad/dad
mkdir: cannot create directory ‘/ddad/adad/dad’: No such file or directory
## Exit status: 1
The standard output is unchanged.
Limitations
Limitations are serious, unfortunately.
Exit code of logged command gets discarded
First of all, the exit code of the logged command is discarded, so user cannot test for it in the next statement. The current code exits the script if there was an error (which I believe is the best behavior). It is possible to modify the script to test
Limited support for bash tracing
The function honors bash tracing with -x. If it finds that the user traces output, it temporarily disables the output (as it would interfere with the trace anyway), and restores it back at the end. Unfortunately, it also appends a few extra lines to the trace.
Unless user turns off logging (with $logoff) there is a considerable speed penalty for all commands after the first $log_next_line, even if no logging takes place.
In ideal world the function should disable debug trapping (trap - DEBUG) after each invocation. Unfortunately I don't know how to do it, so beginning with the first $log_next_line macro, interpretation of each line invokes a custom function.
I use this function before every key command in my complex bootstrapping scripts. With it I can see what exactly and when was executed and what was the output, without the need to really understand the logic of the lengthy and sometimes messy scripts.

Related

How to detect a non-rolling log file and pattern match in a shell script which is using tail, while, read, and?

I am monitoring a log file and if PATTERN didn't appear in it within THRESHOLD seconds, the script should print "error", otherwise, it should print "clear". The script is working fine, but only if the log is rolling.
I've tried reading 'timeout' but didn't work.
log_file=/tmp/app.log
threshold=120
tail -Fn0 ${log_file} | \
while read line ; do
echo "${line}" | awk '/PATTERN/ { system("touch pattern.tmp") }'
code to calculate how long ago pattern.tmp touched and same is assigned to DIFF
if [ ${diff} -gt ${threshold} ]; then
echo "Error"
else
echo "Clear"
done
It is working as expected only when there is 'any' line printed in the app.log.
If the application got hung for any reason and the log stopped rolling, there won't be any output by the script.
Is there a way to detect the 'no output' of tail and do some command at that time?
It looks like the problem you're having is that the timing calculations inside your while loop never get a chance to run when read is blocking on input. In that case, you can pipe the tail output into a while true loop, inside of which you can do if read -t $timeout:
log_file=/tmp/app.log
threshold=120
timeout=10
tail -Fn0 "$log_file" | while true; do
if read -t $timeout line; then
echo "${line}" | awk '/PATTERN/ { system("touch pattern.tmp") }'
fi
# code to calculate how long ago pattern.tmp touched and same is assigned to diff
if [ ${diff} -gt ${threshold} ]; then
echo "Error"
else
echo "Clear"
fi
done
As Ed Morton pointed out, all caps variable names are not a good idea in bash scripts, so I used lowercase variable names.
How about something simple like:
sleep "$threshold"
grep -q 'PATTERN' "$log_file" && { echo "Clear"; exit; }
echo "Error"
If that's not all you need then edit your question to clarify your requirements. Don't use all upper case for non exported shell variable names btw - google it.
To build further on your idea, it might be beneficial to run the awk part in the background and a continuous loop to do the checking.
#!/usr/bin/env bash
log_file="log.txt"
# threshold in seconds
threshold=10
# run the following process in the background
stdbuf -oL tail -f0n "$log_file" \
| awk '/PATTERN/{system("touch "pattern.tmp") }' &
while true; do
match=$(find . -type f -iname "pattern.tmp" -newermt "-${threshold} seconds")
if [[ -z "${match}" ]]; then
echo "Error"
else
echo "Clear"
fi
done
This looks to me like a watchdog timer. I've implemented something like this by forcing a background process to update my log, so I don't have to worry about read -t. Here's a working example:
#!/usr/bin/env bash
threshold=10
grain=2
errorstate=0
while sleep "$grain"; do
date '+[%F %T] watchdog timer' >> log
done &
trap "kill -HUP $!" 0 HUP INT QUIT TRAP ABRT TERM
printf -v lastseen '%(%s)T'
tail -F log | while read line; do
printf -v now '%(%s)T'
if (( now - lastseen > threshold )); then
echo "ERROR"
errorstate=1
else
if (( errorstate )); then
echo "Recovered, yay"
errorstate=0
fi
fi
if [[ $line =~ .*PATTERN.* ]]; then
lastseen=$now
fi
done
Run this in one window, wait $threshold seconds for it to trigger, then in another window echo PATTERN >> log to see the recovery.
While this can be made as granular as you like (I've set it to 2 seconds in the example), it does pollute your log file.
Oh, and note that printf '%(%s)T' format requires bash version 4 or above.

trap ERR doesn't work with pipes

I am try to make a system backup script with trap "" ERR. I realized the trap doesn't get called when commands are part of pipes |.
Heres are some parts of my code that don't work with trap "" ERR ...
OpenFiles=$(lsof "$Source" | wc -l)
PackagesList=$(dpkg --get-selections | awk '!/deinstall|purge|hold/ {print $1}' | tee "$FilePackagesList")
How can I get this to work without using if [ "$?" -eq 0 ]; then, or similar coding ? Because this is the reason I declared a trap this way.
Here is the script ...
root#Lian-Li:~# cat /usr/local/bin/create_incremental_backup_of_system.sh
#!/bin/bash
# Create an incremental GNU-standard backup of important system-files.
# This script works with Debian Jessie and newer systems.
# Created for my lian-li NAS 2016-11-27.
MailTo="admin#example.com" # Mail Address of an admin
Source="boot etc root usr/local usr/lib/cgi-bin var/www"
BackupDirectory=/media/hdd1/backups/lian-li
SubDir="system.d"
FileTimeStamp=$(date "+%Y%m%d%H%M%S")
FileName=$(uname -n)
File="${BackupDirectory}/${SubDir}/${FileName}-${FileTimeStamp}.tgz"
FileIncremental="${BackupDirectory}/${SubDir}/${FileName}.gtar"
FilePackagesList="${BackupDirectory}/${SubDir}/installed_packages_on_${FileName}.txt"
# have2do ...
# Backup rotate
MailContent="None"
TimeStamp=$(date "+%F %T") # This format "2011-12-31 23:59:59" is needed to read the journal
exec 1> >(logger -i -s -t "$0" -p 3) 2>&1 # all error messages are redirected to syslog journal and after that to stdout
trap "BriefExit" ERR # Provide information for an admin (via sendmail) when an error occurred and exit the script
function BriefExit(){
rm -f "$File"
if [ "$MailContent" = "None" ]
then
case "$LANG" in
de_DE.UTF-8)
echo "Beende Skript, aufgrund vorherige Fehler." 1>&2
;;
*)
echo "Stopping script because of previous error(s)." 1>&2
;;
esac
MailContent=$(journalctl -p 3 -o "short" --since="$TimeStamp" --no-pager)
ScriptName="${0##*/}"
SystemName=$(uname -n)
MailSubject="${SystemName}: ${ScriptName}"
echo -e "Subject: ${MailSubject}\n\n${MailContent}\n" | sendmail "$MailTo"
fi
exit 1
}
if [ ! -d "${BackupDirectory}/${SubDir}" ]
then
mkdir -p "${BackupDirectory}/${SubDir}"
fi
LoopCount=0
OpenFiles=1
cd /
while [ "$OpenFiles" -ne 0 ]
do
if [ "$LoopCount" -le 180 ]
then
sleep 1
OpenFiles=$(lsof $Source | wc -l)
LoopCount=$(($LoopCount + 1))
else
echo "Closing Script. Reason: Can't create incremental backup, because some files are open." 1>&2
BriefExit
fi
done
tar -cpzf "$File" -g "$FileIncremental" $Source
chmod 0700 "$File"
PackagesList=$(dpkg --get-selections | awk '!/deinstall|purge|hold/ {print $1}' | tee "$FilePackagesList")
while read -r PackageName
do
case "$PackageName" in
minidlna)
# Code ...
;;
slapd)
# Code ...
;;
esac
done <<< "$PackagesList"
exit 0
This isn't a problem with ERR traps at all, or with command substitutions, but with pipelines.
false | true
returns true, unless the pipefail option is set.
Thus in OpenFiles=$(lsof "$Source" | wc -l), only a failure in wc will cause the pipeline to be considered a failure, or in PackagesList=$(dpkg --get-selections | awk '!/deinstall|purge|hold/ {print $1}' | tee "$FilePackagesList"), only a failure in tee will cause the command as a whole to be considered failed.
Put the command set -o pipefail at the top of your script if you want a failure from any pipeline component (as opposed to the last component alone) to cause the command as a whole to be considered failed -- and note the other caveats for ERR traps given in BashFAQ #105.
Another alternative is to look at the status for each stage in the pipeline:
# cat test_bash_return.bash
true | true | false | true
echo "${PIPESTATUS[#]}"
# ./test_bash_return.bash
0 0 1 0

Write and read from a fifo from two different script

I have two bash script.
One script write in a fifo. The second one read from the fifo, but AFTER the first one end to write.
But something does not work. I do not understand where the problem is. Here the code.
The first script is (the writer):
#!/bin/bash
fifo_name="myfifo";
# Se non esiste, crea la fifo;
[ -p $fifo_name ] || mkfifo $fifo_name;
exec 3<> $fifo_name;
echo "foo" > $fifo_name;
echo "bar" > $fifo_name;
The second script is (the reader):
#!/bin/bash
fifo_name="myfifo";
while true
do
if read line <$fifo_name; then
# if [[ "$line" == 'ar' ]]; then
# break
#fi
echo $line
fi
done
Can anyone help me please?
Thank you
Replace the second script with:
#!/bin/bash
fifo_name="myfifo"
while true
do
if read line; then
echo $line
fi
done <"$fifo_name"
This opens the fifo only once and reads every line from it.
The problem with your setup is that you have fifo creation in the wrong script if you wish to control fifo access to time when the reader is actually running. In order to correct the problem you will need to do something like this:
reader: fifo_read.sh
#!/bin/bash
fifo_name="/tmp/myfifo" # fifo name
trap "rm -f $fifo_name" EXIT # set trap to rm fifo_name at exit
[ -p "$fifo_name" ] || mkfifo "$fifo_name" # if fifo not found, create
exec 3< $fifo_name # redirect fifo_name to fd 3
# (not required, but makes read clearer)
while :; do
if read -r -u 3 line; then # read line from fifo_name
if [ "$line" = 'quit' ]; then # if line is quit, quit
printf "%s: 'quit' command received\n" "$fifo_name"
break
fi
printf "%s: %s\n" "$fifo_name" "$line" # print line read
fi
done
exec 3<&- # reset fd 3 redirection
exit 0
writer: fifo_write.sh
#!/bin/bash
fifo_name="/tmp/myfifo"
# Se non esiste, exit :);
[ -p "$fifo_name" ] || {
printf "\n Error fifo '%s' not found.\n\n" "$fifo_name"
exit 1
}
[ -n "$1" ] &&
printf "%s\n" "$1" > "$fifo_name" ||
printf "pid: '%s' writing to fifo\n" "$$" > "$fifo_name"
exit 0
operation: (start reader in 1st terminal)
$ ./fifo_read.sh # you can background with & at end
(launch writer in second terminal)
$ ./fifo_write.sh "message from writer" # second terminal
$ ./fifo_write.sh
$ ./fifo_write.sh quit
output in 1st terminal:
$ ./fifo_read.sh
/tmp/myfifo: message from writer
/tmp/myfifo: pid: '28698' writing to fifo
/tmp/myfifo: 'quit' command received
The following script should do the job:
#!/bin/bash
FIFO="/tmp/fifo"
if [ ! -e "$FIFO" ]; then
mkfifo "$FIFO"
fi
for script in "$#"; do
echo $script > $FIFO &
done
while read script; do
/bin/bash -c $script
done < $FIFO
Given two script a.sh and b.sh where both scripts pass "a" and "b" to stdout, respectively, one will get the following result (given that the script above is called test.sh):
./test.sh /tmp/a.sh /tmp/b.sh
a
b
Best,
Julian

How to exit from a method in shell script

I am new to shell scripting and stuck with a problem. In my shell method if I saw any validation issue then rest of the programm will not execute and will show user a message. Till validation it's done but when I used exit 0 then only it comes out of the validation loop not from full method.
config_wuigm_parameters () {
echo "Starting to config parameters for WUIGM....." | tee -a $log
prepare_wuigm_conf_file
echo "Configing WUIGM parameters....." | tee -a $log
local parafile=`dirname $0`/wuigm.conf
local pname=""
local pvalue=""
create_preference_template
cat ${parafile} |while read -r line;do
pname=`echo $line | egrep -e "^([^#]*)=(.*)" | cut -d '=' -f 1`
if [ -n "$pname" ] ; then
lsearch=`echo $line | grep "[<|>|\"]" `
if [ -n "$lsearch" ] ; then
echo validtion=$lsearch
echo "< or > character present , Replace < with < and > with >"
exit 1;
else
pvalue=`echo $line | egrep -e "^([^#]*)=(.*)" | cut -d '=' -f 2- `
echo "<entry key=\"$pname\" value=\"$pvalue\"/>" >> $prefs
echo "Configured : ${pname} = ${pvalue} " | tee -a $log
fi
fi
done
echo $validtion
echo "</map>" >> $prefs
# Copy the file to the original location
cp -f $prefs /root/.java/.userPrefs/com/ericsson/pgm/xwx
# removing the local temp file
rm -f $prefs
reboot_server
}
Any help would be great
It is because the construction
cat file | while read ...
starts a new (sub)shell.
In the next you can see the difference:
echoline() {
cat "$1" | while read -r line
do
echo ==$line==
exit 1
done
echo "Still here after the exit"
}
echoline $#
and compare with this
echoline() {
while read -r line
do
echo ==$line==
exit 1
done < "$1"
echo "This is not printed after the exit"
}
echoline $#
Using the return doesn't helps too, (because of subshell). The
echoline() {
cat "$1" | while read -r line
do
echo ==$line==
return 1
done
echo "Still here"
}
echoline $#
will still prints the "Still here".
So, if you want exit the script, use the
while read ...
do
...
done < input #this not starts a new subshell
if want exit just the method (return from it) must check the exit startus of the previous command, like:
echoline() {
cat "$1" | while read -r line
do
echo ==$line==
exit 1
done || return 1
echo "In case of exit (or return), this is not printed"
}
echoline $#
echo "After the function call"
Instead of || or you can use the
[ $? != 0 ] && return 1
just after the while.
You use the return instruction to exit a function with a value.
return [n]
Causes a function to exit with the return value specified by n. If n is omitted, the return status is that of the last command executed in the function body. If used outside a function, but during execution of a script by the . (source) command, it causes the shell to stop executing that script and return either n or the exit status of the last command executed within the script as the exit status of the script. If used out‐side a function and not during execution of a script by ., the return status is false. Any command associated with the RETURN trap is executed before execution resumes after the function or script.
If you want to exit a loop, use the break instruction instead:
break [n]
Exit from within a for, while, until, or select loop. If n is specified, break n levels. n must be ≥ 1. If n is greater than the number of enclosing loops, all enclosing loops are exited. The return value is 0 unless n is not greater than or equal to 1.
The exit instruction exits the current shell instead, so the current program as a whole. If you use sub-shells, code written between parenthesis, then only that sub-shell exits.

Determining if process is running using pgrep

I have a script that I only want to be running one time. If the script gets called a second time I'm having it check to see if a lockfile exists. If the lockfile exists then I want to see if the process is actually running.
I've been messing around with pgrep but am not getting the expected results:
#!/bin/bash
COUNT=$(pgrep $(basename $0) | wc -l)
PSTREE=$(pgrep $(basename $0) ; pstree -p $$)
echo "###"
echo $COUNT
echo $PSTREE
echo "###"
echo "$(basename $0) :" `pgrep -d, $(basename $0)`
echo sleeping.....
sleep 10
The results I'm getting are:
$ ./test.sh
###
2
2581 2587 test.sh(2581)---test.sh(2587)---pstree(2591)
###
test.sh : 2581
sleeping.....
I don't understand why I'm getting a "2" when only one process is actually running.
Any ideas? I'm sure it's the way I'm calling it. I've tried a number of different combinations and can't quite seem to figure it out.
SOLUTION:
What I ended up doing was doing this (portion of my script):
function check_lockfile {
# Check for previous lockfiles
if [ -e $LOCKFILE ]
then
echo "Lockfile $LOCKFILE already exists. Checking to see if process is actually running...." >> $LOGFILE 2>&1
# is it running?
if [ $(ps -elf | grep $(cat $LOCKFILE) | grep $(basename $0) | wc -l) -gt 0 ]
then
abort "ERROR! - Process is already running at PID: $(cat $LOCKFILE). Exitting..."
else
echo "Process is not running. Removing $LOCKFILE" >> $LOGFILE 2>&1
rm -f $LOCKFILE
fi
else
echo "Lockfile $LOCKFILE does not exist." >> $LOGFILE 2>&1
fi
}
function create_lockfile {
# Check for previous lockfile
check_lockfile
#Create lockfile with the contents of the PID
echo "Creating lockfile with PID:" $$ >> $LOGFILE 2>&1
echo -n $$ > $LOCKFILE
echo "" >> $LOGFILE 2>&1
}
# Acquire lock file
create_lockfile >> $LOGFILE 2>&1 \
|| echo "ERROR! - Failed to acquire lock!"
The argument for pgrep is an extended regular expression pattern.
In you case the command pgrep $(basename $0) will evaluate to pgrep test.sh which will match match any process that has test followed by any character and lastly followed by sh. So it wil match btest8sh, atest_shell etc.
You should create a lock file. If the lock file exists program should exit.
lock=$(basename $0).lock
if [ -e $lock ]
then
echo Process is already running with PID=`cat $lock`
exit
else
echo $$ > $lock
fi
You are already opening a lock file. Use it to make your life easier.
Write the process id to the lock file. When you see the lock file exists, read it to see what process id it is supposedly locking, and check to see if that process is still running.
Then in version 2, you can also write program name, program arguments, program start time, etc. to guard against the case where a new process starts with the same process id.
Put this near the top of your script...
pid=$$
script=$(basename $0)
guard="/tmp/$script-$(id -nu).pid"
if test -f $guard ; then
echo >&2 "ERROR: Script already runs... own PID=$pid"
ps auxw | grep $script | grep -v grep >&2
exit 1
fi
trap "rm -f $guard" EXIT
echo $pid >$guard
And yes, there IS a small window for a race condition between the test and echo commands, which can be fixed by appending to the guard file, and then checking that the first line is indeed our own PID. Also, the diagnostic output in the if can be commented out in a production version.

Resources