locking a PID file - bash

I have a function in a bash script that runs indefinitely in background and that shall be terminated by running again the same script. It is a sort of switch, when I invoke this script it starts or kills the function if already running. To do this I use a PID file:
#!/bin/bash
background_function() {
...
}
if [[ ! -s myscript.pid ]]
then
background_function &
echo $! > myscript.pid
else
kill $(cat myscript.pid) && rm myscript.pid
fi
Now, I would like to avoid multiple instances running and race conditions. I tried to use flock and I rewrote the above code in this way:
#!/bin/bash
background_function() {
...
}
exec 200>myscript.pid
if flock -n 200
then
background_function &
echo $! > myscript.pid
else
kill $(cat myscript.pid) && rm myscript.pid
fi
In doing so, however, I have a lock on the pid file but every time I launch the script again the pid file is rewritten by exec 200>myscript.pid and therefore I am unable to retrieve the PID of the already running instance and kill it.
What can I do? Should I use two different files, a pid file and a lock file? Or would it be better to implement other lock mechanisms by using mkdir and touch? Thanks.

If an echo $$ is atomic enough for you, you could use:
echo $$ >> lock.pid
lockedby=`head -1 lock.pid`
if [ $$ != $lockedby ] ; then
kill -9 $lockedby
echo $$ > lock.pid
echo "Murdered $lockedby because it had the lock"
fi
# do things in the script
rm lock.pid

Related

kill previous launch bash process [duplicate]

What's a quick-and-dirty way to make sure that only one instance of a shell script is running at a given time?
Use flock(1) to make an exclusive scoped lock a on file descriptor. This way you can even synchronize different parts of the script.
#!/bin/bash
(
# Wait for lock on /var/lock/.myscript.exclusivelock (fd 200) for 10 seconds
flock -x -w 10 200 || exit 1
# Do stuff
) 200>/var/lock/.myscript.exclusivelock
This ensures that code between ( and ) is run only by one process at a time and that the process doesn’t wait too long for a lock.
Caveat: this particular command is a part of util-linux. If you run an operating system other than Linux, it may or may not be available.
Naive approaches that test the existence of "lock files" are flawed.
Why? Because they don't check whether the file exists and create it in a single atomic action. Because of this; there is a race condition that WILL make your attempts at mutual exclusion break.
Instead, you can use mkdir. mkdir creates a directory if it doesn't exist yet, and if it does, it sets an exit code. More importantly, it does all this in a single atomic action making it perfect for this scenario.
if ! mkdir /tmp/myscript.lock 2>/dev/null; then
echo "Myscript is already running." >&2
exit 1
fi
For all details, see the excellent BashFAQ: http://mywiki.wooledge.org/BashFAQ/045
If you want to take care of stale locks, fuser(1) comes in handy. The only downside here is that the operation takes about a second, so it isn't instant.
Here's a function I wrote once that solves the problem using fuser:
# mutex file
#
# Open a mutual exclusion lock on the file, unless another process already owns one.
#
# If the file is already locked by another process, the operation fails.
# This function defines a lock on a file as having a file descriptor open to the file.
# This function uses FD 9 to open a lock on the file. To release the lock, close FD 9:
# exec 9>&-
#
mutex() {
local file=$1 pid pids
exec 9>>"$file"
{ pids=$(fuser -f "$file"); } 2>&- 9>&-
for pid in $pids; do
[[ $pid = $$ ]] && continue
exec 9>&-
return 1 # Locked by a pid.
done
}
You can use it in a script like so:
mutex /var/run/myscript.lock || { echo "Already running." >&2; exit 1; }
If you don't care about portability (these solutions should work on pretty much any UNIX box), Linux' fuser(1) offers some additional options and there is also flock(1).
Here's an implementation that uses a lockfile and echoes a PID into it. This serves as a protection if the process is killed before removing the pidfile:
LOCKFILE=/tmp/lock.txt
if [ -e ${LOCKFILE} ] && kill -0 `cat ${LOCKFILE}`; then
echo "already running"
exit
fi
# make sure the lockfile is removed when we exit and then claim it
trap "rm -f ${LOCKFILE}; exit" INT TERM EXIT
echo $$ > ${LOCKFILE}
# do stuff
sleep 1000
rm -f ${LOCKFILE}
The trick here is the kill -0 which doesn't deliver any signal but just checks if a process with the given PID exists. Also the call to trap will ensure that the lockfile is removed even when your process is killed (except kill -9).
There's a wrapper around the flock(2) system call called, unimaginatively, flock(1). This makes it relatively easy to reliably obtain exclusive locks without worrying about cleanup etc. There are examples on the man page as to how to use it in a shell script.
To make locking reliable you need an atomic operation. Many of the above proposals
are not atomic. The proposed lockfile(1) utility looks promising as the man-page
mentioned, that its "NFS-resistant". If your OS does not support lockfile(1) and
your solution has to work on NFS, you have not many options....
NFSv2 has two atomic operations:
symlink
rename
With NFSv3 the create call is also atomic.
Directory operations are NOT atomic under NFSv2 and NFSv3 (please refer to the book 'NFS Illustrated' by Brent Callaghan, ISBN 0-201-32570-5; Brent is a NFS-veteran at Sun).
Knowing this, you can implement spin-locks for files and directories (in shell, not PHP):
lock current dir:
while ! ln -s . lock; do :; done
lock a file:
while ! ln -s ${f} ${f}.lock; do :; done
unlock current dir (assumption, the running process really acquired the lock):
mv lock deleteme && rm deleteme
unlock a file (assumption, the running process really acquired the lock):
mv ${f}.lock ${f}.deleteme && rm ${f}.deleteme
Remove is also not atomic, therefore first the rename (which is atomic) and then the remove.
For the symlink and rename calls, both filenames have to reside on the same filesystem. My proposal: use only simple filenames (no paths) and put file and lock into the same directory.
You need an atomic operation, like flock, else this will eventually fail.
But what to do if flock is not available. Well there is mkdir. That's an atomic operation too. Only one process will result in a successful mkdir, all others will fail.
So the code is:
if mkdir /var/lock/.myscript.exclusivelock
then
# do stuff
:
rmdir /var/lock/.myscript.exclusivelock
fi
You need to take care of stale locks else aftr a crash your script will never run again.
Another option is to use shell's noclobber option by running set -C. Then > will fail if the file already exists.
In brief:
set -C
lockfile="/tmp/locktest.lock"
if echo "$$" > "$lockfile"; then
echo "Successfully acquired lock"
# do work
rm "$lockfile" # XXX or via trap - see below
else
echo "Cannot acquire lock - already locked by $(cat "$lockfile")"
fi
This causes the shell to call:
open(pathname, O_CREAT|O_EXCL)
which atomically creates the file or fails if the file already exists.
According to a comment on BashFAQ 045, this may fail in ksh88, but it works in all my shells:
$ strace -e trace=creat,open -f /bin/bash /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = 3
$ strace -e trace=creat,open -f /bin/zsh /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_LARGEFILE, 0666) = 3
$ strace -e trace=creat,open -f /bin/pdksh /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_TRUNC|O_LARGEFILE, 0666) = 3
$ strace -e trace=creat,open -f /bin/dash /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = 3
Interesting that pdksh adds the O_TRUNC flag, but obviously it's redundant:
either you're creating an empty file, or you're not doing anything.
How you do the rm depends on how you want unclean exits to be handled.
Delete on clean exit
New runs fail until the issue that caused the unclean exit to be resolved and the lockfile is manually removed.
# acquire lock
# do work (code here may call exit, etc.)
rm "$lockfile"
Delete on any exit
New runs succeed provided the script is not already running.
trap 'rm "$lockfile"' EXIT
You can use GNU Parallel for this as it works as a mutex when called as sem. So, in concrete terms, you can use:
sem --id SCRIPTSINGLETON yourScript
If you want a timeout too, use:
sem --id SCRIPTSINGLETON --semaphoretimeout -10 yourScript
Timeout of <0 means exit without running script if semaphore is not released within the timeout, timeout of >0 mean run the script anyway.
Note that you should give it a name (with --id) else it defaults to the controlling terminal.
GNU Parallel is a very simple install on most Linux/OSX/Unix platforms - it is just a Perl script.
For shell scripts, I tend to go with the mkdir over flock as it makes the locks more portable.
Either way, using set -e isn't enough. That only exits the script if any command fails. Your locks will still be left behind.
For proper lock cleanup, you really should set your traps to something like this psuedo code (lifted, simplified and untested but from actively used scripts) :
#=======================================================================
# Predefined Global Variables
#=======================================================================
TMPDIR=/tmp/myapp
[[ ! -d $TMP_DIR ]] \
&& mkdir -p $TMP_DIR \
&& chmod 700 $TMPDIR
LOCK_DIR=$TMP_DIR/lock
#=======================================================================
# Functions
#=======================================================================
function mklock {
__lockdir="$LOCK_DIR/$(date +%s.%N).$$" # Private Global. Use Epoch.Nano.PID
# If it can create $LOCK_DIR then no other instance is running
if $(mkdir $LOCK_DIR)
then
mkdir $__lockdir # create this instance's specific lock in queue
LOCK_EXISTS=true # Global
else
echo "FATAL: Lock already exists. Another copy is running or manually lock clean up required."
exit 1001 # Or work out some sleep_while_execution_lock elsewhere
fi
}
function rmlock {
[[ ! -d $__lockdir ]] \
&& echo "WARNING: Lock is missing. $__lockdir does not exist" \
|| rmdir $__lockdir
}
#-----------------------------------------------------------------------
# Private Signal Traps Functions {{{2
#
# DANGER: SIGKILL cannot be trapped. So, try not to `kill -9 PID` or
# there will be *NO CLEAN UP*. You'll have to manually remove
# any locks in place.
#-----------------------------------------------------------------------
function __sig_exit {
# Place your clean up logic here
# Remove the LOCK
[[ -n $LOCK_EXISTS ]] && rmlock
}
function __sig_int {
echo "WARNING: SIGINT caught"
exit 1002
}
function __sig_quit {
echo "SIGQUIT caught"
exit 1003
}
function __sig_term {
echo "WARNING: SIGTERM caught"
exit 1015
}
#=======================================================================
# Main
#=======================================================================
# Set TRAPs
trap __sig_exit EXIT # SIGEXIT
trap __sig_int INT # SIGINT
trap __sig_quit QUIT # SIGQUIT
trap __sig_term TERM # SIGTERM
mklock
# CODE
exit # No need for cleanup code here being in the __sig_exit trap function
Here's what will happen. All traps will produce an exit so the function __sig_exit will always happen (barring a SIGKILL) which cleans up your locks.
Note: my exit values are not low values. Why? Various batch processing systems make or have expectations of the numbers 0 through 31. Setting them to something else, I can have my scripts and batch streams react accordingly to the previous batch job or script.
Really quick and really dirty? This one-liner on the top of your script will work:
[[ $(pgrep -c "`basename \"$0\"`") -gt 1 ]] && exit
Of course, just make sure that your script name is unique. :)
Here's an approach that combines atomic directory locking with a check for stale lock via PID and restart if stale. Also, this does not rely on any bashisms.
#!/bin/dash
SCRIPTNAME=$(basename $0)
LOCKDIR="/var/lock/${SCRIPTNAME}"
PIDFILE="${LOCKDIR}/pid"
if ! mkdir $LOCKDIR 2>/dev/null
then
# lock failed, but check for stale one by checking if the PID is really existing
PID=$(cat $PIDFILE)
if ! kill -0 $PID 2>/dev/null
then
echo "Removing stale lock of nonexistent PID ${PID}" >&2
rm -rf $LOCKDIR
echo "Restarting myself (${SCRIPTNAME})" >&2
exec "$0" "$#"
fi
echo "$SCRIPTNAME is already running, bailing out" >&2
exit 1
else
# lock successfully acquired, save PID
echo $$ > $PIDFILE
fi
trap "rm -rf ${LOCKDIR}" QUIT INT TERM EXIT
echo hello
sleep 30s
echo bye
Create a lock file in a known location and check for existence on script start? Putting the PID in the file might be helpful if someone's attempting to track down an errant instance that's preventing execution of the script.
This example is explained in the man flock, but it needs some impovements, because we should manage bugs and exit codes:
#!/bin/bash
#set -e this is useful only for very stupid scripts because script fails when anything command exits with status more than 0 !! without possibility for capture exit codes. not all commands exits >0 are failed.
( #start subprocess
# Wait for lock on /var/lock/.myscript.exclusivelock (fd 200) for 10 seconds
flock -x -w 10 200
if [ "$?" != "0" ]; then echo Cannot lock!; exit 1; fi
echo $$>>/var/lock/.myscript.exclusivelock #for backward lockdir compatibility, notice this command is executed AFTER command bottom ) 200>/var/lock/.myscript.exclusivelock.
# Do stuff
# you can properly manage exit codes with multiple command and process algorithm.
# I suggest throw this all to external procedure than can properly handle exit X commands
) 200>/var/lock/.myscript.exclusivelock #exit subprocess
FLOCKEXIT=$? #save exitcode status
#do some finish commands
exit $FLOCKEXIT #return properly exitcode, may be usefull inside external scripts
You can use another method, list processes that I used in the past. But this is more complicated that method above. You should list processes by ps, filter by its name, additional filter grep -v grep for remove parasite nad finally count it by grep -c . and compare with number. Its complicated and uncertain
The existing answers posted either rely on the CLI utility flock or do not properly secure the lock file. The flock utility is not available on all non-Linux systems (i.e. FreeBSD), and does not work properly on NFS.
In my early days of system administration and system development, I was told that a safe and relatively portable method of creating a lock file was to create a temp file using mkemp(3) or mkemp(1), write identifying information to the temp file (i.e. PID), then hard link the temp file to the lock file. If the link was successful, then you have successfully obtained the lock.
When using locks in shell scripts, I typically place an obtain_lock() function in a shared profile and then source it from the scripts. Below is an example of my lock function:
obtain_lock()
{
LOCK="${1}"
LOCKDIR="$(dirname "${LOCK}")"
LOCKFILE="$(basename "${LOCK}")"
# create temp lock file
TMPLOCK=$(mktemp -p "${LOCKDIR}" "${LOCKFILE}XXXXXX" 2> /dev/null)
if test "x${TMPLOCK}" == "x";then
echo "unable to create temporary file with mktemp" 1>&2
return 1
fi
echo "$$" > "${TMPLOCK}"
# attempt to obtain lock file
ln "${TMPLOCK}" "${LOCK}" 2> /dev/null
if test $? -ne 0;then
rm -f "${TMPLOCK}"
echo "unable to obtain lockfile" 1>&2
if test -f "${LOCK}";then
echo "current lock information held by: $(cat "${LOCK}")" 1>&2
fi
return 2
fi
rm -f "${TMPLOCK}"
return 0;
};
The following is an example of how to use the lock function:
#!/bin/sh
. /path/to/locking/profile.sh
PROG_LOCKFILE="/tmp/myprog.lock"
clean_up()
{
rm -f "${PROG_LOCKFILE}"
}
obtain_lock "${PROG_LOCKFILE}"
if test $? -ne 0;then
exit 1
fi
trap clean_up SIGHUP SIGINT SIGTERM
# bulk of script
clean_up
exit 0
# end of script
Remember to call clean_up at any exit points in your script.
I've used the above in both Linux and FreeBSD environments.
Add this line at the beginning of your script
[ "${FLOCKER}" != "$0" ] && exec env FLOCKER="$0" flock -en "$0" "$0" "$#" || :
It's a boilerplate code from man flock.
If you want more logging, use this one
[ "${FLOCKER}" != "$0" ] && { echo "Trying to start build from queue... "; exec bash -c "FLOCKER='$0' flock -E $E_LOCKED -en '$0' '$0' '$#' || if [ \"\$?\" -eq $E_LOCKED ]; then echo 'Locked.'; fi"; } || echo "Lock is free. Completing."
This sets and checks locks using flock utility.
This code detects if it was run first time by checking FLOCKER variable, if it is not set to script name, then it tries to start script again recursively using flock and with FLOCKER variable initialized, if FLOCKER is set correctly, then flock on previous iteration succeeded and it is OK to proceed. If lock is busy, it fails with configurable exit code.
It seems to not work on Debian 7, but seems to work back again with experimental util-linux 2.25 package. It writes "flock: ... Text file busy". It could be overridden by disabling write permission on your script.
When targeting a Debian machine I find the lockfile-progs package to be a good solution. procmail also comes with a lockfile tool. However sometimes I am stuck with neither of these.
Here's my solution which uses mkdir for atomic-ness and a PID file to detect stale locks. This code is currently in production on a Cygwin setup and works well.
To use it simply call exclusive_lock_require when you need get exclusive access to something. An optional lock name parameter lets you share locks between different scripts. There's also two lower level functions (exclusive_lock_try and exclusive_lock_retry) should you need something more complex.
function exclusive_lock_try() # [lockname]
{
local LOCK_NAME="${1:-`basename $0`}"
LOCK_DIR="/tmp/.${LOCK_NAME}.lock"
local LOCK_PID_FILE="${LOCK_DIR}/${LOCK_NAME}.pid"
if [ -e "$LOCK_DIR" ]
then
local LOCK_PID="`cat "$LOCK_PID_FILE" 2> /dev/null`"
if [ ! -z "$LOCK_PID" ] && kill -0 "$LOCK_PID" 2> /dev/null
then
# locked by non-dead process
echo "\"$LOCK_NAME\" lock currently held by PID $LOCK_PID"
return 1
else
# orphaned lock, take it over
( echo $$ > "$LOCK_PID_FILE" ) 2> /dev/null && local LOCK_PID="$$"
fi
fi
if [ "`trap -p EXIT`" != "" ]
then
# already have an EXIT trap
echo "Cannot get lock, already have an EXIT trap"
return 1
fi
if [ "$LOCK_PID" != "$$" ] &&
! ( umask 077 && mkdir "$LOCK_DIR" && umask 177 && echo $$ > "$LOCK_PID_FILE" ) 2> /dev/null
then
local LOCK_PID="`cat "$LOCK_PID_FILE" 2> /dev/null`"
# unable to acquire lock, new process got in first
echo "\"$LOCK_NAME\" lock currently held by PID $LOCK_PID"
return 1
fi
trap "/bin/rm -rf \"$LOCK_DIR\"; exit;" EXIT
return 0 # got lock
}
function exclusive_lock_retry() # [lockname] [retries] [delay]
{
local LOCK_NAME="$1"
local MAX_TRIES="${2:-5}"
local DELAY="${3:-2}"
local TRIES=0
local LOCK_RETVAL
while [ "$TRIES" -lt "$MAX_TRIES" ]
do
if [ "$TRIES" -gt 0 ]
then
sleep "$DELAY"
fi
local TRIES=$(( $TRIES + 1 ))
if [ "$TRIES" -lt "$MAX_TRIES" ]
then
exclusive_lock_try "$LOCK_NAME" > /dev/null
else
exclusive_lock_try "$LOCK_NAME"
fi
LOCK_RETVAL="${PIPESTATUS[0]}"
if [ "$LOCK_RETVAL" -eq 0 ]
then
return 0
fi
done
return "$LOCK_RETVAL"
}
function exclusive_lock_require() # [lockname] [retries] [delay]
{
if ! exclusive_lock_retry "$#"
then
exit 1
fi
}
If flock's limitations, which have already been described elsewhere on this thread, aren't an issue for you, then this should work:
#!/bin/bash
{
# exit if we are unable to obtain a lock; this would happen if
# the script is already running elsewhere
# note: -x (exclusive) is the default
flock -n 100 || exit
# put commands to run here
sleep 100
} 100>/tmp/myjob.lock
Some unixes have lockfile which is very similar to the already mentioned flock.
From the manpage:
lockfile can be used to create one
or more semaphore files. If lock-
file can't create all the specified
files (in the specified order), it
waits sleeptime (defaults to 8)
seconds and retries the last file that
didn't succeed. You can specify the
number of retries to do until
failure is returned. If the number
of retries is -1 (default, i.e.,
-r-1) lockfile will retry forever.
I use a simple approach that handles stale lock files.
Note that some of the above solutions that store the pid, ignore the fact that the pid can wrap around. So - just checking if there is a valid process with the stored pid is not enough, especially for long running scripts.
I use noclobber to make sure only one script can open and write to the lock file at one time. Further, I store enough information to uniquely identify a process in the lockfile. I define the set of data to uniquely identify a process to be pid,ppid,lstart.
When a new script starts up, if it fails to create the lock file, it then verifies that the process that created the lock file is still around. If not, we assume the original process died an ungraceful death, and left a stale lock file. The new script then takes ownership of the lock file, and all is well the world, again.
Should work with multiple shells across multiple platforms. Fast, portable and simple.
#!/usr/bin/env sh
# Author: rouble
LOCKFILE=/var/tmp/lockfile #customize this line
trap release INT TERM EXIT
# Creates a lockfile. Sets global variable $ACQUIRED to true on success.
#
# Returns 0 if it is successfully able to create lockfile.
acquire () {
set -C #Shell noclobber option. If file exists, > will fail.
UUID=`ps -eo pid,ppid,lstart $$ | tail -1`
if (echo "$UUID" > "$LOCKFILE") 2>/dev/null; then
ACQUIRED="TRUE"
return 0
else
if [ -e $LOCKFILE ]; then
# We may be dealing with a stale lock file.
# Bring out the magnifying glass.
CURRENT_UUID_FROM_LOCKFILE=`cat $LOCKFILE`
CURRENT_PID_FROM_LOCKFILE=`cat $LOCKFILE | cut -f 1 -d " "`
CURRENT_UUID_FROM_PS=`ps -eo pid,ppid,lstart $CURRENT_PID_FROM_LOCKFILE | tail -1`
if [ "$CURRENT_UUID_FROM_LOCKFILE" == "$CURRENT_UUID_FROM_PS" ]; then
echo "Script already running with following identification: $CURRENT_UUID_FROM_LOCKFILE" >&2
return 1
else
# The process that created this lock file died an ungraceful death.
# Take ownership of the lock file.
echo "The process $CURRENT_UUID_FROM_LOCKFILE is no longer around. Taking ownership of $LOCKFILE"
release "FORCE"
if (echo "$UUID" > "$LOCKFILE") 2>/dev/null; then
ACQUIRED="TRUE"
return 0
else
echo "Cannot write to $LOCKFILE. Error." >&2
return 1
fi
fi
else
echo "Do you have write permissons to $LOCKFILE ?" >&2
return 1
fi
fi
}
# Removes the lock file only if this script created it ($ACQUIRED is set),
# OR, if we are removing a stale lock file (first parameter is "FORCE")
release () {
#Destroy lock file. Take no prisoners.
if [ "$ACQUIRED" ] || [ "$1" == "FORCE" ]; then
rm -f $LOCKFILE
fi
}
# Test code
# int main( int argc, const char* argv[] )
echo "Acquring lock."
acquire
if [ $? -eq 0 ]; then
echo "Acquired lock."
read -p "Press [Enter] key to release lock..."
release
echo "Released lock."
else
echo "Unable to acquire lock."
fi
I wanted to do away with lockfiles, lockdirs, special locking programs and even pidof since it isn't found on all Linux installations. Also wanted to have the simplest code possible (or at least as few lines as possible). Simplest if statement, in one line:
if [[ $(ps axf | awk -v pid=$$ '$1!=pid && $6~/'$(basename $0)'/{print $1}') ]]; then echo "Already running"; exit; fi
Actually although the answer of bmdhacks is almost good, there is a slight chance the second script to run after first checked the lockfile and before it wrote it. So they both will write the lock file and they will both be running. Here is how to make it work for sure:
lockfile=/var/lock/myscript.lock
if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null ; then
trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT
else
# or you can decide to skip the "else" part if you want
echo "Another instance is already running!"
fi
The noclobber option will make sure that redirect command will fail if file already exists. So the redirect command is actually atomic - you write and check the file with one command. You don't need to remove the lockfile at the end of file - it'll be removed by the trap. I hope this helps to people that will read it later.
P.S. I didn't see that Mikel already answered the question correctly, although he didn't include the trap command to reduce the chance the lock file will be left over after stopping the script with Ctrl-C for example. So this is the complete solution
An example with flock(1) but without subshell. flock()ed file /tmp/foo is never removed, but that doesn't matter as it gets flock() and un-flock()ed.
#!/bin/bash
exec 9<> /tmp/foo
flock -n 9
RET=$?
if [[ $RET -ne 0 ]] ; then
echo "lock failed, exiting"
exit
fi
#Now we are inside the "critical section"
echo "inside lock"
sleep 5
exec 9>&- #close fd 9, and release lock
#The part below is outside the critical section (the lock)
echo "lock released"
sleep 5
This one line answer comes from someone related Ask Ubuntu Q&A:
[ "${FLOCKER}" != "$0" ] && exec env FLOCKER="$0" flock -en "$0" "$0" "$#" || :
# This is useful boilerplate code for shell scripts. Put it at the top of
# the shell script you want to lock and it'll automatically lock itself on
# the first run. If the env var $FLOCKER is not set to the shell script
# that is being run, then execute flock and grab an exclusive non-blocking
# lock (using the script itself as the lock file) before re-execing itself
# with the right arguments. It also sets the FLOCKER env var to the right
# value so it doesn't run again.
PID and lockfiles are definitely the most reliable. When you attempt to run the program, it can check for the lockfile which and if it exists, it can use ps to see if the process is still running. If it's not, the script can start, updating the PID in the lockfile to its own.
I find that bmdhack's solution is the most practical, at least for my use case. Using flock and lockfile rely on removing the lockfile using rm when the script terminates, which can't always be guaranteed (e.g., kill -9).
I would change one minor thing about bmdhack's solution: It makes a point of removing the lock file, without stating that this is unnecessary for the safe working of this semaphore. His use of kill -0 ensures that an old lockfile for a dead process will simply be ignored/over-written.
My simplified solution is therefore to simply add the following to the top of your singleton:
## Test the lock
LOCKFILE=/tmp/singleton.lock
if [ -e ${LOCKFILE} ] && kill -0 `cat ${LOCKFILE}`; then
echo "Script already running. bye!"
exit
fi
## Set the lock
echo $$ > ${LOCKFILE}
Of course, this script still has the flaw that processes that are likely to start at the same time have a race hazard, as the lock test and set operations are not a single atomic action. But the proposed solution for this by lhunath to use mkdir has the flaw that a killed script may leave behind the directory, thus preventing other instances from running.
The semaphoric utility uses flock (as discussed above, e.g. by presto8) to implement a counting semaphore. It enables any specific number of concurrent processes you want. We use it to limit the level of concurrency of various queue worker processes.
It's like sem but much lighter-weight. (Full disclosure: I wrote it after finding the sem was way too heavy for our needs and there wasn't a simple counting semaphore utility available.)
Answered a million times already, but another way, without the need for external dependencies:
LOCK_FILE="/var/lock/$(basename "$0").pid"
trap "rm -f ${LOCK_FILE}; exit" INT TERM EXIT
if [[ -f $LOCK_FILE && -d /proc/`cat $LOCK_FILE` ]]; then
// Process already exists
exit 1
fi
echo $$ > $LOCK_FILE
Each time it writes the current PID ($$) into the lockfile and on script startup checks if a process is running with the latest PID.
Using the process's lock is much stronger and takes care of the ungraceful exits also.
lock_file is kept open as long as the process is running. It will be closed (by shell) once the process exists (even if it gets killed).
I found this to be very efficient:
lock_file=/tmp/`basename $0`.lock
if fuser $lock_file > /dev/null 2>&1; then
echo "WARNING: Other instance of $(basename $0) running."
exit 1
fi
exec 3> $lock_file
I use oneliner # the very beginning of script:
#!/bin/bash
if [[ $(pgrep -afc "$(basename "$0")") -gt "1" ]]; then echo "Another instance of "$0" has already been started!" && exit; fi
.
the_beginning_of_actual_script
It is good to see the presence of process in the memory (no matter what the status of process is); but it does the job for me.
If you do not want to or cannot use flock (e.g. you are not using a shared file system), consider using an external service like lockable.
It exposes advisory lock primitives, much like flock would. In particular, you can acquire a lock via:
https://lockable.dev/api/acquire/my-lock-name
and release it via
https://lockable.dev/api/release/my-lock-name
By wrapping script execution with lock acquisition and release, you can make sure only a single instance of the process is running at any given time.

Check if bash script already running except itself with arguments

So I've looked up other questions and answers for this and as you can imagine, there are lots of ways to find this. However, my situation is kind of different.
I'm able to check whether a bash script is already running or not and I want to kill the script if it's already running.
The problem is that with the below code, -since I'm running this within the same script- the script kills itself too because it sees a script already running.
result=`ps aux | grep -i "myscript.sh" | grep -v "grep" | wc -l`
if [ $result -ge 1 ]
then
echo "script is running"
else
echo "script is not running"
fi
So how can I check if a script is already running besides it's own self and kill itself if there's another instance of the same script is running, else, continue without killing itself.
I thought I could combine the above code with $$ command to find the script's own PID and differentiate them this way but I'm not sure how to do that.
Also a side note, my script can be run multiple times at the same time within the same machine but with different arguments and that's fine. I only need to identify if script is already running with the same arguments.
pid=$(pgrep myscript.sh | grep -x -v $$)
# filter non-existent pids
pid=$(<<<"$pid" xargs -n1 sh -c 'kill -0 "$1" 2>/dev/null && echo "$1"' --)
if [ -n "$pid" ]; then
echo "Other script is running with pid $pid"
echo "Killing him!"
kill $pid
fi
pgrep lists the pids that match the name myscript.sh. From the list we filter current $$ shell with grep -v. It the result is non-empty, then you could kill the other pid.
Without the xargs, it would work, but the pgrep myscript.sh will pick up the temporary pid created for command substitution or the pipe. So the pid will never be empty and the kill will always execute complaining about the non-existent process. To do that, for each pid in pids, I check if the pid exists with kill -0. If it does, then it is outputted, effectively filtering all nonexistent pids.
You could also use a normal for loop to filter the pids:
# filter non-existent pids
pid=$(
for i in $pid; do
if kill -0 "$i" 2>/dev/null; then
echo "$i"
fi
done
)
Alternatively, you could use flock to lock the file and use lsof to list current open files with filtering the current one. As it is now, I think it will kill also editors that are editing the file and such. I believe the lsof output could be better filtered to accommodate this.
if [ "${FLOCKER}" != "$0" ]; then
pids=$(lsof -p "^$$" -- ./myscript.sh | awk 'NR>1{print $2}')
if [ -n "$pids" ]; then
echo "Other processes with $(echo $pids) found. Killing them"
kill $pids
fi
exec env FLOCKER="$0" flock -en "$0" "$0" "$#"
fi
I would go with either of 2 ways to solve this problem.
1st solution: Create a watchdog file lets say a .lck file kind of on a location before starting the script's execution(Make sure we use trap etc commands in case script is aborted so that .lck file should be removed) AND remove it once execution of script is completed successfully.
Example script for 1st solution: This is just an example a test one. We need to take care of interruptions in the script, lets say script got interrupted by a command or etc then we could use trap in it too, since at that time it would have not been completed but you may need to kick it off again(since last time it was not completed).
cat file.ksh
#!/bin/bash
PWD=`pwd`
watchdog_file="$PWD/script.lck"
if [[ -f "$watchdog_file" ]]
then
echo "Please wait script is still running, exiting from script now.."
exit 1;
else
touch $watchdog_file
fi
while true
do
echo "singh" > test1
done
if [[ -f "$watchdog_file" ]]
then
rm "$watchdog_file"
fi
2nd solution: Take pid of current running shell using $$ save it in a file. Then check if that process is still running come out of script if NOT running then move on to run statements in script.

Bash script that kills other instances of itself if they're running

So, I want to make a bash script, and I'm going to have it run on boot, but I'd like to update the script if I need to and run it without a reboot, so what I want to do is make the script check if there is any other instances of it running when it is loaded, and terninate any instances of the script other than itself. I want it to check instances of bash and get the path of the scripts that are being ran and kill any instances of scripts that have the same path name as it's own. How can I do this?
Example: If I am in directory /foo/bar and I run the script ../tball/script.sh, it will kill any instances of bash that are running the script /foo/tball/script.sh if they exist.
Here's the basis
kill_others() {
local mypid=$$ # capture this run's pid
declare pids=($(pgrep -f ${0##*/} # get all the pids running this script
for pid in ${pids[#]/$mypid/}; do # cycle through all pids except this one
kill $pid # kill the other pids
sleep 1 # give time to complete
done
}
declare -i count=0
while [[ $(pgrep -f ${0##*/}|wc -l) -gt 1 ]]; do
kill_outhers
((++count))
if [[ $count -gt 10 ]]; then
echo "ERROR: can't kill pids" >&2
exit 1
fi
done
The best approach is a file containing the PID of the process in a volatile filesystem like this:
echo $$ > /run/script.pid
You could refine it further by checking if that PID exists with:
if [ ! -d /proc/$(< /run/script.pid) ] ; then
rm /run/script.pid
fi
In your script you should have something like this, to remove the file on exit or if it receives a signal that kills the process:
trap "rm -f /run/script.pid" EXIT INT QUIT TERM
EDIT: Or you could append the PID to a well known pathname and kill all instances of the script with something like this before saving the PID:
kill $(< /run/script.pid) ; sleep 10 ; kill -9 $(< /run/script.pid)

How can I make an external program interruptible in this trap-captured bash script?

I am writing a script which will run an external program (arecord) and do some cleanup if it's interrupted by either a POSIX signal or input on a named pipe. Here's the draft in full
#!/bin/bash
X=`date '+%Y-%m-%d_%H.%M.%S'`
F=/tmp/$X.wav
P=/tmp/$X.$$.fifo
mkfifo $P
trap "echo interrupted && (rm $P || echo 'couldnt delete $P') && echo 'removed fifo' && exit" INT
# this forked process will wait for input on the fifo
(echo 'waiting for fifo' && cat $P >/dev/null && echo 'fifo hit' && kill -s SIGINT $$)&
while true
do
echo waiting...
sleep 1
done
#arecord $F
This works perfectly as it is: the script ends when a signal arrives and a signal is generated if the fifo is written-to.
But instead of the while true loop I want the now-commented-out arecord command, but if I run that program instead of the loop the SIGINT doesn't get caught in the trap and arecord keeps running.
What should I do?
It sounds like you really need this to work more like an init script. So, start arecord in the background and put the pid in a file. Then use the trap to kill the arecord process based on the pidfile.
#!/bin/bash
PIDFILE=/var/run/arecord-runner.pid #Just somewhere to store the pid
LOGFILE=/var/log/arecord-runner.log
#Just one option for how to format your trap call
#Note that this does not use &&, so one failed function will not
# prevent other items in the trap from running
trapFunc() {
echo interrupted
(rm $P || echo 'couldnt delete $P')
echo 'removed fifo'
kill $(cat $PIDFILE)
exit 0
}
X=`date '+%Y-%m-%d_%H.%M.%S'`
F=/tmp/$X.wav
P=/tmp/$X.$$.fifo
mkfifo $P
trap "trapFunc" INT
# this forked process will wait for input on the fifo
(echo 'waiting for fifo' && cat $P >/dev/null && echo 'fifo hit' && kill -s SIGINT $$)&
arecord $F 1>$LOGFILE 2>&1 & #Run in the background, sending logs to file
echo $! > $PIDFILE #Save pid of the last background process to file
while true
do
echo waiting...
sleep 1
done
Also... you may have your trap written with '&&' clauses for a reason, but as an alternative, you can give a function name as I did above, or a sort of anonymous function like this:
trap "{ command1; command2 args; command3; exit 0; }"
Just make sure that each command is followed by a semicolon and there are spaces between the braces and the commands. The risk of using && in the trap is that your script will continue to run past the interrupt if one of the commands before the exit fails to execute (but maybe you want that?).

Best way to make a shell script daemon?

I'm wondering if there is a better way to make a daemon that waits for something using only sh than:
#! /bin/sh
trap processUserSig SIGUSR1
processUserSig() {
echo "doing stuff"
}
while true; do
sleep 1000
done
In particular, I'm wondering if there's any way to get rid of the loop and still have the thing listen for the signals.
Just backgrounding your script (./myscript &) will not daemonize it. See http://www.faqs.org/faqs/unix-faq/programmer/faq/, section 1.7, which describes what's necessary to become a daemon. You must disconnect it from the terminal so that SIGHUP does not kill it. You can take a shortcut to make a script appear to act like a daemon;
nohup ./myscript 0<&- &>/dev/null &
will do the job. Or, to capture both stderr and stdout to a file:
nohup ./myscript 0<&- &> my.admin.log.file &
Redirection explained (see bash redirection)
0<&- closes stdin
&> file sends stdout and stderr to a file
However, there may be further important aspects that you need to consider. For example:
You will still have a file descriptor open to the script, which means that the directory it's mounted in would be unmountable. To be a true daemon you should chdir("/") (or cd / inside your script), and fork so that the parent exits, and thus the original descriptor is closed.
Perhaps run umask 0. You may not want to depend on the umask of the caller of the daemon.
For an example of a script that takes all of these aspects into account, see Mike S' answer.
Some of the top-upvoted answers here are missing some important parts of what makes a daemon a daemon, as opposed to just a background process, or a background process detached from a shell.
This http://www.faqs.org/faqs/unix-faq/programmer/faq/ describes what is necessary to be a daemon. And this Run bash script as daemon implements the setsid, though it misses the chdir to root.
The original poster's question was actually more specific than "How do I create a daemon process using bash?", but since the subject and answers discuss daemonizing shell scripts generally, I think it's important to point it out (for interlopers like me looking into the fine details of creating a daemon).
Here's my rendition of a shell script that would behave according to the FAQ. Set DEBUG to true to see pretty output (but it also exits immediately rather than looping endlessly):
#!/bin/bash
DEBUG=false
# This part is for fun, if you consider shell scripts fun- and I do.
trap process_USR1 SIGUSR1
process_USR1() {
echo 'Got signal USR1'
echo 'Did you notice that the signal was acted upon only after the sleep was done'
echo 'in the while loop? Interesting, yes? Yes.'
exit 0
}
# End of fun. Now on to the business end of things.
print_debug() {
whatiam="$1"; tty="$2"
[[ "$tty" != "not a tty" ]] && {
echo "" >$tty
echo "$whatiam, PID $$" >$tty
ps -o pid,sess,pgid -p $$ >$tty
tty >$tty
}
}
me_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
me_FILE=$(basename $0)
cd /
#### CHILD HERE --------------------------------------------------------------------->
if [ "$1" = "child" ] ; then # 2. We are the child. We need to fork again.
shift; tty="$1"; shift
$DEBUG && print_debug "*** CHILD, NEW SESSION, NEW PGID" "$tty"
umask 0
$me_DIR/$me_FILE XXrefork_daemonXX "$tty" "$#" </dev/null >/dev/null 2>/dev/null &
$DEBUG && [[ "$tty" != "not a tty" ]] && echo "CHILD OUT" >$tty
exit 0
fi
##### ENTRY POINT HERE -------------------------------------------------------------->
if [ "$1" != "XXrefork_daemonXX" ] ; then # 1. This is where the original call starts.
tty=$(tty)
$DEBUG && print_debug "*** PARENT" "$tty"
setsid $me_DIR/$me_FILE child "$tty" "$#" &
$DEBUG && [[ "$tty" != "not a tty" ]] && echo "PARENT OUT" >$tty
exit 0
fi
##### RUNS AFTER CHILD FORKS (actually, on Linux, clone()s. See strace -------------->
# 3. We have been reforked. Go to work.
exec >/tmp/outfile
exec 2>/tmp/errfile
exec 0</dev/null
shift; tty="$1"; shift
$DEBUG && print_debug "*** DAEMON" "$tty"
# The real stuff goes here. To exit, see fun (above)
$DEBUG && [[ "$tty" != "not a tty" ]] && echo NOT A REAL DAEMON. NOT RUNNING WHILE LOOP. >$tty
$DEBUG || {
while true; do
echo "Change this loop, so this silly no-op goes away." >/dev/null
echo "Do something useful with your life, young padawan." >/dev/null
sleep 10
done
}
$DEBUG && [[ "$tty" != "not a tty" ]] && sleep 3 && echo "DAEMON OUT" >$tty
exit # This may never run. Why is it here then? It's pretty.
# Kind of like, "The End" at the end of a movie that you
# already know is over. It's always nice.
Output looks like this when DEBUG is set to true. Notice how the session and process group ID (SESS, PGID) numbers change:
<shell_prompt>$ bash blahd
*** PARENT, PID 5180
PID SESS PGID
5180 1708 5180
/dev/pts/6
PARENT OUT
<shell_prompt>$
*** CHILD, NEW SESSION, NEW PGID, PID 5188
PID SESS PGID
5188 5188 5188
not a tty
CHILD OUT
*** DAEMON, PID 5198
PID SESS PGID
5198 5188 5188
not a tty
NOT A REAL DAEMON. NOT RUNNING WHILE LOOP.
DAEMON OUT
# double background your script to have it detach from the tty
# cf. http://www.linux-mag.com/id/5981
(./program.sh &) &
Use your system's daemon facility, such as start-stop-daemon.
Otherwise, yes, there has to be a loop somewhere.
$ ( cd /; umask 0; setsid your_script.sh </dev/null &>/dev/null & ) &
It really depends on what is the binary itself going to do.
For example I want to create some listener.
The starting Daemon is simple task :
lis_deamon :
#!/bin/bash
# We will start the listener as Deamon process
#
LISTENER_BIN=/tmp/deamon_test/listener
test -x $LISTENER_BIN || exit 5
PIDFILE=/tmp/deamon_test/listener.pid
case "$1" in
start)
echo -n "Starting Listener Deamon .... "
startproc -f -p $PIDFILE $LISTENER_BIN
echo "running"
;;
*)
echo "Usage: $0 start"
exit 1
;;
esac
this is how we start the daemon (common way for all /etc/init.d/ staff)
now as for the listener it self,
It must be some kind of loop/alert or else that will trigger the script
to do what u want. For example if u want your script to sleep 10 min
and wake up and ask you how you are doing u will do this with the
while true ; do sleep 600 ; echo "How are u ? " ; done
Here is the simple listener that u can do that will listen for your
commands from remote machine and execute them on local :
listener :
#!/bin/bash
# Starting listener on some port
# we will run it as deamon and we will send commands to it.
#
IP=$(hostname --ip-address)
PORT=1024
FILE=/tmp/backpipe
count=0
while [ -a $FILE ] ; do #If file exis I assume that it used by other program
FILE=$FILE.$count
count=$(($count + 1))
done
# Now we know that such file do not exist,
# U can write down in deamon it self the remove for those files
# or in different part of program
mknod $FILE p
while true ; do
netcat -l -s $IP -p $PORT < $FILE |/bin/bash > $FILE
done
rm $FILE
So to start UP it : /tmp/deamon_test/listener start
and to send commands from shell (or wrap it to script) :
test_host#netcat 10.184.200.22 1024
uptime
20:01pm up 21 days 5:10, 44 users, load average: 0.62, 0.61, 0.60
date
Tue Jan 28 20:02:00 IST 2014
punt! (Cntrl+C)
Hope this will help.
Have a look at the daemon tool from the libslack package:
http://ingvar.blog.linpro.no/2009/05/18/todays-sysadmin-tip-using-libslack-daemon-to-daemonize-a-script/
On Mac OS X use a launchd script for shell daemon.
If I had a script.sh and i wanted to execute it from bash and leave it running even when I want to close my bash session then I would combine nohup and & at the end.
example: nohup ./script.sh < inputFile.txt > ./logFile 2>&1 &
inputFile.txt can be any file. If your file has no input then we usually use /dev/null. So the command would be:
nohup ./script.sh < /dev/null > ./logFile 2>&1 &
After that close your bash session,open another terminal and execute: ps -aux | egrep "script.sh" and you will see that your script is still running at the background. Of cource,if you want to stop it then execute the same command (ps) and kill -9 <PID-OF-YOUR-SCRIPT>
See Bash Service Manager project: https://github.com/reduardo7/bash-service-manager
Implementation example
#!/usr/bin/env bash
export PID_FILE_PATH="/tmp/my-service.pid"
export LOG_FILE_PATH="/tmp/my-service.log"
export LOG_ERROR_FILE_PATH="/tmp/my-service.error.log"
. ./services.sh
run-script() {
local action="$1" # Action
while true; do
echo "### Running action '${action}'"
echo foo
echo bar >&2
[ "$action" = "run" ] && return 0
sleep 5
[ "$action" = "debug" ] && exit 25
done
}
before-start() {
local action="$1" # Action
echo "* Starting with $action"
}
after-finish() {
local action="$1" # Action
local serviceExitCode=$2 # Service exit code
echo "* Finish with $action. Exit code: $serviceExitCode"
}
action="$1"
serviceName="Example Service"
serviceMenu "$action" "$serviceName" run-script "$workDir" before-start after-finish
Usage example
$ ./example-service
# Actions: [start|stop|restart|status|run|debug|tail(-[log|error])]
$ ./example-service start
# Starting Example Service service...
$ ./example-service status
# Serive Example Service is runnig with PID 5599
$ ./example-service stop
# Stopping Example Service...
$ ./example-service status
# Service Example Service is not running
Here is the minimal change to the original proposal to create a valid daemon in Bourne shell (or Bash):
#!/bin/sh
if [ "$1" != "__forked__" ]; then
setsid "$0" __forked__ "$#" &
exit
else
shift
fi
trap 'siguser1=true' SIGUSR1
trap 'echo "Clean up and exit"; kill $sleep_pid; exit' SIGTERM
exec > outfile
exec 2> errfile
exec 0< /dev/null
while true; do
(sleep 30000000 &>/dev/null) &
sleep_pid=$!
wait
kill $sleep_pid &>/dev/null
if [ -n "$siguser1" ]; then
siguser1=''
echo "Wait was interrupted by SIGUSR1, do things here."
fi
done
Explanation:
Line 2-7: A daemon must be forked so it doesn't have a parent. Using an artificial argument to prevent endless forking. "setsid" detaches from starting process and terminal.
Line 9: Our desired signal needs to be differentiated from other signals.
Line 10: Cleanup is required to get rid of dangling "sleep" processes.
Line 11-13: Redirect stdout, stderr and stdin of the script.
Line 16: sleep in the background
Line 18: wait waits for end of sleep, but gets interrupted by (some) signals.
Line 19: Kill sleep process, because that is still running when signal is caught.
Line 22: Do the work if SIGUSR1 has been caught.
Guess it does not get any simpler than that.
Like many answers this one is not a "real" daemonization but rather an alternative to nohup approach.
echo "script.sh" | at now
There are obviously differences from using nohup. For one there is no detaching from the parent in the first place. Also "script.sh" doesn't inherit parent's environment.
By no means this is a better alternative. It is simply a different (and somewhat lazy) way of launching processes in background.
P.S. I personally upvoted carlo's answer as it seems to be the most elegant and works both from terminal and inside scripts
try executing using &
if you save this file as program.sh
you can use
$. program.sh &

Resources