Bash file locking including flock for subprocesses - bash

I am trying to work through securing my scripts from parallel execution by incorporating flock. I have read a number of threads here and came across a reference to this: http://www.kfirlavi.com/blog/2012/11/06/elegant-locking-of-bash-program/ which incorporates many of the examples presented in the other threads.
My scripts will eventually run on Ubuntu (>14), OS X 10.7 and 10.11.4. I am mainly testing on OS X 10.11.4 and have installed flock via homebrew.
When I run the script below, locks are being created but I think I am forking the subscripts and it is these scripts I am trying to ensure are not running more than one instance each.
#!/bin/bash
#----------------------------------------------------------------
set -vx
set -euo pipefail
set -o errexit
IFS=$'\n\t'
readonly PROGNAME=$(basename "$0")
readonly LOCKFILE_DIR=/tmp
readonly LOCK_FD=200
subprocess1="/bash$/subprocess1.sh"
subprocess2="/bash$/subprocess2.sh"
lock() {
local prefix=$1
local fd=${2:-$LOCK_FD}
local lock_file=$LOCKFILE_DIR/$prefix.lock
# create lock file
eval "exec $fd>$lock_file"
# acquier the lock
flock -n $fd \
&& return 0 \
|| return 1
}
eexit() {
local error_str="$#"
echo $error_str
exit 1
}
main() {
lock $PROGNAME \
|| eexit "Only one instance of $PROGNAME can run at one time."
##My child scripts
sh "$subprocess1" #wait for it to finish then run
sh "$subprocess2"
}
main
$subprocess1 is a script that loads ncftpget and logs into a remote server to grab some files. Once finished, the connection closes. I want to subprocess1 every 15 minutes via cron. I have done so with success, but sometimes there are many files to grab and the job takes longer than 15 minutes. It is rare, but it does happen. In such a case, I want to ensure a second instance of $subprocess1 can't be started. For clarity a small example of such a subscript is:
#!/bin/bash
remoteftp="someftp.ftp"
ncftplog="somelog.log"
localdir="some/local/dir"
ncftpget -R -T -f "$remoteftp" -d "$ncftplog" "$localdir" "*.files"
EXIT_V="$?"
case $EXIT_V in
0) O="Success!";;
1) O="Could not connect to remote host.";;
2) O="Could not connect to remote host - timed out.";;
3) O="Transfer failed.";;
4) O="Transfer failed - timed out.";;
5) O="Directory change failed.";;
6) O="Directory change failed - timed out.";;
7) O="Malformed URL.";;
8) O="Usage error.";;
9) O="Error in login configuration file.";;
10) O="Library initialization failed.";;
11) O="Session initialization failed.";;
esac
if [ "$EXIT_V" = 0 ];
then
echo ""$O"
else
echo "There has been an error: "$O""
echo "Exiting now..."
exit
fi
echo "Goodbye"
and an example of subprocess2 is:
#!/bin/bash
...preamble script setup items etc and then:
java -jar /some/javaprog.java
When I execute the parent script with "sh lock.sh", it progresses through the script without error and exits. The first issue I have is that if I load up the script again I get an error that indicates only one instance of lock.sh can run. What should I have added in the script that would indicate the processes have not completed yet (rather than merely exiting and giving back the prompt).
However, if subprocess1 was running on its own, lock.sh would load up a second instance of subprocess1 because it was not locked. How would one go about locking child scripts and ideally ensuring that forked processes were taken care of as well? If someone had run subprocess1 at the terminal or there was a runaway instance, if cron loads lock.sh, I would want it to fail when trying to load its instance subprocess1 and subprocess2 and not merely exit if cron tried to load two lock.sh instances.
My main concern is in loading multiple instances of ncftpget that is called by subprocess1 and then further, a third script I hope to incorporate, "subprocess2," which launches a java program that deals with the downloaded files, both ncftpget and the java program can't have parallel processes without breaking many things. But I'm at a loss on how to control them adequately.
I thought I could use something similar to this in the main() function of lock.sh:
#This is where I try to lock the subscript
pidfile="${subprocess1}"
# lock it
exec 200>$pidfile
flock -n 200 || exit 1
pid=$$
echo $pid 1>&200
but am not sure how to incorporate it.

Related

whether a shell script can be executed if another instance of the same script is already running

I have a shell script which usually runs nearly 10 mins for a single run,but i need to know if another request for running the script comes while a instance of the script is running already, whether new request need to wait for existing instance to compplete or a new instance will be started.
I need a new instance must be started whenever a request is available for the same script.
How to do it...
The shell script is a polling script which looks for a file in a directory and execute the file.The execution of the file takes nearly 10 min or more.But during execution if a new file arrives, it also has to be executed simultaneously.
the shell script is below, and how to modify it to execute multiple requests..
#!/bin/bash
while [ 1 ]; do
newfiles=`find /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/ -newer /afs/rch/usr$
touch /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/.my_marker
if [ -n "$newfiles" ]; then
echo "found files $newfiles"
name2=`ls /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/ -Art |tail -n 2 |head $
echo " $name2 "
mkdir -p -m 0755 /afs/rch/usr8/fsptools/WWW/dumpspace/$name2
name1="/afs/rch/usr8/fsptools/WWW/dumpspace/fipsdumputils/fipsdumputil -e -$
$name1
touch /afs/rch/usr8/fsptools/WWW/dumpspace/tempfiles/$name2
fi
sleep 5
done
When writing scripts like the one you describe, I take one of two approaches.
First, you can use a pid file to indicate that a second copy should not run. For example:
#!/bin/sh
pidfile=/var/run/$(0##*/).pid
# remove pid if we exit normally or are terminated
trap "rm -f $pidfile" 0 1 3 15
# Write the pid as a symlink
if ! ln -s "pid=$$" "$pidfile"; then
echo "Already running. Exiting." >&2
exit 0
fi
# Do your stuff
I like using symlinks to store pid because writing a symlink is an atomic operation; two processes can't conflict with each other. You don't even need to check for the existence of the pid symlink, because a failure of ln clearly indicates that a pid cannot be set. That's either a permission or path problem, or it's due to the symlink already being there.
Second option is to make it possible .. nay, preferable .. not to block additional instances, and instead configure whatever it is that this script does to permit multiple servers to run at the same time on different queue entries. "Single-queue-single-server" is never as good as "single-queue-multi-server". Since you haven't included code in your question, I have no way to know whether this approach would be useful for you, but here's some explanatory meta bash:
#!/usr/bin/env bash
workdir=/var/tmp # Set a better $workdir than this.
a=( $(get_list_of_queue_ids) ) # A command? A function? Up to you.
for qid in "${a[#]}"; do
# Set a "lock" for this item .. or don't, and move on.
if ! ln -s "pid=$$" $workdir/$qid.working; then
continue
fi
# Do your stuff with just this $qid.
...
# And finally, clean up after ourselves
remove_qid_from_queue $qid
rm $workdir/$qid.working
done
The effect of this is to transfer the idea of "one at a time" from the handler to the data. If you have a multi-CPU system, you probably have enough capacity to handle multiple queue entries at the same time.
ghoti's answer shows some helpful techniques, if modifying the script is an option.
Generally speaking, for an existing script:
Unless you know with certainty that:
the script has no side effects other than to output to the terminal or to write to files with shell-instance specific names (such as incorporating $$, the current shell's PID, into filenames) or some other instance-specific location,
OR that the script was explicitly designed for parallel execution,
I would assume that you cannot safely run multiple copies of the script simultaneously.
It is not reasonable to expect the average shell script to be designed for concurrent use.
From the viewpoint of the operating system, several processes may of course execute the same program in parallel. No need to worry about this.
However, it is conceivable, that a (careless) programmer wrote the program in such a way that it produces incorrect results, when two copies are executed in parallel.

Check processes run by cronjob to avoid multiple execution

How do I avoid cronjob from executing multiple times on the same command? I had tried to look around and try to check and kill in processes but it doesn't work with the below code. With the below code it keeps entering into else condition where it suppose to be "running". Any idea which part I did it wrongly?
#!/bin/sh
devPath=`ps aux | grep "[i]mport_shell_script"` | xargs
if [ ! -z "$devPath" -a "$devPath" != " " ]; then
echo "running"
exit
else
while true
do
sudo /usr/bin/php /var/www/html/xxx/import_from_datafile.php /dev/null 2>&1
sleep 5
done
fi
exit
cronjob:
*/2 * * * * root /bin/sh /var/www/html/xxx/import_shell_script.sh /dev/null 2>&1
I don't see the point to add a cron job which then starts a loop that runs a job. Either use cron to run the job every minute or use a daemon script to make sure your service is started and is kept running.
To check whether your script is already running, you can use a lock directory (unless your daemon framework already does that for you):
LOCK=/tmp/script.lock # You may want a better name here
mkdir $LOCK || exit 1 # Exit with error if script is already running
trap "rmdir $LOCK" EXIT # Remove the lock when the script terminates
...normal code...
If your OS supports it, then /var/lock/script might be a better path.
Your next question is probably how to write a daemon. To answer that, I need to know what kind of Linux you're using and whether you have things like systemd, daemonize, etc.
check the presence of a file at the beginning of your script ( for example /tmp/runonce-import_shell_script ). If it exists, that means the same script is already running (or the previous one halted with an error).
You can also add a timestamp in that file so you can check since when the script was running (and maybe decide to run it again after 24h even if the file is present)

Bash: How to lock also when perform an outside script

Here is my bash code:
(
flock -n -e 200 || (echo "This script is currently being run" && exit 1)
sleep 10
...Call some functions which is written in another script...
sleep 5
) 200>/tmp/blah.lockfile
I'm running the script from two shells successively and as long as the first one is at "sleep 5" all goes good, meaning that the other one doesn't start. But when the first turns to perform the code from another script (other file) the second run starts to execute.
So I have two questions here:
What should I do to prevent this script and all its "children" from run while the script OR its "child" is still running.
(I didn't find a more appropriate expression for running another script other than a "child", sorry for that :) ).
According to man page, -n causes the process to exit when it fails to gain the lock, but as far as I can see it just wait until it can run. What am I missing ?
Your problem may be fairly mundane. Namely,
false || ( exit 1 )
Does not cause the script to exit. Rather, the exit instructs the subshell to exit. So change your first line to:
flock -n -e 200 || { echo "This script is currently being run"; exit 1; } >&2

Shell script that continuously checks a text file for log data and then runs a program

I have a java program that stops often due to errors which is logged in a .log file. What can be a simple shell script to detect a particular text in the last/latest line say
[INFO] Stream closed
and then run the following command
java -jar xyz.jar
This should keep on happening forever(possibly after every two minutes or so) because xyz.jar writes the log file.
The text stream closed can arrive a lot of times in the log file. I just want it to take an action when it comes in the last line.
How about
while [[ true ]];
do
sleep 120
tail -1 logfile | grep -q "[INFO] Stream Closed"
if [[ $? -eq 1 ]]
then
java -jar xyz.jar &
fi
done
There may be condition where the tailed last log "Stream Closed" is not the real last log and the process is still logging the messages. We can avoid this condition by checking if the process is alive or not. If the process exited and the last log is "Stream Closed" then we need to restart the application.
#!/bin/bash
java -jar xyz.jar &
PID=$1
while [ true ]
do
tail -1 logfile | grep -q "Stream Closed" && kill -0 $PID && sleep 20 && continue
java -jar xyz.jar &
PID=$1
done
I would prefer checking whether the corresponding process is still running and restart the program on that event. There might be other errors that cause the process to stop. You can use a cronjob to periodically (like every minute) perform such a check.
Also, you might want to improve your java code so that it does not crash that often (if you have access to the code).
i solved this using a watchdog script that checks directly (grep) if program(s) is(are) running. by calling watchdog every minute (from cron under ubuntu), i basically guarantee (programs and environment are VERY stable) that no program will stay offline for more than 59 seconds.
this script will check a list of programs using the name in an array and see if each one is running, and, in case not, start it.
#!/bin/bash
#
# watchdog
#
# Run as a cron job to keep an eye on what_to_monitor which should always
# be running. Restart what_to_monitor and send notification as needed.
#
# This needs to be run as root or a user that can start system services.
#
# Revisions: 0.1 (20100506), 0.2 (20100507)
# first prog to check
NAME[0]=soc_gt2
# 2nd
NAME[1]=soc_gt0
# 3rd, etc etc
NAME[2]=soc_gp00
# START=/usr/sbin/$NAME
NOTIFY=you#gmail.com
NOTIFYCC=you2#mail.com
GREP=/bin/grep
PS=/bin/ps
NOP=/bin/true
DATE=/bin/date
MAIL=/bin/mail
RM=/bin/rm
for nameTemp in "${NAME[#]}"; do
$PS -ef|$GREP -v grep|$GREP $nameTemp >/dev/null 2>&1
case "$?" in
0)
# It is running in this case so we do nothing.
echo "$nameTemp is RUNNING OK. Relax."
$NOP
;;
1)
echo "$nameTemp is NOT RUNNING. Starting $nameTemp and sending notices."
START=/usr/sbin/$nameTemp
$START 2>&1 >/dev/null &
NOTICE=/tmp/watchdog.txt
echo "$NAME was not running and was started on `$DATE`" > $NOTICE
# $MAIL -n -s "watchdog notice" -c $NOTIFYCC $NOTIFY < $NOTICE
$RM -f $NOTICE
;;
esac
done
exit
i do not use the log verification, though you could easily incorporate that into your own version (just change grep for log check, for example).
if you run it from command line (or putty, if you are remotely connected), you will see what was working and what wasnt. have been using it for months now without a hiccup. just call it whenever you want to see what's working (regardless of it running under cron).
you could also place all your critical programs in one folder, do a directory list and check if every file in that folder has a program running under the same name. or read a txt file line by line, with every line correspoding to a program that is supposed to be running. etcetcetc
A good way is to use the awk command:
tail -f somelog.log | awk '/.*[INFO] Stream Closed.*/ { system("java -jar xyz.jar") }'
This continually monitors the log stream and when the regular expression matches its fires off whatever system command you have set, which is anything you would type into a shell.
If you really wanna be good you can put that line into a .sh file and run that .sh file from a process monitoring daemon like upstart to ensure that it never dies.
Nice and clean =D

Bash script to kill and restart Hudson

I am a novice at Bash scripting but I'm a quick learner. Usually. I'm trying to write a script to kill and restart an instance of Hudson--it needs to be restarted to pick up changes in environment variables. What I have so far:
#!/bin/bash
h=`pgrep -f hudson`
if test "$h" != ""; then
kill $h
while [ "$h" != "" ]; do
sleep 1
unset h
h=`pgrep -f hudson`
done
fi
java -jar ~/hudson/hudson.war &
The script correctly determines the running Hudson instance's PID and kills it. However, it just waits after the "kill" line and doesn't proceed. If I hit a key there it completes killing the process and exits the script, never even getting to the while loop. Clearly I'm missing something about how the process should be killed. It's not that the Hudson process is hung and not responding to "kill"; it exits normally, just not until I intervene.
I'm also sure this could be much more efficient but right now I would just like to understand where I'm going wrong.
Thanks in advance.
This represents some straightforward improvements to your script:
#!/bin/bash
h=$(pgrep -f hudson) # $() is preferred over backticks
if [[ -n $h ]]; then # this checks whether a variable is non-empty
kill $h
while [[ -n $h ]]; do
sleep 1
h=$(pgrep -f hudson) # it's usually unnecessary to unset a variable before you set it
done
fi
java -jar ~/hudson/hudson.war &
However, it's likely that this is all you need (or use the provided facility that mrooney referred to):
while pkill hudson; do sleep 1; done
java -jar ~/hudson/hudson.war &
How about being nice to Hudson and let it shut down itself. I found the following statement in the Hudson forum:
I added http://server/hudson/exit to
1.161. Accessing this URL will shutdown the VM that runs Hudson.
You can call the URL with wget. You can still kill Hudson if it doesn't shut down in an appropriate time.
EDIT: I just stumbled over another thread, with interesting restart options. It uses commands of the build in Winstone server. Not sure if it will pick up changes to environment variables.
If you are using Hudson via an RPM, it comes with an init script already. If not, I'd take a look at them and see if you can base your script off of them: https://hudson.dev.java.net/svn/hudson/trunk/hudson/main/rpm/SOURCES/ (guest//guest).

Resources