how to protect a make target against concurrent execution?

how to protect a make target against concurrent execution? - makefile

What is the simplest way to make sure that
make target
does not run concurrently ? If two process run make target at the same time, both would run the target and will likely step on each other toes.
A shell snippet such as
dotlockfile target.lock || exit 1
trap "dotlockfile -u target.lock" EXIT
make target
works well: if two process run that snippet at the same time, one of them will acquire the lock and the other will wait for it to finish. When the first process completes, the other get to run make target which will return immediately and do nothing because the target has already been built.
I'm hoping there is a simpler way to do the same thing.

The following (in your makefile) should significantly reduce the probability that several processes collide:
target:
#if [ -f making.target ]; then \
echo "Already making target..."; \
exit 1; \
else \
touch making.target; \
<do whatever is needed to make target>; \
rm making.target; \
fi
You can easily test it by replacing <do whatever is needed to make target> by sleep 10 and try to make target twice, with and without waiting for the first one when launching the second. Note that you can also simply skip the recipe instead of aborting the whole make run: simply remove the exit 1 line.

The simpler solution is to use flock as follows:
flock /tmp/target.lock make target
If this command is run while another is holding the lock, it will wait for it to finish and will run when the lock is released. Alternatively, if the intention is to give up if a make target is already running, the -n flag can be used like so:
flock -n /tmp/target.lock make target

Related

Bash file locking including flock for subprocesses

I am trying to work through securing my scripts from parallel execution by incorporating flock. I have read a number of threads here and came across a reference to this: http://www.kfirlavi.com/blog/2012/11/06/elegant-locking-of-bash-program/ which incorporates many of the examples presented in the other threads.
My scripts will eventually run on Ubuntu (>14), OS X 10.7 and 10.11.4. I am mainly testing on OS X 10.11.4 and have installed flock via homebrew.
When I run the script below, locks are being created but I think I am forking the subscripts and it is these scripts I am trying to ensure are not running more than one instance each.
#!/bin/bash
#----------------------------------------------------------------
set -vx
set -euo pipefail
set -o errexit
IFS=$'\n\t'
readonly PROGNAME=$(basename "$0")
readonly LOCKFILE_DIR=/tmp
readonly LOCK_FD=200
subprocess1="/bash$/subprocess1.sh"
subprocess2="/bash$/subprocess2.sh"
lock() {
local prefix=$1
local fd=${2:-$LOCK_FD}
local lock_file=$LOCKFILE_DIR/$prefix.lock
# create lock file
eval "exec $fd>$lock_file"
# acquier the lock
flock -n $fd \
&& return 0 \
|| return 1
}
eexit() {
local error_str="$#"
echo $error_str
exit 1
}
main() {
lock $PROGNAME \
|| eexit "Only one instance of $PROGNAME can run at one time."
##My child scripts
sh "$subprocess1" #wait for it to finish then run
sh "$subprocess2"
}
main
$subprocess1 is a script that loads ncftpget and logs into a remote server to grab some files. Once finished, the connection closes. I want to subprocess1 every 15 minutes via cron. I have done so with success, but sometimes there are many files to grab and the job takes longer than 15 minutes. It is rare, but it does happen. In such a case, I want to ensure a second instance of $subprocess1 can't be started. For clarity a small example of such a subscript is:
#!/bin/bash
remoteftp="someftp.ftp"
ncftplog="somelog.log"
localdir="some/local/dir"
ncftpget -R -T -f "$remoteftp" -d "$ncftplog" "$localdir" "*.files"
EXIT_V="$?"
case $EXIT_V in
0) O="Success!";;
1) O="Could not connect to remote host.";;
2) O="Could not connect to remote host - timed out.";;
3) O="Transfer failed.";;
4) O="Transfer failed - timed out.";;
5) O="Directory change failed.";;
6) O="Directory change failed - timed out.";;
7) O="Malformed URL.";;
8) O="Usage error.";;
9) O="Error in login configuration file.";;
10) O="Library initialization failed.";;
11) O="Session initialization failed.";;
esac
if [ "$EXIT_V" = 0 ];
then
echo ""$O"
else
echo "There has been an error: "$O""
echo "Exiting now..."
exit
fi
echo "Goodbye"
and an example of subprocess2 is:
#!/bin/bash
...preamble script setup items etc and then:
java -jar /some/javaprog.java
When I execute the parent script with "sh lock.sh", it progresses through the script without error and exits. The first issue I have is that if I load up the script again I get an error that indicates only one instance of lock.sh can run. What should I have added in the script that would indicate the processes have not completed yet (rather than merely exiting and giving back the prompt).
However, if subprocess1 was running on its own, lock.sh would load up a second instance of subprocess1 because it was not locked. How would one go about locking child scripts and ideally ensuring that forked processes were taken care of as well? If someone had run subprocess1 at the terminal or there was a runaway instance, if cron loads lock.sh, I would want it to fail when trying to load its instance subprocess1 and subprocess2 and not merely exit if cron tried to load two lock.sh instances.
My main concern is in loading multiple instances of ncftpget that is called by subprocess1 and then further, a third script I hope to incorporate, "subprocess2," which launches a java program that deals with the downloaded files, both ncftpget and the java program can't have parallel processes without breaking many things. But I'm at a loss on how to control them adequately.
I thought I could use something similar to this in the main() function of lock.sh:
#This is where I try to lock the subscript
pidfile="${subprocess1}"
# lock it
exec 200>$pidfile
flock -n 200 || exit 1
pid=$$
echo $pid 1>&200
but am not sure how to incorporate it.

whether a shell script can be executed if another instance of the same script is already running

I have a shell script which usually runs nearly 10 mins for a single run,but i need to know if another request for running the script comes while a instance of the script is running already, whether new request need to wait for existing instance to compplete or a new instance will be started.
I need a new instance must be started whenever a request is available for the same script.
How to do it...
The shell script is a polling script which looks for a file in a directory and execute the file.The execution of the file takes nearly 10 min or more.But during execution if a new file arrives, it also has to be executed simultaneously.
the shell script is below, and how to modify it to execute multiple requests..
#!/bin/bash
while [ 1 ]; do
newfiles=`find /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/ -newer /afs/rch/usr$
touch /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/.my_marker
if [ -n "$newfiles" ]; then
echo "found files $newfiles"
name2=`ls /afs/rch/usr8/fsptools/WWW/cgi-bin/upload/ -Art |tail -n 2 |head $
echo " $name2 "
mkdir -p -m 0755 /afs/rch/usr8/fsptools/WWW/dumpspace/$name2
name1="/afs/rch/usr8/fsptools/WWW/dumpspace/fipsdumputils/fipsdumputil -e -$
$name1
touch /afs/rch/usr8/fsptools/WWW/dumpspace/tempfiles/$name2
fi
sleep 5
done

When writing scripts like the one you describe, I take one of two approaches.
First, you can use a pid file to indicate that a second copy should not run. For example:
#!/bin/sh
pidfile=/var/run/$(0##*/).pid
# remove pid if we exit normally or are terminated
trap "rm -f $pidfile" 0 1 3 15
# Write the pid as a symlink
if ! ln -s "pid=$$" "$pidfile"; then
echo "Already running. Exiting." >&2
exit 0
fi
# Do your stuff
I like using symlinks to store pid because writing a symlink is an atomic operation; two processes can't conflict with each other. You don't even need to check for the existence of the pid symlink, because a failure of ln clearly indicates that a pid cannot be set. That's either a permission or path problem, or it's due to the symlink already being there.
Second option is to make it possible .. nay, preferable .. not to block additional instances, and instead configure whatever it is that this script does to permit multiple servers to run at the same time on different queue entries. "Single-queue-single-server" is never as good as "single-queue-multi-server". Since you haven't included code in your question, I have no way to know whether this approach would be useful for you, but here's some explanatory meta bash:
#!/usr/bin/env bash
workdir=/var/tmp # Set a better $workdir than this.
a=( $(get_list_of_queue_ids) ) # A command? A function? Up to you.
for qid in "${a[#]}"; do
# Set a "lock" for this item .. or don't, and move on.
if ! ln -s "pid=$$" $workdir/$qid.working; then
continue
fi
# Do your stuff with just this $qid.
...
# And finally, clean up after ourselves
remove_qid_from_queue $qid
rm $workdir/$qid.working
done
The effect of this is to transfer the idea of "one at a time" from the handler to the data. If you have a multi-CPU system, you probably have enough capacity to handle multiple queue entries at the same time.

ghoti's answer shows some helpful techniques, if modifying the script is an option.
Generally speaking, for an existing script:
Unless you know with certainty that:
the script has no side effects other than to output to the terminal or to write to files with shell-instance specific names (such as incorporating $$, the current shell's PID, into filenames) or some other instance-specific location,
OR that the script was explicitly designed for parallel execution,
I would assume that you cannot safely run multiple copies of the script simultaneously.
It is not reasonable to expect the average shell script to be designed for concurrent use.

From the viewpoint of the operating system, several processes may of course execute the same program in parallel. No need to worry about this.
However, it is conceivable, that a (careless) programmer wrote the program in such a way that it produces incorrect results, when two copies are executed in parallel.

Check processes run by cronjob to avoid multiple execution

How do I avoid cronjob from executing multiple times on the same command? I had tried to look around and try to check and kill in processes but it doesn't work with the below code. With the below code it keeps entering into else condition where it suppose to be "running". Any idea which part I did it wrongly?
#!/bin/sh
devPath=`ps aux | grep "[i]mport_shell_script"` | xargs
if [ ! -z "$devPath" -a "$devPath" != " " ]; then
echo "running"
exit
else
while true
do
sudo /usr/bin/php /var/www/html/xxx/import_from_datafile.php /dev/null 2>&1
sleep 5
done
fi
exit
cronjob:
*/2 * * * * root /bin/sh /var/www/html/xxx/import_shell_script.sh /dev/null 2>&1

I don't see the point to add a cron job which then starts a loop that runs a job. Either use cron to run the job every minute or use a daemon script to make sure your service is started and is kept running.
To check whether your script is already running, you can use a lock directory (unless your daemon framework already does that for you):
LOCK=/tmp/script.lock # You may want a better name here
mkdir $LOCK || exit 1 # Exit with error if script is already running
trap "rmdir $LOCK" EXIT # Remove the lock when the script terminates
...normal code...
If your OS supports it, then /var/lock/script might be a better path.
Your next question is probably how to write a daemon. To answer that, I need to know what kind of Linux you're using and whether you have things like systemd, daemonize, etc.

check the presence of a file at the beginning of your script ( for example /tmp/runonce-import_shell_script ). If it exists, that means the same script is already running (or the previous one halted with an error).
You can also add a timestamp in that file so you can check since when the script was running (and maybe decide to run it again after 24h even if the file is present)

bash script with background process

I have the following in my script:
#!/bin/bash
[ ! -S ./notify ] && find ./stylesheets/sass/ \-maxdepth 1 \-type f \-regex '.*/[^_][^/]*\.scss$' | entr \+notify &
what entr does here, is creating notify as a named pipe.
[ insert ]
while read F; do
...
#some processing on found files
#(does not matter for this question at all)
...
done < notify
The problem is, first time I run the script, it sees there is no notify pipe, so it creates one, and puts
the process into the background.
But then the following while loop complains it cannot find notify to read from.
However, when I run script immediately after that, so for the second time now, it continues normally the rest
of the program (while loop part).
How would I fix this, so it runs all good as a whole?
EDIT:
if I put into [ insert ] placeholder above,
sleep 1;
it works, but I would like a better solution for checking when that notify fifo exists, as sometimes it may need more than 1 sec.

You can always poll for the named pipe to be created:
until [ -p notify ]; do read -t 0.1; done
If you don't specifically need to maintain variables between runs, you could also consider using a script rather than entr's +notify. That would avoid the problem.

Terminate makefile command when subcommand throws?

I've got the following Makefile:
#runs the working directory unit tests
test:
#NODE_ENV=test; \
mocha --ignore-leaks $(shell find ./test -name \*test.js);
#deploys working directory
deploy:
#make test; \
make deploy-git; \
make deploy-servers;
#deploys working to git deployment branch
deploy-git:
#status=$$(git status --porcelain); \
if test "x$${status}" = x; then \
git branch -f deployment; \
git push origin deployment; \
echo "Done deploying to git deployment branch."; \
else \
git status; \
echo "Error: cannot deploy. Working directory is dirty."; \
fi
deploy-servers:
# for each server
# #DEPLOY_SERVER_IP = "127.0.0.1"; \
# make deploy-server
#deploy-server:
# connect to this server with ssh
# check if app is already running
# stop the app on the server if already running
# set working directory to app folder
# update deployment git branch
# use git to move head to deployment branch
# start app again
Note that deploy-servers and deploy-server are just dummies for now. This is what the deploy command should do:
run the tests (make test), exit on failure
push current head to deployment branch (make deploy-git), exit on failure
pull deployment branch on servers (make deploy-servers)
You can see this in the Makefile as:
deploy:
#make test; \
make deploy-git; \
make deploy-servers;
The issue is that I am not sure how to prevent make deploy-git from executing when make test fails, and how to prevent make deploy-servers from executing when the tests fail or when make deploy-git fails.
Is there a clear way to do this, or should I resort to using shell files or write these tools in a normal programming language?

The exit status of a shell command list is the exit status of the last command in the list. Simply turn your command list into separate single simple commands. By default, make stops when a command returns nonzero. So you get what you want with
deploy:
#make test
make deploy-git
make deploy-servers
Should you ever want to ignore the exit status of a simple command, you can prefix it with a dash:
target:
cmd1
-cmd2 # It is okay if this fails
cmd3
Your make manual has all the details.

Others have given answers which are based on splitting the "recipe" into individual commands.
In situations where that is not viable, what you can do is set -e in the shell script to make it terminate if a command fails:
target:
set -e ; \
command1 ; \
command2 ; command3 ; \
... commandN
This is the same set -e that you would put near the top of a shell script to get it to bail when some command terminates unsuccessfully.
Suppose that we are not interested in the termination statuses of command2 and command3. Suppose it is okay if these indicate failure, or do not reliably use termination status. Then, instead of set -e we can code an explicit exit test:
target:
command1 ; \
command2 || exit 1 ; \
command3 ; \
true # exit 0 will do here also.
Since command3 can indicate failure, and we don't want it to fail our build, we add a successful dummy command.

make should already do this; it executes complex commands with sh -e, which (as long as it's not in a loop in a POSIX compatible shell) will abort execution if a command exits nonzero, and aborts the entire Makefile on failure of a command unless you specifically tell it not to. If you're feeling paranoid, you can use && in place of ; in your commands.

I solved this very issue by proxying to a new make command at the potential breakpoint:
.PHONY cmd_name cmd_name_contd
cmd_name:
if [ "`pwd`" = "/this/dir" ]; then make cmd_name_contd; fi
cmd_name_contd:
#echo "The directory was good, continuing"
That way, if the directory was wrong, it just exits silently, you could also add an else condition with a message to display on failure.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio