Say I have a simple Makefile.
a1: some_script b
some_command $< b
a2: some_other_script b
some_command $< b
b: c
touch $#
c:
touch $#
Where b is some database-like file that is required to make a1 and a2. However, every time b is accessed (even if not altered) the modification date changes. Therefore, anytime the rule for a2 is executed, Make thinks that a1 needs to be remade because the database b was used (even if c hasn't changed and b remains the same). I only want to update a1 and a2 if c is newer (and therefore b is in need of actual re-compiling).
I could simply have a1 and a2 depend on c directly, but that misrepresents the true workflow.
I do not want to remove b, so having it as an intermediate file won't work.
I've also tried including b as an order-only dependency, but a1 and a2 will never be re-made unless forced to.
Notes: The Makefile is meant to automate executing scripts and keep track of dependencies for a research project (rather than a software project). Perhaps Make is not the right tool for this. The database-like files are GeoPackages.
If you can't rely on the timestamp of b to be accurate, then you need to not use it in your makefile. You can do something like this:
a1: some_script .buildc
some_command $< b
a2: some_other_script .buildc
some_command $< b
.buildc: c
command to update b
touch $#
c:
touch $#
This will run command to update b only if c is newer than .buildc which is set each time this command is invoked, not when b is used.
You could maybe just prevent useless changes of the timestamp of b:
a1: some_script b
touch -r b .b.a1
some_command $< b
touch -r .b.a1 b && rm .b.a1
a2: some_other_script b
touch -r b .b.a2
some_command $< b
touch -r .b.a2 b && rm .b.a2
But be careful: if you run make in parallel mode (make -j), a1 and a2 recipes could be run in parallel with potential race conditions. It is thus probably better to serialize them with .NOTPARALLEL: or by using flock in the recipes.
I think the "order-only prerequisite" might do the trick:
a1: some_script | b
some_command $< b
a2: some_other_script | b
some_command $< b
b: c
touch $#
c:
touch $#
Related
I'd like to use make to process a large number of inputs to outputs using a script (python, say.) The problem is that the script takes an incredibly short amount of time to run per input, but the initialization takes a while (python engine + library initialization.) So, a naive makefile that just has an input->output rule ends up being dominated by this initialization time. Parallelism doesn't help with that.
The python script can accept multiple inputs and outputs, as so:
python my_process -i in1 -o out1 -i in2 -o out2 ...
and this is the recommended way to use the script.
How can I make a Makefile rule that best uses my_process, by sending in out of date input-output pairs in batches? Something like parallel but aware of which outputs are out of date.
I would prefer to avoid recursive make, if at all possible.
I don't completely grasp your problem: do you really want make to operate in batches or do you want a kind of perpetual make process checking the file system on the fly and feeding to the Python process whenever it finds necessary? If the latter, this is quite the opposite of a batch mode and rather a pipeline.
For the batch mode there is a work-around which needs a dummy file recording the last runnning time. In this case we are abusing make for because the makefile is in this part a one-trick pony which is unintuitive and against the good rules:
SOURCES := $(wildcard in*)
lastrun : $(SOURCES)
python my_process $(foreach src,$?,-i $(src) -o $(patsubst in%,out%,$(src)))
touch lastrun
PS: please note that this solution has a substantial flaw in that it doesn't detect the update of in-files when they happen during the run of the makefile. All in all it is more advisable to simply collect the filenames of the in-files which were updated by the update process itself and avoid make althogether.
This is what I ended up going with, a makefile with one layer of recursion.
I tried using $? both with grouped and ungrouped targets, but couldn't get the exact behavior needed. If one of the output targets was deleted, the rule would be re-run but $? wouldn't necessarily have some input files but not the correct corresponding input file, very strange.
Makefile:
all:
INDIR=in
OUTDIR=out
INFILES=$(wildcard in/*)
OUTFILES=$(patsubst in/%, out/%, $(INFILES))
ifdef FIRST_PASS
#Discover which input-output pairs are out of date
$(shell mkdir -p $(OUTDIR); echo -n > $(OUTDIR)/.needs_rebuild)
$(OUTFILES) : out/% : in/%
#echo $# $^ >> $(OUTDIR)/.needs_rebuild
all: $(OUTFILES)
#echo -n
else
#Recurse to run FIRST_PASS, builds .needs_rebuild:
$(shell $(MAKE) -f $(CURDIR)/$(firstword $(MAKEFILE_LIST)) FIRST_PASS=1)
#Convert .needs_rebuild into batches, creates all_batches phony target for convenience
$(shell cat $(OUTDIR)/.needs_rebuild | ./make_batches.sh 32 > $(OUTDIR)/.batches)
-include $(OUTDIR)/.batches
batch%:
#In this rule, $^ is all inputs needing rebuild.
#The corresponding utputs can be computed using a patsubst:
targets="$(patsubst in/%, out/%, $^)"; touch $$targets
clean:
rm -rf $(OUTDIR)
all: all_batches
endif
make_batches.sh:
#!/bin/bash
set -beEu -o pipefail
batch_size=$1
function _make_batches {
batch_num=$1
shift 1
#echo ".PHONY: batch$batch_num"
echo "all_batches: batch$batch_num"
while (( $# >= 1 )); do
read out in <<< $1
shift 1
echo "batch$batch_num: $in"
echo "$out: batch$batch_num"
done
}
export -f _make_batches
echo ".PHONY: all_batches"
parallel -N$batch_size -- _make_batches {#} {} \;
Unfortunately, the makefile is a one trick pony and there's quite a bit of boilerplate to pull this recipe off.
GNU make has the following option to avoid recompilation:
-o file, --old-file=file, --assume-old=file
Do not remake the file file even if it is older than its dependen‐
cies, and do not remake anything on account of changes in file.
Essentially the file is treated as very old and its rules are
ignored.
So given the following Makefile:
A : B
touch A
B : C
touch B
C :
touch C
Assuming all files exist at some point, I can run the following commands:
$ touch C
$ make A -o B
make: `A' is up to date. # Good. C is out-of-date, but A does not get updated.
Question
How can I change my Makefile such that B is always assumed old, but only when rebuilding A?
A : <WHAT_GOES_HERE> B
touch A
B : C
touch B
C :
touch C
I'm specifically looking for the following results:
Result 1: when no files exist yet, make A should create all of them, as before.
$ make A
touch C
touch B
touch A
Result 2: when C is out-of-date, make B should update B, as before. This means we can't have an order-only dependency between B and C (i.e. B : | C). Assuming all files exist again:
$ touch C
$ make B
touch B # Good. Don't loose this property.
Result 3: when C is out-of-date, make A should be a no-op, without extra flags necessary.
$ touch C
$ make A
make: `A' is up to date.
In other words,
when deciding whether to rebuild A,
make should ignore the timestamp of B.
In other words,
B is not a pre-requisite of A.
Just leave B off in A's dependency line.
Reading your answer though, it appears that you want to build B iff it doesn't already exist.
A: $(if $(wildcard B),,B)
⋮
or, more directly,
A: $(wildcard B)
⋮
Here is one solution. Check if B exists explicitly, and call make recursively if it doesn't:
A :
#if [ ! -f B ]; \
then $(MAKE) --no-print-directory B && echo touch A && touch A; \
fi
B : C
touch B
C :
touch C
It works, but I was hoping for a solution using only Makefile directives, no shell script.
$ make A
touch C
touch B
touch A
$ touch C
$ make B
touch B
$ touch C
$ make A
make: `A' is up to date.
I have this simple Makefile:
a:
touch a
b: a
touch b
all: b
touch myapp
make all returns:
touch a
touch b
touch myapp
clearmake all (or clearmake -C gnu all) returns:
touch a
touch b
touch myapp
How to get rid of the unnecessary newlines?
There is not many possibility to change the output of clearmake (or clearmake -C gnu)
(-n only prints the commands, -s does not print it)
That leaves you with workarounds like:
clearmake all | grep -v '^$'
I need to restart the make process in case some intermediate target gets (re)build.
This is the case when a PIP requirements file gets (re)compiled, because the checksum of the resulting file is used in the path to the virtualenv that is used in the Makefile.
If the requirements file gets updated, the same $(MAKECMDGOALS) should get rebuild from the beginning.
I have come up with the following, but this requires to make the outer/inital make process fail (exit 1), which I would like to avoid.
foo:
echo "foo"
# Need to restart make if this has been used as intermediate target.
if [ $(MAKECMDGOALS) != "$#" ]; then \
echo "Restarting make..."; \
touch $#; \
$(MAKE) $(MAKECMDGOALS); \
exit 1; \
fi
bar: foo
echo "bar"
.PHONY: bar
A makefile on the lines below might fit the bill. bar and baz are goals
that will be sabotaged if they intermediately make foo (a.k.a tricky goals).
There's no restarting involved: instead, it is arranged that the tricky
goals are only made by a fresh $(MAKE), after the top make has ensured foo is
up-to-date.
.phony: all clean
tricky = bar baz
all:
$(MAKE) $(tricky)
foo:
echo "foo"
touch $#
clean:
rm -f foo
ifndef simple
tricky_goals = $(filter $(tricky),$(MAKECMDGOALS))
ifneq ($(tricky_goals),)
$(tricky_goals): foo
$(MAKE) simple=y $#
endif
else # ifndef simple
# All tricky goals follow...
bar: foo
echo "bar"
baz: foo
echo "baz"
endif # ifndef simple
The behaviour depends on whether the variable simple is defined, which
by default it is not.
If simple is defined then the goals, whatever they are, will be made
the obvious way - the way they'd be made if it weren't for the foo pitfall.
If simple is undefined then tricky_goals becomes the tricky members $(MAKECMDGOALS).
If $(tricky_goals) is empty there is nothing tricky to do. Otherwise the target list
$(tricky_goals) becomes dependent on foo. So if foo is out-of-date, it gets made now.
Finally each tricky goal is made by calling a simple $(MAKE) for it,
in which foo will be up-todate. Any non-tricky goals will be made the simple way.
I've included all and clean goals for completeness. Note that all
needs coded to go the tricky way.
Assume a target foo.tar, that depends on a list of files foo.files, e.g.
FOO_FILES := $(shell cat foo.files)
foo.tar: foo.files $(FOO_FILES)
tar -cf foo $(FOO_FILES)
Now suppose that foo.files need to be generated e.g.:
foo.files: foo.files.template
sed -e "s/#{VERSION}/$(VERSION)/" < $< > $#
It is clear that foo.files depends on foo.files.template, but how can one make sure that FOO_FILES is evaluated after foo.files is generated?
Your original rules are correct. Because updating foo.files causes the value of FOO_FILES to become outdated you just need to make sure your Makefile is re-evaluated by gnu make when foo.files has been updated by making your Makefile depend on foo.files:
Makefile : foo.files
So, I found an answer reading about Advanced Auto-Dependency Generation over at mad-scientist.net. Basically, it is possible to re-evaluate a makefile by way of a GNU/Make feature. When there is a rule to generate an included makefile, the entire makefile will be re-read after the generation of the included file. Thus --
# -*- mode: make -*-
VERSION := 1.2.3
foo.tar: foo.files $(FOO_FILES)
tar cf $# $(FOO_FILES)
clean:
rm -f foo.files foo_files.mk foo.tar
foo.files: foo.files.template
sed -e "s/#{VERSION}/$(VERSION)/" < $< > $#
# -- voodoo start here --
# This will ensure that FOO_FILES will be evaluated only
# *after* foo.files is generated.
foo_files.mk: foo.files
echo "FOO_FILES := `xargs < $<`" > $#
include foo_files.mk
# -- voodoo ends here --
.PHONY: clean
-- seems to do the right thing.
... and just for completeness:
foo.files.template is:
a-$(VERSION)
b-$(VERSION)
and assume the presence of a-1.2.3 and b-1.2.3.
It can't be done in one pass; Make determines which targets must be rebuilt before it actually executes any rule, and in this case the full list of targets doesn't exist until one of the rules is executed.
This should do it:
FOO_FILES := $(shell cat foo.files)
foo.tar: foo.files
$(MAKE) foo-tarball
.PHONY: foo-tarball
foo-tarball: $(FOO_FILES)
tar -cf foo $^
EDIT:
As the OP points out, this will not work as written; I left out a prerequisite:
foo.tar: foo.files $(FOO_FILES)
...
Note that this will recurse even if foo.files has not changed, which is not strictly necessary; it is possible to correct this, but not elegantly. (For comparison, the selected solution, which I admit is cleaner than mine, recurses even if the target has nothing to do with foo.tar.)