How to write makefile so that it ignores "irrelevant" changes?

How to write makefile so that it ignores "irrelevant" changes? - makefile

Say I have a Makefile like this
B: A
quick-custom-script < $< > $#
C: B
slow-custom-script < $< > $#
Also assume that it may well happen that changes in A would produce the same B. I would like to achive that in such a case the complex making of C is left out because that is certainly unnecessary work when it has unchanged input.
My idea was to put the output of quick-custom-script to a temporary file, diff that against the current B and overwrite B only if differences are found.
In this case, the C rule would still see the old B and do nothing. Unfortunately, this produces another problem that I see (and perhaps more?): On any subsequent run, even without any changes made, A will be newer than the non-overwritten B and hence (even if it is quick) the first script will run - unnecesarily.
I think this can somewhat be minimized as follows
Btemp: A
quick-custom-script < $< > $#
B: Btemp
diff -q $< $# || cp $< $#
C: B
slow-custom-script < $< > $#
Nevertheless I wonder if there is any smarter way to achieve my goal?

This is pretty close to the usual convention. It puts the comparison into the same recipe, like this:
B: A
quick-custom-script < $< > $#T
$(move-if-change)
With this definition:
move-if-change = #if cmp -s $# $#T ; then rm $#T ; else mv $#T $#; fi
This combination into one rule has the advantage that if quick-custom-script terminates abnormally and the makefile is run again, the recipe will start from scratch, and the partially written output file is discarded.
Using cmp is usually quicker than diff, and mv is atomic, so it avoids reintroducing the same potential corruption.

Related

Define a target that depends on a value/variable that need to be resolved

Updated my question as it seemed to be not clear enough!
I was listing when to use make over bash. One thing I like about make is its declarative way of describing necessary steps; we can write a rule by relying on other rules knowing how to provide necessary files (or other external states).
I'm wondering how I can get the same benefit for a value not a file, without changing outer world (like leaving a temporary file).
hello.txt: (here, tell that it needs to resolve person's name)
# Here, person's name is available.
echo Hello $(var_name) > $#
We can imperatively prepare a necessary value with $(call prepare_name, ...) at the beginning of a command in a rule, but that's not what I'm after here.
I posted my attempts as an answer when I opened this question. Hopefully that adds more info on what I'm trying to achieve.

It's not overly clear what you're after, however to clarify a few concepts:
A target must be dependent on other targets. It cannot be dependent on a variable name. It can be dependent on the value of a variable, if that variable resolves to a target name.
So you could do:
VAR=some_target
hello.txt: $(VAR)
echo "hello $^" > $#
some_target:
touch $#
You CANNOT do:
VAR=some_target
hello.txt: VAR
and expect it to work (make would try to build VAR which likely doesn't exist and it would fail).
I'm assuming from the question that you want make to request the variable name of a person, and put that into hello.txt. In that case you would likely want to store the name in a temporary file and use that for the output:
.getname.txt:
#read -p "enter name" name > $#
hello.txt: .getname.txt
#echo "hello $$(cat $$<)" > $#
This will update .getname.txt if it didn't previously exist (so it will not necessarily ask on every invokation of make...). You could make .getname.txt be a .PHONY target, and it will run every time.
If you do want to run every time, then you can simply do:
hello.txt:
#read -p "enter name: " name && echo "hello $$name" > $#
.PHONY: hello.txt
Which will invoke the hello.txt rule regardless of whether hello.txt already exists, and will always prompt the user for a name and rebuild hello.txt.

I can think of a way using eval function. Below suppose foo is a value obtained by a complex calculation.
hello.txt: var_name
echo Hello $($<) > $#
.PHONY: var_name
var_name:
$(eval $# = foo)
Or with .INTERMEDIATE target, this also works, but I feel it's more complicated.
var_name = var-name.txt
hello.txt: $(var_name)
echo Hello $$(< $<) > $#
.PHONY: $(var_name)
.INTERMEDIATE: $(var_name)
$(var_name):
rm -f $# # In case the var file already exists
echo bar > $#
Another way could be to use a target-specific variable. It's not listing a variable as a prerequisite, but I still don't need to think about how to get var_name when writing echo Hello ....
define get_name
echo foo
endef
hello.txt: var_name = $(call get_name)
hello.txt:
echo Hello $(var_name) > $#

As noted in other answers, make track dependencies between files, using timestamps. The regular solution for handling a value will be to store it in a file (or to generate it into a file). Assuming that there is significant work to do whenever the data is changing, you can follow one of the patterns below to implement dependency check on the file value.
The following makefile snapshot will trigger rebuild of complex-result, only when the content of var-value is modified. This is useful when the content of var-value is continuously regenerated, but does not change very frequently.
all: complex-result
last-value.txt: var-value.txt
cmp -s $< $# || cat <$^ > $#
complex-result: last-value.txt
echo Buildig for "$$(cat var-value.txt)"
touch $#
Or more realistic example: trigger a build if the value (content) of any file was modified, using md5 checksum,
all: complex-result
last-value.txt: $((wildcard *.data)
md5sum $^ > $#
last-value.txt: var-value.txt
cmp -s $< $# || cat <$^ > $#
complex-result: last-value.txt
echo Building for "$$(cat var-value.txt)"
touch $#

Always process outermost file extension (and strip extensions along the way)

I have a bunch of different source files in my static HTML blog. The outermost extensions explain the format to be processed next.
Example: Source file article.html.md.gz (with target article.html) should be processed by gunzip, then by my markdown processor.
Further details:
The order of the extensions may vary
Sometimes an extension is not used (article.html.gz)
I know how to process all different extensions
I know that the final form is always article.html
Ideally I would have liked to just write rules as follows:
...
all-articles: $(ALL_HTML_FILES)
%: %.gz
gunzip ...
%: %.md
markdown ...
%: %.zip
unzip ...
And let make figure out the path to take based on the sequence of extensions.
From the documentation however, I understand that there are constraints on match-all rules, and the above is not possible.
What's the best way forward? Can make handle this situation at all?
Extensions are made up examples. My actual source files make more sense :-)

I'm on holiday so I'll bite.
I'm not a fan of pattern rules, they are too restricted and yet too arbitrary at the same time for my tastes. You can achieve what you want quite nicely in pure make:
.DELETE_ON_ERROR:
all: # Default target
files := a.html.md.gz b.html.gz
cmds<.gz> = gzip -d <$< >$#
cmds<.md> = mdtool $< -o $#
define rule-text # 1:suffix 2:basename
$(if $(filter undefined,$(flavor cmds<$1>)),$(error Cannot handle $1 files: [$2$1]))
$2: $2$1 ; $(value cmds<$1>)
all: $2
endef
emit-rule = $(eval $(call rule-text,$1,$2))# 1:suffix 2:basename
emit-hierachy = $(if $(suffix $2),$(call emit-rule,$1,$2)$(call emit-hierachy,$(suffix $2),$(basename $2)))# 1:suffix 2:basename
emit-rules = $(foreach _,$1,$(call emit-hierachy,$(suffix $_),$(basename $_)))# 1:list of source files
$(call emit-rules,${files})
.PHONY: all
all: ; : $# Success
The key here is to set $files to your list of files.
This list is then passed to emit-rules.
emit-rules passes each file one-at-a-time to emit-hierachy.
emit-hierachy strips off each extension in turn,
generates the appropriate make syntax, which it passes to $(eval …).
emit-hierachy carries on until the file has only one extension left.
Thus a.html.md.gz becomes this make syntax:
a.html.md: a.html.md.gz ; gunzip <$< >$#
a.html: a.html.md ; mdtool $< -o $#
all: a.html
Similarly, b.html.gz becomes:
b.html: b.html.gz ; gunzip <$< >$#
all: b.html
Neato, or what?
If you give emit-rules a file with an unrecognised extension (c.html.pp say),
you get a compile-time error:
1:20: *** Cannot handle .pp files: [c.html.pp]. Stop.
Compile-time? Yeah, before any shell commands are run.
You can tell make how to handle .pp files by defining cmds<.pp> :-)
For extra points it's also parallel safe. So you can use -j9 on your 8 CPU laptop, and -j33 on your 32 CPU workstation. Modern life eh?

Makefile with variable number of targets

I am attempting to do a data pipeline with a Makefile. I have a big file that I want to split in smaller pieces to process in parallel. The number of subsets and the size of each subset is not known beforehand. For example, this is my file
$ for i in {1..100}; do echo $i >> a.txt; done
The first step in Makefile should compute the ranges,... lets make them fixed for now
ranges.txt: a.txt
or i in 0 25 50 75; do echo $$(($$i+1))'\t'$$(($$i+25)) >> $#; done
Next step should read from ranges.txt, and create a target file for each range in ranges.txt, a_1.txt, a_2.txt, a_3.txt, a_4.txt. Where a_1.txt contains lines 1 through 25, a_2.txt lines 26-50, and so on... Can this be done?

You don't say what version of make you're using, but I'll assume GNU make. There are a few ways of doing things like this; I wrote a set of blog posts about metaprogramming in GNU make (by which I mean having make generate its own rules automatically).
If it were me I'd probably use the constructed include files method for this. So, I would have your rule above for ranges.txt instead create a makefile, perhaps ranges.mk. The makefile would contain a set of targets such as a_1.txt, a_2.txt, etc. and would define target-specific variables defining the start and stop values. Then you can -include the generated ranges.mk and make will rebuild it. One thing you haven't described is when you want to recompute the ranges: does this really depend on the contents of a.txt?
Anyway, something like:
.PHONY: all
all:
ranges.mk: a.txt # really? why?
for i in 0 25 50 75; do \
echo 'a_$$i.txt : RANGE_START := $$(($$i+1))'; \
echo 'a_$$i.txt : RANGE_END := $$(($$i+25))'; \
echo 'TARGETS += a_$$i.txt'; \
done > $#
-include ranges.mk
all: $(TARGETS)
$(TARGETS) : a.txt # seems more likely
process --out $# --in $< --start $(RANGE_START) --end $(RANGE_END)
(or whatever command; you don't give any example).

multiple targets from one recipe and parallel execution

I have a project which includes a code generator which generates several .c and .h files from one input file with just one invocation of the code generator. I have a rule which has the .c and .h files as multiple targets, the input file as the prerequisite, and the recipe is the invocation of the code generator. I then have further rules to compile and link the generated .c files.
This works fine with a -j factor of 1, but if I increase the j factor, I find I get multiple invocations of the code generator, up to the -j factor or the number of expected target files, whichever is smallest. This is bad because multiple invocations of the code generator can cause failures due to the generated code being written multiple times.
I'm not going to post my actual (large) code here, but I have been able to construct a small example which appears to demonstrate the same behavior.
The Makefile looks like this:
output.concat: output5 output4 output3 output2 output1
cat $^ > $#
output1 output2 output3 output4 output5: input
./frob input
clean:
rm -rf output*
Instead of a code generator, for this example I have written a simple shell script, frob which generates multiple output files from one input file:
#!/bin/bash
for i in {1..5}; do
{
echo "This is output${i}, generated from ${1}. input was:"
cat ${1}
} > output${i}
done
When I run this Makefile with non-unity -j factors, I get the following output:
$ make -j2
./frob input
./frob input
cat output5 output4 output3 output2 output1 > output.concat
$
We see ./frob here gets invoked twice, which is bad. Is there some way I can construct this rule such that the recipe only gets invoked once, even with a non-unity -j factor?
I have considered changing the rule so that just one of the expected output files is the target, then adding another rule with no recipe such that its targets are the remaining expected output files, and the prerequisite is the first expected output file. But I'm not sure this would work, because I don't know if I can guarantee the order in which the files are generated, and thus may end up with circular dependencies.

This is how make is defined to work. A rule like this:
foo bar baz : boz ; $(BUILDIT)
is exactly equivalent, to make, to writing these three rules:
foo : boz ; $(BUILDIT)
bar : boz ; $(BUILDIT)
baz : boz ; $(BUILDIT)
There is no way (in GNU make) to define an explicit rule with the characteristics you want; that is that one invocation of the recipe will build all three targets.
However, if your output files and your input file share a common base, you CAN write a pattern rule like this:
%.foo %.bar %.baz : %.boz ; $(BUILDIT)
Strangely, for implicit rules with multiple targets GNU make assumes that a single invocation of the recipe WILL build all the targets, and it will behave exactly as you want.

Correctly generate and update multiple targets a b с in parallel make -j from input files i1 i2:
all: a b c
.INTERMEDIATE: d
a: d
b: d
c: d
d: i1 i2
cat i1 i2 > a
cat i1 i2 > b
cat i1 i2 > c
If any of a,b,c are missing, the pseudo-target d is remade. The file d is never created; the single rule for d avoids several parallel invocations of the recipe.
.INTERMEDIATE ensures that missing file d doesn't trigger the d recipe.
Some other ways for multiple targets in the book "John
Graham-Cumming - GNU Make Book" p.92-96.

#MadScientist's answer is promising - I think I could possibly use that. In the meantime, I have been playing with this some more and come up with a different possible solution, as hinted at in the question. I can split the rule in two as follows:
INPUT_FILE = input
OUTPUT_FILES = output5 output4 output3 output2 output1
OUTPUT_FILE1 = $(firstword $(OUTPUT_FILES))
OUTPUT_FILES_REST = $(wordlist 2,$(words $(OUTPUT_FILES)),$(OUTPUT_FILES))
$(OUTPUT_FILE1): $(INPUT_FILE)
./frob $<
touch $(OUTPUT_FILES_REST)
$(OUTPUT_FILES_REST): $(OUTPUT_FILE1)
Giving only one output file as a target fixes the possible parallelism problem. Then we make this one output file as a prerequisite to the rest of the output files. Importantly in the frob recipe, we touch all the output files with the exception of the first so we are guaranteed that the first will have an older timestamp than all the rest.

As of make 4.3 (Jan 2020) make allows grouped targets. As per docs the following will update all targets only once if any of the targets is missing or outdated:
foo bar biz &: baz boz
echo $^ > foo
echo $^ > bar
echo $^ > biz

Answer by Ivan Zaentsev almost worked for me, with exception of the following issue. Only when running parallel make (-j2 or above), when a prerequisite of the generated file was changed, the generated file was regenerated successfully, however, the subsequent targets that depend on the generated file were not rebuilt.
The workaround I found was to provide a recipe for the generated files (the trivial copy command), besides the dependency on the intermediate target (d):
d: i1 i2
cat i1 i2 > a.gen
cat i1 i2 > b.gen
cat i1 i2 > c.gen
.INTERMEDIATE: d
a.gen : d
b.gen : d
c.gen : d
a: a.gen d
cp $< $#
b: b.gen d
cp $< $#
c: c.gen d
cp $< $#
e: a b c
some_command $# $^
The clue was this debug output from make when running without the workaround (where 'e' was not rebuilt with make -j2, despite a,b,c being rebuilt):
Finished prerequisites of target file `a'.
Prerequisite `d' of target `a' does not exist.
No recipe for `a' and no prerequisites actually changed.
No need to remake target `a'.

Here is the solution that seemed to work for me (credit to #Ivan Zaentsev for the main solution and to #alexei for pointing out the problem with it). It is similar to the original approach with one major change. Instead of generating temporary files (*.gen as suggested), it just touches the files that depend on the INTERMEDIATE file. :
default: f
.INTERMEDIATE: temp
a b c: temp
touch $#
temp: i1 i2
echo "BUILD: a b c"
cat i1 i2 > a
cat i1 i2 > b
cat i1 i2 > c
e: a b c
echo "BUILD: e"
touch $#
f: e
echo "BUILD: f"
touch $#

stop on error when target of makefile rule is a foreach function

I have a makefile that defines several rules where the target is a foreach function.
$(foreach var,$(list), $($(var)_stuff) $($(var)_more_stuff)):
#echo Building $# from $^...
$(CC) $(FLAGS) ...
Is there any way to get make to quit when encountering an error without going through the entire list.

One workaround is to "manually" invoke exit on failure.
For example, assume we have a directory called scripts with a number of shell scripts (with filenames that end with .sh) that we want to execute.
Then a variable declaration like this:
LIST_OF_SCRIPTS ?= $(wildcard scripts/*.sh)
will give us a list of those scripts, and a target like this:
run-all-scripts
#$(foreach scriptfile,$(LIST_OF_SCRIPTS),$(scriptfile);)
will run all of those scripts, but as you note, the foreach loop will keep going whether or not one of the scripts returns an error code. Adding a || exit to the command will force the subcommand to exit on error, which Make will then treat as a failure.
E.g.,
run-all-scripts
#$(foreach scriptfile,$(LIST_OF_SCRIPTS),$(scriptfile) || exit;)
will do what you want (I believe).
Specifically, using your pseudo-code example, I think you want something like this:
$(foreach var,$(list), $($(var)_stuff) $($(var)_more_stuff)):
#echo Building $# from $^...
($(CC) $(FLAGS) ...) || exit
(where all I've changed is wrapping the (CC) $(FLAGS) ... bit in parens and appending || exit to make it fail on error).

The foreach is completely evaluated and substituted before any of the rules are executed. So the behaviour of this should be identical to as if you had hardcoded the rule without using the foreach. In other words, it's not directly relevant to the problem.
There are only a few possible explanations for what you're seeing, mostly described in the manual here:
You are running Make with -k or --keep-going
You are running Make with -i or --ignore-errors
Your targets is defined as prerequisites of the special .IGNORE target
Your recipe starts with a -
Your recipe isn't actually returning a non-zero exit status

Not sure about your example, but maybe problem is in ; - look at Makefile : show and execute:
dirs = $(shell ls)
clean:
$(foreach dir,$(dirs),echo $(dir);)
produce:
$ make clean
echo bin; echo install.sh; echo Makefile; echo README.md; echo utils;
So make check exit code only for last command: echo utils.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to write makefile so that it ignores "irrelevant" changes? - makefile

Related

Define a target that depends on a value/variable that need to be resolved

Always process outermost file extension (and strip extensions along the way)

Makefile with variable number of targets

multiple targets from one recipe and parallel execution

stop on error when target of makefile rule is a foreach function

Categories

Resources