Can GNU Make use pattern matching to look up variables? - makefile

I'm trying to get Make to build some data analysis, where there are file lists controlled by one overall parameter.
To write it explicitly would be something like:
A_EXTS = a b c d e
B_EXTS = f g h i j
C_EXTS = k l m n o
A.dat : $(foreach EXT, ${A_EXTS}, prefix1_${EXT}.dat prefix2_${EXT}.dat)
python analyse.py $^ > $#
B.dat : $(foreach EXT, ${B_EXTS}, prefix1_${EXT}.dat prefix2_${EXT}.dat)
python analyse.py $^ > $#
C.dat : $(foreach EXT, ${C_EXTS}, prefix1_${EXT}.dat prefix2_${EXT}.dat)
python analyse.py $^ > $#
Obviously the only difference between the three rules is the A vs B vs C.
I thought to try something like
%.dat : $(foreach EXT, ${%_EXTS}, prefix1_${EXT}.dat prefix2_${EXT}.dat)
python analyse.py $^ > $#
…but that doesn't work; e.g. make B.dat runs the rule for B.dat but ignores the dependencies; $^ is set to the empty string.
The files starting prefix2_ are generated by another recipe, so I can't just specify them within the recipe, they need to be marked as dependencies here.
Is this possible to express these dependencies without repeating the same rule?

Well, you can't do it quite like you want to here, but it's not related to looking up variable names: it's because of expansion order.
Variables in targets and prerequisites are expanded when the makefile is parsed, but make doesn't expand the patterns in pattern rules until much later. That means when make expands the ${%_EXTS} variable as it parses the makefile, it has no idea what the value of % will be later when it's actually trying to build things.
You can use secondary expansion to delay expansion of variables until make's second pass where it is actually finding target names. I pulled the logic out into a separate variable and used call to make it a bit more readable:
.SECONDEXPANSION:
EXPANDDEPS = $(foreach EXT,${$1_EXTS},prefix1_${EXT}.dat prefix2_${EXT}.dat)
%.dat : $$(call EXPANDDEPS,$$*)
python analyse.py $^ > $#

Related

Makefile dependencies based on target

I have a Makefile with user-specified input files in the variable INPUT_FILES.
For each input file, I need to create an input file prime.
Some notes:
Each input file can have an arbitrary file location
It is reasonable to assume there aren't duplicate filenames
Each output file needs to go into $(OUTPUT_DIR)
My basic strategy has been to generate the set of targets based INPUT_FILES and then try to determine which input file is the actual dependency of the target.
A few variations I've tried:
# Create a list of targets
OUTPUT_FILES = $(foreach file,$(notdir $(INPUT_FILES)),$(OUTPUT_DIR)/$(file))
# This doesn't work, because all input files are dependencies of each output file
$(OUTPUT_FILES): $(INPUT FILES)
program --input $^ --output $#
# This doesn't work because $# hasn't been resolved yet
$(OUTPUT_FILES): $(filter,$(notdir $#),$(INPUT FILES))
program --input $^ --output $#
# This doesn't work, I think because $# is evaluated too late
.SECONDEXPANSION:
$(OUTPUT_FILES): $(filter,$(notdir $$#),$(INPUT FILES))
program --input $^ --output $#
# This doesn't work either
.SECONDEXPANSION:
$(OUTPUT_FILES): $$(filter,$(notdir $#),$(INPUT FILES))
program --input $^ --output $#
I've looked into static pattern rules as well, but I'm not sure if it can help with what I need.
In your case .SECONDEXPANSION: works because you can use make functions (filter) to compute the prerequisite of each output file. In other circumstances it could be impossible. But there is another GNU make feature that can be used in cases like yours: if you use GNU make you can programmatically instantiate make statements using foreach-eval-call. Just remember that the macro that is used as the statements pattern gets expanded twice, reason why you must double some $ signs (more on this later):
OUTPUT_DIR := dir
OUTPUT_FILES := $(addprefix $(OUTPUT_DIR)/,$(notdir $(INPUT_FILES)))
.PHONY: all
all: $(OUTPUT_FILES)
# The macro used as statements pattern where $(1) is the input file
define MY_RULE
$(1)-output-file := $(OUTPUT_DIR)/$$(notdir $(1))
$$($(1)-output-file): $(1)
#echo program --input $$^ --output $$#
endef
$(foreach i,$(INPUT_FILES),$(eval $(call MY_RULE,$(i))))
Demo:
$ mkdir -p a/a b
$ touch a/a/a b/b c
$ make INPUT_FILES="a/a/a b/b c"
program --input a/a/a --output dir/a
program --input b/b --output dir/b
program --input c --output dir/c
Explanation:
When make parses the Makefile it expands $(foreach ...): it iterates over all words of $(INPUT_FILES), for each it assigns the word to variable i and expands $(eval $(call MY_RULE,$(i))) in this context. So for word foo/bar/baz it expands $(eval $(call MY_RULE,$(i))) with i = foo/bar/baz.
$(eval PARAMETER) expands PARAMETER and instantiates the result as new make statements. So, for foo/bar/baz, make expands $(call MY_RULE,$(i)) with i = foo/bar/baz and considers the result as regular make statements. The expansion of $(eval ...) has no other effect, the result is the empty string. This is why in our case $(foreach ...) expands as the empty string. But it does something: create new make statements dynamically for each input file.
$(call NAME,PARAMETER) expands PARAMETER, assigns it to temporary variable 1 and expands the value of make variable NAME in this context. So, $(call MY_RULE,$(i)) with i = foo/bar/baz expands as the expanded value of variable MY_RULE with $(1) = foo/bar/baz:
foo/bar/baz-output-file := dir/$(notdir foo/bar/baz)
$(foo/bar/baz-output-file): foo/bar/baz
#echo program --input $^ --output $#
which is what is instantiated by eval as new make statements. Note that we had a first expansion here and the $$ became $. Note also that call can have more parameters: $(call NAME,P1,P2) will do the same with $(1) = P1 and $(2) = P2.
When make parses these new statements (as any other statements) it expands them (second expansion) and finally adds the following to its list of variables:
foo/bar/baz-output-file := dir/baz
and the following to its list of rules:
dir/baz: foo/bar/baz
#echo program --input $^ --output $#
This may look complicated but it is not if you remember that the make statements added by eval are expanded twice. First when $(eval ...) is parsed and expanded by make, and a second time when make parses and expands the added statements. This is why you frequently need to escape the first of these two expansions in your macro definition by using $$ instead of $.
And it is so powerful that it is good to know.
When asking for help please provide some kind of actual example names so we can understand more clearly what you have. It also helps us use terminology which is not confusing.
You really want to use $< in your recipes, not $^, I expect.
IF your "input files" are truly input-only (that is, they are not themselves generated by other make rules) then you can easily solve this problem with VPATH.
Just use this:
VPATH := $(sort $(dir $(INPUT_FILES)))
$(OUTPUT_DIR)/% : %
program --input $< --output $#
I finally found a permutation that works - I think the problem was forgetting that filter requires a % for matching patterns. The rule is:
.SECONDEXPANSION:
$(OUTPUT_FILES): $$(filter %$$(#F),$(INPUT_FILES))
program --input $^ --output $#
I also realized I can use #F (equivalent to $$(notdir $$#)) for cleaner syntax.
The rule gets the target's filename on its second expansion ($$(#F)) and then gets the input file (with path) that corresponds to it on second expansion ($$(filter %$$(#F),$(INPUT_FILES))).
Of course, the rule only works if filenames are unique. If someone has a cleaner solution, feel free to post.

Can I simplify this Makefile involving files in subfolders?

I have, for example, the following Makefile to generate PDF files from Markdown files in subdirectories:
FOLDERS = f1 f2 f3
.PHONY: $(FOLDERS)
f1: f1/f1.md
cd $# && pandoc $(notdir $^) -o $(patsubst %.md,%.pdf,$(notdir $^))
f2: f2/f2.md
cd $# && pandoc $(notdir $^) -o $(patsubst %.md,%.pdf,$(notdir $^))
f3: f3/f3.md
cd $# && pandoc $(notdir $^) -o $(patsubst %.md,%.pdf,$(notdir $^))
The expected result is that make f1 requires the existence of f1/f1.md, and generates the resulting PDF as f1/f1.pdf. The same for f2 and f3. This works, but the declarations seem unnecessarily repetitive.
Is there any way to combine these three rules into one, generic rule? That is, without needing to explicitly write out all of the paths to the PDF files or Markdown files, as I may be dynamically adding subfolders and I'd prefer to just change the definition of FOLDERS in the first line. I've googled around and tried a few things, but I feel like either I can't find the right incantation to use, or I'm missing a piece of knowledge about how Makefiles work. Could someone please point me in the right direction?
First, note that there's no good reason to use PHONY targets here, since these rules appear to be building files whose names are known beforehand. Targets like f1/f1.pdf would be much better.
Unfortunately we can't use a pattern rule when the stem (e.g. f1) is repeated in a prerequisite. But a "canned recipe" can do the trick:
define pdf_template
$(1): $(1)/$(1).md
cd $$# && pandoc $$(notdir $$^) -o $$(patsubst %.md,%.pdf,$$(notdir $$^))
endef
$(eval $(call pdf_template,f1))
$(eval $(call pdf_template,f2))
$(eval $(call pdf_template,f3))
(Note how you must escape the $ signs in the template.)
If those $(eval...) lines look too repetitive, you can replace them with a loop:
$(foreach folder,$(FOLDERS),$(eval $(call pdf_template,$(folder))))
EDIT: Come to think of it, there's another way. You can't construct a pattern rule that uses the stem more than once:
$(FOLDERS): %: %/%.md
cd $# && ... this won't work
And you can't use the automatic variables in the prerequisite list, because they aren't yet defined when they're needed:
$(FOLDERS): $#/$#.md
cd $# && ... this won't work either
But you can use them there if you use Secondary Expansion, which causes Make to expand the prereq list a second time:
.SECONDEXPANSION:
$(FOLDERS): $$#/$$#.md
cd $# && ... this works
Again, note the escaped $ symbols.

split a path name for dependecies in a makefile

I need to split the path of a variable into a list.
For example, to convert a/b/c/d into a b c d.
The question is similar to this question, but only a workaround was given, which cannot work with dependencies.
need to split a file name for dependecies.
For example the rule
wd/%.o : $1.c $2.c
cc -o $# -c $1.c $2.c
applied to wd/a/x.o would depend on a.c and x.c.
I managed to create a specialized function that splits the string. But it works only if we know in advance all the possible values of $1 and moreover, combining call and % does not work, so I cannot get the result for dependencies.
For example,
wd/%.o : $(call SPLIT,%.o) #DO NOT WORK
cc -o $# -c $^
called on target wd/a/x.o would have only only one dependency, a/x.o, even if SPLIT works fine in the command line.
Any idea ? Thank you for helping !
The subst function can split the path. To use it in the prerequisite list, use Secondary Expansion:
.SECONDEXPANSION:
wd/%.o : $$(addsuffix .c,$$(patsubst %,%.c,$$(subst /, ,$$*)))
cc -o $# -c $^

Makefile with multiple rules sharing same recipe with patternrules?

I want to remove the duplication of recipe in a makefile like the following
SHELL := /bin/bash
a_% : a1_% a2_%
cat $^ > $#
b_% : b1_% b2_% %_b3
cat $^ > $#
However the following does not work. I guess the trick in this SO question does not work with pattern rules.
SHELL := /bin/bash
a_% : a1_% a2_%
b_% : b1_% b2_% %_b3
a_% b_%:
cat $^ > $#
Any suggestions ? ( In my original makefile, recipe duplication is occurring in 4 targets, and each of those take 3 substitutions, so I can't unroll the targets)
--EDIT--
I realized that one way to solve this was the following.
CMD1 = cat $^ > $#
a_% : a1_% a2_%
$(CMD1)
b_% : b1_% b2_% %_b3
$(CMD1)
I believe this does what you asked for:
SHELL := /bin/bash
define STUFF
$(1)_%: $(1)1_% $(1)2_% $(2)
cat $$^ > $$#
endef
$(eval $(call STUFF,a))
$(eval $(call STUFF,b,%_b3))
How this works:
The general form of the rule is defined as STUFF. (You'd obviously want a better name in your own Makefile.) Note the doubling of dollar signs in $$^ and $$#. This protects them from evaluation when $(call ...) is executed. $(1) and $(2) will be replaced by $(call ...) with positional arguments.
$(call STUFF,a) "calls" STUFF with $(1) set to the string a and $(2) set to the empty string. The return value is:
a_%: a1_% a2_%
cat $^ > $#
Note how one $ was stripped from the remaining variables.
$(eval ...) evaluates the return value obtained in the previous step as if that string had been put in the Makefile. So it creates the rule.
Steps 2 and 3 also happen for the b files. It is similar to what happens for the a files except that this time $(2) is set to the string %_b3.
This is essentially the method I've used in the past to avoid duplication of rules for cases where the rules were rather complex. For the specific case you show in your question, I'd use the shared command variable you mention in your question.

GNU make with many target directories

I have to integrate the generation of many HTML files in an existing Makefile.
The problem is that the HTML files need to reside in many different directories.
My idea is to write an implicit rule that converts the source file (*.st) to the corresponding html file
%.html: %.st
$(HPC) -o $# $<
and a rule that depends on all html files
all: $(html)
If the HTML file is not in the builddir, make doesn't find the implicit rule: *** No rule to make target.
If I change the implicit rule like so
$(rootdir)/build/doc/2009/06/01/%.html: %.st
$(HPC) -o $# $<
it's found, but then I have to have an implicit rule for nearly every file in the project.
According to Implicit Rule Search Algorithm in the GNU make manual, rule search works like this:
Split the entire target name t into a directory part, called d, and the rest, called n. For
example, if t is src/foo.o,
then d is src/,
and n is foo.o.
Make a list of all the pattern rules one of whose targets matches t or n.
If the target pattern contains a slash,
it is matched against t;
otherwise, against n.
Why is the implicit rule not found, and what would be the most elegant solution, assuming GNU make is used?
Here is a stripped down version of my Makefile:
rootdir = /home/user/project/doc
HPC = /usr/local/bin/hpc
html = $(rootdir)/build/doc/2009/06/01/some.html
%.html: %.st
$(HPC) -o $# $<
#This works, but requires a rule for every output dir
#$(rootdir)/build/doc/2009/06/01/%.html: %.st
# $(HPC) -o $# $<
.PHONY: all
all: $(html)
The best solution I found so far is to generate an implicit rule per target directory via foreach-eval-call, as explained in the GNU make manual. I have no idea how this scales to a few thousand target directories, but we will see...
If you have a better solution, please post it!
Here is the code:
rootdir = /home/user/project/doc
HPC = /usr/local/bin/hpc
html = $(rootdir)/build/doc/2009/06/01/some.html \
$(rootdir)/build/doc/2009/06/02/some.html
targetdirs = $(rootdir)/build/doc/2009/06/01 \
$(rootdir)/build/doc/2009/06/02
define generateHtml
$(1)/%.html: %.st
-mkdir -p $(1)
$(HPC) -o $$# $$<
endef
$(foreach targetdir, $(targetdirs), $(eval $(call generateHtml, $(targetdir))))
.PHONY: all
all: $(html)
Like Maria Shalnova I like recursive make (though I disagree with "Recursive Make Considered Harmful"), and in general it's better to make something HERE from a source THERE, not the reverse. But if you must, I suggest a slight improvement: have generateHtml generate only the RULE, not the COMMANDS.
Your active implicit rule makes $(rootdir)/build/doc/2009/06/01/some.html depend on $(rootdir)/build/doc/2009/06/01/some.st. If $(rootdir)/build/doc/2009/06/01/some.st doesn't exist then the rule won't be used/found.
The commented out rule makes $(rootdir)/build/doc/2009/06/01/some.html depend on some.st.
One solution is to make you're source layout match your destination/result layout.
Another option is to create the rules as required with eval. But that will be quite complicated:
define HTML_template
$(1) : $(basename $(1))
cp $< $#
endef
$(foreach htmlfile,$(html),$(eval $(call HTML_template,$(htmlfile))))
An other possibility is to have the commando make call itself recursively with the argument -C with every output directory.
Recursive make is somewhat the standard way to deal with subdirectories, but beware of the implications mentioned in the article "Recursive Make Considered Harmful"

Resources