code management: generate source files with slight variations of various rules - makefile

I have a source file in a declarative language (twolc, actually) that I need to write many variations on: a normative version and many non-normative versions, each with one or more variations from the norm. For example, say the normative file has three rules:
Rule A:
Do something A-ish
Rule B:
Do something B-ish
Rule C:
Do something C-ish
Then one variation might have the exact same rules as the norm for A and C, but a different rule for B, which I will call B-1:
Rule A:
Do something A-ish
Rule B-1:
Do something B-ish, but with a flourish
Rule C:
Do something C-ish
Imagine that you have many different subtle variations on many different rules, and you have my situation. The problem I am worried about is code maintainability. If, later on, I decide that Rule A needs to be refactored somehow, then I will have 50+ files that need to have the exact same rule edited by hand.
My idea is to have separate files for each rule and concatenate them into variations using cat: cat A.twolc B.twolc C.twolc > norm.twolc, cat A.twolc B-1.twolc C.twolc > not-norm.twolc, etc.
Are there any tools designed to manage this kind of problem? Is there a better approach than the one I have in mind? Does my proposed solution have weaknesses I should watch out for?

As you added the makefile tag, here is a GNU-make-based (and Gnu make only) solution:
# Edit this
RULES := A B B-1 C
VARIATIONS := norm not-norm
norm-rules := A B C
not-norm-rules := A B-1 C
# Do not edit below this line
VARIATIONSTWOLC := $(patsubst %,%.twolc,$(VARIATIONS))
all: $(VARIATIONSTWOLC)
define GEN_rules
$(1).twolc: $$(patsubst %,%.twolc,$$($(1)-rules))
cat $$^ > $$#
endef
$(foreach v,$(VARIATIONS),$(eval $(call GEN_rules,$(v))))
clean:
rm -f $(VARIATIONSTWOLC)
patsubst is straightforward. The foreach-eval-call is a bit more tricky. Long story short: it loops over all variations (foreach). For each variation v, it expands (call) GEN_rules by replacing $(1) by $(v) (the current variation) and $$ by $. Each expansion result is then instantiated (eval) as a normal make rule. Example: for v=norm, the GEN_rules expansion produces:
norm.twolc: $(patsubst %,%.twolc,$(norm-rules))
cat $^ > $#
which is in turn expanded as (step-by-step):
step1:
norm.twolc: $(patsubst %,%.twolc,A B C)
cat $^ > $#
step2:
norm.twolc: A.twolc B.twolc C.twolc
cat $^ > $#
step3:
norm.twolc: A.twolc B.twolc C.twolc
cat A.twolc B.twolc C.twolc > norm.twolc
which does what you want: if norm.twolc does not exist or if any of A.twolc, B.twolc, C.twolc is more recent than norm.twolc, the recipe is executed.

Related

Makefile: how to set up prerequisites for two lists of files

I have two lists of files as prerequisites
input_i.xx
config_j.yy
and I need to run all of their combinations. A single one looks like this:
input1_config3.output: input1.xx config3.yy
run_script $^
Also in reality, their names are not numbered, but I already have their stems defined in INPUTS and CONFIGS. With that, I can generate all the targets together
TARGETS:=$(foreach input,$(INPUTS),$(foreach config,$(CONFIGS),$(input)_$(config).output))
But I have difficulty with the prerequisites. It seems I need to
get basename
split on _
add the extensions .xx and .yy
.SECONDEXPANSION
$(TARGETS): $(basename $#)
run_script $^
Can someone show me how to do that? Not sure if this the proper way, maybe a bottom-up way is easier?
make is not really suitable for keeping track of an M x N matrix of results. The fundamental problem is that you can't have two stems in a rule, so you can't say something like
# BROKEN
input%{X}_config%{Y}.output: input%{X}.xx config%{Y}.yy
As a rough approximation, you could use a recursive make rule to set a couple of parameters, and take it from there, but this is rather clumsy.
.PHONY: all
all:
$(MAKE) -$(MAKEFLAGS) X=1 Y=6 input1_config6.output
$(MAKE) -$(MAKEFLAGS) X=1 Y=7 input1_config7.output
$(MAKE) -$(MAKEFLAGS) X=2 Y=6 input2_config6.output
:
input$X_config$Y.output: input$X.xx config$Y.yy
run_script $^
It would be a lot easier if you provided a complete sample example with a complete set of targets and prerequisites and exactly what you wanted to happen.
Using .SECONDEXPANSION might work, but you're not using it correctly; please re-read the documentation. The critical aspect of .SECONDEXPANSION is that you have to escape the variables that you want to avoid expanding until the second pass. In your example you've not escaped anything, so .SECONDEXPANSION isn't actually doing anything at all here. However, as #tripleee points out it's not easy to use multiple variable values in a single target.
To do this more easily you'll probably want to use eval. Something like this:
define DECLARE
$1_$2.output: $1.xx $2.yy
TARGETS += $1_$2.output
endef
TARGETS :=
$(foreach input,$(INPUTS),$(foreach config,$(CONFIGS),$(eval $(call DECLARE,$(input),$(config)))))
$(TARGETS):
run_script $^
I have another solution using include and bash for loop.
include trees.mk
trees.mk:
#for input in $(INPUTS); do \
for config in $(CONFIGS); do \
echo $${input}_$$config.output : $${input}.xx $$config.yy; \
echo -e '\t run_scipt $$^ ';\
done \
done > $#
At the beginning, trees.mk doesn't exist. The double for loops write out the rule to the target using file redirection >$#.
I got this idea from Managing Projects with GNU Make, Third Edition By Robert Mecklenburg, on
page 56

Always process outermost file extension (and strip extensions along the way)

I have a bunch of different source files in my static HTML blog. The outermost extensions explain the format to be processed next.
Example: Source file article.html.md.gz (with target article.html) should be processed by gunzip, then by my markdown processor.
Further details:
The order of the extensions may vary
Sometimes an extension is not used (article.html.gz)
I know how to process all different extensions
I know that the final form is always article.html
Ideally I would have liked to just write rules as follows:
...
all-articles: $(ALL_HTML_FILES)
%: %.gz
gunzip ...
%: %.md
markdown ...
%: %.zip
unzip ...
And let make figure out the path to take based on the sequence of extensions.
From the documentation however, I understand that there are constraints on match-all rules, and the above is not possible.
What's the best way forward? Can make handle this situation at all?
Extensions are made up examples. My actual source files make more sense :-)
I'm on holiday so I'll bite.
I'm not a fan of pattern rules, they are too restricted and yet too arbitrary at the same time for my tastes. You can achieve what you want quite nicely in pure make:
.DELETE_ON_ERROR:
all: # Default target
files := a.html.md.gz b.html.gz
cmds<.gz> = gzip -d <$< >$#
cmds<.md> = mdtool $< -o $#
define rule-text # 1:suffix 2:basename
$(if $(filter undefined,$(flavor cmds<$1>)),$(error Cannot handle $1 files: [$2$1]))
$2: $2$1 ; $(value cmds<$1>)
all: $2
endef
emit-rule = $(eval $(call rule-text,$1,$2))# 1:suffix 2:basename
emit-hierachy = $(if $(suffix $2),$(call emit-rule,$1,$2)$(call emit-hierachy,$(suffix $2),$(basename $2)))# 1:suffix 2:basename
emit-rules = $(foreach _,$1,$(call emit-hierachy,$(suffix $_),$(basename $_)))# 1:list of source files
$(call emit-rules,${files})
.PHONY: all
all: ; : $# Success
The key here is to set $files to your list of files.
This list is then passed to emit-rules.
emit-rules passes each file one-at-a-time to emit-hierachy.
emit-hierachy strips off each extension in turn,
generates the appropriate make syntax, which it passes to $(eval …).
emit-hierachy carries on until the file has only one extension left.
Thus a.html.md.gz becomes this make syntax:
a.html.md: a.html.md.gz ; gunzip <$< >$#
a.html: a.html.md ; mdtool $< -o $#
all: a.html
Similarly, b.html.gz becomes:
b.html: b.html.gz ; gunzip <$< >$#
all: b.html
Neato, or what?
If you give emit-rules a file with an unrecognised extension (c.html.pp say),
you get a compile-time error:
1:20: *** Cannot handle .pp files: [c.html.pp]. Stop.
Compile-time? Yeah, before any shell commands are run.
You can tell make how to handle .pp files by defining cmds<.pp> :-)
For extra points it's also parallel safe. So you can use -j9 on your 8 CPU laptop, and -j33 on your 32 CPU workstation. Modern life eh?

makefile: remove duplicate words without sorting

Is there a possibility to remove duplicates in a list of words without sorting in a makefile?
$(sort foo bar lose)
does remove duplicates (which is for me the main functionality in this case), but also sorts (for me an unfortunate side effect in this case). I want to avoid that.
[update]
bobbogo's answer works very nicely. Just remember to use define uniq for v3.81 and (did not check this) define uniq = for later versions.
larsmans' answer works very nicely too if your record separator is not a space, e.g. if you want to remove duplicates from _foo_bar_lose_lose_bar_baz_ or the like. Just remember to use the RS and ORS awk options instead of tr, and wrap it all with $(firstword $(shell ... ))
Boring $eval based method:
define uniq =
$(eval seen :=)
$(foreach _,$1,$(if $(filter $_,${seen}),,$(eval seen += $_)))
${seen}
endef
w := z z x x y c x
$(info $(sort $w))
$(info $(call uniq,$w))
Extremely fiendish make standard library recursive call (recursive make considered extremely fiendish?):
uniq = $(if $1,$(firstword $1) $(call uniq,$(filter-out $(firstword $1),$1)))
It's worth noting that no variables are damaged in this second formulation (see seen in the first). It is preferable just for that (given the lack of locals in make)!
EDIT
My obscure comment about recursive make above seems to have muddied the waters somewhat.
"Recursive" in the context of this post means recursive function.
It really has nothing to do with the execrable recursive make.
The latter (recursive) definition of uniq is extremely nice, performant, small, and is definitely the one to use.
Depends on where you need it and whether you use GNU make. If you just want to uniq the list of target prerequisites, it's as easy as (http://www.gnu.org/software/make/manual/make.html#Quick-Reference) :
The value of $^ omits duplicate prerequisites, while $+ retains them and preserves their order.
So, a rule like
exe: $(OBJS)
$(LD) -o $# $^
will filter duplicates from $(OBJS) automagically, while still leaving order of other items the same.
You could echo the words through awk:
echo foo bar foo baz bar | tr ' ' '\n' | awk '!a[$0]++'
Deduping one-liner taken from catonmat.
(Don't forget to double the $ to $$ in a Makefile.)
The following works for me under GNU make v3.82:
uniq = $(eval _uniq := $1)$(strip $(foreach _,$(_uniq),$(if $(filter $_,$(_uniq)),$(eval _uniq := $(filter-out $_,$(_uniq)))$_)))
It doesn't modify its input by creating a copy in _uniq, and it's not recursive.

Can prerequisites in a static pattern rule be filtered?

I'm trying to limit $(all_possible_inputs) to $(relevant_inputs). $(all_possible_inputs) is a concatenation of multiple files from other included makefiles. The following functions correctly (the perl scripts know how to ignore the extra inputs), but everything is rebuilt if a single input changes:
$(step2_outputs): $(data)/%.step2: $(routines)/step2.%.pl $(all_possible_inputs)
perl $^ > $#
UPDATE: Filter must match more than one *.step1 file. If step1 produced:
A.foo.step1
A.bar.step1
B.foo.step1
B.bar.step1
B.baz.step1
Then step2's rules should expand to:
A.step2: routines/step2.A.pl A.foo.step1 A.bar.step1
B.step2: routines/step2.B.pl B.foo.step1 B.bar.step1 B.baz.step1
Logically, this is what I want to work:
$(step2_outputs): $(data)/%.step2: $(routines)/step2.%.pl $(filter $(data)/%.*.step1,$(all_possible_inputs))
perl $^ > $#
The % is supposed to match the static pattern rule stem. The * is supposed to be a wildcard (which I'm aware won't work). I believe the problem is that filter repurposes '%', so the filter expression fails. I thought it might be solvable with Secondary Expansion, but I tried this, and the filter still returned the empty string:
UPDATE: I switched the examples to use $$* based on Beta's good suggestion:
.SECONDEXPANSION:
$(step2_outputs): $(data)/%.step2: $(routines)/step2.%.pl $$(filter $(data)/$$*.%.step1,$(all_possible_inputs))
perl $^ > $#
This is running on gnu make 3.81 in a linux environment.
Your third method works for me, but you can try this: instead of % (which is expanded in the first phase) use $$*
.SECONDEXPANSION:
$(step2_outputs): $(data)/%.step2: $(routines)/step2.%.pl $$(filter $(data)/$$*.step1,$(all_possible_inputs))
perl $^ > $#

multi-wildcard pattern rules of GNU Make

I want to write something like regex:
SRC:="a.dat.1 a.dat.2"
$(SRC): %.dat.%: (\\1).rlt.(\\2)
dat2rlt $^ $#
so that a.dat.1 and a.dat.2 will give a.rlt.1 and a.rlt.2.
In GNU Make info page, it says "the % can be used only once".
Is there some trick to achieve this in GNU Make?
I'm afraid what you are trying to do is not possible the way you suggest to do it, since - as you already mention - (GNU) make only allows a single stem '%', see http://www.gnu.org/software/make/manual/make.html#Pattern-Rules:
A pattern rule looks like an ordinary rule, except that its target
contains the character ‘%’ (exactly one of them).
Without it, creating such 'multi-dimensional' targets is cumbersome.
One way around this is by rebuilding the name of the dependency in the command (rather than in the dependency list):
SRC := a.dat.1 a.dat.2
all : $(SRC:%=%.dat2rlt)
%.dat2rlt :
dat2rtl $(word 1,$(subst ., ,$*)).rlt.$(word 2,$(subst ., ,$*)) $*
Of course, however, this way you would lose the dependency, it will not rebuild once the rlt has been updated.
The only way I can see to address that is by generating the rules explicitly:
SRC := a.dat.1 a.dat.2
all : $(SRC)
define GEN_RULE
$1.dat.$2 : $1.rlt.$2
dat2rtl $$< $$#
endef
$(foreach src,$(SRC),$(eval $(call GEN_RULE,$(word 1,$(subst ., ,$(src))),$(word 3,$(subst ., ,$(src))))))
Using named variables, we can write more readable code (based on answer of Paljas):
letters:=a b c
numbers:=1 2 3 4
define GEN_RULE
$(letter).dat.$(number) : $(letter).rlt.$(number)
./rlt2dat $$< $$#
endef
$(foreach number,$(numbers), \
$(foreach letter,$(letters), \
$(eval $(GEN_RULE)) \
) \
)
We can generate SRC in a similar way. Note that using that method SRC will contain all the combinations. That may or may not be beneficial.
Building on the answer of Erzsébet Frigó, you might additionally choose to:
in the inner loop, eval not the macro itself but the result of calling it
name the macro after program you're calling, dat2rtl
in combination, allowing you to
refer to the program name using make's ${0}
define a target, ${0}s (expanding to dat2rts - note the pluralization) with preconditions of all combinations of letters and numbers on which dat2r2 was called
Like this:
letters:=a b c
numbers:=1 2 3 4
define rlt2dat
${0}s::$(letter).dat.$(number)
$(letter).dat.$(number): $(letter).rlt.$(number)
./${0} $$< $$#
endef
$(foreach number,$(numbers), \
$(foreach letter,$(letters), \
$(eval $(call rlt2dat))))
allowing you to build all rlt2dat targets as:
make rlt2dats
For the limited example you gave, you can use a pattern with one %.
SRC := a.dat.1 a.dat.2
${SRC}: a.dat.%: a.rlt.%
dat2rlt $^ $#
$* in the recipe will expand to whatever the % matched.
Note that the "s around your original macro are definitely wrong.
Have a look at .SECONDEXPANSION in the manual for more complicated stuff (or over here).

Resources