makefile: remove duplicate words without sorting - makefile

Is there a possibility to remove duplicates in a list of words without sorting in a makefile?
$(sort foo bar lose)
does remove duplicates (which is for me the main functionality in this case), but also sorts (for me an unfortunate side effect in this case). I want to avoid that.
[update]
bobbogo's answer works very nicely. Just remember to use define uniq for v3.81 and (did not check this) define uniq = for later versions.
larsmans' answer works very nicely too if your record separator is not a space, e.g. if you want to remove duplicates from _foo_bar_lose_lose_bar_baz_ or the like. Just remember to use the RS and ORS awk options instead of tr, and wrap it all with $(firstword $(shell ... ))

Boring $eval based method:
define uniq =
$(eval seen :=)
$(foreach _,$1,$(if $(filter $_,${seen}),,$(eval seen += $_)))
${seen}
endef
w := z z x x y c x
$(info $(sort $w))
$(info $(call uniq,$w))
Extremely fiendish make standard library recursive call (recursive make considered extremely fiendish?):
uniq = $(if $1,$(firstword $1) $(call uniq,$(filter-out $(firstword $1),$1)))
It's worth noting that no variables are damaged in this second formulation (see seen in the first). It is preferable just for that (given the lack of locals in make)!
EDIT
My obscure comment about recursive make above seems to have muddied the waters somewhat.
"Recursive" in the context of this post means recursive function.
It really has nothing to do with the execrable recursive make.
The latter (recursive) definition of uniq is extremely nice, performant, small, and is definitely the one to use.

Depends on where you need it and whether you use GNU make. If you just want to uniq the list of target prerequisites, it's as easy as (http://www.gnu.org/software/make/manual/make.html#Quick-Reference) :
The value of $^ omits duplicate prerequisites, while $+ retains them and preserves their order.
So, a rule like
exe: $(OBJS)
$(LD) -o $# $^
will filter duplicates from $(OBJS) automagically, while still leaving order of other items the same.

You could echo the words through awk:
echo foo bar foo baz bar | tr ' ' '\n' | awk '!a[$0]++'
Deduping one-liner taken from catonmat.
(Don't forget to double the $ to $$ in a Makefile.)

The following works for me under GNU make v3.82:
uniq = $(eval _uniq := $1)$(strip $(foreach _,$(_uniq),$(if $(filter $_,$(_uniq)),$(eval _uniq := $(filter-out $_,$(_uniq)))$_)))
It doesn't modify its input by creating a copy in _uniq, and it's not recursive.

Related

Makefile: how to set up prerequisites for two lists of files

I have two lists of files as prerequisites
input_i.xx
config_j.yy
and I need to run all of their combinations. A single one looks like this:
input1_config3.output: input1.xx config3.yy
run_script $^
Also in reality, their names are not numbered, but I already have their stems defined in INPUTS and CONFIGS. With that, I can generate all the targets together
TARGETS:=$(foreach input,$(INPUTS),$(foreach config,$(CONFIGS),$(input)_$(config).output))
But I have difficulty with the prerequisites. It seems I need to
get basename
split on _
add the extensions .xx and .yy
.SECONDEXPANSION
$(TARGETS): $(basename $#)
run_script $^
Can someone show me how to do that? Not sure if this the proper way, maybe a bottom-up way is easier?
make is not really suitable for keeping track of an M x N matrix of results. The fundamental problem is that you can't have two stems in a rule, so you can't say something like
# BROKEN
input%{X}_config%{Y}.output: input%{X}.xx config%{Y}.yy
As a rough approximation, you could use a recursive make rule to set a couple of parameters, and take it from there, but this is rather clumsy.
.PHONY: all
all:
$(MAKE) -$(MAKEFLAGS) X=1 Y=6 input1_config6.output
$(MAKE) -$(MAKEFLAGS) X=1 Y=7 input1_config7.output
$(MAKE) -$(MAKEFLAGS) X=2 Y=6 input2_config6.output
:
input$X_config$Y.output: input$X.xx config$Y.yy
run_script $^
It would be a lot easier if you provided a complete sample example with a complete set of targets and prerequisites and exactly what you wanted to happen.
Using .SECONDEXPANSION might work, but you're not using it correctly; please re-read the documentation. The critical aspect of .SECONDEXPANSION is that you have to escape the variables that you want to avoid expanding until the second pass. In your example you've not escaped anything, so .SECONDEXPANSION isn't actually doing anything at all here. However, as #tripleee points out it's not easy to use multiple variable values in a single target.
To do this more easily you'll probably want to use eval. Something like this:
define DECLARE
$1_$2.output: $1.xx $2.yy
TARGETS += $1_$2.output
endef
TARGETS :=
$(foreach input,$(INPUTS),$(foreach config,$(CONFIGS),$(eval $(call DECLARE,$(input),$(config)))))
$(TARGETS):
run_script $^
I have another solution using include and bash for loop.
include trees.mk
trees.mk:
#for input in $(INPUTS); do \
for config in $(CONFIGS); do \
echo $${input}_$$config.output : $${input}.xx $$config.yy; \
echo -e '\t run_scipt $$^ ';\
done \
done > $#
At the beginning, trees.mk doesn't exist. The double for loops write out the rule to the target using file redirection >$#.
I got this idea from Managing Projects with GNU Make, Third Edition By Robert Mecklenburg, on
page 56

How to know if a makefile variable is a string of char or numbers?

It probably sounds very elementary but I am unable to find a way to classify a makefile variable into text or number. My pseudocode is like this:
ifeq ($N, 'numeric')
CFLAGS+=-D$N
endif
How to do this? I am using the GNU Make (in cygwin/Windows). I read the make.pdf that comes with it but could not find a way.
Thanks in Advance
EDIT: adopted a suggestion by bobbogo that does not depend on the number of characters to purge.
I assume you use GNU make. Here is a make-only solution, without calling the shell. For performance reasons, depending on your use of it, it can be preferable. Moreover, it does not depend on which shell make uses. Last but not least, it uses recursion and I like recursion:
define PURGE
$(if $(2),$(call PURGE,$(subst $(firstword $(2)),,$(1)),$(filter-out $(firstword $(2)),$(2))),$(1))
endef
DIGITS := 0 1 2 3 4 5 6 7 8 9
define IS_NOT_A_NUMBER
$(call PURGE,$(1),$(DIGITS))
endef
CFLAGS += $(if $(call IS_NOT_A_NUMBER,$(N)),,-D$(N))
all:
$(info N=$(N) => CFLAGS=$(CFLAGS))
Demo:
host> make N=12345
N=12345 => CFLAGS=-D12345
make: 'all' is up to date.
host> make N=foobar
N=foobar => CFLAGS=
make: 'all' is up to date.
Explanation: PURGE is a recursive macro that takes two arguments. The first one ($(1)) is a string to test, the second one ($(2)) is a list of words to match. If $(2) is the empty list PURGE returns $(1). Else, it calls itself with two new parameters:
the value of $(1) where the first word of $(2) has been substituted by nothing,
$(2) from which the first word has been removed
and returns the result. So, if you call PURGE with a string and the list of all digits, it returns the empty string if and only if the string contained only digits.
All make variables are strings. To find out whether a string is in fact a number, you need some elementary text analysis functions. GNU make itself does not offer anything convenient in this area, but you could run a shell command to do the job, perhaps like this:
define is_number
$(shell test '$(1)' -eq '$(1)' 2>/dev/null && echo yes || echo no)
endef
ifeq ($(call is_number, $(N)),yes)
default:
#echo N is a number
else
default:
#echo N is not a number
endif
This results in:
$ make N=5
N is a number
$ make N=string
N is not a number
However, such string processing can be quite unreliable if the string contains special characters.

code management: generate source files with slight variations of various rules

I have a source file in a declarative language (twolc, actually) that I need to write many variations on: a normative version and many non-normative versions, each with one or more variations from the norm. For example, say the normative file has three rules:
Rule A:
Do something A-ish
Rule B:
Do something B-ish
Rule C:
Do something C-ish
Then one variation might have the exact same rules as the norm for A and C, but a different rule for B, which I will call B-1:
Rule A:
Do something A-ish
Rule B-1:
Do something B-ish, but with a flourish
Rule C:
Do something C-ish
Imagine that you have many different subtle variations on many different rules, and you have my situation. The problem I am worried about is code maintainability. If, later on, I decide that Rule A needs to be refactored somehow, then I will have 50+ files that need to have the exact same rule edited by hand.
My idea is to have separate files for each rule and concatenate them into variations using cat: cat A.twolc B.twolc C.twolc > norm.twolc, cat A.twolc B-1.twolc C.twolc > not-norm.twolc, etc.
Are there any tools designed to manage this kind of problem? Is there a better approach than the one I have in mind? Does my proposed solution have weaknesses I should watch out for?
As you added the makefile tag, here is a GNU-make-based (and Gnu make only) solution:
# Edit this
RULES := A B B-1 C
VARIATIONS := norm not-norm
norm-rules := A B C
not-norm-rules := A B-1 C
# Do not edit below this line
VARIATIONSTWOLC := $(patsubst %,%.twolc,$(VARIATIONS))
all: $(VARIATIONSTWOLC)
define GEN_rules
$(1).twolc: $$(patsubst %,%.twolc,$$($(1)-rules))
cat $$^ > $$#
endef
$(foreach v,$(VARIATIONS),$(eval $(call GEN_rules,$(v))))
clean:
rm -f $(VARIATIONSTWOLC)
patsubst is straightforward. The foreach-eval-call is a bit more tricky. Long story short: it loops over all variations (foreach). For each variation v, it expands (call) GEN_rules by replacing $(1) by $(v) (the current variation) and $$ by $. Each expansion result is then instantiated (eval) as a normal make rule. Example: for v=norm, the GEN_rules expansion produces:
norm.twolc: $(patsubst %,%.twolc,$(norm-rules))
cat $^ > $#
which is in turn expanded as (step-by-step):
step1:
norm.twolc: $(patsubst %,%.twolc,A B C)
cat $^ > $#
step2:
norm.twolc: A.twolc B.twolc C.twolc
cat $^ > $#
step3:
norm.twolc: A.twolc B.twolc C.twolc
cat A.twolc B.twolc C.twolc > norm.twolc
which does what you want: if norm.twolc does not exist or if any of A.twolc, B.twolc, C.twolc is more recent than norm.twolc, the recipe is executed.

multi-wildcard pattern rules of GNU Make

I want to write something like regex:
SRC:="a.dat.1 a.dat.2"
$(SRC): %.dat.%: (\\1).rlt.(\\2)
dat2rlt $^ $#
so that a.dat.1 and a.dat.2 will give a.rlt.1 and a.rlt.2.
In GNU Make info page, it says "the % can be used only once".
Is there some trick to achieve this in GNU Make?
I'm afraid what you are trying to do is not possible the way you suggest to do it, since - as you already mention - (GNU) make only allows a single stem '%', see http://www.gnu.org/software/make/manual/make.html#Pattern-Rules:
A pattern rule looks like an ordinary rule, except that its target
contains the character ‘%’ (exactly one of them).
Without it, creating such 'multi-dimensional' targets is cumbersome.
One way around this is by rebuilding the name of the dependency in the command (rather than in the dependency list):
SRC := a.dat.1 a.dat.2
all : $(SRC:%=%.dat2rlt)
%.dat2rlt :
dat2rtl $(word 1,$(subst ., ,$*)).rlt.$(word 2,$(subst ., ,$*)) $*
Of course, however, this way you would lose the dependency, it will not rebuild once the rlt has been updated.
The only way I can see to address that is by generating the rules explicitly:
SRC := a.dat.1 a.dat.2
all : $(SRC)
define GEN_RULE
$1.dat.$2 : $1.rlt.$2
dat2rtl $$< $$#
endef
$(foreach src,$(SRC),$(eval $(call GEN_RULE,$(word 1,$(subst ., ,$(src))),$(word 3,$(subst ., ,$(src))))))
Using named variables, we can write more readable code (based on answer of Paljas):
letters:=a b c
numbers:=1 2 3 4
define GEN_RULE
$(letter).dat.$(number) : $(letter).rlt.$(number)
./rlt2dat $$< $$#
endef
$(foreach number,$(numbers), \
$(foreach letter,$(letters), \
$(eval $(GEN_RULE)) \
) \
)
We can generate SRC in a similar way. Note that using that method SRC will contain all the combinations. That may or may not be beneficial.
Building on the answer of Erzsébet Frigó, you might additionally choose to:
in the inner loop, eval not the macro itself but the result of calling it
name the macro after program you're calling, dat2rtl
in combination, allowing you to
refer to the program name using make's ${0}
define a target, ${0}s (expanding to dat2rts - note the pluralization) with preconditions of all combinations of letters and numbers on which dat2r2 was called
Like this:
letters:=a b c
numbers:=1 2 3 4
define rlt2dat
${0}s::$(letter).dat.$(number)
$(letter).dat.$(number): $(letter).rlt.$(number)
./${0} $$< $$#
endef
$(foreach number,$(numbers), \
$(foreach letter,$(letters), \
$(eval $(call rlt2dat))))
allowing you to build all rlt2dat targets as:
make rlt2dats
For the limited example you gave, you can use a pattern with one %.
SRC := a.dat.1 a.dat.2
${SRC}: a.dat.%: a.rlt.%
dat2rlt $^ $#
$* in the recipe will expand to whatever the % matched.
Note that the "s around your original macro are definitely wrong.
Have a look at .SECONDEXPANSION in the manual for more complicated stuff (or over here).

Simplest way to reverse the order of strings in a make variable

Let's say you have a variable in a makefile fragment like the following:
MY_LIST=a b c d
How do I then reverse the order of that list? I need:
$(warning MY_LIST=${MY_LIST})
to show
MY_LIST=d c b a
Edit: the real problem is that
ld -r some_object.o ${MY_LIST}
produces an a.out with undefined symbols because the items in MY_LIST are actually archives, but in the wrong order. If the order of MY_LIST is reversed, it will link correctly (I think). If you know a smarter way to get the link order right, clue me in.
A solution in pure GNU make:
default: all
foo = please reverse me
reverse = $(if $(1),$(call
reverse,$(wordlist 2,$(words
$(1)),$(1)))) $(firstword $(1))
all : #echo $(call reverse,$(foo))
Gives:
$ make
me reverse please
An improvement to the GNU make solution:
reverse = $(if $(wordlist 2,2,$(1)),$(call reverse,$(wordlist 2,$(words $(1)),$(1))) $(firstword $(1)),$(1))
better stopping condition, original uses the empty string wasting a function call
doesn't add a leading space to the reversed list, unlike the original
Doh! I could have just used a shell script-let:
(for d in ${MY_LIST}; do echo $$d; done) | tac
You can also define search groups with ld:
ld -r foo.o -( a.a b.a c.a -)
Will iterate through a.a, b.a, and c.a until no new unresolved symbols can be satisfied by any object in the group.
If you're using gnu ld, you can also do:
ld -r -o foo.o --whole-archive bar.a
Which is slightly stronger, in that it will include every object from bar.a regardless of whether it satisfies an unresolved symbol from foo.o.
Playing off of both Ben Collins' and elmarco's answers, here's a punt to bash which handles whitespace "properly"1
reverse = $(shell printf "%s\n" $(strip $1) | tac)
which does the right thing, thanks to $(shell) automatically cleaning whitespace and printf automatically formatting each word in its arg list:
$(info [ $(call reverse, one two three four ) ] )
yields:
[ four three two one ]
1...according to my limited test case (i.e., the $(info ...) line, above).

Resources