Dealing with dependency poisoning in makefiles - makefile

I'm working on re-writing an old build that was originally "designed" (or not, as the case may be) to be recursive. To preface, there will come a day when we'll move to something more modern, expressive, and powerful (eg, scons); however, that day is not now.
As part of this effort I'm in the process of consolidating what should be generic variables/macros & targets/recipes into a few concise rulefiles that'll be included as part of the primary build. Each sub-section of the build will use a small makefile that adds targets & dependencies with little in the way of variables being added in these sub-makefiles. The top level makefile will then include all the makefiles, allowing everything to contribute to the dependency tree.
I must admit I'm not at all confident that people will use good judgement in modifying makefiles. As an example of what I'm worried about:
CFLAGS = initial cflags
all: A.so
%.so %.o:
#echo "${#}: ${CFLAGS} : ${filter 5.o,${^}} ${filter %.c,${^}"
%.c :
true
%.o : %.c
A.so : B.so a1.o a2.o a3.o
B.so : b1.o b2.o b3.o
A.so : CFLAGS += flags specific to building A.so
Provided I didn't screw up copying that example, the situation is thus: A.so would link to B.so, and A.so's objects need special flags to be built; however, B.so and B's objects will inherit the changes to CFLAGS.
I would prefer to have a single target for building most if not all object files, even to the extent of modifying CFLAGS specifically for those those objects that need slightly different flags, in order to promote re-use of more generic targets (makes debugging easier if there's only one target/recipe to worry about).
After I finish re-architecting this build, I'm not at all confident someone won't do something stupid like this; worse, it's likely to pass peer review if I'm not around to review it.
I've been kicking around the idea of doing something like this:
% : CFLAGS = initial cflags
... which would prevent dependency poisoning unless someone then updates it with:
% : CFLAGS += some naive attempt at altering CFLAGS for a specific purpose
However, if there's, just 1000 targets (an extremely conservative estimate), and approximately 1k in memory allocated to variables, then we're up around 1mb of overhead which could significantly impact the time it takes to lookup the CFLAGS value when working through recipes (depending on gmake's architecture of course).
In short, I suppose my question is: what's a sane/good way to prevent dependency poisoning in a makefile? Is there a better strategy than what I've outlined?
edit
If anyone out there attempts to go down the path of scoping variables as described above, I ran into a nuance that wasn't entirely obvious at first.
% : INCLUDES :=
# ...
SOMEVAR := /some/path
% : INCLUDES += -I${SOMEVAR}
...
SOMEVAR :=
When a variable is created using :=, everything to the right of := should be evaluated immediately, whereas if it just used = it would delay evaluation until the target recipe evaluates INCLUDES.
However, SOMEVAR evaluates to nothing when a target recipe is evaluated. If you change the definition to:
% : INCLUDES := whatever
# ...
SOMEVAR := /some/path
% : INCLUDES := ${INCLUDES} -I${SOMEVAR}
...
SOMEVAR :=
... then it forces SOMEVAR to be evaluated immediately instead of delaying evaluation, but INCLUDES doesn't evaluate to its previously scoped definition, rather to the global definition.
$(flavor ...) says INCLUDES is simple, and $(origin ...) returns file; this occurs whether you use := or +=.
In short, if you use += on scoped variables, it'll only use the definition of the variable scoped to that target; it doesn't look at globals. If you use :=, it only uses globals.

If you abstain from unusual characters in your filenames, you can select target-specific variables with variable name substitution:
A.so_CFLAGS = flags specific to building A.so
%.so %.o:
#echo "${#}: ${CFLAGS} ${$#_CFLAGS} : ${filter %.o,${^}} ${filter %.c,${^}}"
Which obviously doesn't propagate the name of the currently built archive to the objects built for that library, but I don't know if this is desired.
This approach has some obvious deficiencies, naturally, like the inability to actually override CFLAGS. However, considering that automake has the same problem to solve and resorts to stupid text substitution, I think a lot of people already failed to find a nice solution here.
As a side-note, you might consider using automake instead of re-engineering it.

Related

Make overflow¹, or “How to override a target?”

I have a series (dozens) of projects that consist of large amounts of content in git repositories. Each repository has a git submodule of a common toolkit. The toolkit contains libraries and scripts needed to process the content repositories and build a publishable result. All the repositories are pushed to a host that runs CI and publishes the results. The idea is to keep the repeated code to an absolute minimum and mostly have content in the repositories and rely on and the toolkit to put it all together the same way for every project.
Each project has a top level Makefile that typically only has a couple lines, for example:
STAGE = stage
include toolkit/Makefile
The stage variable has some info about what stage this particular is in which determine which formats get built. Pretty much everything else is handled by the 600 line Makefile in the toolkit. Building some of the output formats can require a long chain of dependencies: The process of a source might trigger a target rule, but to get to the target there might be 8–10 intermediate dependencies where various files get generated before the final target can be made.
I've run across a couple situations where I want to completely replace (not just extend) a specific target rule in just one project. The target gets triggered in the middle of a chain of dependencies but I want to do something completely different for that one step.
I've tried just replacing the target in the top level Makefile:
STAGE = stage
%-target.fmt:
commands
include toolkit/Makefile
This is specifically documented not to be supported, but tantalizingly it works sometime of the time. I've tried changing the order of declaring the custom target and the include but that doesn't seem to significantly affect this. In case it matters, yes, the use of patterns in targets is important.
Sometimes it is useful to have a makefile that is mostly just like another makefile. You can often use the ‘include’ directive to include one in the other, and add more targets or variable definitions. However, it is invalid for two makefiles to give different recipes for the same target.
Interestingly if I put custom functions in the top level Makefile below the include I can override the functions from the toolkit such that $(call do_thing) will use my override:
STAGE = stage
include toolkit/Makefile
define do_thing
commands
endef
However the same does not seem to be true for targets. I am aware of the two colon syntax, but I do not want to just extend an existing target with more dependencies, I want to replace the target entirely with a different way of generating the same file.
I've thought about using recursive calls to make as suggested in the documentation, but then the environment including helper functions that are extensively setup in the toolkit Makefile would not be available to any targets in the top level Makefile. This would be a show stopper.
Is there any way to make make make an exception for me? Can it be coerced into overriding targets? I'm using exclusively recent versions of GNU-Make and am not too concerned about portability. Failing that is there another conceptual way to accomplish the same ends?
¹ My brain hasn't had enough coffee today. In trying to open Stack Overflow to ask this question I typed makeoverflow.com into my browser and was confused why auto-completion wasn't kicking in.
Updated answer:
If your recipe has dependencies, these cannot be overriden by default. Then $(eval) might save you like this:
In toolkit have a macro definition with your generic rule:
ifndef TARGET_FMT_COMMANDS
define TARGET_FMT_COMMANDS
command1 # note this these commands should be prefixed with TAB character
command2
endef
endif
define RULE_TEMPLATE
%-target.fmt: $(1)
$$(call TARGET_FMT_COMMANDS)
endef
# define the default dependencies, notice the ?= assignment
TARGET_DEPS?=list dependencies here
# instantiate the template for the default case
$(eval $(call RULE_TEMPLATE,$(TARGET_DEPS)))
Then into the calling code, just define TARGET_FMT_COMMANDS and TARGET_DEPS before including the toolkit and this should do the trick.
(please forgive the names of the defines/variables, they are only an example)
Initial answer:
Well, I'd write this in the toolkit:
define TARGET_FMT_COMMANDS
command1 # note this these commands should be prefixed with TAB character
command2
endef
%-target.fmt:
$(call TARGET_FMT_COMMANDS)
The you could simply redefine TARGET_FMT_COMMANDS after include toolkit/Makefile
The trick is to systematically have the TAB character preced the commands inside the definition if not you get weird errors.
You can also give parameters to the definition, just as usual.
I ran into the same issue, how I ended up working around the problem that overriding a target does not override the prerequisites was to override the pre-requisites' rules as well to force some of them to be empty commands.
in toolkit/Makefile:
test: depend depend1 depend2
#echo test toolkit
...
in Makefile:
include toolkit/Makefile
depend1 depend2: ;
test: depend
#echo test
Notice how depend1 and depend2 now have empty targets, so the test target's command is overridden and the dependencies are effectively overridden as well.

GNU Make recursively expanded variables examples

Could somebody provide a real-world example of using recursively expanded variables (REV)? In the docs or various blog posts people give only useless toy examples, like
foo = $(bar)
bar = $(ugh)
ugh = Huh?
I cannot find a real use for REV besides creating custom functions with $(call). I also found that in the past people were using REV to supply additional parameters to a compiler for specific targets but that trick is considered outdated now because GNU Make has target-specific variables.
Both recursively expanded variables and simply expanded variables recurse their expansions. The major difference is when that expansion happens.
So your example above works just fine with simply expanded variables if you invert the assignments:
ugh := Huh?
bar := $(ugh)
foo := $(bar)
So the major thing that recursively expanded variables get you is the freedom to assign values in whatever order you need (which means you don't need to worry about inclusion order for included makefiles, etc.).
In a project at work we have a dozen or so included makefiles that have inter-dependent relationships. These are expressed through the usage of known-format variable names (e.g. module A generates an A_provides variable, etc.) Modules that need to utilize the things that module A provides can then list $(A_provides) in their makefiles.
With simply expanded variables (which we had been using until somewhat recently) this meant that the inclusion of the generated makefiles required a manually sorted order to force the inclusion of assigning makefiles before consuming makefiles (into other variables).
With recursively expanded variables this order does not matter. (This would not be the case if the variables were used in any immediately evaluated context in these makefiles but luckily they are not, they only set variables that are used later in the main makefiles.)
Well, one simple example are variables which contain commands for recipes; perhaps:
buildc = $(CC) $(CPPFLAGS) $(CFLAGS) -c -o $# $<
%.o: %.c
$(buildc)
I probably wouldn't write a compile rule like this this way, but if you have a much more complex recipe it can be very useful.
Personally I don't consider "additional parameters ... for specific targets" (by which I assume you mean recursively defined variables such as $($*_FLAGS)) to be outdated or obsoleted by target-specific variables, by any stretch. If nothing else recursively defined variables are much more portable than target-specific variables.
It just means that recursively defined variables are set at the time of definition only!
The best example I can find is this
x := foo
y := $(x) bar
x := later
and is equivalent to
y := foo bar
x := later

Writing a Makefile to be includable by other Makefiles

Background
I have a (large) project A and a (large) project B, such that A depends on B.
I would like to have two separate makefiles -- one for project A and one for project B -- for performance and maintainability.
Based on the comments to an earlier question, I have decided to entirely rewrite B's makefile such that A's makefile can include it. This will avoid the evils of recursive make: allow parallelism, not remake unnecessarily, improve performance, etc.
Current solution
I can find the directory of the currently executing makefile by including at the top (before any other includes).
TOP := $(dir $(lastword $(MAKEFILE_LIST)))
I am writing each target as
$(TOP)/some-target: $(TOP)/some-src
and making changes to any necessary shell commands, e.g. find dir to find $(TOP)/dir.
While this solves the problems it has a couple disadvantages:
Targets and rules are longer and a little less readable. (This is likely unavoidable. Modularity has a price).
Using gcc -M to auto-generate dependencies requires post-processing to add $(TOP) everywhere.
Is this the usual way to write makefiles that can be included by others?
If by "usual" you mean, "most common", then the answer is "no". The most common thing people do, is to improvise some changes to the includee so the names do not clash with the includer.
What you did, however, is "good design".
In fact, I take your design even futher.
I compute a stack of directories, if the inclusion is recursive, you need to keep the current directories on a stack as you parse the makefile tree. $D is the current directory - shorter for people to type than $(TOP)/,
and I prepend everything in the includee, with $D/, so you have variables:
$D/FOOBAR :=
and phony targets:
$D/phony:

Why .PHONY:target and not target:.PHONY?

I still don't understand why "phony" rules in Makefiles have ".PHONY" as their target. It would be much more logical as a prerequisite.
Do I have to elaborate on this? If A depends on B and B is phony, then A is phony too. So the dependency graph .PHONY←B→A is waay surprising compared to .PHONY→B→A. (Another argument is that an implementation of make must handle the .PHONY target very special.)
While this critique may seem rather theoretical (to pointless) - "since make is so ancient, its syntax is here to stay". But I am not proposing any syntax change, there is an alternative:
With GNU Make (at least), the following Makefile declares a phony target_A:
target_A: _PHONY
touch target_A
_PHONY:
#noop
Question 1: This is so simple and clean, surely I am not its first inventor. In fact, given this alternative, why did make ever need the special syntax?
It seems to me that this would also quite nicely solve questions about wildcards in phony targets, and could even shed some light on .PHONY's meaning when beginners doubt.
Question 2: Can you think of any circumstance where this approach is inferior? (Is invoking make .PHONY of any use?)
(I should mention that while I have invoked other makes, GNU Make is the only implementation that I have some experience with - reading and writing Makefiles.)
One big problem with using target_A: .PHONY is that it makes it much harder to use many of make's built-in variables. Take this common recipe as an example:
%.a: $(OBJ_FILES)
$(LD) $(LFLAGS) -o $# $^
The $^ variable pulls in everything that's listed as a prerequisite. If .PHONY was also listed there then it would be passed to the linker on the command-line, which would probably not result in anything good happening. Using meta-targets like .PHONY as prerequisites makes these built-in variables significantly less useful, as they require a lot of extra processing like $(filter-out .PHONY,$^) every time they are used. Inverting the relationship and instead making .PHONY the target is a bit awkward for the sake of thinking about dependency trees, but it cleans up the rest of the makefile.

Building hierarchical Makefile with GNU Make

I have a project divided in modules, each hosted in a directory, say:
root
|_module_A
|_module.cpp
|_Makefile
|_module_B
|_Makefile
|_main.c
|_Makefile
main.c depends on targets defined in Makefiles related to module_A and module_B.
I want to write my root/Makefile with respect to targets defined in Makefiles of both modules.
Now, I know that I could use the include directive, but the problem here is that targets and filenames in module_A and module_B aren't prepended with their directory, so I get something like this:
make: *** No rule to make target `module.o', needed by `main.c'. Stop.
There is a good way to solve this?
Thanks.
There are a couple of ways to do this, none of them perfect. The basic problem is that Make is good at using things there to make things here, but not the other way around.
You haven't said what the targets in module_B are; I'll be pessimistic and suppose that module_A and module_B both have targets called module (different source files, different recipes), so you really can't use include.
The biggest choice you have to make is whether to use recursive Make:
If you don't, then root/Makefile must know how to build module_A/module and module_B/module, so you'll simply have to put those rules in. Then you must either leave the redundant rules in the subdir makefiles (and run the risk that they'll drift out of agreement with the master makefile), or eliminate them, or have them call the master makefile recursively (which you wouldn't have to do very often, but it sure would look silly).
If you do, then root/Makefile will look something like this:
main: main.o module_A/module.o Module_B/module.o
...
main.o: main.c
...
%/module.o:
$(MAKE) -C $(#D) $(#F)
This will work well enough, but it will know nothing about dependencies within the subdirectories, so it will sometimes fail to rebuild an object that is out of date. You can make clean (recursively) beforehand every time, just to be on the safe side, crude but effective. Or force the %/module.o rule, which is less wasteful but a little more complicated. Or duplicate the dependency information in root/Makefile, which is tedious and untidy.
It's just a question of your priorities.
Can't you write the makefile in a non-recursive way?
Recursive Make Considered Harmful

Resources