How to force make to spawn jobs only for recursions - makefile

I'm working on a huge project based on Qt, which take a couple of hours to compile even on a 6 cores machine.
The reason for this is that when I run make only one of the cores compile the sources: the others remain idle.
The solution would be to execute make with the -j option (something like make -j6), using all the 6 cores of my machine.
The problem with this is that the make do not spawn in a recursion.
For example:
I have 4 modules, A, B, C and D:
- D depends on A, B and C.
- B depends on A.
- C depends on system libraries only.
- A depends on system libraries only.
The qmake app generated a Makefile for each of the above modules and one Makefile to compile all the modules.
When I run make -j6, the 6 jobs start compiling all the modules, instead of compiling one by one. This behavior is problem because when the module D must be linked against the other modules, those might not be ready, throwing a not found error.
Is it possible to change this behavior with a make option? Might this be a problem of software engineering (the modules are not well projected)?

It sounds like you have missing dependencies between the recursive make invocations. You haven't shown us the toplevel makefile that invokes the recursive makes, but I guess it looks something like this:
all: A B C D
A B C D:
$(MAKE) -C $#
You can fix this by adding the necessary dependencies between the recursive makes:
all: A B C D
A B C D:
$(MAKE) -C $#
B: A
D: A B C
This strategy will give you correct parallel builds, although at the cost of some performance -- there is a lot of work that can be safely parallelized between those submakes, and its a shame to serialize all of it just for the sake of the one or two commands in each that really have to be serialized. A better solution is to use non-recursive make, which would require a more serious refactoring of your makefiles, or to use Electric Make, which can solve this problem for you without requiring you to change the makefiles at all (not even to add the extra dependencies). I've written about how Electric Make fixes recursive make on my blog.
(Disclaimer: I'm the architect of Electric Make)

Try building with makepp. That will unravel the recursion (see Recursive Make Considered Harmful) by piping the sub-make's targets back to the main process, as though the makefiles had been cleanly designed from the outset.
Then it will do its normal dependency analysis, not abstractly on a directory level, but by drilling down to the real per-file dependencies across your whole directory tree. So it can parallelize the maximum that's safely possible.
Actually I don't know why cmake generates makefiles... All they do is call cmake to do everything behind make's back. So much so, that makepp has some special handling for cmake to avoid endless recursion, for dependencies that make oversees. Now qmake is not exactly cmake, and I'd be delighted to hear if it's near enough to work!

I've finally discovered how to do this automatically with qmake.
It couldn't be simpler: just add CONFIG += ordered to the .pro file and everything will just work. More about this flag here at the qmake Variable Reference.
Note that it only works with the subdirs project template, but this is not an issue since the parallel compilation is only an issue when you have dependency between different binaries.
Example:
# ABC.pro
TEMPLATE = subdirs
CONFIG += ordered
SUBDIRS += \
A \
B \
C

Related

Gnu Make: When invoking parallel make, if pre-requisites are supplied during the build, will make try to remake those?

This is an order of operations question.
Suppose I declare a list of requirements:
required:=$(patsubst %.foo,%.bar, $(shell find * -name '.foo'))
And a rule to make those requirements:
$(required):
./foo.py $#
Finally, I invoke the work with:
make foo -j 10
Suppose further the job is taking days and days (up to a week on this slow desktop computer).
In order to speed things up, I'd like to generate a list of commands and do some of the work on the Much faster laptop. I can't do all of the work on the laptop because, for whatever reason, it can't stay up for hours and hours without discharging and suspending (if I had to guess, probably due to thermal throttling):
make -n foo > outstanding_jobs
cat outstanding_jobs | sort -r | sponge outstanding_jobs
scp slow_box:outstanding_jobs fast_laptop:outstanding_jobs
ssh fast_laptop
head -n 200 outstanding_jobs | parallel -j 12
scp *.bar slow_box:.
The question is:
If I put *.bar in the directory where the original make job was run, will make still try to do that job on the slow box?
OR do I have to halt the job on the slow box and re-invoke make to "get credit" in the make recipe for the new work that I've synced over onto the slow box?
NOTE: substantially revised.
Before it starts building anything, make constructs a dependency graph to guide it, based on an analysis of the requested goal(s), the applicable build rules, and, to some extent, the files already present. It then walks the graph, starting from the goal nodes, to determine which are out of date with respect to their prerequisites and update them.
Although it does not necessarily evaluate the whole graph before running any recipes, once it decides that a given target needs to be updated, make is committed to updating it. In particular, once make decides that some direct or indirect prerequisite of T is out of date, it is committed to (re)building T, too, regardless of any subsequent action on T by another process.
So, ...
If I put *.bar in the directory where the original make job was run,
will make still try to do that job on the slow box?
Adding files to the build directory after make starts building things will not necessarily affect which targets the ongoing make run will attempt to build, nor which recipes it uses to build them. The nearer a target is to a root of the dependency graph, the less likely that the approach described will affect whether make performs a rebuild, especially if you're running a parallel make.
It's possible that you would see some time savings, but you must also consider the possibility that you end up with an inconsistent build.
OR do I have to halt the job on the slow box and re-invoke make to "get credit" in the make recipe for the new work that I've synced over onto the slow box?
If the possibility of an inconsistent build can be discounted, then that is probably a viable option. A new make run will take the then-existing files into account. Depending on the defined rules and the applicable timestamps, it is still possible that some targets would be rebuilt that did not really need to be, but unless the makefile engages in unusual shennanigans, chances are good that at least most of the built files imported from the helper machine will be accepted and used without rebuilding.

How to rebuild when the recipe has changed

I apollogize if this question has already been asked. It's not easy to search.
make has been designed with the assumption that the Makefile is kinda god-like. It is all-knowing about the future of your project and will never need any modification beside adding new source files. Which is obviously not true.
I used to make all my targets in a Makefile depend on the Makefile itself. So that if I change anything in the Makefile, the whole project is rebuilt.
This has two main limitations :
It rebuilds too often. Adding a linker option or a new source file rebuilds everything.
It won't rebuild if I pass a variable on the command line, like make CFLAGS=-O3.
I see a few ways of doing it correctly, but none of them seems satisfactory at first glance.
Make every target depend on a file that contains the content of the recipe.
Generate the whole rule with its recipe into a file destined to be included from the Makefile.
Conditionally add a dependency to the targets to force them being rebuilt whenever necessary.
Use the eval function to generate the rules.
But all these solutions need an uncommon way of writing the recipes. Either putting the whole rule as a string in a variable, or wrap the recipes in a function that would do some magic.
What I'm looking for is a solution to write the rules in a way as straightforward as possible. With as little additional junk as possible. How do people usually do this?
I have projects that compile for multiple platforms. When building a single project which had previously been compiled for a different architecture, one can force a rebuild manually. However when compiling all projects for OpenWRT, manual cleanup is unmanageable.
My solution was to create a marker identifying the platform. If missing, everything will recompile.
ARCH ?= $(shell uname -m)
CROSS ?= $(shell uname -s).$(ARCH)
# marker for the last built architecture
BUILT_MARKER := out/$(CROSS).built
$(BUILT_MARKER) :
#-rm -f out/*.built
#touch $(BUILT_MARKER)
build: $(BUILT_MARKER)
# TODO: add your build commands here
If your flags are too long, you may reduce them to a checksum.
"make has been designed with the assumption that the Makefile is kinda god-like. It is all-knowing about the future of your project and will never need any modification beside adding new source files."
I disagree. make was designed in a time when having your source tree sitting in a hierarchical file system was about all you needed to know about Software configuration management, and it took this idea to the logical consequence, namely that all that is, is a file (with a timestamp). So, having linker options, locator tables, compiler flags and everything else but the kitchen sink in a file, and putting the dependencies thereof also in a file, will yield a consistent, complete and error-free build environment as far as make is concerned.
This means that passing data to a process (which is nothing else than saying that this process is dependent on that data) has to be done via a file - command line arguments as make variables are an abuse of makes capabilities and lead to erroneous results. make clean is the technical remedy for a systemic misbehaviour. It wouldn't be necessary, had the software engineer designed the make process properly and correctly.
The problem is that a clean build process is hard to design and maintain. BUT: in a modern software process, transient/volatile build parameters such as make all CFLAGS=O3 never have a place anyway, as they wreck all good foundations of config management.
The only thing that can be criticised about make may be that it isn't the be-all-end-all solution to software building. I question if a program with this task would have reached even one percent of makes popularity.
TL;DR
place your compiler/linker/locator options into separate files (at a central, prominent, easy to maintain and understand, logical location), decide about the level of control through the granularity of Information (e.g. Compiler flags in one file, linker flags in another) and put the true dependencies down for all files, and voila, you will have the exactly necessary amount of compilation and a correct build.

Number of parallel build jobs in recursive make call

I have a makefile which wraps the real build in a single recursive call, in order to grab and release a license around the build, regardless of whether the build succeeds. Example:
.PHONY main_target
main_target:
#license_grab &
#sleep 2
-#$(MAKE) real_target
#license_release
This works great, until I want to run the make in parallel. On some systems it works fine, but on other systems I run into the documented problem with the -j flag when passing options to a sub-make:
If your operating system doesn’t support the above communication, then
‘-j 1’ is always put into MAKEFLAGS instead of the value you
specified. This is because if the ‘-j’ option were passed down to
sub-makes, you would get many more jobs running in parallel than you
asked for.
Since I only ever have one recursive call, I'd really like to pass the -j flag untouched to the sub-make. I don't want to hard-code the -j flag (with a number) into the makefile, because the makefile runs on multiple machines, with differing numbers of processors. I don't want to use -j with no limit, because that launches a bunch of processes right away rather than waiting for jobs to finish. I tried using the -l flag when I build, but I found that the limit doesn't apply right away, probably because limits don't apply until make can start sampling.
Is there a way to force a sub-make to use multiple jobs? Or, a way to make the -l flag actually accomplish something?
I think I could do it using a makefile modification like using -#$(MAKE) $(JOBS) real_target, and invoking make like make JOBS="-j4" main_target.
But, I'd prefer to just use standard make parameters, not adding extra dependencies on variables. I'd also prefer to modify the makefile as little as possible.
There is no way to change this behavior on systems which don't support jobserver capabilities. Something like your JOBS variable is the only way I can think of.
I'll point out a few things: first, I assume when you say other systems you mean Windows systems: that's the only commonly-used platform I'm aware of which doesn't support the jobserver. If so, note that as of GNU make version 4.0, jobserver support is implemented on Windows. So another option is to upgrade to a newer version of GNU make.
Second, the issues with the -l option were at least partly solved in GNU make version 3.81, where an extra algorithm was introduced to artificially adjust the load average based on the number of jobs make invokes. You shouldn't see this issue any longer with versions of make after that.
The problem is that for whatever reason my make does not support the job server, thus it does not pass the -j flag on in the $(MAKEFLAGS) variable. Since upgrading make to a version that does pass the flag is not an option, the solution is to pass the -j flag elsewhere. Just like the $(MAKE) variable can be abused to pass the -f flag, it can be used in the same way to pass the -j flag:
make MAKE="make -j4" main_target
This will start the main_target build with one job, but invoke make with 4 jobs on the sub-make process. Obviously, if you need a special make tool (the normal purpose of $(MAKE) then you'll need to specify it in the MAKE string as well as in the command.

Why do we describe build procedures with Makefiles instead of shell scripts?

Remark This is a variation on the question “What is the purpose
of linking object files separately in a
Makefile?” by user4076675 taking
a slightly different point of view. See also the corresponding META
discussion.
Let us consider the classical case of a C project. The gcc compiler
is able to compile and link programs in one step. We can then easily
describe the build routine with a shell script:
case $1 in
build) gcc -o test *.c;;
clean) rm -f test;;
esac
# This script is intentionally very brittle, to keep
# the example simple.
However, it appears to be idiomatic to describe the build procedure
with a Makefile, involving extra steps to compile each compilation
unit to an object file and ultimately linking these files. The
corresponding GNU Makefile would be:
.PHONY: all
SOURCES=$(wildcard *.cpp)
OBJECTS=$(SOURCES:.cpp=.o)
%.o: %.cpp
g++ -c -o $# $<
all: default
default: $(OBJECTS)
g++ -o test $^
clean:
rm -rf *.o
This second solution is arguable more involved than the simple shell
script we wrote before. It as also a drawback, as it clutters the
source directory with object files. So, why do we describe build
procedures with Makefiles instead of shell scripts? At the hand of
the previous example, it seems to be a useless complication.
In the simple case where we compile and link three moderately sized
files, any approach is likely to be equally satisfying. I will
therefore consider the general case but many benefits of using
Makefiles are only important on larger projects. Once we learned the
best tool which allows us to master complicated cases, we want to use
it in simple cases as well.
Let me highlight the ''benefits'' of using make instead of a simple
shell script for compilation jobs. But first, I would like to make an
innocuous observation.
The procedural paradigm of shell scripts is wrong for compilation-like jobs
Writing a Makefile is similar to writing a shell script with a slight
change of perspective. In a shell script, we describe a procedural
solution to a problem: we can start to describe the whole procedure in
very abstract terms using undefined functions, and we refine this
description until we reached the most elementary level of description,
where a procedure is just a plain shell command. In a Makefile, we do
not introduce any similar abstraction, but we focus on the files we
want to produce and how we can produce them. This works well because
in UNIX, everything is a file, therefore each treatment is
accomplished by a program which reads its input data from input
files, do some computation and write the results in some output
files.
If we want to compute something complicated, we have to use a lot of
input files which are treated by programs whose outputs are used as
inputs to other programs, and so on until we have produced our final
files containing our result. If we translate the plan to prepare our
final file into a bunch of procedures in a shell script, then the
current state of the processing is made implicit: the plan executor
knows “where it is at” because it is executing a given procedure,
which implicitly guarantees that such and such computations were
already done, that is, that such and such intermediary files were
already prepared. Now, which data describes “where the plan executor
is at”?
Innocuous observation The data which describes “where the plan
executor is at” is precisely the set of intermediary files which
were already prepared, and this is exactly the data which is made
explicit when we write Makefiles.
This innocuous observation is actually the conceptual difference
between shell scripts and Makefiles which explains all the advantages
of Makefiles over shell scripts in compilation jobs and similar jobs.
Of course, to fully appreciate these advantages, we have to write
correct Makefiles, which might be hard for beginners.
Make makes it easy to continue an interrupted task where it was at
When we describe a compilation job with a Makefile, we can easily
interrupt it and resume it later. This is a consequence of the
innocuous observation. A similar effect can only be achieved with
considerable efforts in a shell script, while it is just built in
make.
Make makes it easy to work with several builds of a project
You observed that Makefiles will clutter the source tree with object
files. But Makefiles can actually be parametrised to store these
object files in a dedicated directory. I work with BSD Owl
macros for bsdmake and use
MAKEOBJDIR='/usr/home/michael/obj${.CURDIR:S#^/usr/home/michael##}'
so that all object files end under ~/obj and do not pollute my
sources. See this
answer
for more details.
Advanced Makefiles allow us to have simultaneously several directories
containing several builds of a project with distinct compilation
options. For instance, with distinct features enabled, or debug
versions, etc. This is also consequence of the innocuous observation
that Makefiles are actually articulated around the set of intermediary
files. This technique is illustrated in the testsuite of BSD Owl.
Make makes it easy to parallelise builds
We can easily build a program in parallel since this is a standard
function of many versions of make. This is also consequence of the
innocuous observation: because “where the plan executor is at” is an
explicit data in a Makefile, it is possible for make to reason about
it. Achieving a similar effect in a shell script would require a
great effort.
The parallel mode of any version of make will only work correctly if
the dependances are correctly specified. This might be quite
complicated to achieve, but bsdmake has the feature which
literally anhilates the problem. It is called the
META mode. It
uses a first, non-parallel pass, of a compilation job to compute
actual dependencies by monitoring file access, and uses this
information in later parallel builds.
Makefiles are easily extensible
Because of the special perspective — that is, as another consequence
of the innocuous observation — used to write Makefiles, we can
easily extend them by hooking into all aspects of our build system.
For instance, if we decide that all our database I/O boilerplate code
should be written by an automatic tool, we just have to write in the
Makefile which files should the automatic tool use as inputs to write
the boilerplate code. Nothing less, nothing more. And we can add this
description pretty much where we like, make will get it
anyway. Doing such an extension in a shell script build would be
harder than necessary.
This extensibility ease is a great incentive for Makefile code reuse.

Makefile structure....which one is running?

I am working at ericsson RND and analyzing their makefiles written long time back. I give a command make in a directory. That directory has a makefile that does not contain any dependency and rules etc. It just includes
-include ./Makefile.local
include make/def.mk
include congif.mk
When i run the makefile through make all, it says
all
totalclean
and then enters a directory..subdirectory of the current directory
How should i know where is it taking the dependency from. where are the rules for make all written
The makefile also contains the name of the subdirs and these subdirs are the directories where my make is running after totalclean.
Try to help please. I need to reduce the compile time of the process and i am not getting any direction.
With GNU make you can do
make -pn
To get a list of all rules along with the file name and line number for every command.
In the output, look for the line starting with the rule name followed by a colon. One of the comments after that line will contain the filename and line number. For example:
# commands to execute (from `build/build.mk', line 45):
To find the all rule, look through those three files, Makefile.local, make/def.mk and congif.mk. If you don't see it, you could try removing the include directives from the main makefile one by one, to see when "all" stops working. You can look for the totalclean rule the same way.
Chances are the "all" message is from the all rule and the "totalclean" from the totalclean rule (but as jcomeau_ictx points out, not every person who writes a makefile is civilized-- the messages could come from anywhere and mean anything). The fact that "totalclean" comes after "all" suggests that it is recursion, not dependency.
You haven't said what, if anything, these makefiles are actually doing. If you want to reduce the compile time by tinkering with the build process, your only hope is to prevent unnecessary compilation, which means removing unnecessary dependencies in the makefiles (and perhaps unnecessary coupling in the source code).
EDIT:
Ankit, you're asking for a simple formula for reducing unnecessary dependencies in a big legacy makefile system; there simply isn't one. We don't have enough information to give you detailed direction, but I'll take a shot in the dark: it looks as if your makefiles run totalclean every time, and rebuild from the ground up. This is almost always unnecessary. So look for the call to totalclean and turn it off, see if that speeds things up.
EDIT:
Now you have three problems: you're dealing with a big, badly designed makefile system, you're a makefile novice, and the managers are interfering.
Yes, use make -j .... This might speed things up and almost certainly can't do harm.
You can try to explain to the officials that if you run totalclean every time, you must then recompile everything you need, and that puts a hard lower limit on build times.
You can look for unnecessary dependencies in the makefiles. There is no easy, fast way to do this, because the machine cannot know which prerequisites are really needed. If you understand the build process for a particular target, look at the rule and judge whether each prerequisite is necessary. If you're not sure, you can remove a prerequisite from a rule, make totalclean, make the target, then make all; if the target build failed, then the prerequisite was necessary, if it succeeded but the all build failed then the prerequisite is necessary but it should be in a different rule.
MAKAO may possibly be of use to you. It allows you to print a call graph of an executed make.

Resources