How can I tell what -j option was provided to make - makefile

In Racket's build system, we have a build step that invokes a program that can run several parallel tasks at once. Since this is invoked from make, it would be nice to respect the -j option that make was originally invoked with.
However, as far as I can tell, there's no way to get the value of the -j option from inside the Makefile, or even as an environment variable in the programs that make invokes.
Is there a way to get this value, or the command line that make was invoked with, or something similar that would have the relevant information? It would be ok to have this only work in GNU make.

In make 4.2.1 finally they got MAKEFLAGS right. That is, you can have in your Makefile a target
opts:
#echo $(MAKEFLAGS)
and making it will tell you the value of -j parameter right.
$ make -j10 opts
-j10 --jobserver-auth=3,4
(In make 4.1 it is still broken). Needless to say, instead of echo you can invoke a script doing proper parsing of MAKEFLAGS

Note: this answer concerns make version 3.82 and earlier. For a better answer as of version 4.2, see the answer by Dima Pasechnik.
You can not tell what -j option was provided to make. Information about the number of jobs is not accessible in the regular way from make or its sub-processes, according to the following quote:
The top make and all its sub-make processes use a pipe to communicate with
each other to ensure that no more than N jobs are started across all makes.
(taken from the file called NEWS in the make 3.82 source code tree)
The top make process acts as a job server, handing out tokens to the sub-make processes via the pipe. It seems to be your goal to do your own parallel processing and still honor the indicated maximum number of simultaneous jobs as provided to make. In order to achieve that, you would somehow have to insert yourself into the communication via that pipe. However, this is an unnamed pipe and as far as I can see, there is no way for your own process to join the job-server mechanism.
By the way, the "preprocessed version of the flags" that you mention contain the expression --jobserver-fds=3,4 which is used to communicate information about the endpoints of the pipe between the make processes. This exposes a little bit of what is going on under the hood...

Related

Can GNU makefiles rules have processes as requirements, if so how?

At some step of my software building automatization, which I attempt to implement using GNU make Makefiles, I run into the case of not only having targets a requirement being source files, but as a sort of different type of requirement I would like the target to depend on another software is started and hence exist as an operation system process.
Such a program could be background process but also a foreground process such as a Webbrowser which running a HTML5 application, which might play a role in a building process by for instance interacting with files it is fed through the building process.
I would hence like to write a rule somewhat like this:
.PHONY: firefoxprocess
Html5DataResultFile: HTML5DataSourceFile firefoxprocess
cp HTML5DataSourceFile folder/checked/by/html5app/
waitforHtml5DataResultFile
firefoxprocess:
/usr/bin/firefox file://url/to/html5app &
As seen I have taken the idea that .PHONY targets are somewhat non-file targets and hence would allow for requirering a process to be started?
Yet I a unsure if that is right. The documentation of GNU make is excellent and quite large and I am unsure understood it completely. To the best of my knowledge the documentation did not really report on the use of processes being used in rules, which motivates the question here.
My feeling has been that pidfiles are somewhat a link between processes and files, but they come with several problems (i.e. race conditions, uniqueness etc)
Sometimes a Makefile dependency tree includes elements that aren't naturally or necessarily time-dependent files. There are two answers:
create a file to represent the step, or
just do the work "in line" as part of the step.
The second option is usually easiest. For instance, if a target file is to be created in a directory that might not exist yet, you don't want to make the directory name itself a dependency, because that would cause the file to be out of date whenever the directory changed. Instead, I do:
d/foo:
#test -d d || mkdir -p d
...
In your case, you could something similar; you just need a way to test for a running instance of firefox, and to be able to start it. Something like this might do:
Html5DataResultFile: HTML5DataSourceFile
pgrep firefox || { /usr/bin/firefox && sleep 5; }
cp HTML5DataSourceFile folder/checked/by/html5app/
waitforHtml5DataResultFile
The sleep call just lets FF initialize, because it might not be ready to do anything the instant it returns.
The problem with option #1 in your case is that it's undependable and a little circular. Firefox won't reliably remove the pidfile if the process dies. If it does successfully remove the file when it exits, and re-creates it when it restarts, you have a new problem: the timestamp on the file spuriously defines any dependencies as out of date, when in fact the restarted process hasn't invalidated them.

Number of parallel build jobs in recursive make call

I have a makefile which wraps the real build in a single recursive call, in order to grab and release a license around the build, regardless of whether the build succeeds. Example:
.PHONY main_target
main_target:
#license_grab &
#sleep 2
-#$(MAKE) real_target
#license_release
This works great, until I want to run the make in parallel. On some systems it works fine, but on other systems I run into the documented problem with the -j flag when passing options to a sub-make:
If your operating system doesn’t support the above communication, then
‘-j 1’ is always put into MAKEFLAGS instead of the value you
specified. This is because if the ‘-j’ option were passed down to
sub-makes, you would get many more jobs running in parallel than you
asked for.
Since I only ever have one recursive call, I'd really like to pass the -j flag untouched to the sub-make. I don't want to hard-code the -j flag (with a number) into the makefile, because the makefile runs on multiple machines, with differing numbers of processors. I don't want to use -j with no limit, because that launches a bunch of processes right away rather than waiting for jobs to finish. I tried using the -l flag when I build, but I found that the limit doesn't apply right away, probably because limits don't apply until make can start sampling.
Is there a way to force a sub-make to use multiple jobs? Or, a way to make the -l flag actually accomplish something?
I think I could do it using a makefile modification like using -#$(MAKE) $(JOBS) real_target, and invoking make like make JOBS="-j4" main_target.
But, I'd prefer to just use standard make parameters, not adding extra dependencies on variables. I'd also prefer to modify the makefile as little as possible.
There is no way to change this behavior on systems which don't support jobserver capabilities. Something like your JOBS variable is the only way I can think of.
I'll point out a few things: first, I assume when you say other systems you mean Windows systems: that's the only commonly-used platform I'm aware of which doesn't support the jobserver. If so, note that as of GNU make version 4.0, jobserver support is implemented on Windows. So another option is to upgrade to a newer version of GNU make.
Second, the issues with the -l option were at least partly solved in GNU make version 3.81, where an extra algorithm was introduced to artificially adjust the load average based on the number of jobs make invokes. You shouldn't see this issue any longer with versions of make after that.
The problem is that for whatever reason my make does not support the job server, thus it does not pass the -j flag on in the $(MAKEFLAGS) variable. Since upgrading make to a version that does pass the flag is not an option, the solution is to pass the -j flag elsewhere. Just like the $(MAKE) variable can be abused to pass the -f flag, it can be used in the same way to pass the -j flag:
make MAKE="make -j4" main_target
This will start the main_target build with one job, but invoke make with 4 jobs on the sub-make process. Obviously, if you need a special make tool (the normal purpose of $(MAKE) then you'll need to specify it in the MAKE string as well as in the command.

How can I cause make -j to produce nice output?

I have a large project that is built using make. Because of the size of the project and the way the dependencies are organized, there's a real benefit to building in parallel using make -j. However, the output (that is, the logs and errors messages) that make -j produces is all mixed up, because all of the parallel tasks write to stdout at the same time.
How can I tell make to organize the output nicely? Ideally, I'd like it to buffer the logs from each task separately, and then output then in order as they complete. Is there any standard method of doing this?
You can use the -O or --output-sync command line options:
-O[type], --output-sync[=type]
When running multiple jobs in parallel with -j, ensure the output of each job is collected together rather than interspersed with output from other jobs. If type is not specified or is target the output from the entire recipe for each target is grouped together. If type is line the output from each command line within a recipe is grouped together. If type is recurse output from an entire recursive make is grouped together. If type is none output synchronization is disabled.
The online manual has more information.
(Note that you need GNU Make 4.0 for this to work.)

Why do we describe build procedures with Makefiles instead of shell scripts?

Remark This is a variation on the question “What is the purpose
of linking object files separately in a
Makefile?” by user4076675 taking
a slightly different point of view. See also the corresponding META
discussion.
Let us consider the classical case of a C project. The gcc compiler
is able to compile and link programs in one step. We can then easily
describe the build routine with a shell script:
case $1 in
build) gcc -o test *.c;;
clean) rm -f test;;
esac
# This script is intentionally very brittle, to keep
# the example simple.
However, it appears to be idiomatic to describe the build procedure
with a Makefile, involving extra steps to compile each compilation
unit to an object file and ultimately linking these files. The
corresponding GNU Makefile would be:
.PHONY: all
SOURCES=$(wildcard *.cpp)
OBJECTS=$(SOURCES:.cpp=.o)
%.o: %.cpp
g++ -c -o $# $<
all: default
default: $(OBJECTS)
g++ -o test $^
clean:
rm -rf *.o
This second solution is arguable more involved than the simple shell
script we wrote before. It as also a drawback, as it clutters the
source directory with object files. So, why do we describe build
procedures with Makefiles instead of shell scripts? At the hand of
the previous example, it seems to be a useless complication.
In the simple case where we compile and link three moderately sized
files, any approach is likely to be equally satisfying. I will
therefore consider the general case but many benefits of using
Makefiles are only important on larger projects. Once we learned the
best tool which allows us to master complicated cases, we want to use
it in simple cases as well.
Let me highlight the ''benefits'' of using make instead of a simple
shell script for compilation jobs. But first, I would like to make an
innocuous observation.
The procedural paradigm of shell scripts is wrong for compilation-like jobs
Writing a Makefile is similar to writing a shell script with a slight
change of perspective. In a shell script, we describe a procedural
solution to a problem: we can start to describe the whole procedure in
very abstract terms using undefined functions, and we refine this
description until we reached the most elementary level of description,
where a procedure is just a plain shell command. In a Makefile, we do
not introduce any similar abstraction, but we focus on the files we
want to produce and how we can produce them. This works well because
in UNIX, everything is a file, therefore each treatment is
accomplished by a program which reads its input data from input
files, do some computation and write the results in some output
files.
If we want to compute something complicated, we have to use a lot of
input files which are treated by programs whose outputs are used as
inputs to other programs, and so on until we have produced our final
files containing our result. If we translate the plan to prepare our
final file into a bunch of procedures in a shell script, then the
current state of the processing is made implicit: the plan executor
knows “where it is at” because it is executing a given procedure,
which implicitly guarantees that such and such computations were
already done, that is, that such and such intermediary files were
already prepared. Now, which data describes “where the plan executor
is at”?
Innocuous observation The data which describes “where the plan
executor is at” is precisely the set of intermediary files which
were already prepared, and this is exactly the data which is made
explicit when we write Makefiles.
This innocuous observation is actually the conceptual difference
between shell scripts and Makefiles which explains all the advantages
of Makefiles over shell scripts in compilation jobs and similar jobs.
Of course, to fully appreciate these advantages, we have to write
correct Makefiles, which might be hard for beginners.
Make makes it easy to continue an interrupted task where it was at
When we describe a compilation job with a Makefile, we can easily
interrupt it and resume it later. This is a consequence of the
innocuous observation. A similar effect can only be achieved with
considerable efforts in a shell script, while it is just built in
make.
Make makes it easy to work with several builds of a project
You observed that Makefiles will clutter the source tree with object
files. But Makefiles can actually be parametrised to store these
object files in a dedicated directory. I work with BSD Owl
macros for bsdmake and use
MAKEOBJDIR='/usr/home/michael/obj${.CURDIR:S#^/usr/home/michael##}'
so that all object files end under ~/obj and do not pollute my
sources. See this
answer
for more details.
Advanced Makefiles allow us to have simultaneously several directories
containing several builds of a project with distinct compilation
options. For instance, with distinct features enabled, or debug
versions, etc. This is also consequence of the innocuous observation
that Makefiles are actually articulated around the set of intermediary
files. This technique is illustrated in the testsuite of BSD Owl.
Make makes it easy to parallelise builds
We can easily build a program in parallel since this is a standard
function of many versions of make. This is also consequence of the
innocuous observation: because “where the plan executor is at” is an
explicit data in a Makefile, it is possible for make to reason about
it. Achieving a similar effect in a shell script would require a
great effort.
The parallel mode of any version of make will only work correctly if
the dependances are correctly specified. This might be quite
complicated to achieve, but bsdmake has the feature which
literally anhilates the problem. It is called the
META mode. It
uses a first, non-parallel pass, of a compilation job to compute
actual dependencies by monitoring file access, and uses this
information in later parallel builds.
Makefiles are easily extensible
Because of the special perspective — that is, as another consequence
of the innocuous observation — used to write Makefiles, we can
easily extend them by hooking into all aspects of our build system.
For instance, if we decide that all our database I/O boilerplate code
should be written by an automatic tool, we just have to write in the
Makefile which files should the automatic tool use as inputs to write
the boilerplate code. Nothing less, nothing more. And we can add this
description pretty much where we like, make will get it
anyway. Doing such an extension in a shell script build would be
harder than necessary.
This extensibility ease is a great incentive for Makefile code reuse.

Can I change gnu make parallelism factor on the fly?

I want to run my make with -j8 if I'm not using distcc, but -j40 if distcc is enabled.
If I don't figure out whether or not I can use distcc until deep in the execution of the makefile, is there a way to change the -j factor at that late date? Or do I have to make the decision in a wrapper script before I invoke make? (I really don't want to run make recursively, with a different -j factor in the sub-make).
There's no way to change the number of jobs on the fly. The jobserver is configured right at the beginning of make, and it's not possible to reconfigure it with a different size without restarting make.

Resources