Can I change gnu make parallelism factor on the fly?

Can I change gnu make parallelism factor on the fly? - makefile

I want to run my make with -j8 if I'm not using distcc, but -j40 if distcc is enabled.
If I don't figure out whether or not I can use distcc until deep in the execution of the makefile, is there a way to change the -j factor at that late date? Or do I have to make the decision in a wrapper script before I invoke make? (I really don't want to run make recursively, with a different -j factor in the sub-make).

There's no way to change the number of jobs on the fly. The jobserver is configured right at the beginning of make, and it's not possible to reconfigure it with a different size without restarting make.

Related

Gnu Make: When invoking parallel make, if pre-requisites are supplied during the build, will make try to remake those?

This is an order of operations question.
Suppose I declare a list of requirements:
required:=$(patsubst %.foo,%.bar, $(shell find * -name '.foo'))
And a rule to make those requirements:
$(required):
./foo.py $#
Finally, I invoke the work with:
make foo -j 10
Suppose further the job is taking days and days (up to a week on this slow desktop computer).
In order to speed things up, I'd like to generate a list of commands and do some of the work on the Much faster laptop. I can't do all of the work on the laptop because, for whatever reason, it can't stay up for hours and hours without discharging and suspending (if I had to guess, probably due to thermal throttling):
make -n foo > outstanding_jobs
cat outstanding_jobs | sort -r | sponge outstanding_jobs
scp slow_box:outstanding_jobs fast_laptop:outstanding_jobs
ssh fast_laptop
head -n 200 outstanding_jobs | parallel -j 12
scp *.bar slow_box:.
The question is:
If I put *.bar in the directory where the original make job was run, will make still try to do that job on the slow box?
OR do I have to halt the job on the slow box and re-invoke make to "get credit" in the make recipe for the new work that I've synced over onto the slow box?

NOTE: substantially revised.
Before it starts building anything, make constructs a dependency graph to guide it, based on an analysis of the requested goal(s), the applicable build rules, and, to some extent, the files already present. It then walks the graph, starting from the goal nodes, to determine which are out of date with respect to their prerequisites and update them.
Although it does not necessarily evaluate the whole graph before running any recipes, once it decides that a given target needs to be updated, make is committed to updating it. In particular, once make decides that some direct or indirect prerequisite of T is out of date, it is committed to (re)building T, too, regardless of any subsequent action on T by another process.
So, ...
If I put *.bar in the directory where the original make job was run,
will make still try to do that job on the slow box?
Adding files to the build directory after make starts building things will not necessarily affect which targets the ongoing make run will attempt to build, nor which recipes it uses to build them. The nearer a target is to a root of the dependency graph, the less likely that the approach described will affect whether make performs a rebuild, especially if you're running a parallel make.
It's possible that you would see some time savings, but you must also consider the possibility that you end up with an inconsistent build.
OR do I have to halt the job on the slow box and re-invoke make to "get credit" in the make recipe for the new work that I've synced over onto the slow box?
If the possibility of an inconsistent build can be discounted, then that is probably a viable option. A new make run will take the then-existing files into account. Depending on the defined rules and the applicable timestamps, it is still possible that some targets would be rebuilt that did not really need to be, but unless the makefile engages in unusual shennanigans, chances are good that at least most of the built files imported from the helper machine will be accepted and used without rebuilding.

Make uses multiple cores even without -j argument?

I've noticed on my MacBook Pro (Quad-core) that when I run make, it takes the same amount of time as make -j, and sure enough, Activity Monitor shows all four cores getting high usage. Why is this? Is there some default setting that Apple has? I mean, it would make sense for -j to be the default, but from what I've seen on the web make with no arguments should only be using one thread.
This isn't necessarily a problem, but I'd like to understand the cause nonetheless.

The -j|--jobs flag specifies/limits the number of commands that can be run simultaneously, not the number of threads to allocate to a single command. Think of this option as concurrency instead of parallelism.
For example, I can specify --jobs=2 and have both an ES6 transpiler and a SASS preprocessor running in the background, in the same terminal window, watching for any file changes I may make.

GNU Make - how to add timestamp output (with minimal makefile modification)

I want to get a better idea of my build job metrics but unfortunately, make doesn't output timestamps per se.
If I run make --print-data-base, for a given target it outputs a line
# Last modified 2016-08-15 13:53:16
but that doesn't give me the duration.
QUESTION
Is there a way to get duration of building a target without modifying each target? Some targets are inside makefiles which are generated DURING the build so not feasible to modify their recipes.
POSSIBLE SOLUTION
I could implement a pre- and post-recipe for every target and output a timestamp that way.
Is that a good idea given this is parallel make? Obviously there would be increased build time from calling a pre- and post-recipe for every target but I'd be fine with that.

If this is a parallel make, then the "preactions", "actions" and "postactions" may be interleaved. That is, you might get output like:
Pre-action 12:03:05
Pre-action 12:03:06
building foo...
building bar...
Post-action 12:04:17
Post-action 12:04:51
So it would behoove you to pass a TARGETNAME variable to the pre-action and post-action scripts.
Also, start and end times are not all there is to know about how long an action takes, when you are running things in parallel; rule A might take longer that rule B, simply because rule B is running alone while rule A is sharing the processor with rules C through J.
Other than that, I see no problem with this approach.

Number of parallel build jobs in recursive make call

I have a makefile which wraps the real build in a single recursive call, in order to grab and release a license around the build, regardless of whether the build succeeds. Example:
.PHONY main_target
main_target:
#license_grab &
#sleep 2
-#$(MAKE) real_target
#license_release
This works great, until I want to run the make in parallel. On some systems it works fine, but on other systems I run into the documented problem with the -j flag when passing options to a sub-make:
If your operating system doesn’t support the above communication, then
‘-j 1’ is always put into MAKEFLAGS instead of the value you
specified. This is because if the ‘-j’ option were passed down to
sub-makes, you would get many more jobs running in parallel than you
asked for.
Since I only ever have one recursive call, I'd really like to pass the -j flag untouched to the sub-make. I don't want to hard-code the -j flag (with a number) into the makefile, because the makefile runs on multiple machines, with differing numbers of processors. I don't want to use -j with no limit, because that launches a bunch of processes right away rather than waiting for jobs to finish. I tried using the -l flag when I build, but I found that the limit doesn't apply right away, probably because limits don't apply until make can start sampling.
Is there a way to force a sub-make to use multiple jobs? Or, a way to make the -l flag actually accomplish something?
I think I could do it using a makefile modification like using -#$(MAKE) $(JOBS) real_target, and invoking make like make JOBS="-j4" main_target.
But, I'd prefer to just use standard make parameters, not adding extra dependencies on variables. I'd also prefer to modify the makefile as little as possible.

There is no way to change this behavior on systems which don't support jobserver capabilities. Something like your JOBS variable is the only way I can think of.
I'll point out a few things: first, I assume when you say other systems you mean Windows systems: that's the only commonly-used platform I'm aware of which doesn't support the jobserver. If so, note that as of GNU make version 4.0, jobserver support is implemented on Windows. So another option is to upgrade to a newer version of GNU make.
Second, the issues with the -l option were at least partly solved in GNU make version 3.81, where an extra algorithm was introduced to artificially adjust the load average based on the number of jobs make invokes. You shouldn't see this issue any longer with versions of make after that.

The problem is that for whatever reason my make does not support the job server, thus it does not pass the -j flag on in the $(MAKEFLAGS) variable. Since upgrading make to a version that does pass the flag is not an option, the solution is to pass the -j flag elsewhere. Just like the $(MAKE) variable can be abused to pass the -f flag, it can be used in the same way to pass the -j flag:
make MAKE="make -j4" main_target
This will start the main_target build with one job, but invoke make with 4 jobs on the sub-make process. Obviously, if you need a special make tool (the normal purpose of $(MAKE) then you'll need to specify it in the MAKE string as well as in the command.

How can I tell what -j option was provided to make

In Racket's build system, we have a build step that invokes a program that can run several parallel tasks at once. Since this is invoked from make, it would be nice to respect the -j option that make was originally invoked with.
However, as far as I can tell, there's no way to get the value of the -j option from inside the Makefile, or even as an environment variable in the programs that make invokes.
Is there a way to get this value, or the command line that make was invoked with, or something similar that would have the relevant information? It would be ok to have this only work in GNU make.

In make 4.2.1 finally they got MAKEFLAGS right. That is, you can have in your Makefile a target
opts:
#echo $(MAKEFLAGS)
and making it will tell you the value of -j parameter right.
$ make -j10 opts
-j10 --jobserver-auth=3,4
(In make 4.1 it is still broken). Needless to say, instead of echo you can invoke a script doing proper parsing of MAKEFLAGS

Note: this answer concerns make version 3.82 and earlier. For a better answer as of version 4.2, see the answer by Dima Pasechnik.
You can not tell what -j option was provided to make. Information about the number of jobs is not accessible in the regular way from make or its sub-processes, according to the following quote:
The top make and all its sub-make processes use a pipe to communicate with
each other to ensure that no more than N jobs are started across all makes.
(taken from the file called NEWS in the make 3.82 source code tree)
The top make process acts as a job server, handing out tokens to the sub-make processes via the pipe. It seems to be your goal to do your own parallel processing and still honor the indicated maximum number of simultaneous jobs as provided to make. In order to achieve that, you would somehow have to insert yourself into the communication via that pipe. However, this is an unnamed pipe and as far as I can see, there is no way for your own process to join the job-server mechanism.
By the way, the "preprocessed version of the flags" that you mention contain the expression --jobserver-fds=3,4 which is used to communicate information about the endpoints of the pipe between the make processes. This exposes a little bit of what is going on under the hood...

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio