makefile with multiple jobs -j and subdirectories - makefile

Imagine you have a make file with subdirectories to be built using "make -c". Image you have 4 target directories. 3 out of 4 are done, 1 needs to run. You run the overall makefile with:
make -j 4
Is there a way to tell the makefile system to run the remaining target with make -c -j 4 instead of just 1 ? If two targets would be missing I would like make -c -j 2 for each one.

I'll expand on Beta's (correct) answer. All the individual make processes communicate with each other and guarantee that there will never be more than N jobs running across all the different make invocations, when you use -jN. At the same time, they always guarantee that (assuming there are at least N jobs that can possibly be run across all the make invocations), N jobs will always be running.
Suppose instead that you had 4 directories with "something to do", which somehow you could know a priori, and so instead of invoking one instance of make with -j4 and letting that make invoke the 4 submakes normally, you force each of the submakes to be invoked with -j1. Now suppose that the first directory had 10 targets out of date, the second had 5, the third had 20, and the fourth had 100 out of date targets. At first you have 4 jobs running in parallel. Then once the second directory's 5 targets are built, you only have 3 jobs running in parallel, then 2, then for the rest of the build of the fourth directory you'll have only one target being built at a time and no parallelism. That's much Less Good.
The way GNU make works, instead, all four instances of make are communicating. When the second directory is done, the jobs it was running are available to the other directories, etc. By the end of the build the fourth directory is building four jobs at a time in parallel. That's much More Good.
Maybe if you explained why you want to do this, it would be more helpful to us in constructing an answer.

Make handles that automatically. From the manual:
If you set [-j] to some numeric value ‘N’ and your operating system
supports it... the parent make and all the sub-makes will communicate
to ensure that there are only ‘N’ jobs running at the same time
between them all.

Related

Gnu Make: When invoking parallel make, if pre-requisites are supplied during the build, will make try to remake those?

This is an order of operations question.
Suppose I declare a list of requirements:
required:=$(patsubst %.foo,%.bar, $(shell find * -name '.foo'))
And a rule to make those requirements:
$(required):
./foo.py $#
Finally, I invoke the work with:
make foo -j 10
Suppose further the job is taking days and days (up to a week on this slow desktop computer).
In order to speed things up, I'd like to generate a list of commands and do some of the work on the Much faster laptop. I can't do all of the work on the laptop because, for whatever reason, it can't stay up for hours and hours without discharging and suspending (if I had to guess, probably due to thermal throttling):
make -n foo > outstanding_jobs
cat outstanding_jobs | sort -r | sponge outstanding_jobs
scp slow_box:outstanding_jobs fast_laptop:outstanding_jobs
ssh fast_laptop
head -n 200 outstanding_jobs | parallel -j 12
scp *.bar slow_box:.
The question is:
If I put *.bar in the directory where the original make job was run, will make still try to do that job on the slow box?
OR do I have to halt the job on the slow box and re-invoke make to "get credit" in the make recipe for the new work that I've synced over onto the slow box?
NOTE: substantially revised.
Before it starts building anything, make constructs a dependency graph to guide it, based on an analysis of the requested goal(s), the applicable build rules, and, to some extent, the files already present. It then walks the graph, starting from the goal nodes, to determine which are out of date with respect to their prerequisites and update them.
Although it does not necessarily evaluate the whole graph before running any recipes, once it decides that a given target needs to be updated, make is committed to updating it. In particular, once make decides that some direct or indirect prerequisite of T is out of date, it is committed to (re)building T, too, regardless of any subsequent action on T by another process.
So, ...
If I put *.bar in the directory where the original make job was run,
will make still try to do that job on the slow box?
Adding files to the build directory after make starts building things will not necessarily affect which targets the ongoing make run will attempt to build, nor which recipes it uses to build them. The nearer a target is to a root of the dependency graph, the less likely that the approach described will affect whether make performs a rebuild, especially if you're running a parallel make.
It's possible that you would see some time savings, but you must also consider the possibility that you end up with an inconsistent build.
OR do I have to halt the job on the slow box and re-invoke make to "get credit" in the make recipe for the new work that I've synced over onto the slow box?
If the possibility of an inconsistent build can be discounted, then that is probably a viable option. A new make run will take the then-existing files into account. Depending on the defined rules and the applicable timestamps, it is still possible that some targets would be rebuilt that did not really need to be, but unless the makefile engages in unusual shennanigans, chances are good that at least most of the built files imported from the helper machine will be accepted and used without rebuilding.

Number of parallel build jobs in recursive make call

I have a makefile which wraps the real build in a single recursive call, in order to grab and release a license around the build, regardless of whether the build succeeds. Example:
.PHONY main_target
main_target:
#license_grab &
#sleep 2
-#$(MAKE) real_target
#license_release
This works great, until I want to run the make in parallel. On some systems it works fine, but on other systems I run into the documented problem with the -j flag when passing options to a sub-make:
If your operating system doesn’t support the above communication, then
‘-j 1’ is always put into MAKEFLAGS instead of the value you
specified. This is because if the ‘-j’ option were passed down to
sub-makes, you would get many more jobs running in parallel than you
asked for.
Since I only ever have one recursive call, I'd really like to pass the -j flag untouched to the sub-make. I don't want to hard-code the -j flag (with a number) into the makefile, because the makefile runs on multiple machines, with differing numbers of processors. I don't want to use -j with no limit, because that launches a bunch of processes right away rather than waiting for jobs to finish. I tried using the -l flag when I build, but I found that the limit doesn't apply right away, probably because limits don't apply until make can start sampling.
Is there a way to force a sub-make to use multiple jobs? Or, a way to make the -l flag actually accomplish something?
I think I could do it using a makefile modification like using -#$(MAKE) $(JOBS) real_target, and invoking make like make JOBS="-j4" main_target.
But, I'd prefer to just use standard make parameters, not adding extra dependencies on variables. I'd also prefer to modify the makefile as little as possible.
There is no way to change this behavior on systems which don't support jobserver capabilities. Something like your JOBS variable is the only way I can think of.
I'll point out a few things: first, I assume when you say other systems you mean Windows systems: that's the only commonly-used platform I'm aware of which doesn't support the jobserver. If so, note that as of GNU make version 4.0, jobserver support is implemented on Windows. So another option is to upgrade to a newer version of GNU make.
Second, the issues with the -l option were at least partly solved in GNU make version 3.81, where an extra algorithm was introduced to artificially adjust the load average based on the number of jobs make invokes. You shouldn't see this issue any longer with versions of make after that.
The problem is that for whatever reason my make does not support the job server, thus it does not pass the -j flag on in the $(MAKEFLAGS) variable. Since upgrading make to a version that does pass the flag is not an option, the solution is to pass the -j flag elsewhere. Just like the $(MAKE) variable can be abused to pass the -f flag, it can be used in the same way to pass the -j flag:
make MAKE="make -j4" main_target
This will start the main_target build with one job, but invoke make with 4 jobs on the sub-make process. Obviously, if you need a special make tool (the normal purpose of $(MAKE) then you'll need to specify it in the MAKE string as well as in the command.

How can I cause make -j to produce nice output?

I have a large project that is built using make. Because of the size of the project and the way the dependencies are organized, there's a real benefit to building in parallel using make -j. However, the output (that is, the logs and errors messages) that make -j produces is all mixed up, because all of the parallel tasks write to stdout at the same time.
How can I tell make to organize the output nicely? Ideally, I'd like it to buffer the logs from each task separately, and then output then in order as they complete. Is there any standard method of doing this?
You can use the -O or --output-sync command line options:
-O[type], --output-sync[=type]
When running multiple jobs in parallel with -j, ensure the output of each job is collected together rather than interspersed with output from other jobs. If type is not specified or is target the output from the entire recipe for each target is grouped together. If type is line the output from each command line within a recipe is grouped together. If type is recurse output from an entire recursive make is grouped together. If type is none output synchronization is disabled.
The online manual has more information.
(Note that you need GNU Make 4.0 for this to work.)

Get current job number in makefile

Is there a way to get current job number to use in makefile rule?
Let me give you a little context.
I am using a tool which runs on multiple files. Naturally I use parallel jobs to speed things up. The real issue here is that this tool spawns multiple threads and I would like them to run in single core - since that way it is faster. I did some tests and it runs faster that way.
I need job numer to set process affinity to cpu core.
There is no way to get a "job number" because make doesn't track a job number. Basically all the instances of make in a single build share a list of identical tokens. When make wants to start a job it obtains a token. When make is finished with a job it adds the token back to the list. If it tries to get a token and one is not available it will sleep until one becomes available. There's no distinguishing characteristic to the tokens so there's no way to have a "job number".
To learn more about how GNU make handles parallel builds, you can read http://make.mad-scientist.net/jobserver.html
I'm not quite sure how this helps you anyway. Make doesn't know anything about threads, it only starts processes. If a single process consists of multiple threads, make will still think of it as a single job.
EDIT:
Assuming that you are in a single, non-recursive invocation of make you can do it like this:
COUNT :=
%.foo :
$(eval COUNT += x)
#echo "$#: Number of rules run is $(words $(COUNT))"
all: a.foo b.foo c.foo d.foo

Trouble with parallel make not always starting another job when one finishes

I'm working on a system with four logical CPS (two dual-core CPUs if it matters). I'm using make to parallelize twelve trivially parallelizable tasks and doing it from cron.
The invocation looks like:
make -k -j 4 -l 3.99 -C [dir] [12 targets]
The trouble I'm running into is that sometimes one job will finish but the next one won't startup even though it shouldn't be stopped by the load average limiter. Each target takes about four hours to complete and I'm wondering if this might be part of the problem.
Edit: Sometimes a target does fail but I use the -k option to have the rest of the make still run. I haven't noticed any correlation with jobs failing and the next job not starting.
I'd drop the '-l'
If all you plan to run the the system is this build I think the -j 4 does what you want.
Based on my memory, if you have anything else running (crond?), that can push the load average over 4.
GNU make ref
Does make think one of the targets is failing? If so, it will stop the make after the running jobs finish. You can use -k to tell it to continue even if an error occurs.
#BCS
I'm 99.9% sure that the -l isn't causeing the problem because I can watch the load average on the machine and it drops down to about three and sometimes as low as one (!) without starting the next job.

Resources