Shorter question:
Make targets have files as dependencies; let's say one example dependency is the file "D." I would like Make to traverse its dependency graph, and for each "D," also depend on success being recorded in a log file of "D's" recipe's exit status ("D.status.log"; for simplicity's sake, just includes process exit status or the string "Started"). Is this possible without digging into Make's source myself and modifying the graph logic? (I.e. has somebody already written this as a patch or another Make-like utility?)
Details:
I am a fan in spirit of using Makefiles to run data processing workflows. I am not alone, as searching for "makefile data" yields a few like-minded folks:
http://www.bioinformaticszen.com/post/decomplected-workflows-makefiles/
http://bost.ocks.org/mike/make/
However, in practice, I find it a glorious pain in the neck. Multi-step processes generate output from programs that don't necessarily finish. Running a multi-step workflow on thousands of input files means cobbling together some find ... rm commands, which feels like a fragile data management strategy.
Basically, I'd like a well-logged Make for data that has this style of interface: I'll call it fantasymake below.
Makefile:
all: results1 results2
results1: script input1
script input1 >results1
results2: script input2
script input2 >results2
results2beyond: script results2
script results2 >results2beyond
Example directory tree before:
Makefile
input1
input2
Directory after running fantasymake:
Makefile
input1
input2
results1
results1.err.log
results1.out.log
results1.status.log
results2
results2.err.log
results2.out.log
results2.status.log
results2beyond
results2beyond.err.log
results2beyond.out.log
results2beyond.status.log
Presently, I could get the logs with this bit of Bash, but I haven't found a graceful way to integrate these wrapper commands into Makefile rules:
echo Started. >results.status.log
some_program >results.out.log 2>results.err.log
echo $? >results.status.log
(Recalling every non-joined line in a Makefile definition is a separate shell: An in-Makefile wrapper would have a continuation line (backslash) between some_program ... and echo $$? to make sure they're both executed in the same shell.)
Back to the fantasymake behaviors, this would be the directory after running fantasymake clean:
Makefile
input1
input2
Suppose running fantasymake, results2 failed or was terminated. (And suppose we didn't fantasymake clean.) Then results2beyond would not get generated; and here's where I don't think I can just rely on unmodified Make: results2.status.log logs that results2 failed, so fantasymake would not proceed to results2beyond on the next invocation.
To get the build to finish, a clean-failed rule could sweep away erroneous results. You may need this if you have, say, a database dependency (or live connection) that was easier to leave out of Make. Here's what the directory would look like after running fantasymake clean-failed instead of fantasymake clean:
Makefile
input1
input2
results1
results1.err.log
results1.out.log
results1.status.log
Suppose after running fantasymake clean-failed, script is updated. Then running fantasymake would regenerate results1 and its logs alongside results2.
From glancing at Wikipedia (List of build automation software), it looks like none of makepp, omake, or cmake do the trick. The list on that page (I lack the reputation to link anymore) is a bit lengthy, so I turn to this lovely crowd that has helped lurking me many times already.
Is this an extension I'd have to hack together, or does it already exist?
For the wrappers, this is trivial if you use GNU make. Just use a user-defined function:
TARGETS = one two three
# Invoke this with $(call LOG,<cmdline>)
define LOG
echo "$$(date): Started." >'$#'.status.log
($1) >'$#'.out.log 2>'$#'.err.log
echo "$$(date): Completed: $$?" >>'$#'.status.log
endef
all: $(TARGETS)
$(TARGETS):
$(call LOG, echo "$# out"; echo "$# error" 1>&2)
I'm not really sure what exactly you're trying to accomplish with the "clean" stuff. If you just want a target clean-failed that will remove the logs for any target which doesn't exist, that's simple enough:
TARGETS = one two three
clean-failed:
for t in $(TARGETS); do [ -f "$$t" ] || rm -f "$$t".*.log; done
The rest of your requirements sound, to me, like standard make functionality.
I think you can achieve this with regular make, you just have to be a bit smarter about how you setup your rules. Specifically, don't put your results file in place until you are sure it is complete and consistent. Change your makefile like this:
all: results1 results2
results1: script input1
script input1 >results1.tmp && mv results1.tmp results1
results2: script input2
script input2 >results2.tmp && mv results2.tmp results2
results2beyond: script results2
script results2 >results2beyond.tmp && mv results2beyond.tmp results2beyond
Now if the power dies or your disk fills up or something like that, the workflow will pickup wherever it left off. Any result files that exist are guaranteed to be complete and consistent, because the shell will not execute the mv command unless the previous command finished successfully.
UPDATE:
If you're using GNU make you can simplify the makefile somewhat:
PROCESS=script $< > $#.tmp && mv $#.tmp $#
all: results1 results2
results%: input% script
$(PROCESS)
results2beyond: results2 script
$(PROCESS)
Depending on how determined you are, you can probably simplify this even more, but that's left as an exercise for the reader.
Related
I have a number of makefiles that build and run tests. I would like to create a script that makes each one and notes whether the tests passed or failed. Though I can determine test status within each make file, I am having trouble finding a way to communicate that status to the caller of the make command.
My first thought is to somehow affect the return value of the make command, though this does not seem possible. Can I do this? Is there some other form of communication I can use to express the test status to the bash script that will be calling make? Perhaps by using environment variables?
Thanks
Edit: It seems that I cannot set the return code for make, so for the time being I will have to make the tests, run them in the calling script instead of the makefile, note the results, and then manually run a make clean. I appreciate everyone's assistance.
Make will only return one of the following according to the source
#define MAKE_SUCCESS 0
#define MAKE_TROUBLE 1
#define MAKE_FAILURE 2
MAKE_SUCCESS and MAKE_FAILURE should be self-explanatory; MAKE_TROUBLE is only returned when running make with the -q option.
That's pretty much all you get from make, there doesn't seem to be any way to set the return code.
The default behavior of make is to return failure and abandon any remaining targets if something failed.
for directory in */; do
if ( cd "$directory" && make ); then
echo "$0: Make in $directory succeeded" >&2
else
echo "$0: Make in $directory failed" >&2
fi
done
Simply ensure each test leaves its result in a file unique to that test. Least friction will be to create test.pass if thes test passes, otherwise create test.fail. At the end of the test run gather up all the files and generate a report.
This scheme has two advantages that I can see:
You can run the tests in parallel (You do us the -jn flag, don't you? (hint: it's the whole point of make))
You can use the result files to record whether the test needs to be re-run (standard culling of work (hint: this is nearly the whole point of make))
Assuming the tests are called test-blah where blah is any string, and that you have a list of tests in ${tests} (after all, you have just built them, so it's not an unreasonable assumption).
A sketch:
fail = ${#:%.pass=%.fail}
test-passes := $(addsuffix .pass,${tests})
${test-passes}: test-%.pass: test-%
rm -f ${fail}
touch $#
$* || mv $# ${fail}
.PHONY: all
all: ${test-passes}
all:
# Count the .pass files, and the .fail files
echo '$(words $(wildcard *.pass)) passes'
echo '$(words $(wildcard *.fail)) failures'
In more detail:
test-passes := $(addsuffix .pass,${tests})
If ${tests} contains test-1 test-2 (say), then ${test-passes} will be test-1.pass test-2.pass
${test-passes}: test-%.pass: test-%
You've just gotta love static pattern rules.
This says that the file test-1.pass depends on the file test-1. Similarly for test-2.pass.
If test-1.pass does not exist, or is older than the executable test-1, then make will run the recipe.
rm -f ${fail}
${fail} expands to the target with pass replaced by fail, or test-1.fail in this case. The -f ensures the rm returns no error in the case that the file does not exist.
touch $# — create the .pass file
$< || mv $# ${fail}
Here we run the executable
If it returns success, our work is finished
If it fails, the output file is deleted, and test-1.fail is put in its place
Either way, make sees no error
.PHONY: all — The all target is symbolic and is not a file
all: ${test-passes}
Before we run the recipe for all, we build and run all the tests
echo '$(words $(wildcard *.pass)) passes'
Before passing the text to the shell, make expands $(wildcard) into a list of pass files, and then counts the files with $(words). The shell gets the command echo 4 passes (say)
You run this with
$ make -j9 all
Make will keep 9 jobs running at once — lovely if you have 8 CPUs.
Our project uses Makefiles with the following type of rule for each multi-directory sub-make:
DIRS = lib audio conf parser control
all: $(DIRS)
#for DIR in $(DIRS); \
do \
( cd $$DIR; $(MAKE) $(MFLAGS) all; ) \
done
If any file fails to compile in one of the leaf makes, the build stops in that directory - but the rest of the make continues. How do I set up these Makefiles so the first error at any level will stop the entire make?
Thanks
From the for loops section of the bash manual:
The return status is the exit status of the last command that
executes.
So, you do not need to capture return statuses. You need your recipe to fail if any sub-make fails:
DIRS = lib audio conf parser control
all: $(DIRS)
#for DIR in $(DIRS); do \
$(MAKE) -C $$DIR $(MFLAGS) all || exit 1; \
done
But it would be much better to have individual recipes per directory, instead of a single for loop:
DIRS = lib audio conf parser control
all: $(DIRS)
.PHONY: all $(DIRS)
$(DIRS):
$(MAKE) -C $# $(MFLAGS) all
This way, if a sub-make fails, it is the complete rule's recipe that fails and make stops. Note the .PHONY special target, in this case it is needed because you want to run the recipe, even if the directory already exists.
There is another advantage with this structure: if you run make in parallel mode (make -j N) it will launch several sub-makes simultaneously instead of just one with the for loop. And each sub-make, in turn, will launch several recipes in parallel, up to N jobs. On a multi-processor or multi-core architecture the speed-up factor can be significant.
But this advantage can become a drawback if your project is not parallel safe, that is, if the order of processing of your directories matters and is not properly defined in the makefiles. If you are in this situation you can add a:
.NOTPARALLEL:
special target at the beginning of your main makefile to tell make. But it would be better to explicitly define the inter-directories dependencies. And if you do not know how to do this, please ask another question.
I found the answer in this question's answer: I have to rewrite to capture the return status of each submake.
I tried looking for answers to this question, so I apologize in advance if this is a duplicate of a question I didn't find. Also sorry that I cannot directly provide the code that I am working with (it would require a lot of environmental dependencies, anyway).
I have a sequence of actions, which all depend on the success of the previous actions, and also don't need to be repeated unless they are out of date. A make solution seemed like the proper one. I've come up with a solution that does almost all of it. Here is the sequence of steps I am trying to replicate, with the output of each step listed below its input:
ZIP file
extract to package/
package/directory/*.comp
execute uncomp.py to create a .uncomp file from a .comp file
Everything works fine up to this point
package/directory/*.uncomp
For *.uncomp files, execute script1 to produce a .html file
For *_ext.uncomp files, execute script2 to produce numbered *_ext.##.png file(s)
Multiple numbered files (_ext.0.png, _ext.1.png, _ext.2.png) are possible, and may not be present at the time make is run. However, make should know that they are the output of the previous step, and only run this recipe if these files (a) don't exist or (b) any are older than the *_ext.uncomp file.
I have put together a Makefile which does almost what I'm looking for, except that it delegates all of the last portion (numbered files) to a shell script which I could program to look at file times, but that defeats the purpose of using make in the first place, in my opinion.
Environment
Debian 8.8 (x86)
GNU Make 4.0
Built for x86_64-pc-linux-gnu
My Question
What rules and recipes can I use to inform GNU make of the relationship between the *_ext.uncomp files and the _ext.##.png files so that those recipes only get executed as necessary (and say 'Target is up-to-date' if all .png files are at least as new as the _ext.uncomp file), that won't also apply to the *.uncomp files, and that will still work of there are no .png files in the output?
I will also need to indicate the relationship between non-_ext files and their corresponding HTML counterparts. So that script1 only gets executed when the HTML file is out of date or doesn't exist. This recipe/rule should not pay attention to _ext.uncomp files.
Any other advice on my Makefile would also be appreciated, because I am not overly familiar with it.
Generalized contents of my current Makefile
.PHONY : all
all : package package/directory/*.uncomp
./process $^
%.comp.uncomp : %.comp package
python uncomp.py $<
package : *.zip
rm -rf package/
unzip *.zip -d package/
Contents of the process script
This script should no longer exist if all the goals of the question are met (make will handle everything). It works great, but it always processes .uncomp files no matter what, even if the output from them already exists and is newer than the source.
#!/bin/bash
if [ $# -lt 2 ]; then
echo "$0 expects at least 2 arguments"
exit 1
fi
# Discard the first agrument, it's always 'package'
shift
# Iterate over each of the remaining arguments
while [ $# -gt 0 ]; do
if [[ $1 == *_ext.uncomp ]] ; then
python script2 $1
elif [[ $1 == *.uncomp ]] ; then
python script1 $1
else
echo "Warning: Unknown file type: $1"
fi
shift
done
I learned a lot about GNU make trying to get this to work. I discovered that the solution to my problem was in not overthinking it.
The most important realization was that I didn't need make to track all of the numbered output files, but just the first one (if the first one is out of date or missing, they all will be, and they all get re-extracted by the script, so a 1:1 relationship was all I needed to indicate there).
I found out that GNU make 3.82 and later uses "shortest stem first" order instead of definition order when matching pattern rules. To make my file compatible with both versions, I made sure to define the most specific stems first.
After that it was a matter of setting up some implicit rules, and just telling make what to expect to be able to find—the concept is a little backwards to my way of thinking which is why I had some trouble at first (look for this file that doesn't exist yet; now, here's a way to make it from a file that does exist). The end result, fully functional:
PACKAGE := package
COMP := .comp
UNCOMP := .comp.uncomp
PNG0 := .comp.0.png
TXT := .comp.txt
SUFFIX := _ext
COMPFILES = $(wildcard $(PACKAGE)/subdir/*$(COMP))
UNCOMPFILES = $(COMPFILES:$(COMP)=$(UNCOMP))
SUFFIXFILES = $(filter %$(SUFFIX)$(UNCOMP),$(UNCOMPFILES))
PNGFILES = $(SUFFIXFILES:$(UNCOMP)=$(PNG0))
NOSUFFIXFILES = $(filter-out %$(SUFFIX)$(UNCOMP),$(UNCOMPFILES))
TXTFILES = $(NOSUFFIXFILES:$(UNCOMP)=$(TXT))
.PHONY : all
all : pngs txts htaccess
.PHONY : txts
txts : $(TXTFILES)
.PHONY : pngs
pngs : $(PNGFILES)
.PHONY : uncomp
uncomp : $(UNCOMPFILES)
make pngs
make txts
.PHONY : htaccess
htaccess : $(PACKAGE)/.htaccess
%$(SUFFIX)$(PNG0) : %$(SUFFIX)$(UNCOMP)
## Ignore failures when extracting PNG files
-python script1.py $<
%$(TXT) : %$(UNCOMP)
## Ignore failures when dumping TXT files
-python script2.py $< > $#
%$(UNCOMP) : %$(COMP)
## Ignore decompression failure
-python uncomp.py $<
$(PACKAGE)/.htaccess : .htaccess | $(PACKAGE)
cp .htaccess $(PACKAGE)/
$(PACKAGE) : *.zip
rm -rf $(PACKAGE)/
unzip *.zip -d $(PACKAGE)/
make uncomp
.PHONY : clean
clean :
rm -rf $(PACKAGE)/
It looks to me like Makefile rules can be roughly classified into "positive" and "negative" ones: "positive" rules create missing or update outdated files, while "negative" ones remove files.
Writing prerequisites for "positive" rules is quite easy: if the target and the prerequisite are file names, make by default runs the recipe if the target is missing or outdated (a missing file in this context may be viewed as an infinitely old file).
However, consider a "negative" rule, for example for target clean. A usual way to write it seems to be something like this:
clean:
rm -f *.log *.synctex.gz *.aux *.out *.toc
This is clearly not the best way to do:
rm is executed even when there is nothing to do,
its error messages and exit status need to be suppressed with -f flag, which has other (possibly undesirable) effects, and
the fact that there were nothing to do for target clean is not reported to the user, unlike what is normal for "positive" targets.
My question is: how to write a Makefile rule that shall be processed by make only if certain files are present? (Like what would be useful for make clean.)
how to write a Makefile rule that shall be processed by make only if certain files are present? (Like what would be useful for make clean.)
You can do it like so:
filenames := a b c
files := $(strip $(foreach f,$(filenames),$(wildcard $(f))))
all: $(filenames)
$(filenames):
touch $#
clean:
ifneq ($(files),)
rm -f $(files)
endif
Example session:
$ make
touch a
touch b
touch c
$ make clean
rm -f a b c
$ make clean
make: Nothing to be done for 'clean'.
Useful perhaps for some purposes, but it strikes me as a strained refinement for make clean.
This can be easily remedied:
clean:
for file in *.log *.synctex.gz *.aux *.out *.toc; do \
if [ -e "$file" ]; then \
rm "$$file" || exit 1; \
else \
printf 'No such file: %s\n' "$file" \
fi \
done
The if statement is necessary unless your shell supports and has enabled nullglob or something similar.
If your printf supports %q you should use that instead of %s to avoid possible corruptions of your terminal when printing weird filenames.
A meta-answer is: are you sure you want to do this?
The other answers suggest to me that the cure is worse than the disease, since one involves an extension to POSIX make (ifneq), and the other uses a compound command which spreads over seven lines. Both of these are sometimes necessary expedients – I'm not criticising either answer – but both are things I avoid in a Makefile if I can. If I found myself wanting to do this in a clean rule, perhaps for the reason you mention in your comment to #MikeKinghans' answer, I'd try quite hard to change the rest of the Makefile to avoid needing this.
Reflecting on your three original points in turn:
rm is executed even when there is nothing to do: so what? The alternatives still need to, for example, expand the *.log *.synctex.gz ... so there's only miniscule efficiency gain to avoiding the rm. Make is a high-level tool which generally does not concern itself with efficiency.
its error messages and exit status need to be suppressed with -f flag: the -f flag doesn't generally suppress errors and the exit status, it merely indicates to rm that a non-existing or non-permissioned file is not to be regarded as an error.
the fact that there were nothing to do for target clean is not reported to the user: should the user really care?
The last point is the most interesting. People asking about make, on Stackoverflow and elsewhere, sometimes make things hard for themselves by trying to use it as a procedural language – make is not Python, or Fortran. Instead, it's a goal programming language (if we want to get fancy about it): you write snippets of rules to achieve sub-goals, so that the user (you, later) doesn't have to care about the details or the directory's current state, but can simply indicate a goal, and the program does whatever's necessary to get there. So whether there's is or isn't anything to do, the user ‘shouldn't’ care.
I think the short version of this answer is: it's idiomatic to keep make rules as simple (and thus as readable and robust) as possible, even at the expense of a little crudity or repetition.
I have a compile job where linking is taking a lot of IO work. We have around a dozen of cores so we run make -j13, but when it comes to linking the 6 targets, I'd like those to be done in a round robin way. I thought about making one depend on the next but I think this would break the individual targets. Any ideas how to solve this small issue?
make itself doesn't provide a mechanism to request "N of these, but no more than M of those at a time".
You might try using the sem command from the GNU parallel package in the recipe of your linker rules. Its documentation has an example of ensuring only one instance of a tool runs at once. In your example, you would allow make to start up to 13 sems at a time, but only one of those at a time will run the linker, while the others block.
The downside is that you could get into a situation where 5 of your make's 13 job slots are tied up with instances of sem that are all waiting for a linker process to finish. Depending on the structure of your build, that might mean some wasted CPU time. Still beats 6 linkers thrashing the disk at once, though :-)
You should specify that your six targets cannot be built in parallel. Add a line like this to your makefile:
.NOTPARALLEL: target1 target2 target3 target4 target5 target6
For more information look here https://www.gnu.org/software/make/manual/html_node/Parallel-Disable.html.
I've stumbled upon a hacky solution:
For each recipe it runs, Make does two things: it expands variables/functions in the recipe, and then runs the shell commands.
Since the first step can read/write the global variables, it seems to be done synchronously.
So if you run all your shell commands during the first step (using $(shell )), no other recipe will be able to start while they're running.
E.g. consider this makefile:
all: a b
a:
sleep 1
b:
sleep 1
time make -j2 reports 1 second.
But if you rewrite it to this:
# A string of all single-letter Make flags, without spaces.
override single_letter_makeflags = $(filter-out -%,$(firstword $(MAKEFLAGS)))
ifneq ($(findstring n,$(single_letter_makeflags)),)
# See below.
override safe_shell = $(info Would run shell command: $1)
else ifeq ($(filter --trace,$(MAKEFLAGS)),)
# Same as `$(shell ...)`, but triggers a error on failure.
override safe_shell = $(shell $1)$(if $(filter-out 0,$(.SHELLSTATUS)),$(error Unable to execute `$1`, exit code $(.SHELLSTATUS)))
else
# Same functions but with logging.
override safe_shell = $(info Shell command: $1)$(shell $1)$(if $(filter-out 0,$(.SHELLSTATUS)),$(error Unable to execute `$1`, exit code $(>
endif
# Same as `safe_shell`, but discards the output and expands to nothing.
override safe_shell_exec = $(call,$(call safe_shell,$1))
all: a b
a:
$(call safe_shell_exec,sleep 1)
#true
b:
$(call safe_shell_exec,sleep 1)
#true
time make -j2 now reports 2 seconds.
Here, #true does nothing, and suppresses Nothing to be done for ?? output.
There are some problems with this approach though. One is that all output is discarded unless redirected to file or stderr...
It won't break individual targets.
You can create any number of (:) rules for a target, as long as only one of them has an actual recipe for building it. This appears to be a good use case for that.