GNUMake trouble with implicit rules - makefile

I'm missing something about implicit rules. Here's the Makefile (GNU Make 4.2.1)
heimdall /tmp 1670> cat Makefile
PARTS= a b c
.SECONDEXPANSION:
data/events2: $$(patsubst %,$$(#D)/%.ppd,$(PARTS))
/bin/ls -l $^
%/events2: $$(patsubst %,$$(#D)/%.ppd,$(PARTS))
/bin/ls -l $^
Here are the cooked up data to illustrate the situation:
heimdall /tmp 1671> ls -1 data data1
data:
a.ppd
b.ppd
c.ppd
data1:
a.ppd
b.ppd
c.ppd
Here is make using an explicit rule, which works like I'd expect.
heimdall /tmp 1672> make data/events2
/bin/ls -l data/a.ppd data/b.ppd data/c.ppd
-rw-rw-r-- 1 bennett None 0 Feb 4 12:19 data/a.ppd
-rw-rw-r-- 1 bennett None 0 Feb 4 12:19 data/b.ppd
-rw-rw-r-- 1 bennett None 0 Feb 4 12:19 data/c.ppd
And finally, this:
heimdall /tmp 1673> make data1/events2
make: *** No rule to make target 'data1/events2'. Stop.
Why doesn't the implicit rule match? I feel like I've missed something fundamental.
Thanks.
-E

%/events2: $$(patsubst %,$$(#D)/%.ppd,$(PARTS))
Is not a pattern rule that would match in your sample structure. From the docs:
% in a prerequisite of a pattern rule stands for the same stem that was matched by the % in the target. In order for the pattern rule to apply, its target pattern must match the file name under consideration and all of its prerequisites (after pattern substitution) must name files that exist or can be made. These files become prerequisites of the target.
However in your target % would be matching data1. But there isn't actually any % to match on prerequisite side as those present are oft patsubst function and directory (stem) is referred to as $(#D).
I've tried to write such rule like this using foreach function:
%/events2: $(foreach part,$(PARTS), %/$(part).ppd)
/bin/ls -l $^
If you wanted to stick with patsubst, this should work as well:
%/events2: $(patsubst %,\%/%.ppd,$(PARTS))
/bin/ls -l $^
Not that % is used for directory name matching the one in target and it's escaped with \ to make it through patsubst unscathed.
Either way seems to have gone well with GNU make yielding:
$ make data1/events2
/bin/ls -l data1/a.ppd data1/b.ppd data1/c.ppd
-rw-r--r-- 1 ondrej users 0 Feb 4 22:00 data1/a.ppd
-rw-r--r-- 1 ondrej users 0 Feb 4 22:00 data1/b.ppd
-rw-r--r-- 1 ondrej users 0 Feb 4 22:00 data1/c.ppd

Related

How do I benchmark Makefile targets?

Consider the following Makefile:
test:
#echo "$(shell date) Before test"
sleep 10
#echo "$(shell date) After test"
When I run this, the output is the following:
Wed Nov 24 17:00:22 PST 2021 Before test
sleep 10
Wed Nov 24 17:00:22 PST 2021 After test
It looks like make is executing the shell commands before executing the target.
How can I force make to execute the shell commands when they appear in the recipe?
Make recipes are expanded by make prior passing them to the shell. So when make expands your first recipe the recipe becomes:
#echo "Wed Nov 24 17:00:22 PST 2021 Before test"
sleep 10
#echo "Wed Nov 24 17:00:22 PST 2021 After test"
because date is called at almost the same time. It is only after the recipe has been expanded that it is passed to the shell. In this case each line is executed by a different shell. The last one is executed 10 seconds after the others but as the string to echo is already set...
Make expands the recipes before passing them to the shell for very good reasons. For instance to replace the automatic make variables like $# (rule's target) or $^ (rule's prerequisites) by their actual values. The shell could not do this.
To obtain the desired effect, as you realized yourself, you need to let the recipe itself call date, for instance with $(date) or with the old-fashioned backticks. Note that you must escape the $ sign to protect it from the make expansion:
#echo foobar."$$(date)"
which, after make expansion, becomes:
#echo foobar."$(date)"
This is why people frequently still use the backticks in make recipes, even if the $(...) syntax is now recommended for command substitution; there is no need to escape them:
#echo foobar."`date`"
Using the $(shell ...) make function in a recipe is useless. Recipes are already shell scripts. Same with most make functions. The only reason to use them in a recipe is when they are more efficient or simpler than their shell equivalent. But we must remember that they are expanded by make itself, not by the shell, and that make expands them before the recipe is passed to the shell. Example: if you want to print the basename of all prerequisites the notdir make function is convenient:
#echo $(notdir $^)
instead of:
#for f in $^; do basename "$$f"; done
But you cannot print the basenames of all C source files in ./project/src with:
#$(notdir find ./project/src -name '*.c')
bacause make applies its notdir function to each of its (expanded) word arguments and what it passes to the shell is:
#find src -name '*.c'
The find command is then executed with a wrong starting directory (src instead of ./project/src) and, even if this wrong starting directory exists, the directory part of the results is not stripped off by notdir which has already been expanded.
It looks like the following achieves the desired effect.
test:
#echo "$$(date) Before test"
sleep 10
#echo "$$(date) After test"
I'm curious about other answers though if there are less hacky ways to do this.

why does "make" delete target files only if implicit

Suppose I have a Makefile like this
B1.txt: A1.txt
python big_long_program.py A1.txt > $#
correct1.txt: B1.txt reference.txt
diff -q B1.txt reference.txt
touch $#
Then the output when I make correct1.txt is pretty well what I would expect:
python big_long_program.py A1.txt > B1.txt
diff -q B1.txt reference.txt
touch correct1.txt
Now if I have lots of files, B1.txt, B2.txt, B3.txt etc, so create an implicit rule:
B%.txt: A%.txt
python big_long_program.py A$*.txt > $#
correct%.txt: B%.txt reference.txt
diff -q B$*.txt reference.txt
touch $#
Instead this happens when I make correct1.txt:
python big_long_program.py A1.txt > B1.txt
diff -q B1.txt reference.txt
touch correct1.txt
rm B1.txt
i.e. there difference is that now the file B1.txt has been deleted, which in many cases is really bad.
So why are implicit rules different? Or am I doing something wrong?
You are not doing anything wrong. The behavior you observe and analyze is documented in 10.4 Chains of Implicit Rules. It states that intermediate files are indeed treated differently.
The second difference is that if make does create b in order to update
something else, it deletes b later on after it is no longer needed.
Therefore, an intermediate file which did not exist before make also
does not exist after make. make reports the deletion to you by
printing a rm -f command showing which file it is deleting.
The documentation does not explicitly explain why it behaves like this. Looking in the file ChangeLog.1, there is a reference to the remove_intermediates function as far back as 1988. At that time, disk space was expensive and at a premium.
If you do not want this behavior, mention the targets you want to keep somewhere in the makefile as an explicit prerequisite or target or use the .PRECIOUS or the .SECONDARY special built-in targets for that.
With thanks to MadScientist for the additional comments, see below.

Using bash extended globs file masks in variables in find and loop

I am trying to match files using a pre-set file mask in a variable.
mat $ ls -lQ /tmp/Mat
total 0
-rw-rw-r-- 1 Mat Mat 0 Mar 3 14:32 "testfile1"
-rw-rw-r-- 1 Mat Mat 0 Mar 3 14:33 "testfile1.gz"
-rw-rw-r-- 1 Mat Mat 0 Mar 3 14:33 "testfile2"
-rw-rw-r-- 1 Mat Mat 0 Mar 3 14:33 "testfile2.gz"
-rw-rw-r-- 1 Mat Mat 0 Mar 3 14:38 "testfile2.gz#id=142"
-rw-rw-r-- 1 Mat Mat 0 Mar 3 14:34 "testfile2test"
-rw-rw-r-- 1 Mat Mat 0 Mar 3 14:34 "testfile2test.gz"
mat $ file_mask=*file2*
mat $ ls /tmp/Mat/$file_mask?(.gz)
testfile2.gz testfile2test.gz
I am trying to get: testfile2 testfile2.gz testfile2test testfile2.gz
To summarize the outcome:
tl;dr
The OP experienced unexpected behavior due to a bug in 3.x versions of bash relating to certain extended glob patterns, i.e., with shopt -s extglob in effect.
However, even without the bug, the code doesn't work as intended, because the globbing pattern *file2*?(.gz) is effectively the same as *file* - which would match files with any suffix, not just .gz.
To only match names containing file2 that either have no suffix at all, or, if they have [at least] one, with a [last] suffix of .gz, use *([^.])file2*([^.])?(*.gz) (this works fine in bash 3.x too). Note that, as with the OP's patterns, this requires extended globbing to be activated with shopt -s extglob.
The assumption is that the OP's intent is as follows:
Match only names containing file2 [before the 1st suffix, if any] that either have no suffix at all, or, if they have [at least] one, with a [last] suffix of .gz
E.g., match files file2 file2-a, some-file2, file2.gz, file2-a.gz, file2.tar.gz, but not file2.no (because it has a [last] suffix that is not '.gz').
While there is a bash 3.x bug that affects patterns such as *?(...) - see below - there's no good reason to use *?(...), because it is effectively the same as just *, given that * matches any sequence of characters, including suffixes.
The solution below is not affected by the bug.
You cannot use * for matching only the root of a filename (the part before the [first] suffix), because * matches any string, whether part of a suffix or not.
Thus, extended glob *([^.]) must be used, which matches a string of any length containing any character except . (a period).
Also, to account for the fact that a filename may have multiple suffixes, the optional .gz-matching part of the pattern should be ?(*.gz).
To put it together:
Note: shopt -s extglob must be in effect for the commands to work.
# Create test files; note the addition of "testfile2.tar.gz", which SHOULD
# match, and "testfile2.no", which should NOT match:
$ touch "testfile1" "testfile1.gz" "testfile2" "testfile2.gz" "testfile2.gz#id=142" "testfile2test" "testfile2test.gz" "testfile2.tar.gz" "testfile2.no"
$ ls -1 *([^.])file2*([^.])?(*.gz)
testfile2
testfile2.gz
testfile2.tar.gz
testfile2test
testfile2test.gz
# The same, using a variable:
$ file_mask=*([^.])file2*([^.]) # NO globbing here (no globbing in *assignments*).
$ file_mask+=?(*.gz) # Extend the pattern; still no globbing.
$ ls -1 $file_mask # Globbing happens here, due to unquoted use of the variable.
# Same output as before.
# Using a loop should work equally:
for f in *([^.])file2*([^.])?(*.gz); do echo "$f"; done
# Same output as before.
# Loop with a variable:
$ file_mask=*([^.])file2*([^.])
$ file_mask+=?(*.gz)
$ for f in $file_mask; do echo "$f"; done
# Same output as before.
Obscure extended-globbing bug in bash 3.x:
Note that the bug is unrelated to whether or not variables are used.
I don't know in what version the bug was fixed, but it's not present in 4.3.30, for instance.
In short, *?(...) mistakenly acts as if *+(...) had been specified.
In other words: independent simple pattern * followed by extended pattern ?(...) (match zero or 1 ... instance) effectively behaves like * followed by +(...) (match 1 or more ... instances).
Demonstration, observed in bash 3.2.57 (the current version on OSX 10.10.2; the OP uses 3.2.25):
$ touch f f.gz # create test files
$ ls -1 f?(.gz) # OK: finds files with basename root 'f', optionally suffixed with '.gz'
f
f.gz
# Now extend the glob with `*` after the basename root.
# This, in fact, is logically equivalent to `f*` and should
# match *all files starting with 'f'*.
$ ls -1 f*?(.gz)
f.gz
# ^ BUG: only matches the suffixed file.

How to provide parameters to a submake when running in parallel?

I am trying to use make to handle some data processing.
Consider the following simple rule in a makefile makefile-month
output_$(YEAR)_$(MONTH): input_$(YEAR)_$(MONTH)
foo input_$(YEAR)_$(MONTH) output_$(YEAR)_$(MONTH)
This rule can be used to process any required month using, e.g.
make -f makefile-month YEAR=2006 MONTH=2
And this works fine.
What I am really interested now is to use make to process several months in parallel.
However, I cannot find a simple way of achieving this.
Notice that using a shell for loop does not work with parallel make.
Defining a global makefile,
all:
for year in 2006; do \
for month in 1 2 3 4 5 6 7 8 9 10 11 12; do \
$(MAKE) -f makefile-month YEAR=$$year MONTH=$$month; \
done; \
done
and running,
make -j 12
does not execute each month in parallel.
Each call to the sub-make is executed in serial.
Any ideas?
There are lots of different ways to handle the details, but the overall solution is to move away from for loops in a single recipe and switch to individual targets. So for example:
YEARS := 2006 2007
MONTHS := 1 2 3 4 5 6 7 8 9 10 11 12
TARGETS := $(foreach Y,$(YEARS),$(foreach M,$(MONTHS),month.$Y.$M))
.PHONY: all $(TARGETS)
all: $(TARGETS)
$(TARGETS):
$(MAKE) -f makefile-month YEAR=$(word 2,$(subst ., ,$#)) MONTH=$(word 3,$(subst ., ,$#))
(note I didn't test this but hopefully you get the idea).

GNU make: “Nothing to be done for 'target'” vs. “'target' is up to date”

How does GNU make decide which of the messages to emit? The Makefile I am using causes Nothing to be done for 'target' messages to be emitted when the target is up do date. But I think 'target' is up to date would be more appropriate.
The chief difference is in whether gmake has a rule to build the target or not. If there is no rule for the target, but the target exists, then gmake will say, "Nothing to be done", as in, "I don't know how to update this thing, but it already exists, so I guess there's nothing to be done." If there is a rule, but the target is already up-to-date, then gmake will say, "is up to date", as in, "I do have instructions for updating this thing, but it appears to already be up-to-date, so I'm not going to do anything."
Here's a concrete example:
$ echo "upToDate: older ; #echo done" > Makefile
$ touch older ; sleep 2 ; touch upToDate ; touch nothingToDo
$ ls --full-time -l older upToDate nothingToDo
-rw-r--r-- 1 ericm ericm 0 2011-04-20 11:13:04.970243002 -0700 nothingToDo
-rw-r--r-- 1 ericm ericm 0 2011-04-20 11:13:02.960243003 -0700 older
-rw-r--r-- 1 ericm ericm 0 2011-04-20 11:13:04.960243001 -0700 upToDate
$ gmake upToDate
gmake: `upToDate' is up to date.
$ gmake nothingToDo
gmake: Nothing to be done for `nothingToDo'.
Since gmake has no rule for "nothingToDo", but the file already exists, you get the "nothing to be done" message. If "nothingToDo" did not exist, you would instead get the familiar, "No rule to make" message.
In contrast, because gmake has a rule for "upToDate", and the file appears to be up-to-date, you get the "is up to date" message.
I research this problem, and test a lot of scenes, the result I got is:
If the target has no recipe and if no any prerequisites' recipes are executed, then Nothing to be done for "top target" will printed
If the target has recipe, but the recipe is not executed, then is up to date is printed.
If no rules for a existing file, and make that existing file as target also print Nothing to be done for "xxx"

Resources