Makefiles: using `wildcard` vs. `find` for specifying source files - makefile

TL;DR: How can I use find in a Makefile in order to identify the relevant source files (e.g., all .c files)? I know how to use a wildcard but I'm not able to get find to work.
Longer version:
I'm putting together a Makefile as part of an exercise on shared libraries; I noticed that when I use the following lines to specify the source and object files (i.e., .c files) for my shared library, I get an error after running make (gcc fatal error: no input files):
SRC=$(find src/ -maxdepth 1 -type f -regex ".*\.c")
OBJ=$(patsubst %.c,%.o,$(SRC))
*rest-of-makefile*
However, it compiles correctly when I use wildcard instead of find:
SRC=$(wildcard src/*.c)
OBJ=$(patsubst %.c,%.o,$(SRC))
*rest-of-makefile*
(As reference, included below is confirmation that the find command does indeed return the intended file when run from the shell.)
What is the correct syntax for using the find command (in my Makefile) to search for my source files (if it's at all possible)?
(Why would I prefer to use find?: I like the fact that I can quickly double-check the results of a find statement by running the command from the shell; I can't do that with wildcard. Also, I'd like to rely on regexes if possible. )
As reference, below is the relevant tree structure. As you can see (from the second code-block below), running the find command as specified in the Makefile (i.e., from above) does indeed return the intended file (src/libex29.c). In other words, the issue described above isn't because of a syntax problem in the find options or the regex.
.
├── build
├── Makefile
├── src
│   ├── dbg.h
│   ├── libex29.c
│   └── minunit.h
└── tests
├── libex29_tests.c
└── runtests.sh
Results of running find from the . folder above:
~/lchw30$ find src/ -maxdepth 1 -type f -regex ".*\.c"
src/libex29.c
P.S. I know this post technically violates the rule that all posted code must compile - I just thought that including the entire code for the both the Makefile as well as the libex29.c source file would be overkill. Let me know if that's not the case - happy to post the files in their entirety, if folks prefer.

Make doesn't have a find function. You have to use the shell function to run find. Also you should always use := not = for shell (and wildcard, for that matter) for performance reasons. And you should put spaces around assignments in make, just for clarity:
SRC := $(shell find src/ -maxdepth 1 -type f -regex ".*\.c")
Also I don't see why you want to use find here. find is good if you want to search and entire subdirectory structure which contains more than one level, but wildcard is far more efficient for simple directory lookups.

Related

Make for identical workflow in separate directories

I'm using make to automate some of my data analysis. I have several directories, each containing a different realization of the data, which consists of several files representing the state of the data at a given time, like so:
├── a
│ ├── time_01.dat
│ ├── time_02.dat
│ ├── time_03.dat
│ └── ...
├── b
│ ├── time_01.dat
│ └── ...
├── c
│ ├── time_01.dat
│ └── ...
├── ...
The number of datafiles in each directory is unknown, and more can be added at any time. The files all have the same naming convention in each directory.
I want to use make to run the exact same set of recipes in each directory (to analyze each dataset separately and uniformly). In particular, there is one script that should run any time a new datafile is added, and creates an output file (analysis_time_XX.txt) for each datafile in the directory. This script does not update any files that have been previously created, but does create all the missing files. Refactoring this script is not a possibility, unfortunately.
So I have one recipe creating many targets, yet it must run separately for each directory. The solutions I've found to create multiple targets with one recipe (e.g. here) do not work in my case, as I need one rule to do this separately for multiple sets of files in separate directories.
These intermediate files are needed for their own sake (as they help validate the data collected), but are also used to create a final comparison plot between the datasets.
My current setup is an ugly combination of functions and .SECONDEXPANSION
dirs = a b c
datafiles = $(foreach dir,$(dirs),$(wildcard $(dir)/*.dat))
df_to_analysis = $(subst .dat,.txt,$(subst time_,analysis_time_,$(1)))
analysis_to_df = $(subst .txt,.dat,$(subst analysis_time_,time_,$(1)))
analysis_files = $(foreach df,$(datafiles),$(call df_to_analysis,$(df)))
all: final_analysis_plot.png
.SECONDEXPANSION:
$(analysis_files): %: $$(call analysis_to_df,%)
python script.py $(dir $#)
final_analysis_plot.png: $(analysis_files)
python make_plot.py $(analysis_files)
Note that script.py creates all of the analysis_time_XX.txt files in the given directory. The flaw with this setup is that make does not know that the first script generates all the targets, and so runs unnecessarily when parallel make is used. For my application parallel make is a necessity, as these scripts have a long runtime, and parallelization saves a lot of time as the setup is "embarrassingly parallel."
Is there an elegant way to fix this issue? Or even an elegant way to clean up the code I have now? I've shown a simple example here, which already requires a good bit of setup, and doing this for several different scripts gets unwieldy quickly.
I think, in your case there's no need to bother with .txt files. If script.py was nicer and could work per-file, there would be a value in writing individual file rules. In this case, we need to introduce an intermediate per-directory .done files.
DATA_DIRS := a b c
# A directory/.done.analysis file means that `script.py` was run here.
DONE_FILES := $(DATA_DIRS:%=%/*.done.analysis)
# .done.analysis depends on all the source data files.
# When a .dat file is added or changes, it will be newer than
# a .done.analysis file; and the analysis would be re-run.
$(DONE_FILES): %/.done.analysis: $(wildcard %/*.dat)
python script.py $(#D)
final_analysis_plot.png: $(DONE_FILES)
python make_plot.py $(wildcard $(DATA_DIRS)/analysis_time_*.txt)

How do you make a makefile target depend on a file with the same name as the target file's directory?

Suppose you have the following project structure:
.
├── Makefile
└── src
└── 1.py
The program 1.py creates multiple (0, 1, or more) files in the directory build/1. This generalizes to arbitrary numbers, i.e. a program x.py where x is some natural number would create multiple files in the directory build/x. The project can consist of many python(3) files.
A makefile for the specific scenario above could look like this:
PYTHON_FILES := $(shell find src -name '*.py')
TXT_FILES := build/1/test.txt
.PHONY: clean all
all: $(TXT_FILES)
build/1/test.txt: src/1.py
mkdir -p build/1
touch build/1/test.txt # emulates: python3 src/1.py
echo "success!"
clean:
rm -rf build
Running make with the above project structure and makefile results in the following project structure:
.
├── Makefile
├── build
│   └── 1
│   └── test.txt
└── src
└── 1.py
How do I generalize the rule head build/1/test.txt: src/1.py to handle projects with any number of python programs (or, equivalently, build subdirectories) and any number of output files per python program?
You can generalized the existing rule to work on ANY python code in src. Use '%' in the pattern rule, use '$* to refer to the number in the action list.
The rule will re-run the test, whenever the python test is modified. It will record "success" only if the python test indicate completion without an error.
Update 2019-11-24: Generalized the test to handle N tests, each generating multiple files. With rebuild.
Note 1: Make need a way to know if the python test passed without ANY failure. The solution assume non-zero exit code from the python code, or that there is another way to tell if all tests have passed.
Note 2: The done file capture the list of test files generated in the folder (excluding the test.done itself). This can be used to verify that NO output file was removed, if needed, in a separate target to compensate the the lack of explicit files generated by the process
TASK_LIST=1 2 3 4
all: ${TASK_LIST:%=build/%/task.done}
build/%/task.done: src/%.py
mkdir -p build/$*
touch build/$*/test.txt # emulates: python3 src/1.py
# Run script src/%.py should return non-zero exit on failure.
ls build/$* | grep -xv "$(#F)" > $#
touch $# # Mark "success!"
GNU Make documentation: https://www.gnu.org/software/make/manual/html_node/Automatic-Variables.html

Make starts in wrong directory under FreeBSD

I have a very simple Makefile that just shells out to another Makefile:
all:
cd src && make all
My directory structure (the Makefile is in the top-level directory):
[I] mqudsi#php ~/bbcp> tree -d
.
├── bin
│   └── FreeBSD
├── obj
│   └── FreeBSD
├── src
└── utils
This works just fine under Linux, but under FreeBSD, it gives an error about src not being found.
To debug, I updated the Makefile command to pwd; cd src && make all and I discovered that somehow when I run make in the top-level directory, it is being executed under ./obj instead, meaning it's looking for ./obj/src/ to cd into.
Aside from the fact that I have no clue why it's doing that, I presumed for sure that calling gmake instead of make under FreeBSD would take care of it, but that wasn't the case (and I'm relieved, because I can't believe there is that huge of a difference between BSD make and GNU make in terms of core operation).
The odd thing is, deleting obj makes everything work perfectly. So in the presence of an obj directory, make cds into ./obj first; otherwise it executes as you'd expect it to.
Answering my own question here.
From the FreeBSD make man page:
.OBJDIR A path to the directory where the targets are built. Its
value is determined by trying to chdir(2) to the follow-
ing directories in order and using the first match:
1. ${MAKEOBJDIRPREFIX}${.CURDIR}
(Only if `MAKEOBJDIRPREFIX' is set in the environ-
ment or on the command line.)
2. ${MAKEOBJDIR}
(Only if `MAKEOBJDIR' is set in the environment or
on the command line.)
3. ${.CURDIR}/obj.${MACHINE}
4. ${.CURDIR}/obj
5. /usr/obj/${.CURDIR}
6. ${.CURDIR}
Variable expansion is performed on the value before it's
used, so expressions such as
${.CURDIR:S,^/usr/src,/var/obj,}
may be used. This is especially useful with
`MAKEOBJDIR'.
`.OBJDIR' may be modified in the makefile via the special
target `.OBJDIR'. In all cases, make will chdir(2) to
the specified directory if it exists, and set `.OBJDIR'
and `PWD' to that directory before executing any targets.
The key part being
In all cases, make will chdir(2) to specified directory if it exists, and set .OBJDIR'PWD' to that directory before executing any targets.
By contrast, the GNU make manual page makes no such reference to any sort of automatic determination of OBJDIR, only that it will be used if it is set.
The solution was to override the OBJDIR variable via the pseudotarget .OBJDIR:
.OBJDIR: ./
all:
cd src && make
clean:
cd src && make clean
An alternative solution is to prefix the cd targets with ${CURDIR}, which isn't modified after the chdir into OBJDIR.
I don't get why gmake behaved the same way, however. That feels almost like a bug to me.

Is include path relative to current directory or source code location?

I am confused about makefiles searching the include paths.
Lets say I have a file structure:
.
├── hdrMainFolder.h
├── headers
│   └── hdrDifferentPath.h
├── makefile
├── sourceCode.cpp
└── src
├── hdrSamePath.h
└── src1.cpp
I use -I option in the makefile to indicate the paths of the included headers.
Here are the included headers from src1.cpp
#include "hdrSamePath.h"
#include "hdrMainFolder.h"
#include "hdrDifferentPath.h"
Which of the paths I should indicate explicitly in the makefile? Which of them are unnecessary? Is it enough like below?
INCLUDING = -Isrc -Iheaders
Is it necessary to indicate to the path of a header, if it is only included by a source file under the same path?
Which of the paths I should indicate explicitly in the makefile?
Which of them are unnecessary?
On the command shell, the directory where compilation instruction is being run is the current directory for the compiler.
[Compilation being done using direct command by user or using makefile]
The current directory (./) is default included by compiler for header file search paths.
If you create sub-directories and place your header files in sub-directory structure, then, you need to explicitly add -I rule for each sub-directory which contains the required header files.
hdrMainFolder.h -> present in current directory, no need to add -I rule for this
hdrDifferentPath.h -> need to add -I rule (-I./headers)
hdrSamePath.h -> need to add -I rule (-I./src)
[You may omit ./ in above added -I rules, I follow for better clarity]
Is it necessary to indicate to the path of a header, if it is only
included by a source file under the same path?
Yes, source file location is not used to determine the user-defined header file search path. Need to explicitly mention it.

Makefile rule that detects any changed file in a directory (and subdirs)

I want to create a Makefile rule that runs whenever anything is changed inside a directory (which contains multiple source files in different languages, and at different subdirectory levels).
As an example, take this Makefile:
newest: src
touch newest
with a tree like:
src/
src/a
scr/subdir/
scr/subdir/c
First time I run make, newest is created all right. But if I now touch src/subdir/b, make does nothing.
Is it possible at all to create such a rule?
I think you would need to use something like FILES := $(shell find src -type f) and a rule of newest: $(FILES) to get the sort of behavior you want.

Resources