I am working on a book. The chapters will be written in Markdown (.md), and then converted to both html and pdf (via LaTeX) versions using pandoc. Each chapter has a handful of associated Python scripts that generate some images and need to be run before the chapter is built. I am trying to write a makefile that will compile all the chapters to these two formats.
For now, the project is structured as follows:
project
|--- makefile
|--- chapters
| --- chapter1
| --- main.md
| --- genimage.py
| --- genanotherimage.py
| --- chapter2
|--- main.md
|--- otherimage.py
| --- output
| --- html
| --- chapter1.html
| --- chapter2.html
| --- pdf
| --- chapter1.pdf
| --- chapter2.pdf
I would like to type "make chapter1" (or similar) and have it refresh both output/html/chapter1.html and output/pdf/chapter1.pdf, re-running all the .py scripts in the corresponding directory if they have changed. Ideally I would have one rule that handles all the chapters in parallel rather than a separate one for each one. (The actual command generate the html/pdf is "pandoc -o output/html/chapter1.html chapter1/main.md" and so on.)
I am not very familiar with make and my attempts so far have been very unsuccessful. I can't manage to make a target where there are multiple files to update, and I have not managed to use patterns to handle each chapter with a single rule. I am happy to reorganize somewhat if it makes things easier.
Is this workflow possible with a makefile? I am grateful for any hints to get started; I'm at a loss and even just knowing the right things to look up in the manual would be very helpful.
The following is based on the project tree you show and assumes GNU make. It also assumes that you must run pandoc and your python scripts from the top level directory of the project. Pattern rules can probably help:
CHAPTERS := $(notdir $(wildcard chapters/chapter*))
.PHONY: all $(CHAPTERS)
all: $(CHAPTERS)
$(CHAPTERS): %: chapters/output/html/%.html chapters/output/pdf/%.pdf
chapters/output/html/%.html chapters/output/pdf/%.pdf: chapters/%/main.md
for python_script in $(wildcard $(<D)/*.py); do ./$$python_script; done
mkdir -p chapters/output/html chapters/output/pdf
pandoc -o chapters/output/html/$*.html <other-options> $<
pandoc -o chapters/output/pdf/$*.pdf <other-options> $<
The main subtlety is that when GNU make encounters a pattern rule with several targets it considers that one single execution of the recipe builds all targets. In our case the HTML and PDF outputs are produced by the same execution of the recipe.
Note: with recent versions of GNU make rules with grouped targets (&:) do the same.
This is not 100% perfect because a chapter will not be rebuilt if you modify or add python scripts. If you also need this we will need more sophisticated GNU make features like secondary expansion or eval.
Example with secondary expansion:
CHAPTERS := $(notdir $(wildcard chapters/chapter*))
.PHONY: all $(CHAPTERS)
all: $(CHAPTERS)
$(CHAPTERS): %: chapters/output/html/%.html chapters/output/pdf/%.pdf
.SECONDEXPANSION:
chapters/output/html/%.html chapters/output/pdf/%.pdf: chapters/%/main.md $$(wildcard chapters/$$*/*.py)
for python_script in $(wildcard $(<D)/*.py); do ./$$python_script; done
mkdir -p chapters/output/html chapters/output/pdf
pandoc -o chapters/output/html/$*.html $<
pandoc -o chapters/output/pdf/$*.pdf $<
To understand why $$ in $$(wildcard chapters/$$*/*.py) see the GNU make manual.
Example with eval:
CHAPTERS := $(notdir $(wildcard chapters/chapter*))
.PHONY: all $(CHAPTERS)
all: $(CHAPTERS)
$(CHAPTERS): %: chapters/output/html/%.html chapters/output/pdf/%.pdf
# $1: chapter
define CHAPTER_RULE
PYTHON_SCRIPTS_$1 := $$(wildcard chapters/$1/*.py)
chapters/output/html/$1.html chapters/output/pdf/$1.pdf: chapters/$1/main.md $$(PYTHON_SCRIPTS_$1)
for python_script in $$(PYTHON_SCRIPTS_$1); do ./$$$$python_script; done
mkdir -p chapters/output/html chapters/output/pdf
pandoc -o chapters/output/html/$1.html $$<
pandoc -o chapters/output/pdf/$1.pdf $$<
endef
$(foreach c,$(CHAPTERS),$(eval $(call CHAPTER_RULE,$c)))
To understand why $$ or $$$$ see the GNU make manual.
Related
My goal is the following: I have a directory src which contains markdown files (.md). I want to run a command on each of these files so that the comments are removed and the edited files are stored in a separate directory. For this I want to use make.
This is the Makefile I have:
.PHONY: clean all
BUILD_DIR := build
SRC_DIRS := src
SRCS := $(shell find $(SRC_DIRS) -name *.md)
DSTS := $(patsubst $(SRC_DIRS)/%.md,$(BUILD_DIR)/%.md,$(SRCS))
all: $(DSTS)
# The aim of this is to remove all my comments from the final documents
$(DSTS): $(SRCS)
pandoc --strip-comments -f markdown -i $< -t markdown -o $#
clean:
rm $(BUILD_DIR)/*.md
While this works in general, I noticed that the command is executed on all files, even though I changed only one single file.
Example: I have 3 Files src/a.md, src/b.md and src/c.md. Now I run make and all files are correctly generated in the build folder. Now I only edit c.md and run make again. I would expect that make only "compiles" src/c.md anew but instead all three files are compiled again. What am I doing wrong?
Your line
$(DSTS): $(SRCS)
is saying ‘All of the DSTS depend on all of the SRCS’, so whenever any one of the $(SRCS) is newer than any of the $(DSTS), this pandoc action will be run.
That's not what you want to express. What you want is something more like
$(BUILD_DIR)/%.md: $(SRC_DIRS)/%.md
pandoc --strip-comments -f markdown -i $< -t markdown -o $#
all: $(DSTS)
That says that all of the $(DSTS) should be up to date, and the pattern rule teaches Make what each one depends on, and how to build it, if it is out of date.
(As a general point, looking your original rule, it's rarely the right thing to do to have multiple targets in a rule, as you have with $(DSTS); also note that in your original, $< always refers only to the first of the dependencies in $(SRCS))
I have a series of directories organized like this:
foo/
foo.file1 foo.file2
bar/
bar.file1 bar.file2
baz/
baz.file1 baz.file2
Right now I'm processing these files using a script that does all the checking for file existence etc but I thought that perhaps I could use a Makefile for it (since said script is very fragile), to avoid reprocessing files that did not change.
The problem is that each directory is independent, and I'd need to do, for example:
foo.file1.processed: foo.file1
run_random_program foo.file1 -o foo.file1.processed
for each of the 71 directories that are in total in that path. This looks like being extremely tedious and I wonder if there's something that would prevent me from writing all of this by hand.
Is such a thing possible?
EDIT: Some examples that show what I have in mind, had I a single Makefile for each directory:
file1.cds.callable: file1.callable
long_script_name -i $< -o $#
file1.rds: file1.cds.callable
another_long_script_name $< additional_file_in_folder $#
file1.csv: file1.rds
yet_another_script $< $#
Seems like pattern rules are exactly what you need:
# These are the original source files (based on the example)
CALLABLE := $(wildcard */*.callable)
# These are the final targets
TARGETS := $(CALLABLE:%.callable=%.csv)
all: $(TARGETS)
%.csv : %.rds
yet_another_script $< $#
%.rds: %.cds.callable
another_long_script_name $< additional_file_in_folder $#
%.cds.callable: %.callable
long_script_name -i $< -o $#
I would like to copy pdf files from several directories into a build directory, then use pdfunite to compile them into one pdf. The following make recipe works, but I have to run it twice because the first time through, I get an error from pdfunite - no files are found in the build directory (the PDFS variable is empty) even though they were just copied in the previous lines. How can I fix this so it works in one pass? I have simplified the recipe for clarity; I am actually pulling from various folders and making some pdfs on the fly as well, so I can't easily concatenate a full list of files from various subfolders (folder1 and folder2 in the example) to pass to pdfunite.
notebook:
mkdir -p $(out)
mkdir -p $(build)/notebook
$(eval PR := $(sort $(wildcard $(data)/folder1/*.pdf)) )
cp $(PR) $(build)/notebook
$(eval SR := $(sort $(wildcard $(data)/folder2/*.pdf)) )
cp $(SR) $(build)/notebook
$(eval PDFS := $(sort $(wildcard $(build)/notebook/*.pdf)) )
pdfunite $(PDFS) $(out)/notebook.pdf
Your Makefile is not in line with make's philosophy. You are using make as another scripting language, while make is more than this. It compares targets and prerequisites dates, based on this decides which must be built or re-built, and passes the recipes to the shell. So, for your particular problem, you should rather try something like:
PR := $(wildcard $(data)/folder1/*.pdf)
SR := $(wildcard $(data)/folder2/*.pdf)
PDFS1 := $(patsubst $(data)/folder1/%.pdf,$(build)/notebook/%.pdf,$(PR))
PDFS2 := $(patsubst $(data)/folder2/%.pdf,$(build)/notebook/%.pdf,$(SR))
PDFS := $(sort $(PDFS1) $(PDFS2))
.PHONY: notebook
notebook: $(out)/notebook.pdf
$(PDFS1): $(build)/notebook/%.pdf: $(data)/folder1/%.pdf | $(build)/notebook
cp $< $#
$(PDFS2): $(build)/notebook/%.pdf: $(data)/folder2/%.pdf | $(build)/notebook
cp $< $#
$(build)/notebook $(out):
mkdir -p $#
$(out)/notebook.pdf: $(PDFS) | $(out)
pdfunite $(PDFS) $#
The variables definitions are reasonably straightforward: patsubst, as its name says, substitutes strings. The target: pattern: prerequisites is a static pattern rule. And the prerequisites after | are order-only prerequisites.
What this makefile says, basically, is that $(out)/notebook.pdf depends on a set of pdf files in $(build)/notebook/ and that these pdf files depend on source pdf files with the same basenames in $(data)/folder1/ and $(data)/folder2/. It also says that directories must be created before being populated. Thanks to all this only what needs to be done will be done, no more, no less. And it is more in line with make's philosophy.
If you have many source folders and do not want to replicate the copying rules, you can use more advanced features like:
FOLDERS := folder1 folder2
.PHONY: notebook
notebook: $(out)/notebook.pdf
define MY_rule
$(1)_SRCS := $$(wildcard $$(data)/$(1)/*.pdf)
$(1)_DSTS := $$(patsubst $$(data)/$(1)/%.pdf,$$(build)/notebook/%.pdf,$$($(1)_SRCS))
PDFS += $$($(1)_DSTS)
$(1)_DSTS: $$(build)/notebook/%.pdf: $$(data)/$(1)/%.pdf | $$(build)/notebook
cp $$< $$#
endef
$(foreach f,$(FOLDERS),$(eval $(call MY_rule,$(f))))
$(build)/notebook $(out):
mkdir -p $#
$(out)/notebook.pdf: $(PDFS) | $(out)
pdfunite $(PDFS) $#
I would do another way.
PR:=$(sort $(wildcard $(data)/folder1/*.pdf))
SR:=$(sort $(wildcard $(data)/folder2/*.pdf))
PDFS=$(sort $(wildcard $(build)/notebook/*.pdf))
all: copy
pdfunite $(PDFS) $(out)/notebook.pdf
copy:
mkdir -p $(out)
mkdir -p $(build)/notebook
cp $(PR) $(build)/notebook
cp $(SR) $(build)/notebook
.PHONY: all copy
Please check: PDFS= and not PDFS:=. If you use simple = the variable's value will calculate when it needed (not sooner!).
When you run make it want to build all. The all's requirement is copy - so make does some mkdir and cp. After it return the all: the value of PDFS is needed so will evalute now - we have many-many pdf in $(build)/notebook :)
Goog evening,
I'm completely new to makefiles and worked out a file which fits our needs good but I'm not completely satisfied. We use bootstrap3 and have around 40 customers with differend color settings. That's why we need to compile 40 slightly different css files. Until now, we have the following file structure
less/customer1.less
css/customer1.css
color/customer.less contains bootstraps variables file
customer1.less contains
#variables: 'myCompany/color/customer1'; //this is forwarded to where bootrstrap loads the variables template
#import "bootstrap";
#import 'myCompany/modifications';
Our makefile
SOURCES = $(shell ls less/*.less)
# Files we don't want to be build
SOURCES := $(filter-out less/bootstrap.less, $(SOURCES))
SOURCES := $(filter-out less/a11y.less, $(SOURCES))
TARGETS = $(patsubst less/%.less,css/%.css,$(SOURCES))
DEPEND = $(patsubst less/%.less,make/%.d,$(SOURCES))
css/%.css: less/%.less
# First building dependency files
lessc -M $< $# > 'make/$*.d'
# Then building CSS and sourcemap
lessc -s $< > $# --source-map=map/$*.css.map --source-map-basepath=map --clean-css
-include $(DEPEND)
all: $(TARGETS)
Call
$ make all
Creates Makefiles in make/, CSS in css/ CSS source-maps in map/ and expects LESS being in less/.
This works but we need to create customerX.less for each customer manually even if the only difference is the assigned color scheme/variables file.
Make should look in the color folder if there is a file for this customerX and then create (but not overwrite!) customerX.less in less directory.
Any make guru out here know how to do this with make?
I believe you can do what you want here with an order-only prerequisite.
Something like:
less/customer%.less: | color/customer.less
[ -f '$#' ] || cp $^ $#
I don't think the -f test is strictly necessary there but it shouldn't hurt and is safer.
On a different topic $(shell ls less/*.less) can probably be done better with either $(shell echo less/*.less) (you don't care about what ls does you just want the shell glob expansion) or $(wildcard less/*.less). (Technically shelling out and wildcard are slightly different but I don't know that that will matter for you here.)
Also note that the all target will not create these missing less files for you (as SOURCES will not contain them as the file didn't exist) but make css/customer#.css will create them if necessary.
I am writing a GNUmakefile to create a workflow to analyse some biological sequence data. The data comes in a format called fastq, which then undergoes a number of cleaning and analysis tools. I have attached what I currently have written, which takes me all the way from quality control before cleaning and then quality control afterwards. My problem is that I'm not sure how to get the 'fastqc' commands to run, as its targets are not dependencies for any of the other steps in the workflow.
%_sts_fastqc.html %_sts_fastqc.zip: %_sts.fastq
# perform quality control after cleaning reads
fastqc $^
%_sts.fastq: %_st.fastq
# trim reads based on quality
sickle se -f $^ -t illumina -o $#
%_st.fastq: %_s.fastq
# remove contaminated reads
tagdust -s adapters.fa $^
%_s.fastq: %.fastq
# trim adapters
scythe -a <adapters.fa> -o $# $^
%_fastqc.html %_fastqc.zip: %.fastq
# perform quality control before cleaning reads
fastqc $^
%.fastq: %.sra
# convert .fastq to .sra
fastq-dump $^
I believe adding these lines to the start of your Makefile will do what you are asking for:
SOURCES:=$(wildcard *.sra)
TARGETS:=$(SOURCES:.sra=_fastqc.html) $(SOURCES:.sra=_fastqc.zip)\
$(SOURCES:.sra=_sts_fastqc.html) $(SOURCES:.sra=_sts_fastqc.zip)
.PHONY: all
all: $(TARGETS)
What this does is grab all .sra files from the file system and build a list of targets to build by replacing the extension with whatever strings are necessary to produce the targets. (Note the the html and zip targets being produced by the same command I could have one or the other but I've decided to put both, in case the rules change and the hmtl and zip targets are ever produced separately.) Then it sets the phony all target to build all the computed targets. Here is a Makefile I've modified from yours by adding #echo everywhere which I used to check that things were okay without having to run the actual commands in your Makefile. You could copy and paste it in a file to first check that everything is fine before modifying your own Makefile with the lines above. Here it is:
SOURCES:=$(wildcard *.sra)
TARGETS:=$(SOURCES:.sra=_fastqc.html) $(SOURCES:.sra=_fastqc.zip)\
$(SOURCES:.sra=_sts_fastqc.html) $(SOURCES:.sra=_sts_fastqc.zip)
.PHONY: all
all: $(TARGETS)
%_sts_fastqc.html %_sts_fastqc.zip: %_sts.fastq
# perform quality control after cleaning reads
#echo fastqc $^
%_sts.fastq: %_st.fastq
# trim reads based on quality
#echo sickle se -f $^ -t illumina -o $#
%_st.fastq: %_s.fastq
# remove contaminated reads
#echo tagdust -s adapters.fa $^
%_s.fastq: %.fastq
# trim adapters
#echo 'scythe -a <adapters.fa> -o $# $^'
%_fastqc.html %_fastqc.zip: %.fastq
# perform quality control before cleaning reads
#echo fastqc $^
%.fastq: %.sra
# convert .fastq to .sra
#echo fastq-dump $^
I tested it here by running touch a.sra b.sra and then running make. It ran the commands for both files.
instead of using patterns, I would use a 'define':
# 'all' is not a file
.PHONY: all
# a list of 4 samples
SAMPLES=S1 S2 S3 S4
#define a macro named analyzefastq. It takes one argument $(1). we need to protect the '$' for later expension using $(eval)
define analyzefastq
# create a .st.fastq from fastq for file $(1)
$(1).st.fastq : $(1).fastq
tagdust -s adapters.fa $$^
# create a .fastq from seq for file $(1)
$(1).fastq : $(1).sra
fastq-dump $$^
endef
#all : final target dependency is all samples with a suffix '.st.fastq'
all: $(addsuffix ${S}.st.fastq, ${SAMPLES} )
## loop over each sample , name of variable is 'S' call and eval the previous macro, using 'S'=sample for the argument
$(foreach S,${SAMPLES},$(eval $(call analyzefastq,$(S))) )
I also use my tool jsvelocity https://github.com/lindenb/jsvelocity to generate large Makefile for NGS:
https://gist.github.com/lindenb/3c07ca722f793cc5dd60