Make workflow with a lot of intermediary files - makefile

I have some trouble with my Makefile:
# Manage rendering of images
.PHONY: explode
all: explode anime.apng
out.ppm: file.code
./pgm -f $<
explode: out.ppm
split -d -a 3 --lines=$(N) --additional-suffix=.ppm $< frame
# Convert to multiple png
%.png: %.ppm
convert $< $#
optipng $#
# Assemble in one animated png
anime.apng: %.png
apngasm $# frame000.png
My problem is: I don't know how many intermediate files I will have to produce my final target, so I can't specify them in advance. Schematically:
1 file.code -> 1 out.ppm |> LOADS of .ppm |> LOADS of .png -> 1 anime.apng
+> … +> …
+> … +> …
For that I use an implicit rule %.png: %.ppm. But then I cannot specify a prerequisite for my last merge step! Any ideas? With another tool than make? Anything elegant?

I think a simple and rather clean approach would be to have a variable record the list of those 'LOADS' of ppm, in my example it's variable STEP2.
Surely you can use the program that gets you from '1 out.ppm' to 'LOADS of .ppm' to list the .ppm files that you will obtain.
With a very trivial exemple where out.ppm would be a text file listing the names of the .ppm to produce, you would write something like :
SOURCE = file.code
STEP1 = out.ppm
STEP2 = $(shell cat $(STEP1))
STEP3 = $(STEP2:%.ppm=%.png)
TARGET = anime.apng
Then you'll need to write a rule to get all the files listed in STEP2 from the file $(STEP1). This is done file by file as if it was an implicit rule with a % pattern, assuming your program is called 'extractor' :
$(STEP2): $(STEP1)
extractor $^ $#
This rule will be applied once for each file listed in STEP2. This assumes that your program only wants the names of the source and output files. Should you prefer to pass the stem of the output file, you can still use a plain ol' implicit rule :
$(STEP2):%.ppm: $(STEP1)
extractor $^ $*
(The $(STEP2): at the begining is to prevent make from using this rule to generate out.ppm)
Then, everything is as usual when it comes to compilation and you can adapt rules for compiling and linking any C projet. The %.ppm -> %.png step is like compiling %.c to %.o :
%.png: %.ppm
convert $< $#
optipng $#
To group everything at the end (equivalent of linking several %.o into only one binary) :
$(TARGET): $(STEP3)
apngasm $# $^
I'm assuming here that apngasm can take the list of everything to put together as arguments à la tar.
Hope it's clear and helpful enough.

A temporary workaround would be to take inspiration from a close question and render the images in a subfolder with a sub-make call. Something like:
Makefile:
# Manage rendering of images
.PHONY: explode
all: explode anime.apng
out.ppm: file.code
./pgm -f $<
explode: out.ppm
split -d -a 3 --lines=$(N) --additional-suffix=.ppm $< subfolder/frame
ppm2png: explode
$(MAKE) -C subfolder
# Assemble in one animated png
anime.apng: ppm2png
apngasm $# subfolder/frame000.png
temp/Makefile:
SOURCES := $(wildcard *.ppm)
OUTPUTS = $(patsub %.ppm,%.png,%(SOURCES))
.PHONY: all
all: $(OUTPUTS)
# Convert to multiple png
%.png: %.ppm
convert $< $#
optipng $#
I'm sure better can be done. With another tool than make?

Related

Makefile the dependency is a list variable but $< is only taking the first dependency

I've some similar files on which I want to do an operation using makefile. So I'm doing this:
INPUT := $(wildcard *.png)
OUTPUT := $(INPUT:.png=.jpeg)
.PHONY: all
all: $(OUTPUT)
$(OUTPUT): $(INPUT)
convert $< -resize 30x30 $#
I'm getting correct jpeg file names but the image is the same (first dependency) in the all the files.
I know that using $< only refers to the first dependency in the list, and using $^ is giving all the deps but for all the ouputs.
Is there any way that dep1 for output1, dep2 for output2 and so on?
This way you declare that each output file depends on every input file. You should be using a pattern rule instead, i.e.:
$(OUTPUT): %.jpeg: %.png
convert $< -resize 30x30 $#
I'd be inclined just to use:
%.jpeg: %.png
convert $< -resize 30x30 $#
In any case, you don't want to have to remake all of your thumbnails any time one of the original images changes -- they each only depend on the single corresponding png file.

GNU make: create targets baed on specific directory contents (1:1 target-directory mapping)

I have a series of directories organized like this:
foo/
foo.file1 foo.file2
bar/
bar.file1 bar.file2
baz/
baz.file1 baz.file2
Right now I'm processing these files using a script that does all the checking for file existence etc but I thought that perhaps I could use a Makefile for it (since said script is very fragile), to avoid reprocessing files that did not change.
The problem is that each directory is independent, and I'd need to do, for example:
foo.file1.processed: foo.file1
run_random_program foo.file1 -o foo.file1.processed
for each of the 71 directories that are in total in that path. This looks like being extremely tedious and I wonder if there's something that would prevent me from writing all of this by hand.
Is such a thing possible?
EDIT: Some examples that show what I have in mind, had I a single Makefile for each directory:
file1.cds.callable: file1.callable
long_script_name -i $< -o $#
file1.rds: file1.cds.callable
another_long_script_name $< additional_file_in_folder $#
file1.csv: file1.rds
yet_another_script $< $#
Seems like pattern rules are exactly what you need:
# These are the original source files (based on the example)
CALLABLE := $(wildcard */*.callable)
# These are the final targets
TARGETS := $(CALLABLE:%.callable=%.csv)
all: $(TARGETS)
%.csv : %.rds
yet_another_script $< $#
%.rds: %.cds.callable
another_long_script_name $< additional_file_in_folder $#
%.cds.callable: %.callable
long_script_name -i $< -o $#

Makefile for dotfiles (graphviz)

As part of generating a PDF from a latex file, I got a makefile from tex.stackexchange.com.
# You want latexmk to *always* run, because make does not have all the info.
# Also, include non-file targets in .PHONY so they are run regardless of any
# file of the given name existing.
.PHONY: paper-1.pdf all clean
# The first rule in a Makefile is the one executed by default ("make"). It
# should always be the "all" rule, so that "make" and "make all" are identical.
all: paper-1.pdf
# CUSTOM BUILD RULES
# In case you didn't know, '$#' is a variable holding the name of the target,
# and '$<' is a variable holding the (first) dependency of a rule.
# "raw2tex" and "dat2tex" are just placeholders for whatever custom steps
# you might have.
%.tex: %.raw
./raw2tex $< > $#
%.tex: %.dat
./dat2tex $< > $#
# MAIN LATEXMK RULE
# -pdf tells latexmk to generate PDF directly (instead of DVI).
# -pdflatex="" tells latexmk to call a specific backend with specific options.
# -use-make tells latexmk to call make for generating missing files.
# -interaction=nonstopmode keeps the pdflatex backend from stopping at a
# missing file reference and interactively asking you for an alternative.
paper-1.pdf: paper-1.tex
latexmk -bibtex -pdf -pdflatex="pdflatex -interaction=nonstopmode" -use-make paper-1.tex
clean:
latexmk -bibtex -CA
My figures are .dot files that I turn into PNG files. I can make the PNGs with some basic shell commands, but that it doesn't make sense to use a shell script because you lose the advantages of make.
Here's what I've been trying after reading some documentation.
%.png: %.dot
dot -Tpng $(.SOURCE) -o $(.TARGET)
and
.dot.png:
dot -Tpng $(.SOURCE) -o $(.TARGET)
However, whenever I try to run the target directly the terminal prints is:
dot -Tpng -o
and it holds because it waits for input from STDIN because there was no input file.
If I try to invoke the rule by running make *.dot I get the output:
make: Nothing to be done for `figure-1a.dot'.
make: Nothing to be done for `figure-1b.dot'.
I'm clearly not understanding what I need to do. How do I get the makefile to take all the .dot files and create .png files every time I run through the creation of the PDF?
UPDATE: Here is another attempt I tried
graphs := $(wildcard *.dot)
.dot.png: $(graphs)
dot -Tpng $(.SOURCE) -o $(.TARGET).png
GNU make uses $< and $#, not .SOURCE and .TARGET, the recipe should be
.PHONY: all
all: $(patsubst %.dot,%.png,$(wildcard *.dot))
%.png: %.dot
dot -Tpng $< -o $#

GNU Make get the list of all files in a directory that were generated by previous rule

I am looking for Makefile macro to get list of all files in a directory that were generated as rule1 processing and using this list for rule2 processing.
Here's what I am trying to achieve :
Rule 1: Generate source .c files (using xml files) and place them in $(MYDIR) directory.
Rule 2: Get the list of all files in $(MYDIR) and create object files and place them in $(OBJDIR).
Problem is, I want to update list of files in Rule2 after Rule 1 has been processed, else list of files in $(MYDIR) will be empty.
all : rule_1 rule_2
rule1 : $(MYDIR)/generated_source1.c $(MYDIR)/generated_source2.c
$(MYDIR)/generated_source1.c:
xsltproc generator1.xml style_generator.xsl -o $(MYDIR)/generated_source_1.c
$(MYDIR)/generated_source2.c:
xsltproc generator2.xml style_generator.xsl -o $(MYDIR)generated_source_2.c
#Get list of all $(MYDIR).*c , create corresponding $(OBJDIR)/*.o list.
SOURCES := $(wildcard $(MYDIR)/*.c)
OBJECTS := $(notdir ${SOURCES})
GENERATED_OBJS := $(patsubst %.c,$(OBJDIR)/%.o,$(OBJECTS))
#This rule is compiling of all .c generated in rule1.
rule2 : $(GENERATED_OBJS)
ld -r -o $(OBJDIR)/generated_lib.o $(GENERATED_OBJS)
$(OBJDIR)/%.o: $(MYDIR)/%.c
gcc $(CFLAGS) -c -o $# $<
$(SOURCES) is shown empty, but actually it should contain generated_source1.c and generated_source2.c
I am not sure how .SECONDEXPANSION rule will work for my case.
You can't really (and don't really want to) play around with getting make to re-evaluate file existence during the running of the make process.
What you want to do is track the files from start to finish in make and then you have all your lists.
You can start at either direction but starting with the initial source tends to be easier.
So start with
MYDIR:=dir
OBJDIR:=obj
XML_SOURCES := $(wildcard $(MYDIR)/*.xml)
then translate from there to the generated source files
SOURCES := $(subst generator,generated_source,$(XML_SOURCES:.xml=.c))
and from there to the generated object files
GENERATED_OBJS := $(patsubst $(MYDIR)/%.c,$(OBJDIR)/%.o,$(SOURCES))
At which point you can define the default target
all: $(OBJDIR)/generated_lib.o
and then define the rules for each step
$(MYDIR)/%.c:
cat $^ > $#
$(OBJDIR)/%.o: $(MYDIR)/%.c
cat $^ > $#
$(OBJDIR)/generated_lib.o: $(GENERATED_OBJS)
ld -r -o $# $^
The $(MYDIR)/%.c rule needs a bit of extra magic to actually work correctly. You need to define the specific input/output pairs so that they are used correctly by that rule.
$(foreach xml,$(XML_SOURCES),$(eval $(subst generator,generated_source,$(xml:.xml=.c)): $(xml)))
This .xml to .c step would be easier if the input and output files shared a basename as you could then just use this and be done.
%.c: %.xml
cat $^ > $#

How to get a makefile to run all commands, regardless of targets or dependencies

I am writing a GNUmakefile to create a workflow to analyse some biological sequence data. The data comes in a format called fastq, which then undergoes a number of cleaning and analysis tools. I have attached what I currently have written, which takes me all the way from quality control before cleaning and then quality control afterwards. My problem is that I'm not sure how to get the 'fastqc' commands to run, as its targets are not dependencies for any of the other steps in the workflow.
%_sts_fastqc.html %_sts_fastqc.zip: %_sts.fastq
# perform quality control after cleaning reads
fastqc $^
%_sts.fastq: %_st.fastq
# trim reads based on quality
sickle se -f $^ -t illumina -o $#
%_st.fastq: %_s.fastq
# remove contaminated reads
tagdust -s adapters.fa $^
%_s.fastq: %.fastq
# trim adapters
scythe -a <adapters.fa> -o $# $^
%_fastqc.html %_fastqc.zip: %.fastq
# perform quality control before cleaning reads
fastqc $^
%.fastq: %.sra
# convert .fastq to .sra
fastq-dump $^
I believe adding these lines to the start of your Makefile will do what you are asking for:
SOURCES:=$(wildcard *.sra)
TARGETS:=$(SOURCES:.sra=_fastqc.html) $(SOURCES:.sra=_fastqc.zip)\
$(SOURCES:.sra=_sts_fastqc.html) $(SOURCES:.sra=_sts_fastqc.zip)
.PHONY: all
all: $(TARGETS)
What this does is grab all .sra files from the file system and build a list of targets to build by replacing the extension with whatever strings are necessary to produce the targets. (Note the the html and zip targets being produced by the same command I could have one or the other but I've decided to put both, in case the rules change and the hmtl and zip targets are ever produced separately.) Then it sets the phony all target to build all the computed targets. Here is a Makefile I've modified from yours by adding #echo everywhere which I used to check that things were okay without having to run the actual commands in your Makefile. You could copy and paste it in a file to first check that everything is fine before modifying your own Makefile with the lines above. Here it is:
SOURCES:=$(wildcard *.sra)
TARGETS:=$(SOURCES:.sra=_fastqc.html) $(SOURCES:.sra=_fastqc.zip)\
$(SOURCES:.sra=_sts_fastqc.html) $(SOURCES:.sra=_sts_fastqc.zip)
.PHONY: all
all: $(TARGETS)
%_sts_fastqc.html %_sts_fastqc.zip: %_sts.fastq
# perform quality control after cleaning reads
#echo fastqc $^
%_sts.fastq: %_st.fastq
# trim reads based on quality
#echo sickle se -f $^ -t illumina -o $#
%_st.fastq: %_s.fastq
# remove contaminated reads
#echo tagdust -s adapters.fa $^
%_s.fastq: %.fastq
# trim adapters
#echo 'scythe -a <adapters.fa> -o $# $^'
%_fastqc.html %_fastqc.zip: %.fastq
# perform quality control before cleaning reads
#echo fastqc $^
%.fastq: %.sra
# convert .fastq to .sra
#echo fastq-dump $^
I tested it here by running touch a.sra b.sra and then running make. It ran the commands for both files.
instead of using patterns, I would use a 'define':
# 'all' is not a file
.PHONY: all
# a list of 4 samples
SAMPLES=S1 S2 S3 S4
#define a macro named analyzefastq. It takes one argument $(1). we need to protect the '$' for later expension using $(eval)
define analyzefastq
# create a .st.fastq from fastq for file $(1)
$(1).st.fastq : $(1).fastq
tagdust -s adapters.fa $$^
# create a .fastq from seq for file $(1)
$(1).fastq : $(1).sra
fastq-dump $$^
endef
#all : final target dependency is all samples with a suffix '.st.fastq'
all: $(addsuffix ${S}.st.fastq, ${SAMPLES} )
## loop over each sample , name of variable is 'S' call and eval the previous macro, using 'S'=sample for the argument
$(foreach S,${SAMPLES},$(eval $(call analyzefastq,$(S))) )
I also use my tool jsvelocity https://github.com/lindenb/jsvelocity to generate large Makefile for NGS:
https://gist.github.com/lindenb/3c07ca722f793cc5dd60

Resources