I have write a makefile that prepare some files. I create ORIGINAL directory and then I use the file inside the folder for start the others rules
RDIR=.
RFILES:=$(wildcard $(RDIR)/*.vcf)
OUTDIR=ORIGINAL
OUTFILES=$(patsubst %.vcf,$(OUTDIR)/%.gz,$(RFILES))
BCFTOOLS=bcftools
OUTSOMATIC=SOMATIC
OUTVARDICT=$(patsubst
$(OUTDIR)/%vardict.gz,$(OUTSOMATIC)/%.somatic.vcf,$(wildcard
$(OUTDIR)/*vardict.gz))
OUTMUTEC2=$(patsubst
$(OUTDIR)/%mutect2_all.gz,$(OUTSOMATIC)/%mutect2.somatic.vcf,$(wildcard
$(OUTDIR)/*mutect2_all.gz))
OUTVARSCAN2=$(patsubst
$(OUTDIR)/%varscan.gz,$(OUTSOMATIC)/%varscan2.somatic.vcf,$(wildcard
$(OUTDIR)/*varscan.gz))
.PHONY: all
all: $(OUTDIR) $(OUTFILES) $(OUTSOMATIC) $(OUTVARDICT) $(OUTMUTEC2)
$(OUTVARSCAN2)
$(OUTDIR)/%.gz: %.vcf
bgzip -c $< > $#
$(OUTDIR):
test -d $# || mkdir $#
$(OUTSOMATIC):
test -d $# || mkdir $#
$(OUTSOMATIC)/%.somatic.vcf: $(OUTDIR)/%vardict.gz
$(BCFTOOLS) view -f PASS -i 'INFO/STATUS ~ ".*Somatic"' $< > $#
$(OUTSOMATIC)/%mutect2.somatic.vcf: $(OUTDIR)/%mutect2_all.gz
$(BCFTOOLS) view -f PASS $< > $#
$(OUTSOMATIC)/%varscan2.somatic.vcf: $(OUTDIR)/%varscan.gz
$(BCFTOOLS) view -f PASS -i 'SS="2"' $< > $#
clean:
rm -rf $(OUTDIR)
rm -rf $(OUTSOMATIC)
I need to launch 3 time make -f Makefile for execute all the rules. How
can improve that script?
What is the right way?
thanks for any help
If I understand you correctly, your makefile is zip vcf files in one directory into gz files in a second directory, then use those gz files to build vcf files in a third directory (building the directories as needed), and that those final vcf files are the real goal.
You can do it in one pass, if you modify the variable assignments to derive target names from the planned gz files, not the gz files that already exist:
OUTVARDICT=$(patsubst $(OUTDIR)/%vardict.gz,$(OUTSOMATIC)/%.somatic.vcf,$(filter $(OUTDIR)/%vardict.gz, $(OUTFILES)))
OUTMUTEC2= $(patsubst $(OUTDIR)/%mutect2_all.gz, $(OUTSOMATIC)/%mutect2.somatic.vcf, $(filter $(OUTDIR)/%mutect2_all.gz, $(OUTFILES)))
OUTVARSCAN2 = $(patsubst $(OUTDIR)/%varscan.gz,$(OUTSOMATIC)/%varscan2.somatic.vcf, $(filter $(OUTDIR)/%varscan.gz, $(OUTFILES)))
and modify the rules to allow Make to determine which intermediates to build:
all: $(OUTVARDICT) $(OUTMUTEC2) $(OUTVARSCAN2)
$(OUTDIR)/%.gz: %.vcf $(OUTDIR)
bgzip -c $< > $#
$(OUTDIR):
test -d $# || mkdir $#
$(OUTSOMATIC):
test -d $# || mkdir $#
$(OUTSOMATIC)/%.somatic.vcf: $(OUTDIR)/%vardict.gz $(OUTSOMATIC)
$(BCFTOOLS) view -f PASS -i 'INFO/STATUS ~ ".*Somatic"' $< > $#
$(OUTSOMATIC)/%mutect2.somatic.vcf: $(OUTDIR)/%mutect2_all.gz $(OUTSOMATIC)
$(BCFTOOLS) view -f PASS $< > $#
$(OUTSOMATIC)/%varscan2.somatic.vcf: $(OUTDIR)/%varscan.gz $(OUTSOMATIC)
$(BCFTOOLS) view -f PASS -i 'SS="2"' $< > $#
Related
Say, in a Makefile, I have the following targets:
EXES=dir1/subdir1/abc dir1/subdir1/def dir1/subdir2/ghi dir1/subdir2/jkl dir2/subdir3/mno dir2/subdir3/pqr
Each item in $(EXES) represents a binary to be created. I want to make sure that the necessary directories (in the example: dir1/subdir1, dir1/subdir2, dir2/subdir3) are created if they are not existent.
How would I achieve this with gnu-make?
Use order-only prerequisites:
target: prerequisites | order-only-prerequisites
order-only-prerequisites:
recipe
Their recipe is executed only if they do not exist yet. Example:
$(BUILDDIR)/foo.o: src/foo.c | $(BUILDDIR)
$(CC) $(CFLAGS) -o $# $<
$(BUILDDIR):
mkdir -p $#
And if you want to extract the list of directories to create from the definition of your EXES variable:
$(EXES): | $(dir $(EXES))
$(dir $(EXES)):
mkdir -p $#
Or, to instantiate exactly one rule per target:
define DIR_rule
$(1): | $$(dir $(1))
endef
$(foreach e,$(EXES),$(eval $(call DIR_rule,$(e))))
$(dir $(EXES)):
mkdir -p $#
I finally found a solution with $(dir ...)
EXE_DIRS=$(dir $(EXES))
EXE_DIRS_UNIQUE=$(shell for DIR in $(EXE_DIRS); do echo $$DIR; done | sort | uniq)
$(shell for DIR in $(EXE_DIRS_UNIQUE); do if [ ! -d $$DIR ]; then mkdir -p $$DIR; fi; done)
all: $(EXES)
I have the following rules in a Makefile to build an executable in 3 stages:
all: build/myexe
build/myexe: output/main_dats.o output/foo_dats.o | build/
gcc $^ -o $#
output/%.o: output/%.c
patscc -c $< -o $#
output/%_dats.c: src/%.dats | output/
patsopt -cc -o $# -d $<
build/:
mkdir -p build/
output/:
mkdir -p output/
An src/%.dats source file is used to generate an output/%_dats.c source file which is compiled to an output/%.o object file and finally they are linked into the executable build/myexe.
Running make the first time will only successfully build the first of the two .o files:
$ make
mkdir -p output/
patsopt -cc -o output/main_dats.c -d src/main.dats
patscc -c output/main_dats.c -o output/main_dats.o
make: *** No rule to make target `output/foo_dats.o', needed by `build/myexe'. Stop.
rm output/main_dats.c
But running again will build the second .o file and successfully link the executable:
$ make
patsopt -cc -o output/foo_dats.c -d src/foo.dats
patscc -c output/foo_dats.c -o output/foo_dats.o
mkdir -p build/
gcc output/main_dats.o output/foo_dats.o -o build/myexe
rm output/foo_dats.c
and note that at the end of each invocation the command rm output/..._dats.c is deleting the generated .c source file.
Here is a Makefile written without pattern matching:
all: build/myexe
build/myexe: output/main_dats.o output/foo_dats.o | build/
gcc $^ -o $#
output/foo_dats.o: output/foo_dats.c
patscc -c $< -o $#
output/main_dats.o: output/main_dats.c
patscc -c $< -o $#
output/foo_dats.c: src/foo.dats | output/
patsopt -cc -o $# -d $<
output/main_dats.c: src/main.dats | output/
patsopt -cc -o $# -d $<
build/:
mkdir -p build/
output/:
mkdir -p output/
which works more predictably:
$ make
mkdir -p output/
patsopt -cc -o output/main_dats.c -d src/main.dats
patscc -c output/main_dats.c -o output/main_dats.o
patsopt -cc -o output/foo_dats.c -d src/foo.dats
patscc -c output/foo_dats.c -o output/foo_dats.o
mkdir -p build/
gcc output/main_dats.o output/foo_dats.o -o build/myexe
and note that the generated .c files are not being removed any more.
Apparently I am misusing the pattern matching mechanism. I know there is some kind of wildcard function but I believe it is intended for file globbing.
To avoid removing intermediate files, you just need to list them as actual targets somewhere. For example you could write a separate rule:
make_srcs: output/main_dats.c output/foo_dats.c
You don't have to list this target make_srcs as a prerequisite, or provide it a recipe, etc. Just listing the _dats.c files as actual targets or prerequisites in the makefile is enough to keep them from being deleted.
As for your "only building some output" behavior, I don't know: it works fine for me:
$ make --version | head -n1
GNU Make 4.2.1
$ cat Makefile
all: build/myexe
build/myexe: output/main_dats.o output/foo_dats.o | build/
touch $#
output/%.o: output/%.c
touch $#
output/%_dats.c: src/%.dats | output/
touch $#
build/:
mkdir -p build/
output/:
mkdir -p output/
make_srcs: output/main_dats.c output/foo_dats.c
$ rm -rf output build && make
mkdir -p output/
touch output/main_dats.c
touch output/main_dats.o
touch output/foo_dats.c
touch output/foo_dats.o
mkdir -p build/
touch build/myexe
So there's something about your setup which hasn't been made clear in your question. As the comment suggested you need to run make -d (I would leave off the -R option, I don't know why you'd add that) and figure out why make throws that error.
Pattern rules should ideally be deprecated. They are prone to over-matching (because, well, patterns), they can be hard to get working, they bring with them the whole "intermediate target" issue (that's the deletion of output/*.c files that you are observing), they need another dubious feature ("secondary expansion") to make them usable in some more involved scenarios, etc.
In short: using pattern rules is not advised, and using multi-level pattern rules is definitely not advised. Just more trouble than it's worth. IMHO, anyway.
(end rant)
So I suggest that you write a simple macro instead, so your makefile ends up looking like this:
all: build/myexe
# $(call dats,basename)
define dats
output/$1_dats.o: output/$1_dats.c
patscc -c $$< -o $$#
output/$1_dats.c: src/$1.c | output
patcc -cc -o $$# -d $$<
endif
build/myexe: output/main_dats.o output/foo_dats.o | build
gcc $^ -o $#
$(eval $(call dats,foo))
$(eval $(call dats,main))
build:
mkdir -p build
output:
mkdir -p output
I am writing a pipeline in make to analyse biological data. There are three distinct sections of the pipeline, the first is to check the quality of the data, the second is to clean the data, and third is to align the data. After each section is complete I would like to manually inspect the results before I move on with the next. Therefore, instead of having a single chain of pattern rules, I want to be able to call each section using a phony target (similar to how you call clean). For example:
make analysis.pipeline quality
make analysis.pipeline trim
make analysis.pipeline align
Here is my current makefile:
# analysis pipeline
include analysis.pipeline.config
# align data
all: $(sorted_bam)
$(results)/%.sorted.bam: $(results)/%.bam
samtools sort $^ $(basename $#)
$(results)/%.bam: $(results)/%.sam
samtools view -bS $^ > $#
$(results)/%.sam: $(results)/%.adaprm.fastq
bowtie2 -x $(genome) -U $^ -S $#
# clean data
.PHONY: trim
trim: $(qc_processed) $(trimmed_fastq)
$(results)/%.adaprm_fastqc.html: $(results)/%.adaprm.fastq
fastqc -o $(#D) $^
$(results)/%.adaprm.fastq: $(data)/%.fastq
cutadapt -a AACCGGTT $^ > $#
# check data
.PHONY: quality
quality: $(qc_raw)
$(results)/%_fastqc.html: $(data)/%.fastq
mkdir -p $(#D) && fastqc -o $(#D) $^
The makefile is written to run in a src directory, which is separate to where the targets and dependencies are built, namely data and results directories. Is it possible to call each section of my pipeline the way I intend to, will there be any issue if I run it in parallel, and is this the right way to get around not being able to use implicit pattern rules in phony targets?
Updated makefile
# objects = A.fastq B.fastq
.PHONY: quality
quality: A_fastqc.html B_fastqc.html C_fastqc.html
.PHONY: trim
trim: A.trimmed_fastqc.html B.trimmed_fastqc.html
.PHONY: align
align: A.sorted.bam.bai B.sorted.bam.bai
# ALIGN DATA SECTION
%.sorted.bam.bai: %.sorted.bam
samtools index $^
%.sorted.bam: %.bam
samtools sort $^ $#
%.bam: %.sam
samtools view -bS $^ > $#
%.sam: %.trimmed.fastq %.trimmed_fastqc.html
bowtie2 -x $(genome) -U $< -S $#
# TRIM DATA SECTION
%.trimmed_fastqc.html: %.trimmed.fastq
fastqc $^
%.trimmed.fastq: %.adaprm.fastq
seqtk trimfq $^ > $#
%.adaprm.fastq: %.fastq %_fastqc.html
cutadapt -a AACCGGTT $< > $#
# CHECK QUALITY SECTION
%_fastqc.html: %.fastq
fastqc $^
I am using gnu Make 3.82 and have an annoying problem.
I have a rule setting dependencies between directories.
OBJDIR=../obj
$(objdir)/%.o: %.C
$(COMPILE) -MM -MT$(objdir)/$(notdir $#) $< -o $(DEPDIR)/$(notdir $(basename $<).d )
$(COMPILE) -o $(objdir)/$(notdir $# ) -c $<
In order to do this, the obj directory must exist.
I want to mkdir the directory as a prerequisite
$(objdir)/%.o: %.C $(objdir)
$(COMPILE) -MM -MT$(objdir)/$(notdir $#) $< -o $(DEPDIR)/$(notdir $(basename $<).d )
$(COMPILE) -o $(objdir)/$(notdir $# ) -c $<
$(objdir):
mkdir $(objdir)
This doesn't work, because it fails when the directory is there and then the make stops
I tried shell
if [ ! -d $(objdir) ] ; then \
mkdir $(objdir) \
fi
but obviously I've got something wrong. What's the best way of doing this?
One simple way is to use:
mkdir -p ../obj
It doesn't fail when the directory exists.
I usually create a macro, MKPATH, for this:
MKPATH = mkdir -p
and then reference the macro in the rule:
$(objdir):
$(MKPATH) $(objdir)
That way, I can change the behaviour without changing the makefile if it becomes necessary.
Your shell fragment:
if [ ! -d $(objdir) ] ; then
mkdir $(objdir)
fi
does not work as written because make executes each line separately.
You could write (note the added semi-colon):
if [ ! -d $(objdir) ] ; then \
$(MKPATH) $(objdir) ; \
fi
Or:
if [ ! -d $(objdir) ] ; then $(MKPATH) $(objdir); fi
Or:
[ -d $(objdir) ] || $(MKPATH) $(objdir)
Note that the command line must be successful overall, so do not try:
[ ! -d $(objdir) ] && $(MKPATH) $(objdir)
If the directory exists, the first alternative fails, but the shell exits with a non-zero status, thus failing...and causing the build to fail.
mkdir
"mkdir -p"
Change:
$(objdir): mkdir $(objdir)
to =>
$(objdir):
mkdir -p $(objdir)
If that particular mkdir does not have -p then:
$(objdir):
test -d $(objdir) || mkdir $(objdir)
Makefiles
Keep the target: and the comands (mkdir, etc) on seperate lines.
Also, in make, to ignore failed commands, prefix command with minus:
$(objdir):
-mkdir $(objdir)
Commands (if-then-else; for loops, etc) with multiple lines require adding `\;' to represent newlines to the shell:
$(objdir):
if [ ! -d $(objdir) ] ; then \
mkdir $(objdir) ; \
fi
This particular usage of if-then-else can also written as:
$(objdir):
if [ ! -d $(objdir) ] ; then mkdir $(objdir) ; fi
The following Makefile that demonstrates each point above
all: setup dirs report
# Create an intefering dir1
# Remove dir2. It is work to be done later.
setup:
#mkdir -p dir1
#if test -d dir2 ; then rmdir dir2 ; fi
# Continue (with dir2), even though dir1 re-creation fails
dirs:
-mkdir dir1
mkdir -v dir2
# Show we're still running
report:
#echo DIRS:
#for d in dir?; do \
test -d $$d || break ; \
echo -n "$$d " ; \
done
#echo
Output from running running make:
mkdir dir1
mkdir: cannot create directory `dir1': File exists
make: [dirs] Error 1 (ignored)
mkdir -v dir2
mkdir: created directory `dir2'
DIRS:
dir1 dir2
xpi_built := $(build_dir)/$(install_rdf) \
$(build_dir)/$(chrome_manifest) \
$(chrome_jar_file) \
$(default_prefs)
xpi_built_no_dir := $(subst $(build_dir)/,,$(xpi_built))
$(xpi_file): $(build_dir) $(xpi_built)
#echo "Creating XPI file."
cd $(build_dir); $(ZIP) ../$(xpi_file) $(xpi_built_no_dir)
#echo "Creating XPI file. Done!"
$(build_dir)/%: %
cp -f $< $#
$(build_dir):
#if [ ! -x $(build_dir) ]; \
then \
mkdir $(build_dir); \
fi
can anyone explain me this makefile part? particularly interested in
$(build_dir)/%: % as well as $< and $# directives
two labels $(build_dir) exists, I guess both are executed, but in which order?
$(build_dir)/%: %
cp -f $< $#
This is a static pattern rule which uses automatic variables in its command; $< expands to the leftmost prerequisite, $# expands to the target. If you try to make $(build_dir)/foo (whatever $(build_dir) is), Make will treat this rule as
$(build_dir)/foo: foo
cp -f foo $(build_dir)/foo
The next rule,
$(build_dir):
#if [ ! -x $(build_dir) ]; \
then \
mkdir $(build_dir); \
fi
is for $(build_dir) itself, and is unnecessarily complicated. It says "if $(build_dir) doesn't exist, then mkdir it", and it could be written this way:
$(build_dir):
mkdir $#
It looks as if your primary target is $(xpi_file):
$(xpi_file): $(build_dir) $(xpi_built)
So Make will first make $(build_dir) (if necessary), then the members of the list %(xpi_built), which includes a couple of things of the form $(build_dir)/%. Once those are done, it will execute the commands of this rule: it will cd into $(build_dir), zip some things up, and echo a couple of messages.
See Pattern Rules and Automatic Variables in the GNU make documentation. The first rule matches files inside $(build_dir), not $(build_dir) itself. $< expands to the list of prerequisites of the current rule, $# is the target for the current rule.