Makefile collect list of files in a new directory - makefile

I have a list of file paths in a variable. I'd like to copy each file in this list to a new location. The problem is that I'm very new to makefiles and I'm struggling to get anything working. My attempt has culminated at the following, although not working (and probably totally wrong) I hope it illustrates what I'm trying to do.
FILES = a/b/file c/d/file e/.../file etc...
copyfiles:
for file in $(FILES); do \
cp $$file newDir/$(notdir $$file); \
done

You could do
FILES = a/b/file c/d/file e/.../file etc...
copyfiles:
cp $(FILES) newDir
I tried it, and it works.
Remember, globbing is done by the shell, not by the commands. cp takes a list of files as arguments, and copies all of them to the location specified by the last argument. When you type cp *.cpp all the cp program sees as its arguments are the files in the current directory that end in .cpp.

Related

How to tar a directory with using makefile

I have few files in sub directories, all the files are just text files like faq, user guides.There are no c/cpp src code
in it. Following is the file and directory structure.
scr
|_Makefile #Top level Makefile
|_other_dirs
|_some_other_dirs
|_mydir
|_Makefile #Makefile of mydir, need to put some code here
|_dir1
| |_textfile0
| |_textfile1
|_dir2
|_textfile2
|_textfile3
Question, How can I tar the contents of dir1 and dir2 into one tar ball? I tried searching over internet about the Makefile and how to use it to create the take ball from top Makefile but no success yet. I am not very familiar with Makefiles, any starting point will be appreciated. Thanks.
Following is my novice attempt to have a very basic Makefile:
-->cat Makefile
mydir.tgz : *
tar -zcvf mydir.tgz mydir/
-->make
Makefile:1: *** missing separator. Stop.
Idea is to run top Makefile and have tar file generated for mydir.
You can add all files and directories in mydir recursively as a prerequisite of mydir.tgz. That way, your tar file will be executed if, and only if, a change occurs somewhere under mydir. For example like this:
mydir.tgz: $(shell find mydir)
tar -zcvf mydir.tgz mydir
The line with the tar command should start with at TAB.
Most of the mechanisms of this answer are also described in this SO question, but it seemed to make sense to me to add it here to concisely answer your specific question.

Run script only on modified file using Makefile

I have few txt files in a directory. I want to run a shell script only on the files which have been modified. How can I achieve this through Makefile?
Have written the following part but it builds all the txt files in the directory. Would be great to get some pointers on this.
FILENAME:= $(wildcard dir/txts/*/*.txt)
.PHONY: build-txt
build-txt: $(FILENAME)
sh build-txts.sh $^
I'm guessing you want something like this:
files := $(wildcard dir/txts/*/*.txt)
dummies := $(addprefix .mod_,$files)
all:$(dummies)
$(dummies): .mod_% : %
sh build-txts.sh $^
touch $#
For any new text file, it will run the script, and create a .mod counterpart. For any non-new text file, it will check if the timestamp is newer than the .mod files timestamp. If it is, it runs the script, and then touches the .mod (making the .mod newer than the text). For any text file that has not been modified since the last make, the .mod file will be newer and the script will not run. Notice that the .mod files are NOT PHONY targets. They are dummy files who exist solely to mark when the text file was last modified. You can stick them in a dummy directory for easy cleaning as well.
If you need something where you don't want to rebuild the text files by default on a fresh checkout, or your script criteria isn't based on timestamps, you would need something a bit more tricky:
files := $(wildcard dir/txts/*/*.txt)
md5s:= $(addprefix .md5_,$files)
all:$(md5s)
.PHONY:$(md5s)
$(md5s):
( [ -e $# ] && md5sum -c $# ) || \
( sh build-txts.sh $# && md5sum $(#:.md5_=) > $# )
Here, you run the rule for all text files regardless, and you use bash to determine if the file is out of date. If the text file does not exist, or the md5sum is not correct, it runs the script, then updates the md5sum. Because the rules are phony, they always run for all the .md5sum files regardless of whether they already exist.
Using this method, you could submit the .md5 files to your repository, and it would only run the script on those files whose md5 sum changed after checkout.

GNU Make - build only out-of-date file in directory

Pretty new to GNU Make. This is a less complex example of something more general I have been trying to get to work.
I have many input files that have similar name format .txt, and I have a shell script that will take the input file and generate an output of the same name but with a different extension .wc. I have written the following Make file.
# name of dependencies
SRC = $(wildcard *.txt)
# get name of targets (substitute .wc for .txt)
TAR = $(SRC:.txt=.wc)
all: $(TAR)
%.wc: %.txt
sh word_count.sh $<
This runs fine, and will generate all the .wc output files. However, if I modify one of the input(dependency) files, they are all rebuilt. So the question is; what is the best way to get GNU Make to only process the modified .txt files in the directory?

Create directory only once when running makefile in parallel

I'm using make to write a pipeline for biological data analysis. My project directory is:
PROJECT
- DATA
- SAMPLEA
- A1.FASTQ A2.FASTQ
- SAMPLEB
- B1.FASTQ B2.FASTQ
- RESULTS
- SRC
- makefile
My current makefile uses a wildcard to list the directory of all .FASTQ files in the DATA directory. Using pattern rules each .FASTQ file then goes through a series of recipes with the final output file written to the RESULTS directory. Instead, I would like to create a directory for each SAMPLE where the final output file is written:
PROJECT/RESULTS/SAMPLEA/A1.out
PROJECT/RESULTS/SAMPLEA/A2.out
PROJECT/RESULTS/SAMPLEB/B1.out
PROJECT/RESULTS/SAMPLEB/B2.out
I can do this by having the first recipe make the directory, however this throws an error when the second of the FASTQ files from the same SAMPLE also tries to create the directory. A few posts on stack overflow suggest using the -p flag on mkdir to ignore errors, however this apparently causes problems when I run the makefile in parallel using the -j flag. I thought about forcing a shell script at the start of the makefile to run, to check if the results directories are present, and if not then it should create them, but I'd like to try and solve this issue using just make.
Create directory before executing rule.
DATADIR := $(shell cd DATA; find * -type d)
create_results_dir:= $(shell for i in $(DATADIR); \
do test -d DATA/$$i && mkdir -p RESULTS/$$i; \
done)
all:
#echo do something.

Install data directory tree with massive number of files using automake

I have a data directory which I would like automake to generate install and uninstall targets for. Essentially, I just want to copy this directory verbatim to the DATA directory, Normally, I might list all the files individually, like
dist_whatever_DATA=dir/subdir/filea ...
But the problem arises when my directory structure looks like this
*root
*subdir
*~10 files
*subdir
*~10 files
*subdir
*~700 files
*subdir
...
~20 subdirs
I just cannot list all 1000+ files included as part of my Makefile.am. That would be ridiculous.
I need to preserve the directory structure as well. I should note that this data is not generated at all by the build process, and is actually largely short audio recordings. So it's not like I would want automake to "check" that every file I want to install has actually been created, as they're either there or not, and whatever file is there, I know I want it to be installed, and whatever file is not, should not be installed. I know that this is the justification used in other places to not do wildcard instsalls, but all the possible reasons don't apply here.
I would use a script to generate a Makefile fragment that lists all the files:
echo 'subdir_files =' > subfiles.mk
find subdir -type f -print | sed 's/^/ /;$q;s/$/ \\/' >> subfiles.mk
and then include this subfiles.mk from your main Makefile.am:
include $(srcdir)/subfiles.mk
nobase_dist_pkgdata_DATA = $(subdir_files)
A second option is to EXTRA_DIST = subdir, and then to write custom install-data-local and uninstall-local rules.
The problem here is that EXTRA_DIST = subdir will distributes all files in subdir/, including backup files, configuration files (e.g. from your VCS), and other things you would not want to distribute.
Using a script as above let you filter the files you really want to distribute.
I've found that installing hundreds of files separately makes for a tormentingly long invocation of make install. I had a similar case where I wanted to install hundreds of files, preserving the directory structure, and I did not want to change my Makefile.am every time a file was added to or removed from the collection.
I included a LZMA archive of the files in my distribution, and made automake rules like so:
GIANTARCHIVE = My_big_archive.tar.lz
dist_pkgdata_DATA = $(GIANTARCHIVE)
install-data-hook:
cd $(DESTDIR)$(pkgdatadir); \
cat $(GIANTARCHIVE) | unlzma | $(TAR) --list > uninstall_manifest.txt; \
cat $(GIANTARCHIVE) | unlzma | $(TAR) --no-same-owner --extract; \
rm --force $(GIANTARCHIVE); \
cat uninstall_manifest.txt | sed --expression='s/^\|$$/"/g' | xargs chmod a=rX,u+w
uninstall-local:
cd $(DESTDIR)$(pkgdatadir); \
cat uninstall_manifest.txt | sed --expression='s/ /\\ /g' | xargs rm --force; \
rm --force uninstall_manifest.txt
This way, automake installs My_big_archive.tar.lz in the $(pkgdata) directory, and extracts it there, making a list of all the files that were in it, so it can uninstall them later. This also runs much faster than listing each file as an install target, even if you were to autogenerate that list.
I would write a script (either as a separate shell script, or in the Makefile.am), that is run as part of the install-data-hook target.

Resources