How to escape backslash in makefile shell function - makefile

I need to recursively find all the header files in a list of directories. I can't figure out how to escape the command properly. I have searched around and found various information on escaping in makefiles but I have not been able to solve this issue.
In bash the following does what I want:
find path1 path2 path3 -type f \( -name *.hpp -o -name *.h -o *.hxx \)
In my make file I have tried a few combinations of foreach, etc. Currently I have this:
INCLUDE_PATHS ?= path1 path2 path3
MY_HEADERS := $(shell find $(INCLUDE_PATHS) -type f \( -name *.h -o -name *.hpp -o -name *.hxx \))
This produces:
find: paths must precede expression
Usage: find [-H] [-L] [-P] [path...] [expression]
If I just look for one extension such as "*.hpp" it works fine (I assume because the \(...\) is not needed).
I have tried various combinations of $, ', ". \ to escape the '\' characters in the shell command without success.
Any help would be greatly appreciated.

Your problem doesn't have anything to do with make or the value of INCLUDE_PATHS or how make interprets backslash characters. The problem is that you're not escaping your globbing, and it's matching some local files. Rewrite your function to escape your glob statements, like this:
MY_HEADERS := $(shell find $(INCLUDE_PATHS) -type f \( -name \*.h -o -name \*.hpp -o -name \*.hxx \))
I would be very surprised if the original command works in bash without quoting those characters, if you run it from the same directory containing the same contents as make.

The variable MY_HEADERS becomes correct, by calling $($INCLUDE_PATHS) -- not $(INCLUDE_PATHS).
So your Makefile would be:
INCLUDE_PATHS ?= path1 path2 path3
MY_HEADERS := $(shell find $($INCLUDE_PATHS) -type f \( -name *.h -o -name *.hpp -o -name *.hxx \))
You can continue to check the variable's value:
all: printme
printme:
#echo $(MY_HEADERS)
Running this Makefile by make will show your desired answer.

Although MadScientist already answered the question perfectly, you could use the following to avoid the shell altogether:
INCLUDE_PATHS ?= path1 path2 path3
EXTENSIONS := .h .hpp .hxx
MY_HEADERS := $(shell find $(INCLUDE_PATHS) -type f \( -name \*.h -o -name \*.hpp -o -name \*.hxx \))
$(info $(MY_HEADERS))
MY_HEADERS := $(foreach p,$(INCLUDE_PATHS),$(foreach e,$(EXTENSIONS),$(wildcard $(p)/*$(e))))
$(info $(MY_HEADERS))

Related

Bash wrapping parts of a variable in quotes when expanded

I'm trying to recursively find c and header files in a script, while avoiding globbing out any that exist in the current directory.
FILE_MATCH_LIST='"*.c","*.cc","*.cpp","*.h","*.hh","*.hpp"'
FILE_MATCH_REGEX=$(echo "$FILE_MATCH_LIST" | sed 's/,/ -o -name /g')
FILE_MATCH_REGEX="-name $FILE_MATCH_REGEX"
This does exactly what I want it to:
+ FILE_MATCH_REGEX='-name "*.c" -o -name "*.cc" -o -name "*.cpp" -o -name "*.h" -o -name "*.hh" -o -name "*.hpp"'
Now, if I call find with that string (in quotes), it maintains the leading and trailing quotes and breaks find:
files=$(find $root_dir "$FILE_MATCH_REGEX" | grep -v $GREP_IGNORE_LIST)
+ find [directory] '-name "*.c" -o -name "*.cc" -o -name "*.cpp" -o -name "*.h" -o -name "*.hh" -o -name "*.hpp"'
This results in a "unknown predicate" error from find, because the entire predicate is single quoted.
If I drop the quotes from the variable in the find command, I get a strange behavior:
files=$(find $root_dir $FILE_MATCH_REGEX | grep -v $GREP_IGNORE_LIST)
+ find [directory] -name '"*.c"' -o -name '"*.cc"' -o -name '"*.cpp"' -o -name '"*.h"' -o -name '"*.hh"' -o -name '"*.hpp"'
Where are these single quotes coming from? They exist if I echo that variable as well, but they aren't there in the command when I'm actually setting the $FILE_MATCH_REGEX (As seen at the beginning of the question).
This of course also breaks find, because it's looking for the actual double quoted string, instead of expanding the *.h etc.
How do I get these strings into find without all of these quoting woes?
Fleshing out the array answer:
#!/bin/bash
patterns=( '*.c' '*.cc' '*.h' '*.hh' )
find_args=( "-name" "${patterns[0]}" )
for (( i=1 ; i < "${#patterns[#]}" ; i++ )) ; do
find_args+=( "-o" "-name" "${patterns[i]}" )
done
find [directory] "${find_args[#]}"
Also, to clear up the misconception around quotes, if you echo the last line the output might not be what you expect:
echo find [directory] "${find_args[#]}"
# outputs: find [directory] -name *.c -o -name *.cc -o -name *.h -o -name *.hh
Where are the quotes? Your shell removed them after it was done with them. Quotes are not find syntax, they are shell syntax that tell the shell how to interpret (or perhaps how NOT to interpret) your command line.
The reason for the strange behavior in your debug output is that the quotes in your data are literal quotes, not shell syntax quotes that get removed during command parsing. The debugger is just trying to point out the distinction.
Some useful resources on the Bash wiki:
BashParser explains how your command line gets parsed and executed
BashFAQ/050 explains why embedding quotes in your data isn't sufficient
If you have GNU find - adjust to your liking:
#!/bin/bash
#FILE_MATCH_LIST='"*.c","*.cc","*.cpp","*.h","*.hh","*.hpp"'
FILE_MATCH_LIST='.*/.*\.(c|cc|cpp|h|hh|hpp)'
find . -type f -regextype posix-egrep -regex "${FILE_MATCH_LIST}"

shell find command : Mixing excluding directories and including specific directories

I'm faced with a directory structure like :
./xxx/src
./xxx/src/folder1
./xxx/src/folder2
./xxx/src/folder2/subfolder1
./xxx/UTest
./xxx/Module
./xxx/Itfs
./.hg
./Tools
Instead of having each module define it's own include search directory. I want to find all relevant directories with a "simple" shell command and use this in my top level makefile.
That would look alot nicer and as bonus, force other users to use folders defined in the coding guidelines.
I got as far as :
MODINCLUDES = \
$(shell find $(MODULE_DIR) \( -name .hg -o -name Tools \) -prune -o \( -name src -o -name Module -o -name Itfs \) -type d -print| while read line; do echo "-I$$line"; done )
But this would result in
MODINCLUDES = -I/xxx/src -I/xxx/Module -I/xxx/Itfs
Obviously I would like have any subfolders in src to be included aswell.
./xxx/src/folder1
./xxx/src/folder2
./xxx/src/folder2/subfolder1
Can someone explain me, on how to do this ?
Thanks !
If I understand your question correctly, just don't restrict to just the three specific folders. Take out the
\( -name src -o -name Module -o -name Itfs \)
and maybe avoid the silly while loop if your find supports -printf:
MODINCLUDES = \
$(shell find $(MODULE_DIR) \
\( -name .hg -o -name Tools \) -prune -o \
-type d -printf "-I%p\n")
Finding the sub-directories you want with find can be done with:
find $(MODULE_DIR)/xxx/src $(MODULE_DIR)/xxx/Module $(MODULE_DIR)/xxx/Itfs -type d
And then, putting this in a Makefile and adding the -I prefix:
ROOTDIRS := $(addprefix $(MODULE_DIR)/xxx/,src Module Itfs)
SUBDIRS := $(shell find $(ROOTDIRS) -type d)
MODINCLUDES := $(addprefix -I,$(SUBDIRS))
Or, all at once:
MODINCLUDES := $(addprefix -I,$(shell find $(addprefix $(MODULE_DIR)/xxx/,src Module Itfs) -type d))

List of object files to gcc format

I'm trying to link all my files with object *.o extension in my directory.
I tried to use:
for i in $(find . -name "*.o" -type f);
do
echo $i >> myFiles
done
Then I need:
gcc -o myFile <myFiles
gcc: fatal error: no input files
compilation terminated.
I see several problem in your approach:
find should not be used in a loop, but rather with a -fprint <file>. So in your case:
find . -name "*.o" -type f -fprintf myfiles
Secondly, redirecting your file to gcc stdin will not work as you think, as it uses the input as the source of the code: see this question. What you want instead is to expand the list of objects to a list of arguments:
cat myfiles | xargs gcc -o myFile
xargs does it nicely. But as #n.m. mentioned, you could do everything at once with a command substitution:
gcc -o myfile $(find . -type f -name *.o -print0)
The only difference I suggest is to use a -print0 so that find put a \0 at the end of a find instead of a \n.
Good Luck

How to escape special characters in a variable to provide commandline arguments in bash

I very often use find to search for files and symbols in a huge source tree. If I don't limit the directories and file types, it takes several minutes to search for a symbol in a file. (I already mounted the source tree on an SSD and that halved the search time.)
I have a few aliases to limit the directories that I want to search, e.g.:
alias findhg='find . -name .hg -prune -o'
alias findhgbld='find . \( -name .hg -o -name bld \) -prune -o'
alias findhgbldins='find . \( -name .hg -o -name bld -o -name install \) -prune -o'
I then also limit the file types as well, e.g.:
findhgbldins \( -name '*.cmake' -o -name '*.txt' -o -name '*.[hc]' -o -name '*.py' -o -name '*.cpp' \)
But sometimes I only want to check for symbols in cmake files:
findhgbldins \( -name '*.cmake' -o -name '*.txt' \) -exec egrep -H 'pattern' \;
I could make a whole bunch of aliases for all possible combinations, but it would be a lot easier if I could use variables to select the file types, e.g:
export SEARCHALL="\( -name '*.cmake' -o -name '*.txt' -o -name '*.[hc]' -o -name '*.py' -o -name '*.cpp' \)"
export SEARCHSRC="\( -name '*.[hc]' -o -name '*.cpp' \)"
and then call:
findhgbldins $SEARCHALL -exec egrep -H 'pattern' \;
I tried several variants of escaping \, (, * and ), but there was no combination that did work.
The only way I could make it to work, was to turn off globbing in Bash, i.e. set -f, before calling my 'find'-contraption and then turn globbing on again.
One alternative I came up with is to define a set of functions (with the same names as my aliases findhg, findhgbldins, and findhgbldins), which take a simple parameter that is used in a case structure that selects the different file types I am looking for, something like:
findhg {
case $1 in
'1' )
find <many file arguments> ;;
'2' )
find <other file arguments> ;;
...
esac
}
findhgbld {
case $1 in
'1' )
find <many file arguments> ;;
'2' )
find <other file arguments> ;;
...
esac
}
etcetera
My question is: Is it at all possible to pass these types of arguments to a command as a variable ?
Or is there maybe a different way to achieve the same i.e. having a combination of a command (findhg, findhgbld,findhgbldins) and a single argument to create a large number of combinations for searching ?
It's not really possible to do what you want without unpleasantness. The basic problem is that when you expand a variable without double-quotes around it (e.g. findhgbldins $SEARCHALL), it does word splitting and glob expansion on the variable's value, but does not interpret quotes or escapes, so there's no way to embed something in the variable's value to suppress glob expansion (well, unless you use invalid glob patterns, but that'd keep find from matching them properly too). Putting double-quotes around it (findhgbldins "$SEARCHALL") suppresses glob expansion, but it also suppresses word splitting, which you need to let find interpret the expression properly. You can turn off glob expansion entirely (set -f, as you mentioned), but that turns it off for everything, not just this variable.
One thing that would work (but would be annoying to use) would be to put the search options in arrays rather than plain variables, e.g.:
SEARCHALL=( \( -name '*.cmake' -o -name '*.txt' -o -name '*.[hc]' -o -name '*.py' -o -name '*.cpp' \) )
findhgbldins "${SEARCHALL[#]}" -exec egrep -H 'pattern' \;
but that's a lot of typing to use it (and you do need every quote, bracket, brace, etc to get the array to expand right). Not very helpful.
My preferred option would be to build a function that interprets its first argument as a list of file types to match (e.g. findhgbldins mct -exec egrep -H 'pattern' \; might find make/cmake, c/h, and text files). Something like this:
findhgbldins() {
filetypes=()
if [[ $# -ge 1 && "$1" != "-"* ]]; then # if we were passed a type list (not just a find primitive starting with "-")
typestr="$1"
while [[ "${#typestr}" -gt 0 ]]; do
case "${typestr:0:1}" in # this looks at the first char of typestr
c) filetypes+=(-o -name '*.[ch]');;
C) filetypes+=(-o -name '*.cpp');;
m) filetypes+=(-o -name '*.make' -o '*.cmake');;
p) filetypes+=(-o -name '*.py');;
t) filetypes+=(-o -name '*.txt');;
?) echo "Usage: $0 [cCmpt] [find options]" >2
exit ;;
esac
typestr="${typestr:1}" # remove first character, so we can process the remainder
done
# Note: at this point filetypes will be something like '-o' -name '*.txt' -o -name '*.[ch]'
# To use it with find, we need to remove the first element (`-o`), and add parens
filetypes=( \( "${filetypes[#]:1}" \) )
shift # and get rid of $1, so it doesn't get passed to `find` later!
fi
# Run `find`
find . \( -name .hg -o -name bld -o -name install \) -prune -o "${filetypes[#]}" "$#"
}
...you could also use a similar approach to building a list of directories to prune, if you wanted to.
As I said, that'd be my preferred option. But there is a trick (and I do mean trick), if you really want to use the variable approach. It's called a magic alias, and it takes advantage of the fact that aliases are expanded before wildcards, but functions are processed afterward, and does something completely unnatural with the combination. Something like this:
alias findhgbldins='shopts="$SHELLOPTS"; set -f; noglob_helper find . \( -name .hg -o -name bld -o -name install \) -prune -o'
noglob_helper() {
"$#"
case "$shopts" in
*noglob*) ;;
*) set +f ;;
esac
unset shopts
}
export SEARCHALL="( -name *.cmake -o -name *.txt -o -name *.[hc] -o -name *.py -o -name *.cpp )"
Then if you run findhgbldins $SEARCHALL -exec egrep -H 'pattern' \;, it expands the alias, records the current shell options, turns off globbing, and passes the find command (including $SEARCHALL, word-split but not glob-expanded) to noglob_helper, which runs the find command with all options, then turns glob expansion back on (if it wasn't disabled in the saved shell options) so it doesn't mess you up later. It's a complete hack, but it should actually work.

Using find on multiple file extensions in combination with grep

I am having a problems using find and grep together in msys on Windows. However, I also tried the same command on a Linux machine and it behaved the same. Notwithstanding, the syntax below is for windows in that the semicolon on the end of the command is not preceded by a backslash.
I am trying to write a find expression to find *.cpp and *.h files and pass the results to grep. If I run this alone, it successfully finds all the .cpp and .h files:
find . -name '*.cpp' -o -name '*.h'
But if I add in an exec grep expression like this:
find . -name '*.cpp' -o -name '*.h' -exec grep -l 'std::deque' {} ;
It only greps the .h files. If I switch the .h and .cpp order in the command, it only searches the .h. Essentially, it appears to only grep the last file extension in the expression. What do I need to do to grep both .h and .cpp??
Since you're using -o, you will need to use parentheses around it:
find . \( -name '*.cpp' -o -name '*.h' \) -exec grep -l 'std::deque' {} \;
Or.. you can ...
bash$> grep '/bin' `find . -name "*.pl" -o -name "*.sh"`
./a.sh:#!/bin/bash
./pop3.pl:#!/usr/bin/perl
./seek.pl:#!/usr/bin/perl -w
./move.sh:#!/bin/bash
bash$>
Above command greps 'bin' in ".sh" and ".pl" files. And it has found them !!

Resources