Does a '%' in a pattern match an empty string? - makefile

From the docs:
A vpath pattern is a string containing a % character. The string
must match the file name of a prerequisite that is being searched for,
the % character matching any sequence of zero or more characters (as
in pattern rules).
Now, although, it is true that % does match an empty string (string of zero length) in a vpath pattern (vpath % foo), this is not true for pattern-rules.
So, it is wrong for the documentation above, to equate between them, as:
...the '%' character matching any sequence of zero or more characters (as
in pattern rules.
As this is simply not true, as evident by the following Makefile:
all ::
al%l :
#echo '$#'
.
Executing, we get:
# It is evident that 'all' doesn't match 'al%l'
$ make -r
make: Nothing to be done for 'all'.
# But, 'all' does match 'al%'
$ make -r -f makefile -f <(echo 'al% : ; echo $#')
echo all
all
.
In fact, this is well documented:
For example, %.c as a pattern matches any file name that ends in
.c. s.%.c as a pattern matches any file name that starts with s.,
ends in .c and is at least five characters long. (There must be at
least one character to match the %.) The substring that the %
matches is called the "stem".
Agree?

Yes, it does.
The problem in your example is that you are mixing and matching single vs. double colon recipes. This is explicitly not allowed, you need to do one or the other for all matching rules.
Also, having different patterns does not qualify as being the same target and the most specific match will usually get run and the others ignored (even if a zero width match as in your example might be present).

Related

bash pattern substitution to remove an arbitrary long sequence of letters

My script deals about filenames which are padded by the letter x to a certain length, so a file may be abcdxxxxxx or fooxxxxxxx. I have the filename stored in a variable fn, and I want to extract just the "stem", i.e. abcd or foo.
I obviously can do this by forking a sed or tr process and feed the file name into it, but bash also has a feature called pattern substitution for variables, and I was wondering whether this could be used.
From the bash man page:
${parameter/pattern/string}
Pattern substitution. The pattern is expanded to produce a pattern just as in pathname expansion. Parameter is expanded and the longest match of pattern against its value is replaced with string. If pattern begins with /, all matches of pattern are replaced with string. Normally only the first match is replaced.... If pattern begins with %, it must match at the end of the expanded value of parameter.
Now, a pattern denoting the letter x is just x, and since the pattern should match at the end, I need %x
echo ${fn/%x/}
indeed return the filename with the last x removed. But I want to have all x removed, i.e. all occurences of the pattern, which requires according to the man-page that the pattern starts with a slash. I understand this to turn %x into either /%x or %/x However, neither echo ${fn//%x/} nor echo ${fn/%/x/} produce the expected result.
Did I misunderstand something in the description of pattern substitution?
Regarding the substring replacements (/, //, /%, /#). Towards the end in here here:
${var/Pattern/Replacement}
First match of Pattern, within var replaced with Replacement.
${var//Pattern/Replacement}
Global replacement. All matches of Pattern, within var replaced with Replacement.
${var/#Pattern/Replacement}
If prefix of var matches Pattern, then substitute Replacement for Pattern.
${var/%Pattern/Replacement}
If suffix of var matches Pattern, then substitute Replacement for Pattern.
So, it's first match, all matches, prefix string or suffix string and as with globbing you can't x* in the sense of regular expressions, you are left with options described in the other answers.
Try:
echo "${fn%${fn##*[^x]}}"
Examples
$ fn=abcdxxxxxx; echo "${fn%${fn##*[^x]}}"
abcd
$ fn=fooxxxxxxx; echo "${fn%${fn##*[^x]}}"
foo
How it works
For starters, ${parameter##word} is prefix removal. It removes word from the beginning of parameter. In our cvase, ${fn##*[^x]} is the file with everything removed from the front up to an including the last character that is not x. This leaves only the trailing x's. For example:
$ fn=abcdxxxxxx; echo "${fn##*[^x]}"
xxxxxx
${parameter%%word} is suffix removal. It removes word from the end of $parameter. In our case, we want to removes trailing x's (as found above) from $fn. Thus we want ${fn%${fn##*[^x]}}.
Doubling the percent sign will do what you want:
echo "${fn%%x*}"
"Remove, from the end of the string, x and all the characters that follow it"
Or you can use extended globs:
shopt -s extglob
echo "${fn/%+(x)/}"
"Replace, at the end of the string, a sequence of one or more x's with nothing"
Assuming you have the filename in the environment variable fn, then in bash you can do:
if [[ $fn =~ x+$ ]]; then
echo ${fn%$BASH_REMATCH}
fi
This will print the filename with the matched part removed. If you want it to work also when there are no x:es at the end of the filename, replace x+$ with x*$ above, in which case it will always match.
As for the pattern substitution, my guess is it will only attempt the replace matches in the string once at a given location even if you add the / to replace all matches. So when it matches the last x at the end of the string, it will not go back to an earlier location in the string to see if it matches again. Basically this means you cannot combine % and /. If my guess is correct, that is :)

What does "output_dir="${1%/}" mean in .sh file?

I have not seen the usage like this.Anyone can provide relevant information? The source code im2txt
See the bash manual:
${parameter%word}
${parameter%%word}
The word is expanded to produce a pattern and matched according to the rules described below (see Pattern Matching). If the pattern matches a trailing portion of the expanded value of parameter, then the result of the expansion is the value of parameter with the shortest matching pattern (the ‘%’ case) or the longest matching pattern (the ‘%%’ case) deleted. [...]
I emphasized the relevant alternative. The parameter in question is $1, i.e. the first command line argument the script was called with. The pattern is a simple / which will be removed if present. In other words, the expansion removes an optional trailing slash.
Demonstration (the y case shows that it's just a trailing pattern, z demonstrates no match):
$ x=aaa/; y=aaa/bbb; z=aaa; echo "$x -> ${x%/}"; echo "$y -> ${y%/}"; echo "$z -> ${z%/}"
aaa/ -> aaa
aaa/bbb -> aaa/bbb
aaa -> aaa
It basically removes the last "/" character from the ending of the first string received as a parameter of the script in cause.
If you had "/home/users/" as a string, then output_dir would become "/home/users"
You can find more details on string manipulation in bash here.

How to rename multiple files?

In a folder I have several files with the following name-structure (I write just three examples):
F_001_4837_blabla1.doc
F_045_8987_blabla2.doc
F_168_9092_blabla3.doc
What I would do is to use a BASH command to rename all the files in my folder by deleting the first underscore and the series of zeros before the first number code obtaining:
F1_4837_blabla1.doc
F45_8987_blabla2.doc
F168_9092_blabla3.doc
shopt -s extglob
for f in *; do
echo "$f: ${f/_*(0)/}"
# mv "$f" "${f/_*(0)/}" # for the actual rename
done
output
F_001_4837_blabla1.doc: F1_4837_blabla1.doc
F_045_8987_blabla2.doc: F45_8987_blabla2.doc
F_168_9092_blabla3.doc: F168_9092_blabla3.doc
Parameter Expansion
Parameter expansion can be used to replace the content of a variable. In this case, we replace the pattern _*(0) with nothing.
${parameter/pattern/string}
Pattern substitution. The pattern is expanded to produce a pat-
tern just as in pathname expansion. Parameter is expanded and
the longest match of pattern against its value is replaced with
string. If pattern begins with /, all matches of pattern are
replaced with string. Normally only the first match is
replaced. If pattern begins with #, it must match at the begin-
ning of the expanded value of parameter. If pattern begins with
%, it must match at the end of the expanded value of parameter.
If string is null, matches of pattern are deleted and the / fol-
lowing pattern may be omitted. If parameter is # or *, the sub-
stitution operation is applied to each positional parameter in
turn, and the expansion is the resultant list. If parameter is
an array variable subscripted with # or *, the substitution
operation is applied to each member of the array in turn, and
the expansion is the resultant list.
Extended pattern matching
Extended pattern matching allows us to use the pattern *(0) to match zero or more 0 characters. It needs to be enabled using the extglob setting.
If the extglob shell option is enabled using the shopt builtin, several
extended pattern matching operators are recognized. In the following
description, a pattern-list is a list of one or more patterns separated
by a |. Composite patterns may be formed using one or more of the fol-
lowing sub-patterns:
?(pattern-list)
Matches zero or one occurrence of the given patterns
*(pattern-list)
Matches zero or more occurrences of the given patterns
+(pattern-list)
Matches one or more occurrences of the given patterns
#(pattern-list)
Matches one of the given patterns
!(pattern-list)
Matches anything except one of the given patterns

What is the difference between % and * in a makefile

The GNU make manual does not excel at explaining this part, I could not find the explanation or I could not infer the information elsewhere.
I realize % is a kind of wildcard, but what is the difference between % and * in the context of targets, dependencies and commands? Where can I use it and does it have the same meaning everywhere?
target: dependencies ...
commands
The wildcard character * is used to simply generate a list of matching files in the current directory. The pattern substitution character % is a placeholder for a file which may or may not exist at the moment.
To expand on the Wildcard pitfall example from the manual which you had already discovered,
objects = *.o
The proper way to phrase that if there no '.o' files is something like
objects := $(patsubst %.c,%.o,$(wildcard *.c))
make itself performs no wildcard expansion in this context, but of course, if you pass the literal value *.o to the shell, that's when expansion happens (if there are matches) and so this can be slightly hard to debug. make will perform wildcard expansion in the target of a rule, so you can say
foo: *.o
and have it work exactly like you intended (provided the required files are guaranteed to exist at the time this dependency is evaluated).
By contrast, you can have a rule with a pattern placeholder, which gets filled in with any matching name as make tries to find a recipe which can be used to generate a required dependency. There are built-in rules like
%.o: %.c
$(CC) $(CCFLAGS) $^ -o $#
(approximating the real thing here) which say "given a file matching %.c, the corresponding file %.o can be generated as follows." Here, the % is a placeholder which can be replaced by anything; so if it is applied against an existing file foo.c it says how foo.o can be generated.
You could rephrase it to say * matches every matching file while % matches any matching file.
Both % and * are ordinary characters in Make recipe lines; they are just passed to the shell.
% denotes a file "stem" in pattern substitutions, as in $(patsubst %.o,%.c,$(OBJS)). The pattern %.o is applied to each element in $(OBJS), and % captures the matching part. Then in the replacement pattern %.c, the captured part is substituted for the %, and a list of the substitutions emerges out of patsubst as the return value.
* is useful in the argument of the $(wildcard ...) operator, where it resembles the action of the shell * glob in matching some paths in the filesystem.
On the left hand side of a patsubst, where % denotes a match, it resembles * in that it matches some characters. However, % carries some restrictions, such as that it can only appear once! For instance whereas we can expand the wildcard */*.c, of course, we cannot have a double stem pattern substitution like $(patsubst %/%.o,%/foo/%.c,...). This restriction could be lifted in some future version of GNU Make, but it currently holds as far as I know.
Also there is a subtle difference between % and * in that % matches a nonempty sequence of characters. The wildcard pattern fo*o.c matches foo.c. The substitution pattern fo%o.c does not match foo.c, because then the stem % would be empty which is not allowed.

Functions "filter" and "filter-out" do not remove newlines

From the docs:
$(filter PATTERN...,TEXT)
Returns all whitespace-separated words in TEXT that do match any
of the PATTERN words, removing any words that do not match. The
patterns are written using %, just like the patterns used in the
patsubst function above.
$(filter-out PATTERN...,TEXT)
Returns all whitespace-separated words in TEXT that do not match
any of the PATTERN words, removing the words that do match one or
more. This is the exact opposite of the filter function.
What does "whitespace - separated words" mean?
Well, we think we know. At-least, when assuming a "normal" locale.
So, for a "C" ("POSIX") locale we have:
"space"
Define characters to be classified as white-space characters.
In the POSIX locale, at a minimum, the <space>, <form-feed>, <newline>, <carriage-return>, <tab>, and <vertical-tab> shall be included.
Now, a makefile, like this:
define foo
a
b
endef
all :
echo '$(filter a b,$(foo))'
Running, I get:
echo ''
Let's try the filter-out case:
define foo
a
b
endef
all :
-echo '$(filter-out a b,$(foo))'
Running, I get:
echo 'a
/bin/sh: 1: Syntax error: Unterminated quoted string
makefile:8: recipe for target 'all' failed
make: [all] Error 2 (ignored)
b'
/bin/sh: 1: Syntax error: Unterminated quoted string
makefile:8: recipe for target 'all' failed
make: [all] Error 2 (ignored)
So, clearly Make does not handle here properly a legitimate white-space (newline).
Right?
The thing is you need to escape the newline characters in your foo variable or pass its value to a proper place.
The same as writing any embeded shell script inside the makefile, you need to escape every new line. $(foo) will simply copy-paste a content from foo multi-line variable. Hence, for your given foo value, below recipe will raise a syntax error:
test1:
echo '$(foo)'
Similar thing is for your filter-out example. I'm not sure why filter function gives no syntax error.
1st solution. As mentioned above, escaping a newline character is one of the solutions:
define foo
a\
b
endef
test1:
echo '$(foo)'
The benefit is that you don't need to change your all recipe.
2nd solution. In most cases, you probably don't want to change/edit/parse your multi-line variable. Then you'll need to use a shell function that will directly invoke a shell command instead of pasting a script into the makefile contents and then parsing it. Our test recipe will look like this:
define foo
a
b
endef
test2:
echo $(shell echo '$(foo)')
Note that output newlines are being converted to single spaces by shell function.

Resources