What do ## or // mean in bash shell script? - bash

I have searched a lot, and while I see a couple of examples of these used, specifically from here:
scale=${scale##*[!0-9]*}
[ -z "${scale//[0-9]}" ]
There is no explanation for what these symbols do, how they work or when to use them scripting. I have not found them explained elsewhere when special symbols are discussed. Looks like they could be useful. Can anyone explain how the ## and // work in the script examples on the page linked above? Thanks.

They're part of shell parameter expansion syntax, used to modify the value of the variable. # and % are used to delete a prefix or suffix of the variable, and // is used to substitute one string for another.
${parameter#word}
${parameter##word}
The word is expanded to produce a pattern just as in filename expansion (see Filename Expansion). If the pattern matches the beginning of the expanded value of parameter, then the result of the expansion is the expanded value of parameter with the shortest matching pattern (the ‘#’ case) or the longest matching pattern (the ‘##’ case) deleted.
So ${scale##*[!0-9]*} means to remove the beginning of the string that matches anything followed by a non-digit followed by anything. So foobar becomes an empty string (because everything is removed), while 123 is left alone because [!0-9] never matches anything.
${parameter/pattern/string}
The pattern is expanded to produce a pattern just as in filename expansion. Parameter is expanded and the longest match of pattern against its value is replaced with string. If pattern begins with ‘/’, all matches of pattern are replaced with string. Normally only the first match is replaced. If pattern begins with ‘#’, it must match at the beginning of the expanded value of parameter. If pattern begins with ‘%’, it must match at the end of the expanded value of parameter. If string is null, matches of pattern are deleted and the / following pattern may be omitted.
So ${scale//[0-9]} simply removes all digits from the the value of the variable, then test -z is used to test if this is an empty string (meaning the original string only had digits).

From: http://tldp.org/LDP/abs/html/string-manipulation.html
${string##substring}
Deletes longest match of $substring from front of $string.
${string//substring/replacement}
Replace all matches of $substring with $replacement.

Related

Simple bash function to find/replace string variable (no files)

I simply want a function (or just a 1-liner) to find/replace a string inside a variable, and not worry if the variables contain crazy characters.
Pseudo-code:
findReplace () {
#what goes here?
}
myLongVar="some long \crazy/ text my_placeholder bla"
replace="my_placeholder"
replaceWith="I like hamburgers/fries"
myFinalVar=$(findReplace $myLongVar $replace $replaceWith)
All similar questions seem complicated and use files
You can define the function like this:
findReplace1() {
printf "%s" "${1/"$2"/$3}"
}
And then run it like this:
myFinalVar=$(findReplace "$myLongVar" "$replace" "$replaceWith")
Note the double-quotes -- they're very important, because without them bash will split the variables' values into separate words (e.g. "some long \crazy/ text..." -> "some" "long" "\crazy/" "text...") and also try to expand anything that looks like a wildcard into a list of matching filenames. It's ok to leave them off on the right side of an assignment (myFinalVar=...), but that's one of the few places where it's ok. Also, note that within the function I put double-quotes around $2 -- in that case again it's to keep it from being treated as a wildcard pattern, but here it'd a string-match wildcard rather than filenames. Oh, and I used printf "%s" instead of echo because some versions of echo do weird things with strings that contain backslashes and/or start with "-".
And, of course, you can just skip the function and do the replacement directly:
myFinalVar=${myLongVar/"$replace"/$replaceWith}
Try:
myFinalVar=${myLongVar/$replace/$replaceWith}
If your want to replace all occurrences of $replace, not just the first, use:
myFinalVar=${myLongVar//$replace/$replaceWith}
Documentation
From man bash:
${parameter/pattern/string}
Pattern substitution. The pattern is expanded to produce a pattern
just as in pathname expansion. Parameter is expanded and the longest
match of pattern against its value is replaced with
string. If pattern begins with /, all matches of pattern are
replaced with string. Normally only the first match is replaced. If
pattern begins with #, it must match at the beginning of
the expanded value of parameter. If pattern begins with %, it must
match at the end of the expanded value of parameter. If string is
null, matches of pattern are deleted and the / following pattern may
be omitted. If the nocasematch shell option is enabled, the match is
performed without regard to the case of alphabetic
characters. If parameter is # or *, the substitution operation is
applied to each positional parameter in turn, and the
expansion is the resultant list. If parameter is an array variable
subscripted with # or *, the substitution operation is applied to each
member of the array in turn, and the expansion is the
resultant list.

What does ${img_file%.*} in a shell script mean?

I know that .* means fetch all files regardless of the extensions (I hope I'm not wrong). However, I can't for the love of my life seem to figure out what does that extra % sign mean!
Here's two code snippets that might help describe the situation a bit more :
img_files=${img_files}' '$(ls ${TRAINING_DIR}/*.exp${exposure}.tif)
for img_file in ${img_files}; do
run_command tesseract ${img_file} ${img_file%.*} \
${box_config} ${config} &
For those who need even more details, here's the full script.
The expression ${img_file%.*} will remove the rightmost dot and any character after it in the variable img_file. From man bash:
${parameter%word}
${parameter%%word}
Remove matching suffix pattern. The word is expanded to produce
a pattern just as in pathname expansion. If the pattern matches
a trailing portion of the expanded value of parameter, then the
result of the expansion is the expanded value of parameter with
the shortest matching pattern
Example:
>var="word1 word2"
>echo ${var%word2}
word1
>echo ${var%word1}
word1 word2
% here means removal from right edge. For example
consider a variable img_file="racecar"
${img_file%c*} will return race.
${img_file%%c*} = ra

What does the POSIX spec mean when it says this is necessary to avoid ambiguity?

When responding to this comment:
Now I got the the two ":"s are independent, and that's why I couldn't find any document about them. Is the first one needed in this case?
I noticed this paragraph in the spec for the first time:
In the parameter expansions shown previously, use of the <colon> in the format shall result in a test for a parameter that is unset or null; omission of the <colon> shall result in a test for a parameter that is only unset. If parameter is '#' and the colon is omitted, the application shall ensure that word is specified (this is necessary to avoid ambiguity with the string length expansion).
I've seen the matching explanation in the bash reference manual:
When not performing substring expansion, using the form described below (e.g., ‘:-’), Bash tests for a parameter that is unset or null. Omitting the colon results in a test only for a parameter that is unset. Put another way, if the colon is included, the operator tests for both parameter’s existence and that its value is not null; if the colon is omitted, the operator tests only for existence.
before and I understand what the difference is with the colon versions of these expansions.
What confused me just now is this sentence from the spec:
If parameter is '#' and the colon is omitted, the application shall ensure that word is specified (this is necessary to avoid ambiguity with the string length expansion).
I don't understand what ambiguity is possible here if word is unspecified.
None of the expansion sigils are valid in shell variable names so they cannot possibly start a single-character variable name. If they could then using a parameter of # would always be ambiguous without a colon since you could never tell if ${#+foo} meant the length of the variable foo or an alternate expansion on #, etc.
What am I missing here? What ambiguity requires ensuring that word exist? (I mean not having word in this expansion is clearly not useful but that's not the same thing.)
- is also a shell special parameter, whose value is a string indicating which shell options are currently set. For example,
$ echo $-
himBH
${#parameter} is the syntax for the length of a parameter.
$ foo=bar
$ echo ${#foo}
3
The expression ${#-}, therefore is ambiguous: is it the length of the value of $-, or is does it expand to the empty string if $# is empty? (Unlikely, since $# is always an integer and cannot be unset, but syntactically legal.) I interpret the spec to meant that ${#-} should resolve the ambiguity by expanding to the length of $- (which is what most shells seem to do).

bash copy file where some of the filename is not known

In a bash script i want to copy a file but the file name will change over time.
The start and end of the file name will however stay the same.
is there a way so i get the file like so:
cp start~end.jar
where ~ can be anything?
the cp command would be run a a bash script on a ubuntu machine if this makes and difference.
A glob (start*end) will give you all matching files.
Check out the Expansion > Pathname Expansion > Pattern Matching section of the bash manual for more specific control
* Matches any string, including the null string.
? Matches any single character.
[...] Matches any one of the enclosed characters. A pair of characters separated by a hyphen denotes a range expression; any character that sorts between those two characters, inclusive, using the current locale's collat-
ing sequence and character set, is matched. If the first character following the [ is a ! or a ^ then any character not enclosed is matched. The sorting order of characters in range expressions is determined by
the current locale and the value of the LC_COLLATE shell variable, if set. A - may be matched by including it as the first or last character in the set. A ] may be matched by including it as the first character in
the set.
and if you enable extglob:
?(pattern-list)
Matches zero or one occurrence of the given patterns
*(pattern-list)
Matches zero or more occurrences of the given patterns
+(pattern-list)
Matches one or more occurrences of the given patterns
#(pattern-list)
Matches one of the given patterns
!(pattern-list)
Matches anything except one of the given patterns
Use a glob to capture the variable text:
cp start*end.jar

what does ## mean inside ${}

I am reading a shell scripts from github :script
It has two lines of code confused me. I have never seen ## used in bash like this before.
could anyone explain this to me, how does it work? thanks.
branch_name=$(git symbolic-ref -q HEAD)
branch_name=${branch_name##refs/heads/}
Note:The first line produces something like 'refs/heads/master'
and the next line remove the leading refs/heads make the branch_name becomes master.
From the bash(1) man page, EXPANSION section, Parameter Expansion subsection:
${parameter#word}
${parameter##word}
Remove matching prefix pattern. The word is expanded to produce
a pattern just as in pathname expansion. If the pattern matches
the beginning of the value of parameter, then the result of the
expansion is the expanded value of parameter with the shortest
matching pattern (the ``#'' case) or the longest matching pat‐
tern (the ``##'' case) deleted.
Also available in the manual, of course (but it doesn't seem to support linking to this exact text; search the page for ##).
Have a look here where a lot other string manipulation tricks are described. In short
${string##substring}
Deletes longest match of $substring from front of $string.

Resources