Bash tilde *substring* expansion - undocumented feature? - bash

I was surprised that this expansion:
$ echo "${foo:~abc}"
yielded the empty string when foo was unset. I expected that it would parse like this:
$ echo "${foo:(~abc)}"
and yield the string "~abc". But instead, I found that if I did define
$ foo='abcdefg'
$ echo "${foo:~abc}"
g
In fact, it's taking "abc" in arithmetic context and doing. "${foo:~0}". Likewise
$ foo='abcdefg'
$ echo "${foo:~3}"
defg
It gets you the last n+1 characters of the expansion. I looked in the "Parameter Expansion" section of the manpage. I see no mention of tildes there. Bash Hackers Wiki only mentions tildes as (also undocumented) case modifiers.
This behavior goes back to at least 3.2.57.
Am I just missing where this form of substring expansion is documented, or is it not documented at all?

It's not undocumented (you may have been confusing ${foo:~abc} with ${foo-~abc}).
${parameter:offset}
${parameter:offset:length}
Substring Expansion. Expands to up to length characters of the
value of parameter starting at the character specified by off-
set. [...] If length is omitted, expands to the substring of the
value of parameter starting at the character specified by offset
and extending to the end of the value. length and offset are
arithmetic expressions (see ARITHMETIC EVALUATION below).
Here, ~abc is the offset field of the expansion, and ~ is the bitwise negation operator in the arithmetic expression. An undefined parameter evaluates to 0 in an arithmetic expression, and ~0 == -1.

Related

Order of brace expansion and parameter expansion

A common trope on StackOverflow bash is: "Why doesn't x=99; echo {1..$x} work?"
The answer is "because braces are expanded before parameters/variables".
Therefore, I thought it should be possible to expand multiple variables using a single $ and a brace. I'd expect a=1; b=2; c=3; echo ${{a..c}} to print 1 2 3. First, the inner brace would expand to ${a} ${b} ${c} (which it does when writing echo \${{a..c}}). Then that result would undergo parameter expansion.
However, I got -bash: ${{a..c}}: bad substitution so {a..c} wasn't expanded at all.
Bash's manual is a bit more specific (emphasis mine).
Expansion is performed on the command line after it has been split into tokens [...]
The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and filename expansion.
Note the ; and , in that list. "Left-to-right fashion" seems to apply to the whole (therefore unordered) list before the ;. Just like the mathematical operators * and / have no precedence over each other.
Ok, so brace expansion is not really of higher precedence than parameter expansion. It's just that both {1..$x} and ${{a..c}} are evaluated from left to right, meaning the brace { comes before the parameter $x and the parameter ${ comes before the brace {a..c}.
Or so I thought. However, when using $ instead of ${ then parameters on the left expand after braces on the right:
# in bash 5.0.3(1)
x=nil; x1=one; x2=two
echo ${x{1..2}} # prints `-bash: ${x{1..2}}: bad substitution`
echo $x{1..2} # prints `one two`
Question
Could it be that the bash manual is flawed or did I read it wrong?
If the manual is flawed: What is the exact order of all expansions?
I'm just asking because I'm curious. I don't plan to use thinks like $x{1..2} anywhere. I'm not interested in better solutions or alternatives to address multiple variables (e.g. array slices ${array[#]:1:2}). I just want to get a deeper understanding.
from: https://www.gnu.org/software/bash/manual/html_node/Brace-Expansion.html
To avoid conflicts with parameter expansion, the string ‘${’ is not
considered eligible for brace expansion, and inhibits brace expansion
until the closing ‘}’.
That said, for echo $x{1..2} , first the brace expansion takes place, and then the parameter expansion, so we have echo $x1 $x2. For echo ${x{1..2}} the brace expansion doesn't happen, because we are after the ${ and haven't reached the closing } of the parameter expansion.
Regarding the bash manual part you have quoted, left-to-right order still exists for the expansions (with respect to allowed nested ones). Things get clearer if you format the list instead of using , and ;:
brace expansion
In a left-to-right fashion:
tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution
word splitting
filename expansion.
Read Mo Budlong's 1988 classic Command Line Psychology, which was written for regular Unix, but most of it still applies to bash. The order of evaluation goes:
1 History substitution (except for the Bourne shell)
2 Splitting words, including special characters
3 Updating the history list (except for the Bourne shell)
4 Interpreting single and double quotes
5 Alias substitution (except for the Bourne shell)
6 Redirection of input and output (< > and |)
7 Variable substitution (variables starting with $)
8 Command substitution (commands inside back quotes)
9 File name expansion (file name wild cards)
So what bash does with code like {1..3} happens before step 7 above, and that's why the OP code fails.
But if we must, there's always eval, (which should only be used if the variables are known in advance, or first cautiously type checked):
a=1; b=2; c=3; eval echo \{$a..$c}
Output:
1 2 3

Understanding Bash Script argument expressions

With the help of a number of web searches, SO questions and answers, and trial-and-error I have written the following script to send attachments to an email.
attachments=""
subject=""
args=( "$#" ) # Copy arguments
recipient="${#: -1}" # Last argument
unset "args[${#args[#]}-1]" # Remove last argument
for i in "${args[#]}"; do # Remaining Arguments
attachments="$attachments -a $i"
subject="$subject $i"
done
eval "echo 'See Attached …' | mail -r 'Fred <fred#example.net>' $attachments -s \"Attached: $subject\" $recipient"
It works perfectly using something like
send.sh file1 file2 file3 recipient#example.com
I have omitted some of the refinements in the above code, such as error checking, but the whole thing works as planned.
I have no trouble with the process, and I have good programming skills. However I find that Bash scripting is like medieval Latin to me, and I an having a hard time understanding the four expressions which I have commented.
The idea is that I pop the last argument, which is supposed to be the recipient, and loop through the remaining arguments which will be attached files.
Can anybody detail the meanings of the expressions $#, ${#: -1}, ${args[#]}, and args[${#args[#]}-1], and explain what the hash is doing in the last expression?
No doubt the script could stand some improvement, but I only trying to understand what is happening so far.
It's all in bash manual shell parameter expansion and some in bash special parameters. So:
Can anybody detail the meanings of the expressions $#
From the manual, important parts:
$#
($#) Expands to the positional parameters, starting from one. [...]
if not within double quotes, these words are subject to word splitting. In contexts where word splitting is not performed, this expands to a single word with each positional parameter separated by a space. When the expansion occurs within double quotes, and word splitting is performed, each parameter expands to a separate word. That is, "$#" is equivalent to "$1" "$2" ...
So "$#" is equal to "$1" "$2" "$3" ... for each parameter passed. Word splitting is that thing that when a variable is not quoted, it splits argument on spaces, like: a="arg1 arg2 arg3"; f $a runs f with 3 arguments.
${#: -1}
From shell parameter expansion:
${parameter:offset}
${parameter:offset:length}
It expands to up to length characters of the value of parameter starting at the character specified by offset [...]
If parameter is ‘#’, the result is length positional parameters beginning at offset. A negative offset is taken relative to one greater than the greatest positional parameter, so an offset of -1 evaluates to the last positional parameter.
So ${#: -1} is the last positional argument passed to the script. The additional space is there because ${parameter:-word} means something different.
${args[#]}
From bash manual arrays:
Any element of an array may be referenced using ${name[subscript]}. The braces are required to avoid conflicts with the shell’s filename expansion operators. If the subscript is ‘#’ or ‘*’, the word expands to all members of the array name.
${args[#]} is equal to ${args[1]} ${args[2]} ${args[3]}. Note that without quotes word splitting is performed. In your code you have for i in "${args[#]}" - words are preserved.
args[${#args[#]}-1]
From bash manual shell parameter expansion:
${#parameter}
If parameter is an array name subscripted by ‘*’ or ‘#’, the value substituted is the number of elements in the array.
So ${#args[#]} expands to the count of elements in an array. The count of elements -1 is the index of last element. So args[${#args[#]}-1] is args[<the index of last array element>]. The unset "args[${#args[#]}-1]" is used to remove last array element.
explain what the hash is doing in the last expression?
The hash is there to trigger proper variable expansion.
what ( "$#" ) is doing.
From manual:
Arrays are assigned to using compound assignments of the form
name=(value1 value2 … )
The var=("$#") creates an array var with the copy of positional parameters properly expanded with words preserved.
Everything is explained somewhere in the Bash Manual
$# in Special Parameters
Expands to the positional parameters, starting from one. In contexts where word splitting is performed, this expands each positional parameter to a separate word; if not within double quotes, these words are subject to word splitting.
${#: -1} in Shell Parameter Expansion
${parameter:offset}
${parameter:offset:length}
... If parameter is ‘#’, the result is length positional parameters beginning at offset. A negative offset is taken relative to one greater than the greatest positional parameter, so an offset of -1 evaluates to the last positional parameter.
${args[#]} in Arrays
Any element of an array may be referenced using ${name[subscript]}. The braces are required to avoid conflicts with the shell’s filename expansion operators. If the subscript is ‘#’ or ‘*’, the word expands to all members of the array name.
args[${#args[#]}-1] also in Arrays:
${#name[subscript]} expands to the length of ${name[subscript]}. If subscript is ‘#’ or ‘*’, the expansion is the number of elements in the array.

Nested parameter expansion: Why is ${foo%"$bar"} legal, but ${$bar} not?

I'll start with the two motivating examples, to give proper context for the question, and then ask the question. First consider this example:
$ ext=.mp3
$ fname=file.mp3
$ echo ${fname%"$ext"}
file
Evidently, in parsing ${fname%"$ext"}, bash first expands $ext into .mp3, and then expands ${fname%.mp3} into file — the last step follows trivially from the definition of % expansions. What's confusing me is the expansion of $ext...
In particular, let's compare the above with this similar example:
$ a=value
$ b=a
$ echo ${$b}
-bash: ${$b}: bad substitution
Of course, I know I could use "indirect expansion" here to achieve what I want:
$ echo ${!b}
value
But that's not relevant to my question. I want to understand the specific bash evaluation and parsing rules that explain why ${$b} fails but ${fname%"$ext"} succeeds.
The only relevant passage I've found in man bash is this:
The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and pathname expansion.
But I'm not seeing how the different behaviors result from these rules.
I'd like see an explanation that explains each step of the evaluation process of the two examples, and the rule underlying each step.
If you look up ${parameter%word} expansion in the bash manual you'll see that parameter and word are treated differently. word is subject to pathname expansion while parameter is not.
${parameter%word}
${parameter%%word}
Remove matching suffix pattern. The word is expanded to produce a
pattern just as in pathname expansion. If the pattern matches a
trailing portion of the expanded value of parameter, then the result
of the expansion is the expanded value of parameter with the shortest
matching pattern (the % case) or the longest matching pattern (the
%% case) deleted. If parameter is # or *, the pattern removal
operation is applied to each positional parameter in turn, and the
expansion is the resultant list. If parameter is an array variable
subscripted with # or *, the pattern removal operation is applied to
each member of the array in turn, and the expansion is the resultant
list.
That seems like it would explain it. But it doesn't. Pathname expansion only means globbing and pattern matching with *, ?, and the like. It doesn't include variable expansion.
The key is to read up. There's a preamble that applies to the above:
In each of the cases below, word is subject to tilde expansion, parameter expansion, command substitution, and arithmetic expansion.
In totality, word is subject to all of these expansions. Key to this question: $ext is expanded via recursive parameter expansion.
I say "recursive" because it can in fact be nested arbitrarily deep. To wit:
$ echo ${fname%.mp3}
file
$ echo ${fname%"$ext"}
file
$ echo ${fname%"${ext%"$empty"}"}
file
$ echo ${fname%"${ext%"${empty%""}"}"}
file

Why this expansion inside expansion pattern doesn't work?

Suppose you have something like:
$ a=(fooa foob foox)
Then you can do:
$ b=(${(M)a:#*(a|b)})
To select a's elements matching the pattern.
So you have:
$ print ${(qq)b}
'fooa' 'foob'
Then you expect to build the pattern in some dynamic way, so you have it in another variable, say:
$ p="*(a|b)"
And you expect this:
$ b=(${(M)a:#$p})
Would work the same as before, as the documentation says, but it doesn't:
$ print ${(qq)b}
''
Why is that?
Because zsh tries to select $p's value literally (a plain string text) in this case:
a=('*(a|b)' fooa foob)
p="*(a|b)"
b=(${(M)a:#$p})
print ${(qq)b}
;#⇒'*(a|b)'
We could tell zsh to treat $p's expansion as patterns rather than literal values explicitly by a ${~spec} form.
${~spec}
Turn on the GLOB_SUBST option for the evaluation of spec; if the ‘~’ is doubled, turn it off. When this option is set, the string resulting from the expansion will be interpreted as a pattern anywhere that is possible, such as in filename expansion and filename generation and pattern-matching contexts like the right hand side of the ‘=’ and ‘!=’ operators in conditions.
-- zshexpn(1): Expansion, Parameter Expansion
In this case, we could use it like this:
a=(fooa foob foox)
p="*(a|b)"
b=(${(M)a:#${~p}}) ;# tell zsh treat as a pattern for `$p`
print ${(qq)b}
;#⇒'fooa' 'foob'
Note: It gives some hints in the parameter expansion flag b for storing patterns in variable values:
b
Quote with backslashes only characters that are special to pattern matching. This is useful when the contents of the variable are to be tested using GLOB_SUBST, including the ${~...} switch.
Quoting using one of the q family of flags does not work for this purpose since quotes are not stripped from non-pattern characters by GLOB_SUBST. In other words,
pattern=${(q)str}
[[ $str = ${~pattern} ]]
works if $str is ‘a*b’ but not if it is ‘a b’, whereas
pattern=${(b)str}
[[ $str = ${~pattern} ]]
is always true for any possible value of $str.
-- zshexpn(1): Expansion, Parameter Expansion, Parameter Expansion Flags

Issue with Log files generation [duplicate]

I have a simple question but I wonder what is the difference between ${varname} and $varname ?
I use both but I don't see any difference which could tell me when to use one or the other.
Using {} in variable names helps get rid of ambiguity while performing variable expansion.
Consider two variables var and varname. Lets see you wanted to append the string name to the variable var. You can't say $varname because that would result in the expansion of the variable varname. However, saying ${var}name would help you achieve the desired result.
$ var="This is var variable."
$ varname="This is varname variable."
$ echo $varname
This is varname variable.
$ echo ${var}name
This is var variable.name
Braces are also required when accessing any element of an array.
$ a=( foo bar baz ) # Declare an array
$ echo $a[0] # Accessing first element -- INCORRECT
foo[0]
$ echo ${a[0]} # Accessing first element -- CORRECT
foo
Quoting from info bash:
Any element of an array may be referenced using ${name[subscript]}.
The braces are required to avoid conflicts with pathname expansion.
They are the same in a basic case, but using ${varname} gives more control and ability to work with the variable. It also skips edge cases in which it can create confusion. And finally, it enables variable expansion as described in Shell Parameter Expansion:
The ‘$’ character introduces parameter expansion, command
substitution, or arithmetic expansion. The parameter name or symbol to
be expanded may be enclosed in braces, which are optional but serve to
protect the variable to be expanded from characters immediately
following it which could be interpreted as part of the name.
When braces are used, the matching ending brace is the first ‘}’ not
escaped by a backslash or within a quoted string, and not within an
embedded arithmetic expansion, command substitution, or parameter
expansion.
The basic form of parameter expansion is ${parameter}. The value of
parameter is substituted. The braces are required when parameter is a
positional parameter with more than one digit, or when parameter is
followed by a character that is not to be interpreted as part of its
name.
Let's see a basic example. Here, the use of ${} allows us to do something that a simple $ does not. Consider we want to write $myvar + "blabla"::
$ myvar=23
$ echo $myvar
23
$ echo $myvarblabla
<--- the variable $myvarblabla doesn't exist!
$ echo ${myvar}blabla
23blabla
The distinction becomes important when something follows the variable:
text="House"
plural="${text}s"
Without the braces, the shell would see texts as variable name which wouldn't work.
The braces are also necessary when you use the extended syntax to specify defaults (${name-default}), display errors when undefined (${name?error}), or pattern substitution (see this article for other patterns; it's for BASH but most work for KSH as well)
> echo $name-default
-default
> echo ${name-default}
default
Related:
Parameter Substitution in Korn-/POSIX-Shell

Resources