Why does field splitting not occur after parameter expansion in an assignment statement in shell? - shell

Consider the following two assignments.
$ a="foo bar"
$ b=$a
$ b=foo bar
bash: bar: command not found
Why does the second assignment work fine? How is the second command any different from the third command?
I was hoping the second assignment to fail because
b=$a
would expand to
b=foo bar
Since $a is not within double-quotes, foo bar is not quoted, therefore field-splitting should occur (as per my understanding) which would result in b=foo to be considered an assignment and bar to be a command that cannot be found.
Summary: I was expecting the 2nd command to fail for the same reason that caused the 3rd command to fail. Why does the 2nd command succeed?
I went through the POSIX but I am unable to find anything that specifies that field splitting won't occur after parameter expansion that occurs in an assignment.
I mean anywhere else field splitting would occur for an unquoted parameter after parameter expansion. For example,
$ a="foo bar"
$ printf "[%s] [%s]\n" $a
[foo] [bar]
See Section 2.6.5.
After parameter expansion (Parameter Expansion), command substitution (Command Substitution), and arithmetic expansion (Arithmetic Expansion), the shell shall scan the results of expansions and substitutions that did not occur in double-quotes for field splitting and multiple fields can result.
So which part of the POSIX standard prevents field splitting when parameter expansion occurs in an assignment statement?

In 2.9.1, "Simple Commands":
The words that are recognized as variable assignments or redirections according to Shell Grammar Rules are saved for processing in steps 3 and 4.
Step 2 -- which is explicitly skipped in this case per the above text -- reiterates that it ignores assignments when performing expansion and field splitting:
The words that are not variable assignments or redirections shall be expanded. If any fields remain following their expansion, the first field shall be considered the command name and remaining fields are the arguments for the command.
Thus, it's step 2 that determines the command to run (based on contents other than variable assignments and redirections), which addresses the b=$a case given in your question.
Step 4 performs other expansions -- "tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal" -- for assignments. Notably, field splitting is not a member of this set. Indeed, it's explicit in 2.6 that none of these create multiple words in and of themselves:
Tilde expansions, parameter expansions, command substitutions, arithmetic expansions, and quote removals that occur within a single word expand to a single field. It is only field splitting or pathname expansion that can create multiple fields from a single word. The single exception to this rule is the expansion of the special parameter '#' within double-quotes, as described in Special Parameters.

Related

Are quotes necessary in bash when declaring local variables based on the command line argument variable expansion? [duplicate]

This question already has answers here:
Quoting vs not quoting the variable on the RHS of a variable assignment
(5 answers)
Closed 4 years ago.
Are the quotes in the below example necessary or superfluous. And why?
#!/bin/bash
arg1="$1"
arg2="$2"
How do you explain the fact when $1 is 123 echo abc, the first assignment is not interpreted as:
arg1=123 echo abc
which is a normal command (echo) call with argument abc and an environment variable (arg) passed to the execution.
From section 2.9.1 of the POSIX shell syntax specification:
Each variable assignment shall be expanded for tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal prior to assigning the value.
String-splitting and globbing (the steps which double quotes suppress) are not in this list.
Thus, the quotes are superfluous -- not just for assignments where the right-and side refers to a positional parameter, but for all assignments barring those where (1) the behavior of single-quoted, not double-quoted, strings are desired; or (2) whitespace or other content in the value would be otherwise parsed as syntactic rather than literal.
(Note that the decision on how to parse a command -- thus, whether it is an assignment, a simple command, a compound command, or something else -- takes place before parameter expansions; thus, var=$1 is determined to be an assignment before the value of $1 is ever considered! Were this untrue, such that data could silently become syntax, it would be far more difficult -- if not impossible -- to write secure code handling untrusted data in bash).

Bash - Why does $VAR1=FOO or 'VAR=FOO' (with quotes) return command not found?

For each of two examples below I'll try to explain what result I expected and what I got instead. I'm hoping for you to help me understand why I was wrong.
1)
VAR1=VAR2
$VAR1=FOO
result: -bash: VAR2=FOO: command not found
In the second line, $VAR1 gets expanded to VAR2, but why does Bash interpret the resulting VAR2=FOO as a command name rather than a variable assignment?
2)
'VAR=FOO'
result: -bash: VAR=FOO: command not found
Why do the quotes make Bash treat the variable assignment as a command name?
Could you please describe, step by step, how Bash processes my two examples?
How best to indirectly assign variables is adequately answered in other Q&A entries in this knowledgebase. Among those:
Indirect variable assignment in bash
Saving function output into a variable named in an argument
If that's what you actually intend to ask, then this question should be closed as a duplicate. I'm going to make a contrary assumption and focus on the literal question -- why your other approaches failed -- below.
What does the POSIX sh language specify as a valid assignment? Why does $var1=foo or 'var=foo' fail?
Background: On the POSIX sh specification
The POSIX shell command language specification is very specific about what constitutes an assignment, as quoted below:
4.21 Variable Assignment
In the shell command language, a word consisting of the following parts:
varname=value
When used in a context where assignment is defined to occur and at no other time, the value (representing a word or field) shall be assigned as the value of the variable denoted by varname.
The varname and value parts shall meet the requirements for a name and a word, respectively, except that they are delimited by the embedded unquoted equals-sign, in addition to other delimiters.
Also, from section 2.9.1, on Simple Commands, with emphasis added:
The words that are recognized as variable assignments or redirections according to Shell Grammar Rules are saved for processing in steps 3 and 4.
The words that are not variable assignments or redirections shall be expanded. If any fields remain following their expansion, the first field shall be considered the command name and remaining fields are the arguments for the command.
Redirections shall be performed as described in Redirection.
Each variable assignment shall be expanded for tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal prior to assigning the value.
Also, from the grammar:
If all the characters preceding '=' form a valid name (see the Base Definitions volume of IEEE Std 1003.1-2001, Section 3.230, Name), the token ASSIGNMENT_WORD shall be returned. (Quoted characters cannot participate in forming a valid name.)
Note from this:
The command must be recognized as an assignment at the very beginning of the parsing sequence, before any expansions (or quote removal!) have taken place.
The name must be a valid name. Literal quotes are not part of a valid variable name.
The equals sign must be unquoted. In your second example, the entire string was quoted.
Assignments are recognized before tilde expansion, parameter expansion, command substitution, etc.
Why $var1=foo fails to act as an assignment
As given in the grammar, all characters before the = in an assignment must be valid characters within a variable name for an assignment to be recognized. $ is not a valid character in a name. Because assignments are recognized in step 1 of simple command processing, before expansion takes place, the literal text $var1, not the value of that variable, is used for this matching.
Why 'var=foo' fails to act as an assignment
First, all characters before the = must be valid in variable names, and ' is not valid in a variable name.
Second, an assignment is only recognized if the = is not quoted.
1)
VAR1=VAR2
$VAR1=FOO
You want to use a variable name contained in a variable for the assignment. Bash syntax does not allow this. However, there is an easy workaround :
VAR1=VAR2
declare "$VAR1"=FOO
It works with local and export too.
2)
By using single quotes (double quotes would yield the same result), you are telling Bash that what is inside is a string and to treat it as a single entity. Since it is the first item on the line, Bash tries to find an alias, or shell builtin, or an executable file in its PATH, that would be named VAR=FOO. Not finding it, it tells you there is no such command.
An assignment is not a normal command. To perform an assignment contained in a quote, you would need to use eval, like so :
eval "$VAR1=FOO" # But please don't do that in real life
Most experienced bash programmers would probably tell you to avoid eval, as it has serious drawbacks, and I am giving it as an example just to recommend against its use : while in the example above it would not involve any security risk or error potential because the value of VAR1 is known and safe, there are many cases where an arbitrary (i.e. user-supplied) value could cause a crash or unexpected behavior. Quoting inside an eval statement is also more difficult and reduces readability.
You declare VAR2 earlier in the program, right?
If you are trying to assign the value of VAR2 to VAR1, then you need to make sure and use $ in front of VAR2, like so:
VAR1=$VAR2
That will set the value of VAR2 equal to VAR1, because when you utilize the $, you are saying that value that is stored in the variable. Otherwise it doesn't recognize it as a variable.
Basically, a variable that doesn't have a $ in front of it will be interpreted as a command. Any word will. That's why we have the $ to clarify "hey this is a variable".

Parameterized substitutions (${foo%bar}, ${foo-bar}, etc) without using eval

From envsubst man:
These substitutions are a subset of the substitutions that a shell
performs on unquoted and double-quoted strings. Other kinds of
substitutions done by a shell, such as ${variable-default} or
$(command-list) or `command-list`, are not performed by the envsubst
program, due to security reasons.
I'd like to perform variable substitution a string, supporting constructs like ${variable-default} or ${variable%suffix}. I don't want to allow running commands.
Apparently it's not possible using envsubst, on the other hand eval has serious security implications.
Is there some other possibility than writing custom interpolation function?
bash 4.4 introduced a new type of parameter expansion which might do what you want. Namely, ${foo#P} expands the value of foo as if it were a prompt string, and a prompt string does undergo a round of expansion just prior to being displayed.
${parameter#operator}
Parameter transformation. The expansion is either a transforma-
tion of the value of parameter or information about parameter
itself, depending on the value of operator. Each operator is a
single letter:
Q The expansion is a string that is the value of parameter
quoted in a format that can be reused as input.
E The expansion is a string that is the value of parameter
with backslash escape sequences expanded as with the
$'...' quoting mechansim.
P The expansion is a string that is the result of expanding
the value of parameter as if it were a prompt string (see
PROMPTING below).
A The expansion is a string in the form of an assignment
statement or declare command that, if evaluated, will
recreate parameter with its attributes and value.
a The expansion is a string consisting of flag values rep-
resenting parameter's attributes.
A quick example:
$ foo='${bar:-9}'
$ echo "$foo"
${bar:-9}
$ echo "${foo#P}"
9
$ bar=3
echo "${foo#P}"
3
It does, however, still allow running arbitrary commands via $(...):
$ foo='$(echo hi)'
$ echo "${foo#P}"
hi
Another caveat: it does, of course, also expand prompt escapes, so you may be more expansions than you expected if your string already contains some backslashes. There is some conflict between prompt escapes and escapes intended for echo -e.

Bash escaping/expanding order

I'm fairly new to Bash and I'm having trouble working out what is happening to my input as it is interpreted. Specifically, when escaping occurs relative to the other expansion steps.
From what I've read, bash does the following (in order):
brace expansion
tilde expansion
parameter and variable expansion
command substitution
arithmetic expansion
word splitting
filename expansion
But this list doesn't include when it converts all escape sequences e.g. '\\' into their meanings e.g. '\'. That is, if I want to print a backslash character. The command to run is
echo \\
not
echo \
So the syntax required for the semantics of a backslash character is two backslashes. This must be converted into a single slash representation internally.
It seems to be sometime before command substitution as I found out with a small test program.
So, my question is: When does this step take place? (or a complete list of the bash interpretation loop would be perfect)
and also, are there any other subtleties in the interpreter that are likely to catch me out? (related to knowing the complete list i guess)
From the man page's Expansion section, just before the Redirection section.
Quote Removal
After the preceding expansions, all unquoted occurrences of the characters \, ', and " that did not result from one of the above expansions
are removed.
Quote removal is one final process after the seven expansions you list.

Command substitution and field splitting in shell

I understand why the following command fails.
$ a=foo bar
-bash: bar: command not found
It attempts to first execute a=foo and then execute bar which fails because there is no such command called bar.
But I don't understand why this works. I was expecting the following command to fail as well.
$ a=$(echo foo bar)
$ echo "$a"
foo bar
As per http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_05 first command substitution happens, and then field splitting happens.
2.6 Word Expansions
This section describes the various expansions that are performed on
words. Not all expansions are performed on every word, as explained in
the following sections.
Tilde expansions, parameter expansions, command substitutions,
arithmetic expansions, and quote removals that occur within a single
word expand to a single field. It is only field splitting or pathname
expansion that can create multiple fields from a single word. The
single exception to this rule is the expansion of the special
parameter '#' within double-quotes, as described in Special
Parameters.
The order of word expansion shall be as follows:
Tilde expansion (see Tilde Expansion), parameter expansion (see Parameter Expansion), command substitution (see Command Substitution),
and arithmetic expansion (see Arithmetic Expansion) shall be
performed, beginning to end. See item 5 in Token Recognition.
Field splitting (see Field Splitting) shall be performed on the portions of the fields generated by step 1, unless IFS is null.
Pathname expansion (see Pathname Expansion) shall be performed, unless set -f is in effect.
Quote removal (see Quote Removal) shall always be performed last.
So after command subsitution,
a=$(echo foo bar)
becomes
a=foo bar
And then after to field splitting, a=foo should be executed first and then bar should be executed and then we should have the same error, i.e. bar: command not found. Why does a=$(echo foo bar) work fine then?
The answer is in 2.9.1 Simple Commands I believe.
Specifically points 1 and 4:
 1. The words that are recognized as variable assignments or redirections according to Shell Grammar Rules are saved for processing in steps 3 and 4.
 4. Each variable assignment shall be expanded for tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal prior to assigning the value.
Or in the bash reference manual in 3.4 Shell Parameters:
A variable may be assigned to by a statement of the form
name=[value]
If value is not given, the variable is assigned the null string. All values undergo tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, and quote removal (detailed below).

Resources