Consider this line of code where deps contains a list of dependencies:
IFS=',' printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
I set IFS to be , for a single invocation of printf but strangely printf doesn't seem to respect IFS as it doesn't expand deps to a comma-separated list.
On the other hand, if I set IFS like so:
IFS=','
printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
printf correctly expands deps to a comma-separated list.
Do I miss something here?
In this command line:
IFS=',' printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
printf does not expand "${deps[*]}". The shell does the expansion. In fact, that's pretty well always true. Although printf happens to be a shell builtin, it doesn't do anything special to its arguments, and you would get exactly the same behaviour with an external printf.
The syntax
envvar=value program arg1 arg2 arg3
causes the shell to add envvar=value to the list of environment variables provided to program, and the strings arg1, arg2, and arg3 to be made into an argv list for program. Before all that happens, the shell does its normal expansions of various types, which will cause shell variables referenced in the value and the three arguments to be replaced with their values. But the environment variable setting envvar=value is not part of the shell's execution environment.
Equally,
FOO=World echo "Hello, $FOO"
will not use FOO=World when expanding $FOO in the argument to echo. "Hello, $FOO" is expanded by the shell in the shell's execution environment, and then passed to echo as an argument, and FOO=World is passed to echo as part of its environment.
Putting the variable setting in a separate command is completely different.
IFS=','; printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
first sets the value of IFS in the shell's environment, before the shell parses the printf command. When the shell then does its expansions in the arguments it will eventually pass to printf, it uses the value of IFS in order to expand the array deps[*]. In this case, IFS is not included in the environment variables passed to printf, unless IFS has previously been exported.
The use of IFS with the read builtin may seem confusing, but it is entirely consistent with the above. In the command
IFS=, read A B C
IFS=, is passed as part of the list of environment variables to read. read the consumes a line of input, and consults the value of IFS in its environment in order to figure out how to split the input line into words.
In order to change IFS for the purposes of parameter expansion in an argument, the change must be made in the shell's environment, which is a global change. Since you rarely want to globally change the value of IFS, a common idiom is to change it within a subshell created with ():
( IFS=,; printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"; )
probably does what you want.
You can. Just save it first, use it and reset it:
oldifs=$IFS
IFS=','
printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
IFS=$oldifs
Now for your command, you can do it on one line by putting a ; between the commands:
IFS=','; printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
You still need to save and restore IFS though.
The only place you can set IFS as a one-shot deal is in while loops. E.g:
while IFS=$'\n' read line; do
....
done
It is just applied within the while loop in that case.
Related
Seems that the recommended way of doing indirect variable setting in bash is to use eval:
var=x; val=foo
eval $var=$val
echo $x # --> foo
The problem is the usual one with eval:
var=x; val=1$'\n'pwd
eval $var=$val # bad output here
(and since it is recommended in many places, I wonder just how many scripts are vulnerable because of this...)
In any case, the obvious solution of using (escaped) quotes doesn't really work:
var=x; val=1\"$'\n'pwd\"
eval $var=\"$val\" # fail with the above
The thing is that bash has indirect variable reference baked in (with ${!foo}), but I don't see any such way to do indirect assignment -- is there any sane way to do this?
For the record, I did find a solution, but this is not something that I'd consider "sane"...:
eval "$var='"${val//\'/\'\"\'\"\'}"'"
A slightly better way, avoiding the possible security implications of using eval, is
declare "$var=$val"
Note that declare is a synonym for typeset in bash. The typeset command is more widely supported (ksh and zsh also use it):
typeset "$var=$val"
In modern versions of bash, one should use a nameref.
declare -n var=x
x=$val
It's safer than eval, but still not perfect.
Bash has an extension to printf that saves its result into a variable:
printf -v "${VARNAME}" '%s' "${VALUE}"
This prevents all possible escaping issues.
If you use an invalid identifier for $VARNAME, the command will fail and return status code 2:
$ printf -v ';;;' '%s' foobar; echo $?
bash: printf: `;;;': not a valid identifier
2
eval "$var=\$val"
The argument to eval should always be a single string enclosed in either single or double quotes. All code that deviates from this pattern has some unintended behavior in edge cases, such as file names with special characters.
When the argument to eval is expanded by the shell, the $var is replaced with the variable name, and the \$ is replaced with a simple dollar. The string that is evaluated therefore becomes:
varname=$value
This is exactly what you want.
Generally, all expressions of the form $varname should be enclosed in double quotes, to prevent accidental expansion of filename patterns like *.c.
There are only two places where the quotes may be omitted since they are defined to not expand pathnames and split fields: variable assignments and case. POSIX 2018 says:
Each variable assignment shall be expanded for tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal prior to assigning the value.
This list of expansions is missing the parameter expansion and the field splitting. Sure, that's hard to see from reading this sentence alone, but that's the official definition.
Since this is a variable assignment, the quotes are not needed here. They don't hurt, though, so you could also write the original code as:
eval "$var=\"the value is \$val\""
Note that the second dollar is escaped using a backslash, to prevent it from being expanded in the first run. What happens is:
eval "$var=\"the value is \$val\""
The argument to the command eval is sent through parameter expansion and unescaping, resulting in:
varname="the value is $val"
This string is then evaluated as a variable assignment, which assigns the following value to the variable varname:
the value is value
The main point is that the recommended way to do this is:
eval "$var=\$val"
with the RHS done indirectly too. Since eval is used in the same
environment, it will have $val bound, so deferring it works, and since
now it's just a variable. Since the $val variable has a known name,
there are no issues with quoting, and it could have even been written as:
eval $var=\$val
But since it's better to always add quotes, the former is better, or
even this:
eval "$var=\"\$val\""
A better alternative in bash that was mentioned for the whole thing that
avoids eval completely (and is not as subtle as declare etc):
printf -v "$var" "%s" "$val"
Though this is not a direct answer what I originally asked...
Newer versions of bash support something called "parameter transformation", documented in a section of the same name in bash(1).
"${value#Q}" expands to a shell-quoted version of "${value}" that you can re-use as input.
Which means the following is a safe solution:
eval="${varname}=${value#Q}"
Just for completeness I also want to suggest the possible use of the bash built in read. I've also made corrections regarding -d'' based on socowi's comments.
But much care needs to be exercised when using read to ensure the input is sanitized (-d'' reads until null termination and printf "...\0" terminates the value with a null), and that read itself is executed in the main shell where the variable is needed and not a sub-shell (hence the < <( ... ) syntax).
var=x; val=foo0shouldnotterminateearly
read -d'' -r "$var" < <(printf "$val\0")
echo $x # --> foo0shouldnotterminateearly
echo ${!var} # --> foo0shouldnotterminateearly
I tested this with \n \t \r spaces and 0, etc it worked as expected on my version of bash.
The -r will avoid escaping \, so if you had the characters "\" and "n" in your value and not an actual newline, x will contain the two characters "\" and "n" also.
This method may not be aesthetically as pleasing as the eval or printf solution, and would be more useful if the value is coming in from a file or other input file descriptor
read -d'' -r "$var" < <( cat $file )
And here are some alternative suggestions for the < <() syntax
read -d'' -r "$var" <<< "$val"$'\0'
read -d'' -r "$var" < <(printf "$val") #Apparently I didn't even need the \0, the printf process ending was enough to trigger the read to finish.
read -d'' -r "$var" <<< $(printf "$val")
read -d'' -r "$var" <<< "$val"
read -d'' -r "$var" < <(printf "$val")
Yet another way to accomplish this, without eval, is to use "read":
INDIRECT=foo
read -d '' -r "${INDIRECT}" <<<"$(( 2 * 2 ))"
echo "${foo}" # outputs "4"
Bash seems to behave unpredictably in regards to temporary, per-command variable assignment, specifically with IFS.
I often assign IFS to a temporary value in conjunction with the read command. I would like to use the same mechanic to tailor output, but currently resort to a function or subshell to contain the variable assignment.
$ while IFS=, read -a A; do
> echo "${A[#]:1:2}" # control (undesirable)
> done <<< alpha,bravo,charlie
bravo charlie
$ while IFS=, read -a A; do
> IFS=, echo "${A[*]:1:2}" # desired solution (failure)
> done <<< alpha,bravo,charlie
bravo charlie
$ perlJoin(){ local IFS="$1"; shift; echo "$*"; }
$ while IFS=, read -a A; do
> perlJoin , "${A[#]:1:2}" # function with local variable (success)
> done <<< alpha,bravo,charlie
bravo,charlie
$ while IFS=, read -a A; do
> (IFS=,; echo "${A[*]:1:2}") # assignment within subshell (success)
> done <<< alpha,bravo,charlie
bravo,charlie
If the second assignment in the following block does not affect the environment of the command, and it does not generate an error, then what is it for?
$ foo=bar
$ foo=qux echo $foo
bar
$ foo=bar
$ foo=qux echo $foo
bar
This is a common bash gotcha -- and https://www.shellcheck.net/ catches it:
foo=qux echo $foo
^-- SC2097: This assignment is only seen by the forked process.
^-- SC2098: This expansion will not see the mentioned assignment.
The issue is that the first foo=bar is setting a bash variable, not an environment variable. Then, the inline foo=qux syntax is used to set an environment variable for echo -- however echo never actually looks at that variable. Instead $foo gets recognized as a bash variable and replaced with bar.
So back to your main question, you were basically there with your final attempt using the subshell -- except that you don't actually need the subshell:
while IFS=, read -a A; do
IFS=,; echo "${A[*]:1:2}"
done <<< alpha,bravo,charlie
outputs:
bravo,charlie
For completeness, here's a final example that reads in multiple lines and uses a different output separator to demonstrate that the different IFS assignments aren't stomping on each other:
while IFS=, read -a A; do
IFS=:; echo "${A[*]:1:2}"
done < <(echo -e 'alpha,bravo,charlie\nfoo,bar,baz')
outputs:
bravo:charlie
bar:baz
The answer is a bit simpler than the other answers are presenting:
$ foo=bar
$ foo=qux echo $foo
bar
We see "bar" because the shell expands $foo before setting foo=qux
Simple Command Expansion -- there's a lot to get through here, so bear with me...
When a simple command is executed, the shell performs the following expansions, assignments, and redirections, from left to right.
The words that the parser has marked as variable assignments (those preceding the command name) and redirections are saved for later processing.
The words that are not variable assignments or redirections are expanded (see Shell Expansions). If any words remain after expansion, the first word is taken to be the name of the command and the remaining words are the arguments.
Redirections are performed as described above (see Redirections).
The text after the ‘=’ in each variable assignment undergoes tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal before being assigned to the variable.
If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment. If any of the assignments attempts to assign a value to a readonly variable, an error occurs, and the command exits with a non-zero status.
If no command name results, redirections are performed, but do not affect the current shell environment. A redirection error causes the command to exit with a non-zero status.
If there is a command name left after expansion, execution proceeds as described below. Otherwise, the command exits. If one of the expansions contained a command substitution, the exit status of the command is the exit status of the last command substitution performed. If there were no command substitutions, the command exits with a status of zero.
So:
the shell sees foo=qux and saves that for later
the shell sees $foo and expands it to "bar"
then we now have: foo=qux echo bar
Once you really understand the order that bash does things, a lot of the mystery goes away.
Short answer: the effects of changing IFS are complex and hard to understand, and best avoided except for a few well-defined idioms (IFS=, read ... is one of the idioms I consider ok).
Long answer: There are a couple of things you need to keep in mind in order to understand the results you're seeing from changes to IFS:
Using IFS=something as a prefix to a command changes IFS only for that one command's execution. In particular, it does not affect how the shell parses the arguments to be passed to that command; that's controlled by the shell's value of IFS, not the one used for the command's execution.
Some commands pay attention to the value of IFS they're executed with (e.g. read), but others don't (e.g. echo).
Given the above, IFS=, read -a A does what you'd expect, it splits its input on ",":
$ IFS=, read -a A <<<"alpha,bravo,charlie"
$ declare -p A
declare -a A='([0]="alpha" [1]="bravo" [2]="charlie")'
But echo pays no attention; it always puts spaces between the arguments it's passed, so using IFS=something as a prefix to it has no effect at all:
$ echo alpha bravo
alpha bravo
$ IFS=, echo alpha bravo
alpha bravo
So when you use IFS=, echo "${A[*]:1:2}", it's equivalent to just echo "${A[*]:1:2}", and since the shell's definition of IFS starts with space, it puts the elements of A together with spaces between them. So it's equivalent to running IFS=, echo "alpha bravo".
On the other hand, IFS=,; echo "${A[*]:1:2}" changes the shell's definition of IFS, so it does affect how the shell puts the elements together, so it comes out equivalent to IFS=, echo "alpha,bravo". Unfortunately, it also affects everything else from that point on so you either have to isolate it to a subshell or set it back to normal afterward.
Just for completeness, here are a couple of other versions that don't work:
$ IFS=,; echo "${A[#]:1:2}"
bravo charlie
In this case, the [#] tells the shell to treat each element of the array as a separate argument, so it's left to echo to merge them, and it ignores IFS and always uses spaces.
$ IFS=,; echo "${A[#]:1:2}"
bravo charlie
So how about this:
$ IFS=,; echo ${A[*]:1:2}
bravo charlie
In this case, the [*] tells the shell to mash all elements together with the first character of IFS between them, giving bravo,charlie. But it's not in double-quotes, so the shell immediately re-splits it on ",", splitting it back into separate arguments again (and then echo joins them with spaces as always).
If you want to change the shell's definition of IFS without having to isolate it to a subshell, there are a few options to change it and set it back afterward. In bash, you can set it back to normal like this:
$ IFS=,
$ while read -a A; do # Note: IFS change not needed here; it's already changed
> echo "${A[*]:1:2}"
> done <<<alpha,bravo,charlie
bravo,charlie
$ IFS=$' \t\n'
But the $'...' syntax isn't available in all shells; if you need portability it's best to use literal characters:
IFS='
' # You can't see it, but there's a literal space and tab after the first '
Some people prefer to use unset IFS, which just forces the shell to its default behavior, which is pretty much the same as with IFS defined in the normal way.
...but if IFS had been changed in some larger context, and you don't want to mess that up, you need to save it and then set it back. If it's been changed normally, this'll work:
saveIFS=$IFS
...
IFS=$saveIFS
...but if someone thought it was a good idea to use unset IFS, this will define it as blank, giving weird results. So you can use this approach or the unset approach, but not both. If you want to make this robust against the unset conflict, you can use something like this in bash:
saveIFS=${IFS:-$' \t\n'}
...or for portability, leave off the $' ' and use literal space+tab+newline:
saveIFS=${IFS:-
} # Again, there's an invisible space and tab at the end of the first line
All in all, it's a lot of mess full of traps for the unwary. I recommend avoiding it whenever possible.
Seems that the recommended way of doing indirect variable setting in bash is to use eval:
var=x; val=foo
eval $var=$val
echo $x # --> foo
The problem is the usual one with eval:
var=x; val=1$'\n'pwd
eval $var=$val # bad output here
(and since it is recommended in many places, I wonder just how many scripts are vulnerable because of this...)
In any case, the obvious solution of using (escaped) quotes doesn't really work:
var=x; val=1\"$'\n'pwd\"
eval $var=\"$val\" # fail with the above
The thing is that bash has indirect variable reference baked in (with ${!foo}), but I don't see any such way to do indirect assignment -- is there any sane way to do this?
For the record, I did find a solution, but this is not something that I'd consider "sane"...:
eval "$var='"${val//\'/\'\"\'\"\'}"'"
A slightly better way, avoiding the possible security implications of using eval, is
declare "$var=$val"
Note that declare is a synonym for typeset in bash. The typeset command is more widely supported (ksh and zsh also use it):
typeset "$var=$val"
In modern versions of bash, one should use a nameref.
declare -n var=x
x=$val
It's safer than eval, but still not perfect.
Bash has an extension to printf that saves its result into a variable:
printf -v "${VARNAME}" '%s' "${VALUE}"
This prevents all possible escaping issues.
If you use an invalid identifier for $VARNAME, the command will fail and return status code 2:
$ printf -v ';;;' '%s' foobar; echo $?
bash: printf: `;;;': not a valid identifier
2
eval "$var=\$val"
The argument to eval should always be a single string enclosed in either single or double quotes. All code that deviates from this pattern has some unintended behavior in edge cases, such as file names with special characters.
When the argument to eval is expanded by the shell, the $var is replaced with the variable name, and the \$ is replaced with a simple dollar. The string that is evaluated therefore becomes:
varname=$value
This is exactly what you want.
Generally, all expressions of the form $varname should be enclosed in double quotes, to prevent accidental expansion of filename patterns like *.c.
There are only two places where the quotes may be omitted since they are defined to not expand pathnames and split fields: variable assignments and case. POSIX 2018 says:
Each variable assignment shall be expanded for tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal prior to assigning the value.
This list of expansions is missing the parameter expansion and the field splitting. Sure, that's hard to see from reading this sentence alone, but that's the official definition.
Since this is a variable assignment, the quotes are not needed here. They don't hurt, though, so you could also write the original code as:
eval "$var=\"the value is \$val\""
Note that the second dollar is escaped using a backslash, to prevent it from being expanded in the first run. What happens is:
eval "$var=\"the value is \$val\""
The argument to the command eval is sent through parameter expansion and unescaping, resulting in:
varname="the value is $val"
This string is then evaluated as a variable assignment, which assigns the following value to the variable varname:
the value is value
The main point is that the recommended way to do this is:
eval "$var=\$val"
with the RHS done indirectly too. Since eval is used in the same
environment, it will have $val bound, so deferring it works, and since
now it's just a variable. Since the $val variable has a known name,
there are no issues with quoting, and it could have even been written as:
eval $var=\$val
But since it's better to always add quotes, the former is better, or
even this:
eval "$var=\"\$val\""
A better alternative in bash that was mentioned for the whole thing that
avoids eval completely (and is not as subtle as declare etc):
printf -v "$var" "%s" "$val"
Though this is not a direct answer what I originally asked...
Newer versions of bash support something called "parameter transformation", documented in a section of the same name in bash(1).
"${value#Q}" expands to a shell-quoted version of "${value}" that you can re-use as input.
Which means the following is a safe solution:
eval="${varname}=${value#Q}"
Just for completeness I also want to suggest the possible use of the bash built in read. I've also made corrections regarding -d'' based on socowi's comments.
But much care needs to be exercised when using read to ensure the input is sanitized (-d'' reads until null termination and printf "...\0" terminates the value with a null), and that read itself is executed in the main shell where the variable is needed and not a sub-shell (hence the < <( ... ) syntax).
var=x; val=foo0shouldnotterminateearly
read -d'' -r "$var" < <(printf "$val\0")
echo $x # --> foo0shouldnotterminateearly
echo ${!var} # --> foo0shouldnotterminateearly
I tested this with \n \t \r spaces and 0, etc it worked as expected on my version of bash.
The -r will avoid escaping \, so if you had the characters "\" and "n" in your value and not an actual newline, x will contain the two characters "\" and "n" also.
This method may not be aesthetically as pleasing as the eval or printf solution, and would be more useful if the value is coming in from a file or other input file descriptor
read -d'' -r "$var" < <( cat $file )
And here are some alternative suggestions for the < <() syntax
read -d'' -r "$var" <<< "$val"$'\0'
read -d'' -r "$var" < <(printf "$val") #Apparently I didn't even need the \0, the printf process ending was enough to trigger the read to finish.
read -d'' -r "$var" <<< $(printf "$val")
read -d'' -r "$var" <<< "$val"
read -d'' -r "$var" < <(printf "$val")
Yet another way to accomplish this, without eval, is to use "read":
INDIRECT=foo
read -d '' -r "${INDIRECT}" <<<"$(( 2 * 2 ))"
echo "${foo}" # outputs "4"
I don't seem to be able to assign a string that begins with "-e" or "-E" to a bash shell variable:
$ options="-e stuff"
$ echo $options
stuff
Other letters work fine:
$ options="-g stuff"
$ echo $options
-g stuff
What is the reason for this?
You should quote your variable:
echo "${options}"
otherwise it's being expanded to
echo -g stuff
which is being interpreter by echo as its -e option, which actually exists (see man echo), and that's why -e "did not work" while other letters you tried "did".
First: To reliably determine the value of a variable in bash, use declare -p, not echo. Thus:
declare -p options
will emit something like:
declare -- options="-e stuff"
This tells you much more than echo does:
Because it's declare -- rather than declare -x, you know that the variable is not exported.
Because it's not declare -a, you know it's not giving you an array (echo "$array" will print only the first element of a shell array and ignore the rest).
Because it's not declare -i, you know the value wasn't declared to be an integer... etc.
If you're only worried about the string case, but want to ensure that you get a printable value no matter which version of bash is in use (as some historical releases will not always guarantee printable escaping for values printed with declare -p), consider instead:
printf '%q=%q\n' options "$options"
...which will emit unambiguous output even if there are cursor control characters, newlines, or other non-textual contents in your string.
Follow the advice of the POSIX specification for echo, and use printf instead. To quote the APPLICATION USAGE section in full, emphasis added, noting that in bash, -e enables XSI-style interpretation of escape sequences:
It is not possible to use echo portably across all POSIX systems
unless both -n (as the first argument) and escape sequences are
omitted.
The printf utility can be used portably to emulate any of the
traditional behaviors of the echo utility as follows (assuming that
IFS has its standard value or is unset):
The historic System V echo and the requirements on XSI implementations
in this volume of POSIX.1-2008 are equivalent to:
printf "%b\n" "$*"
The BSD echo is equivalent to:
if [ "X$1" = "X-n" ]
then
shift
printf "%s" "$*"
else
printf "%s\n" "$*"
fi
New applications are encouraged to use printf instead of echo.
So, how does this apply to you? Since you want -e to be treated as data, not part of echo's setup, the BSD, non--n branch of that applies:
options="-e stuff"
printf '%s\n' "$options"
On my GNU bash, version 4.3.42(1)-release I am doing some tests to answer a question. The idea is to split a :-separated string and and each of its elements into an array.
For this, I try to set the IFS to : in the scope of the command, so that the split is automatic and IFS remains untouched:
$ myvar="a:b:c"
$ IFS=: d=($myvar)
$ printf "%s\n" ${d[#]}
a
b
c
And apparently IFS remains the same:
$ echo $IFS
# empty
The BASH reference says that:
If IFS is unset, the parameters are separated by spaces. If IFS is
null, the parameters are joined without intervening separators.
However, then I notice that the IFS is kind of broken, so that echo $myvar returns a b c instead of a:b:c.
Unsetting the value solves it:
$ unset IFS
$ echo $myvar
a:b:c
But I wonder: what is causing this? Isn't IFS=: command changing IFS just in the scope of the command being executed?
I see in Setting IFS for a single statement that this indeed works:
$ IFS=: eval 'd=($myvar)'
$ echo $myvar
a:b:c
But I don't understand why it does and IFS=: d=($myvar) does not.
I was going to comment on this when I saw you use it, but it didn't occur to me until just now what the problem was. The line
IFS=: d=($myvar)
doesn't temporarily set IFS; it simply sets two variables in the current shell. (Simple commands can be prefixed with local environment settings, but an assignment statement itself is not a simple command.)
When you write
echo $IFS
IFS expands to :, but because : is the first character of IFS, it is removed during word splitting. Using
echo "$IFS"
would show that IFS is still set to :.
IFS behaves fine with read on the same line:
myvar="a:b:c"
IFS=: read -ra d <<< "$myvar"
printf "%s\n" "${d[#]}"
a
b
c
Check value of IFS:
declare -p IFS
-bash: declare: IFS: not found
so clearly IFS has not been tampered in current shell.
Now check original input:
echo $myvar
a:b:c
or:
echo "$myvar"
a:b:c
In bash, you can set set a variable that is valid for a single statement only if that statement is not itself a variable assignment. For example:
$ foo=one bar=two
$ echo $foo
one
To make the second assignment part of a statement, you need ... some other statement. As you've noticed, eval works. In addition, read should work:
$ foo="one:two:three"
$ IFS=: read -r -a bar <<< "$foo"
$ declare -p bar
declare -a bar='([0]="one" [1]="two" [2]="three")'