I've encountered a strange problem after temporarily changing IFS for the purpose of array building:
$ echo "1 2 3" |while read myVar1 myVar2; do echo "myVar1: ${myVar1}"; echo "myVar2: ${myVar2}"; done
myVar1: 1
myVar2: 2 3
$ IFS=':' myPaths=( ${PATH} ) # this works: I have /home/morgwai/bin on ${myPaths[0]} , /usr/local/sbin on ${myPaths[1]} and so on
$ echo "1 2 3" |while read myVar1 myVar2; do echo "myVar1: ${myVar1}"; echo "myVar2: ${myVar2}"; done
myVar1: 1 2 3
myVar2:
$ echo $IFS
$ echo "1:2:3" |while read myVar1 myVar2; do echo "myVar1: ${myVar1}"; echo "myVar2: ${myVar2}"; done ;
myVar1: 1
myVar2: 2:3
Normally when I change IFS temporarily for any other command than array building (for example IFS=',' echo whatever) its value is changed only during the execution of that, however here it seems as if IFS got permanently changed to a colon (although echo $IFS doesn't show this, which is even more strange...).
Is this a bug or somehow an expected behavior that I don't understand?
I'm using bash version 4.4.18 if it matters...
Note: I know that I can build the same array using IFS=':' read -a myPaths <<< ${PATH} and then IFS gets reverted to the default value normally, but that's not the point: I'm trying to understand what actually happens in the example I gave above.
Thanks!
You're just setting variables, not setting a variable followed by executing a command (ie, the way you build array is a pure variable assignment, not a command, hence both assignments become permanent).
The issue with an IFS of : not showing up in echo $IFS is caused by shell parameter expansion and word splitting.
Consider:
$ IFS=:
$ echo $IFS
$ echo "$IFS"
:
When a parameter expansion is not quoted, it undergoes word splitting afterwords. From the manual:
The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting.
and
The shell treats each character of $IFS as a delimiter, and splits the results of the other expansions into words using these characters as field terminators. If IFS is unset, or its value is exactly <space><tab><newline>, the default, then sequences of <space>, <tab>, and <newline> at the beginning and end of the results of the previous expansions are ignored, and any sequence of IFS characters not at the beginning or end serves to delimit words.
So when IFS is a colon, splitting a word consisting of just a colon results in a (single) empty word. Always quote your variables to prevent unexpected gotchas like this.
Bash seems to behave unpredictably in regards to temporary, per-command variable assignment, specifically with IFS.
I often assign IFS to a temporary value in conjunction with the read command. I would like to use the same mechanic to tailor output, but currently resort to a function or subshell to contain the variable assignment.
$ while IFS=, read -a A; do
> echo "${A[#]:1:2}" # control (undesirable)
> done <<< alpha,bravo,charlie
bravo charlie
$ while IFS=, read -a A; do
> IFS=, echo "${A[*]:1:2}" # desired solution (failure)
> done <<< alpha,bravo,charlie
bravo charlie
$ perlJoin(){ local IFS="$1"; shift; echo "$*"; }
$ while IFS=, read -a A; do
> perlJoin , "${A[#]:1:2}" # function with local variable (success)
> done <<< alpha,bravo,charlie
bravo,charlie
$ while IFS=, read -a A; do
> (IFS=,; echo "${A[*]:1:2}") # assignment within subshell (success)
> done <<< alpha,bravo,charlie
bravo,charlie
If the second assignment in the following block does not affect the environment of the command, and it does not generate an error, then what is it for?
$ foo=bar
$ foo=qux echo $foo
bar
$ foo=bar
$ foo=qux echo $foo
bar
This is a common bash gotcha -- and https://www.shellcheck.net/ catches it:
foo=qux echo $foo
^-- SC2097: This assignment is only seen by the forked process.
^-- SC2098: This expansion will not see the mentioned assignment.
The issue is that the first foo=bar is setting a bash variable, not an environment variable. Then, the inline foo=qux syntax is used to set an environment variable for echo -- however echo never actually looks at that variable. Instead $foo gets recognized as a bash variable and replaced with bar.
So back to your main question, you were basically there with your final attempt using the subshell -- except that you don't actually need the subshell:
while IFS=, read -a A; do
IFS=,; echo "${A[*]:1:2}"
done <<< alpha,bravo,charlie
outputs:
bravo,charlie
For completeness, here's a final example that reads in multiple lines and uses a different output separator to demonstrate that the different IFS assignments aren't stomping on each other:
while IFS=, read -a A; do
IFS=:; echo "${A[*]:1:2}"
done < <(echo -e 'alpha,bravo,charlie\nfoo,bar,baz')
outputs:
bravo:charlie
bar:baz
The answer is a bit simpler than the other answers are presenting:
$ foo=bar
$ foo=qux echo $foo
bar
We see "bar" because the shell expands $foo before setting foo=qux
Simple Command Expansion -- there's a lot to get through here, so bear with me...
When a simple command is executed, the shell performs the following expansions, assignments, and redirections, from left to right.
The words that the parser has marked as variable assignments (those preceding the command name) and redirections are saved for later processing.
The words that are not variable assignments or redirections are expanded (see Shell Expansions). If any words remain after expansion, the first word is taken to be the name of the command and the remaining words are the arguments.
Redirections are performed as described above (see Redirections).
The text after the ‘=’ in each variable assignment undergoes tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal before being assigned to the variable.
If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment. If any of the assignments attempts to assign a value to a readonly variable, an error occurs, and the command exits with a non-zero status.
If no command name results, redirections are performed, but do not affect the current shell environment. A redirection error causes the command to exit with a non-zero status.
If there is a command name left after expansion, execution proceeds as described below. Otherwise, the command exits. If one of the expansions contained a command substitution, the exit status of the command is the exit status of the last command substitution performed. If there were no command substitutions, the command exits with a status of zero.
So:
the shell sees foo=qux and saves that for later
the shell sees $foo and expands it to "bar"
then we now have: foo=qux echo bar
Once you really understand the order that bash does things, a lot of the mystery goes away.
Short answer: the effects of changing IFS are complex and hard to understand, and best avoided except for a few well-defined idioms (IFS=, read ... is one of the idioms I consider ok).
Long answer: There are a couple of things you need to keep in mind in order to understand the results you're seeing from changes to IFS:
Using IFS=something as a prefix to a command changes IFS only for that one command's execution. In particular, it does not affect how the shell parses the arguments to be passed to that command; that's controlled by the shell's value of IFS, not the one used for the command's execution.
Some commands pay attention to the value of IFS they're executed with (e.g. read), but others don't (e.g. echo).
Given the above, IFS=, read -a A does what you'd expect, it splits its input on ",":
$ IFS=, read -a A <<<"alpha,bravo,charlie"
$ declare -p A
declare -a A='([0]="alpha" [1]="bravo" [2]="charlie")'
But echo pays no attention; it always puts spaces between the arguments it's passed, so using IFS=something as a prefix to it has no effect at all:
$ echo alpha bravo
alpha bravo
$ IFS=, echo alpha bravo
alpha bravo
So when you use IFS=, echo "${A[*]:1:2}", it's equivalent to just echo "${A[*]:1:2}", and since the shell's definition of IFS starts with space, it puts the elements of A together with spaces between them. So it's equivalent to running IFS=, echo "alpha bravo".
On the other hand, IFS=,; echo "${A[*]:1:2}" changes the shell's definition of IFS, so it does affect how the shell puts the elements together, so it comes out equivalent to IFS=, echo "alpha,bravo". Unfortunately, it also affects everything else from that point on so you either have to isolate it to a subshell or set it back to normal afterward.
Just for completeness, here are a couple of other versions that don't work:
$ IFS=,; echo "${A[#]:1:2}"
bravo charlie
In this case, the [#] tells the shell to treat each element of the array as a separate argument, so it's left to echo to merge them, and it ignores IFS and always uses spaces.
$ IFS=,; echo "${A[#]:1:2}"
bravo charlie
So how about this:
$ IFS=,; echo ${A[*]:1:2}
bravo charlie
In this case, the [*] tells the shell to mash all elements together with the first character of IFS between them, giving bravo,charlie. But it's not in double-quotes, so the shell immediately re-splits it on ",", splitting it back into separate arguments again (and then echo joins them with spaces as always).
If you want to change the shell's definition of IFS without having to isolate it to a subshell, there are a few options to change it and set it back afterward. In bash, you can set it back to normal like this:
$ IFS=,
$ while read -a A; do # Note: IFS change not needed here; it's already changed
> echo "${A[*]:1:2}"
> done <<<alpha,bravo,charlie
bravo,charlie
$ IFS=$' \t\n'
But the $'...' syntax isn't available in all shells; if you need portability it's best to use literal characters:
IFS='
' # You can't see it, but there's a literal space and tab after the first '
Some people prefer to use unset IFS, which just forces the shell to its default behavior, which is pretty much the same as with IFS defined in the normal way.
...but if IFS had been changed in some larger context, and you don't want to mess that up, you need to save it and then set it back. If it's been changed normally, this'll work:
saveIFS=$IFS
...
IFS=$saveIFS
...but if someone thought it was a good idea to use unset IFS, this will define it as blank, giving weird results. So you can use this approach or the unset approach, but not both. If you want to make this robust against the unset conflict, you can use something like this in bash:
saveIFS=${IFS:-$' \t\n'}
...or for portability, leave off the $' ' and use literal space+tab+newline:
saveIFS=${IFS:-
} # Again, there's an invisible space and tab at the end of the first line
All in all, it's a lot of mess full of traps for the unwary. I recommend avoiding it whenever possible.
I need to read from a variable line by line, do some operations with every line and then work with the data afterwards. I have to work in sh.
I already tried this, but $VAR is empty since I assume, that I saved into it in a subshell
#!/bin/sh POSIXLY_CORRECT=yes
STRING="a\nb\nc\n"
echo $STRING | while read line; do
*some operations with line*
VAR=$(echo "$VAR$line")
done
echo $VAR
I also tried redirecting a variable...
done <<< $STRING
done < $(echo $STRING)
done < < $(echo $STRING)
...and so on, but only got No such file or Redirection unexpected
Please help.
As you've guessed, variable assignments in a subshell aren't seen by the parent, and the commands in a pipelines are run in subshells.
Having to work in plain sh is a real buzzkill. All right, all right, here are a few ideas. One is to extend the life of the subshell and do your work after the loop ends:
string="a
b
c"
echo "$string" | {
var=
while IFS= read -r line; do
*some operations with line*
var=$var$line
done
echo "$var"
}
Another is to use a heredoc (<<), since sh doesn't have herestrings (<<<).
var=
while IFS= read -r line; do
*some operations with line*
var=$var$line
done <<STR
$var
STR
Other improvements:
Escapes aren't interpreted in string literals, so if you want literal newlines put literal newlines, not \n.
Make sure to empty out $var with var=. You don't want an environment variable leaking into your script.
Quote variable expansions. Not needed in assignments, though.
$(echo foo) is an anti-pattern: you can just write foo.
It's best to keep variables lowercase. All uppercase is reserved for shell variables.
Use IFS= and -r to keep read from stripping leading whitespace or interpreting backslashes.
Consider this line of code where deps contains a list of dependencies:
IFS=',' printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
I set IFS to be , for a single invocation of printf but strangely printf doesn't seem to respect IFS as it doesn't expand deps to a comma-separated list.
On the other hand, if I set IFS like so:
IFS=','
printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
printf correctly expands deps to a comma-separated list.
Do I miss something here?
In this command line:
IFS=',' printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
printf does not expand "${deps[*]}". The shell does the expansion. In fact, that's pretty well always true. Although printf happens to be a shell builtin, it doesn't do anything special to its arguments, and you would get exactly the same behaviour with an external printf.
The syntax
envvar=value program arg1 arg2 arg3
causes the shell to add envvar=value to the list of environment variables provided to program, and the strings arg1, arg2, and arg3 to be made into an argv list for program. Before all that happens, the shell does its normal expansions of various types, which will cause shell variables referenced in the value and the three arguments to be replaced with their values. But the environment variable setting envvar=value is not part of the shell's execution environment.
Equally,
FOO=World echo "Hello, $FOO"
will not use FOO=World when expanding $FOO in the argument to echo. "Hello, $FOO" is expanded by the shell in the shell's execution environment, and then passed to echo as an argument, and FOO=World is passed to echo as part of its environment.
Putting the variable setting in a separate command is completely different.
IFS=','; printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
first sets the value of IFS in the shell's environment, before the shell parses the printf command. When the shell then does its expansions in the arguments it will eventually pass to printf, it uses the value of IFS in order to expand the array deps[*]. In this case, IFS is not included in the environment variables passed to printf, unless IFS has previously been exported.
The use of IFS with the read builtin may seem confusing, but it is entirely consistent with the above. In the command
IFS=, read A B C
IFS=, is passed as part of the list of environment variables to read. read the consumes a line of input, and consults the value of IFS in its environment in order to figure out how to split the input line into words.
In order to change IFS for the purposes of parameter expansion in an argument, the change must be made in the shell's environment, which is a global change. Since you rarely want to globally change the value of IFS, a common idiom is to change it within a subshell created with ():
( IFS=,; printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"; )
probably does what you want.
You can. Just save it first, use it and reset it:
oldifs=$IFS
IFS=','
printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
IFS=$oldifs
Now for your command, you can do it on one line by putting a ; between the commands:
IFS=','; printf "setup-x86.exe -q -p='%s'\n" "${deps[*]}"
You still need to save and restore IFS though.
The only place you can set IFS as a one-shot deal is in while loops. E.g:
while IFS=$'\n' read line; do
....
done
It is just applied within the while loop in that case.
What does the triple-less-than-sign bash operator, <<<, mean, as inside the following code block?
LINE="7.6.5.4"
IFS=. read -a ARRAY <<< "$LINE"
echo "$IFS"
echo "${ARRAY[#]}"
Also, why does $IFS remain to be a space, not a period?
It redirects the string to stdin of the command.
Variables assigned directly before the command in this way only take effect for the command process; the shell remains untouched.
From man bash
Here Strings
A variant of here documents, the format is:
<<<word
The word is expanded and supplied to the command on its standard input.
The . on the IFS line is equivalent to source in bash.
Update: More from man bash (Thanks gsklee, sehe)
IFS The Internal Field Separator that is used for word splitting
after expansion and to split lines into words with the read
builtin command. The default value is "<space><tab><new‐line>".
yet more from man bash
The environment for any simple command or function may be augmented
temporarily by prefixing it with parameter assignments, as described
above in PARAMETERS. These assignment statements affect only the environment seen by that command.
The reason that IFS is not being set is that bash isn't seeing that as a separate command... you need to put a line feed or a semicolon after the command in order to terminate it:
$ cat /tmp/ifs.sh
LINE="7.6.5.4"
IFS='.' read -a ARRAY <<< "$LINE"
echo "$IFS"
echo "${ARRAY[#]}"
$ bash /tmp/ifs.sh
7 6 5 4
but
$ cat /tmp/ifs.sh
LINE="7.6.5.4"
IFS='.'; read -a ARRAY <<< "$LINE"
echo "$IFS"
echo "${ARRAY[#]}"
$ bash /tmp/ifs.sh
.
7 6 5 4
I'm not sure why doing it the first way wasn't a syntax error though.