Bash seems to behave unpredictably in regards to temporary, per-command variable assignment, specifically with IFS.
I often assign IFS to a temporary value in conjunction with the read command. I would like to use the same mechanic to tailor output, but currently resort to a function or subshell to contain the variable assignment.
$ while IFS=, read -a A; do
> echo "${A[#]:1:2}" # control (undesirable)
> done <<< alpha,bravo,charlie
bravo charlie
$ while IFS=, read -a A; do
> IFS=, echo "${A[*]:1:2}" # desired solution (failure)
> done <<< alpha,bravo,charlie
bravo charlie
$ perlJoin(){ local IFS="$1"; shift; echo "$*"; }
$ while IFS=, read -a A; do
> perlJoin , "${A[#]:1:2}" # function with local variable (success)
> done <<< alpha,bravo,charlie
bravo,charlie
$ while IFS=, read -a A; do
> (IFS=,; echo "${A[*]:1:2}") # assignment within subshell (success)
> done <<< alpha,bravo,charlie
bravo,charlie
If the second assignment in the following block does not affect the environment of the command, and it does not generate an error, then what is it for?
$ foo=bar
$ foo=qux echo $foo
bar
$ foo=bar
$ foo=qux echo $foo
bar
This is a common bash gotcha -- and https://www.shellcheck.net/ catches it:
foo=qux echo $foo
^-- SC2097: This assignment is only seen by the forked process.
^-- SC2098: This expansion will not see the mentioned assignment.
The issue is that the first foo=bar is setting a bash variable, not an environment variable. Then, the inline foo=qux syntax is used to set an environment variable for echo -- however echo never actually looks at that variable. Instead $foo gets recognized as a bash variable and replaced with bar.
So back to your main question, you were basically there with your final attempt using the subshell -- except that you don't actually need the subshell:
while IFS=, read -a A; do
IFS=,; echo "${A[*]:1:2}"
done <<< alpha,bravo,charlie
outputs:
bravo,charlie
For completeness, here's a final example that reads in multiple lines and uses a different output separator to demonstrate that the different IFS assignments aren't stomping on each other:
while IFS=, read -a A; do
IFS=:; echo "${A[*]:1:2}"
done < <(echo -e 'alpha,bravo,charlie\nfoo,bar,baz')
outputs:
bravo:charlie
bar:baz
The answer is a bit simpler than the other answers are presenting:
$ foo=bar
$ foo=qux echo $foo
bar
We see "bar" because the shell expands $foo before setting foo=qux
Simple Command Expansion -- there's a lot to get through here, so bear with me...
When a simple command is executed, the shell performs the following expansions, assignments, and redirections, from left to right.
The words that the parser has marked as variable assignments (those preceding the command name) and redirections are saved for later processing.
The words that are not variable assignments or redirections are expanded (see Shell Expansions). If any words remain after expansion, the first word is taken to be the name of the command and the remaining words are the arguments.
Redirections are performed as described above (see Redirections).
The text after the ‘=’ in each variable assignment undergoes tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal before being assigned to the variable.
If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment. If any of the assignments attempts to assign a value to a readonly variable, an error occurs, and the command exits with a non-zero status.
If no command name results, redirections are performed, but do not affect the current shell environment. A redirection error causes the command to exit with a non-zero status.
If there is a command name left after expansion, execution proceeds as described below. Otherwise, the command exits. If one of the expansions contained a command substitution, the exit status of the command is the exit status of the last command substitution performed. If there were no command substitutions, the command exits with a status of zero.
So:
the shell sees foo=qux and saves that for later
the shell sees $foo and expands it to "bar"
then we now have: foo=qux echo bar
Once you really understand the order that bash does things, a lot of the mystery goes away.
Short answer: the effects of changing IFS are complex and hard to understand, and best avoided except for a few well-defined idioms (IFS=, read ... is one of the idioms I consider ok).
Long answer: There are a couple of things you need to keep in mind in order to understand the results you're seeing from changes to IFS:
Using IFS=something as a prefix to a command changes IFS only for that one command's execution. In particular, it does not affect how the shell parses the arguments to be passed to that command; that's controlled by the shell's value of IFS, not the one used for the command's execution.
Some commands pay attention to the value of IFS they're executed with (e.g. read), but others don't (e.g. echo).
Given the above, IFS=, read -a A does what you'd expect, it splits its input on ",":
$ IFS=, read -a A <<<"alpha,bravo,charlie"
$ declare -p A
declare -a A='([0]="alpha" [1]="bravo" [2]="charlie")'
But echo pays no attention; it always puts spaces between the arguments it's passed, so using IFS=something as a prefix to it has no effect at all:
$ echo alpha bravo
alpha bravo
$ IFS=, echo alpha bravo
alpha bravo
So when you use IFS=, echo "${A[*]:1:2}", it's equivalent to just echo "${A[*]:1:2}", and since the shell's definition of IFS starts with space, it puts the elements of A together with spaces between them. So it's equivalent to running IFS=, echo "alpha bravo".
On the other hand, IFS=,; echo "${A[*]:1:2}" changes the shell's definition of IFS, so it does affect how the shell puts the elements together, so it comes out equivalent to IFS=, echo "alpha,bravo". Unfortunately, it also affects everything else from that point on so you either have to isolate it to a subshell or set it back to normal afterward.
Just for completeness, here are a couple of other versions that don't work:
$ IFS=,; echo "${A[#]:1:2}"
bravo charlie
In this case, the [#] tells the shell to treat each element of the array as a separate argument, so it's left to echo to merge them, and it ignores IFS and always uses spaces.
$ IFS=,; echo "${A[#]:1:2}"
bravo charlie
So how about this:
$ IFS=,; echo ${A[*]:1:2}
bravo charlie
In this case, the [*] tells the shell to mash all elements together with the first character of IFS between them, giving bravo,charlie. But it's not in double-quotes, so the shell immediately re-splits it on ",", splitting it back into separate arguments again (and then echo joins them with spaces as always).
If you want to change the shell's definition of IFS without having to isolate it to a subshell, there are a few options to change it and set it back afterward. In bash, you can set it back to normal like this:
$ IFS=,
$ while read -a A; do # Note: IFS change not needed here; it's already changed
> echo "${A[*]:1:2}"
> done <<<alpha,bravo,charlie
bravo,charlie
$ IFS=$' \t\n'
But the $'...' syntax isn't available in all shells; if you need portability it's best to use literal characters:
IFS='
' # You can't see it, but there's a literal space and tab after the first '
Some people prefer to use unset IFS, which just forces the shell to its default behavior, which is pretty much the same as with IFS defined in the normal way.
...but if IFS had been changed in some larger context, and you don't want to mess that up, you need to save it and then set it back. If it's been changed normally, this'll work:
saveIFS=$IFS
...
IFS=$saveIFS
...but if someone thought it was a good idea to use unset IFS, this will define it as blank, giving weird results. So you can use this approach or the unset approach, but not both. If you want to make this robust against the unset conflict, you can use something like this in bash:
saveIFS=${IFS:-$' \t\n'}
...or for portability, leave off the $' ' and use literal space+tab+newline:
saveIFS=${IFS:-
} # Again, there's an invisible space and tab at the end of the first line
All in all, it's a lot of mess full of traps for the unwary. I recommend avoiding it whenever possible.
Given the following code in bash:
filename=${1}
content1=${#:2}
content2="${#:2}"
content3=${*:2}
content4="${*:2}"
echo "${content1}" > "${filename}"
echo "${content2}" >> "${filename}"
echo "${content3}" >> "${filename}"
echo "${content4}" >> "${filename}"
What is the difference between the "contents" ? When I will see the difference? What is the better way to save the content that I get and why?
In an assignment, the right-hand side doesn't have to be quoted to prevent word splitting and globbing, unless it contains a blank.
These two are the same:
myvar=$var
myvar="$var"
but these are not:
myvar='has space'
myvar=has space
The last one tries to run a command space with the environment variable myvar set to value has.
This means that content1 is the same as content2, and content3 is the same as content4, as they only differ in RHS quoting.
The differences thus boil down to the difference between $# and $*; the fact that subarrays are used doesn't matter. Quoting from the manual:
(for $*):
When the expansion occurs within double quotes, it expands to a single word with the value of each parameter separated by the first character of the IFS special variable.
(for $#):
When the expansion occurs within double quotes, each parameter expands to a separate word.
Since you're using quotes (as you almost always should when using $# or $*), the difference is that "${*:2}" is a single string, separated by the first character of IFS, and "${#:2}" expands to separate words, blank separated.
Example:
$ set -- par1 par2 par3 # Set $1, $2, and $3
$ printf '<%s>\n' "${#:2}" # Separate words
<par2>
<par3>
$ printf '<%s>\n' "${*:2}" # Single word, blank separated (default IFS)
<par2 par3>
$ IFS=, # Change IFS
$ printf '<%s>\n' "${*:2}" # Single word, comma separated (new IFS)
<par2,par3>
As for when to use $# vs. $*: as a rule of thumb, you almost always want "$#" and almost never the unquoted version of either, as the latter is subject to word splitting and globbing.
"$*" is useful if you want to join array elements into a single string as in the example, but the most common case, I guess, is iterating over positional parameters in a script or function, and that's what "$#" is for. See also this Bash Pitfall.
The only difference is that using $* will cause the arguments to be concatenated with the first character of IFS, while $# will cause the arguments to be concatenated with a single space (regardless of the value of IFS):
$ set a b c
$ IFS=-
$ c1="$*"
$ c2="$#"
$ echo "$c1"
a-b-c
$ echo "$c2"
a b c
The quotes aren't particularly important here, as expansions on the right-hand side of an assignment aren't subject to word-splitting or pathname expansion; content1 and content2 should always be identical, as should be content3 and content4.
Which is better depends on what you want.
There is a quite nasty expression that want to echo using bash.
The expression is:
'one two --
Note: There is white space after --.
So I have:
IFS=
echo 'one$IFStwoIFS--$IFS
But the result is:
one$IFStwo$IFS--$IFS
You have few issues with your approach:
Within single quote variables are not expanded in shell
In the string one$IFStwo$IFS--$IFS first instance of $IFS will not be expanded since you have string two next to $IFS so it attempts to expand non-existent variable $IFStwo.
Default value of $IFS is $' \t\n'
You can use:
echo "one${IFS}two$IFS--$IFS"
which will expand to (cat -A output):
one ^I$
two ^I$
-- ^I$
Experiment 1
Here is my first script in file named foo.sh.
IFS=:
for i in foo:bar:baz
do
echo $i
done
This produces the following output.
$ bash foo.sh
foo bar baz
Experiment 2
This is my second script.
IFS=:
for i in foo:bar:baz
do
unset IFS
echo $i
done
This produces the following output.
$ bash foo.sh
foo:bar:baz
Experiment 3
This is my third script.
IFS=:
var=foo:bar:baz
for i in $var
do
echo $i
done
This produces the following output.
$ bash foo.sh
foo
bar
baz
Why is the output different in all three cases? Can you explain the
rules behind the interpretation of IFS and the commands that leads to
this different outputs?
I found this a very interesting experiment.
Thank you for that.
To understand what is going on,
the relevant section from man bash is this:
Word Splitting
The shell scans the results of parameter expansion, command substitu-
tion, and arithmetic expansion that did not occur within double quotes
for word splitting.
The key is the "results of ..." part, and it's very subtle.
That is, word splitting happens on the result of certain operations,
as listed there: the result of parameter expansion,
the result of command substitution, and so on.
Word splitting is not performed on string literals such as foo:bar:baz.
Let's see how this logic plays out in the context of your examples.
Experiment 1
IFS=:
for i in foo:bar:baz
do
echo $i
done
This produces the following output:
foo bar baz
No word splitting is performed on the literal foo:bar:baz,
so it doesn't matter what is the value of IFS,
as far as the for loop is concerned.
Word splitting is performed after parameter expansion on the value of $i,
so foo:bar:baz is split to 3 words,
and passed to echo, so the output is foo bar baz.
Experiment 2
IFS=:
for i in foo:bar:baz
do
unset IFS
echo $i
done
This produces the following output:
foo:bar:baz
Once again,
no word splitting is performed on the literal foo:bar:baz,
so it doesn't matter what is the value of IFS,
as far as the for loop is concerned.
Word splitting is performed after parameter expansion on the value of $i,
but since IFS was unset,
its default value is used to perform the split,
which is <space><tab><newline>.
But since foo:bar:baz doesn't contain any of those,
it remains intact, so the output is foo:bar:baz.
Experiment 3
IFS=:
var=foo:bar:baz
for i in $var
do
echo $i
done
This produces the following output:
foo
bar
baz
After the parameter expansion of $var,
word splitting is performed using the value of IFS,
and so for has 3 values to iterate over, foo, bar, and baz.
The behavior of echo is trivial here,
the output is one word per line.
The bottomline is: word splitting is not performed on literal values.
Word splitting is only performed on the result of certain operations.
This is not all that surprising.
A string literal is much like an expression written enclosed in double-quotes, and you wouldn't expect word splitting on "...".
What does the triple-less-than-sign bash operator, <<<, mean, as inside the following code block?
LINE="7.6.5.4"
IFS=. read -a ARRAY <<< "$LINE"
echo "$IFS"
echo "${ARRAY[#]}"
Also, why does $IFS remain to be a space, not a period?
It redirects the string to stdin of the command.
Variables assigned directly before the command in this way only take effect for the command process; the shell remains untouched.
From man bash
Here Strings
A variant of here documents, the format is:
<<<word
The word is expanded and supplied to the command on its standard input.
The . on the IFS line is equivalent to source in bash.
Update: More from man bash (Thanks gsklee, sehe)
IFS The Internal Field Separator that is used for word splitting
after expansion and to split lines into words with the read
builtin command. The default value is "<space><tab><new‐line>".
yet more from man bash
The environment for any simple command or function may be augmented
temporarily by prefixing it with parameter assignments, as described
above in PARAMETERS. These assignment statements affect only the environment seen by that command.
The reason that IFS is not being set is that bash isn't seeing that as a separate command... you need to put a line feed or a semicolon after the command in order to terminate it:
$ cat /tmp/ifs.sh
LINE="7.6.5.4"
IFS='.' read -a ARRAY <<< "$LINE"
echo "$IFS"
echo "${ARRAY[#]}"
$ bash /tmp/ifs.sh
7 6 5 4
but
$ cat /tmp/ifs.sh
LINE="7.6.5.4"
IFS='.'; read -a ARRAY <<< "$LINE"
echo "$IFS"
echo "${ARRAY[#]}"
$ bash /tmp/ifs.sh
.
7 6 5 4
I'm not sure why doing it the first way wasn't a syntax error though.