Zsh - split string by spaces when using dot operator - shell

Here is my script:
#/bin/bash
list="a b c"
for i in $list; do
echo $i
done
This works:
➜ ~ ./lol.sh
a
b
c
This doesn't:
➜ ~ . ./lol.sh
a b c
Why split does not work with dot command and how can I fix it?

Lists should never be represented as strings. Use array syntax.
list=( a b c )
for i in "${list[#]}"; do
echo "$i"
done
There are several reasons this is preferable.
In ZSH:
ZSH breaks POSIX by not performing string-splitting at all on unquoted expansions unless they explicitly request it. You can make this request by either running setopt sh_word_split, or using the parameter expansions ${=list} or ${(ps: :)list}
In other Bourne-derived shells:
String-splitting is dependent on the value of IFS, which cannot be guaranteed to be at defaults, especially when sourced from a separate script (which may have changed it locally).
Unquoted expansion also performs globbing, which can have different results depending on which files are in the current working directory (for instance, if your list contains hello[world], this will behave in an unexpected manner if your current directory contains files named hellow, helloo, or otherwise matching the glob).
Avoiding the globbing step is not only more correct, but also more efficient.

Whilst I note the comment regarding lists by Charles Duffy, this was my solution/test.
#!/bin/zsh
function three()
{
first=$1
second=$2
third=$3
echo "1: $first 2: $second 3:$third"
}
setopt sh_word_split
set "1 A 2" "2 B 3" "3 C 4" "4 D 5"
for i;do
three $i;
done
This will output
1: 1 2: A 3:2
1: 2 2: B 3:3
1: 3 2: C 3:4
1: 4 2: D 3:5

Related

why there is different output in for-loop

Linux bash: why the two shell script as follow had different result?
[root#yumserver ~]# data="a,b,c";IFS=",";for i in $data;do echo $i;done
a
b
c
[root#yumserver ~]# IFS=",";for i in a,b,c;do echo $i;done
a b c
expect output: the second script also output:
a
b
c
I should understood what #M.NejatAydin means。Thanks also #EdMorton,#HaimCohen!
[root#k8smaster01 ~]# set -x;data="a,b,c";IFS=",";echo $data;echo "$data";for i in $data;do echo $i;done
+ data=a,b,c
+ IFS=,
+ echo a b c
a b c
+ echo a,b,c
a,b,c
+ for i in '$data'
+ echo a
a
+ for i in '$data'
+ echo b
b
+ for i in '$data'
+ echo c
c
[root#k8smaster01 ~]# IFS=",";for i in a,b,c;do echo $i;done
+ IFS=,
+ for i in a,b,c
+ echo a b c
a b c
Word splitting is performed on the results of unquoted expansions (specifically, parameter expansions, command substitutions, and arithmetic expansions, with a few exceptions which are not relevant here). The literal string a,b,c in the
second for loop is not an expansion at all. Thus, word splitting is not performed on that literal string. But note that, in the second example, word splitting is still performed on $i (an unquoted expansion) in the command echo $i.
It seems the point of confusion is where and when the IFS is used. It is used in the word splitting phase following an (unquoted) expansion. It is not used when the shell reads its input and breaks the input into words, which is an earlier phase.
Note: IFS is also used in other contexts (eg, by the read builtin command) which are not relevant to this question.
#HaimCohen explained in detail why you get a different result with those two approaches. Which is what you asked. His answer is correct, it should get upvoted and accepted.
Just a trivial addition from my side: you can easily modify the second of your approaches however if you define the variable on the fly:
IFS=",";for i in ${var="a,b,c"};do echo $i;done

Parse filename string and extract parent at specific level using shell

I have a filename as a string, say filname="a/b/c/d.png".
Is there a general method to extract the parent directory at a given level using ONLY shell parameter expansion?
I.e. I would like to extract "level 1" and return c or "level 2" and return b.
Explicitly, I DO NOT want to get the entire parent path (i.e. a/b/c/, which is the result of ${filename%/*}).
Using just shell parameter expansion, assuming bash, you can first transform the path into an array (splitting on /) and then ask for specific array indexes:
filename=a/b/c/d.png
IFS=/
filename_array=( $filename )
unset IFS
echo "0 = ${filename_array[0]}"
echo "1 = ${filename_array[1]}"
echo "2 = ${filename_array[2]}"
echo "3 = ${filename_array[3]}"
Running the above produces:
0 = a
1 = b
2 = c
3 = d.png
These indexes are the reverse of what you want, but a little
arithmetic should fix that.
Using zsh, the :h modifier trims the final component off a path in variable expansion.
The (s:...:) parameter expansion flag can be used to split the contents of a variable. Combine those with normal array indexing where a negative index goes from the end of the array, and...
$ filename=a/b/c/d.png
$ print $filename:h
a/b/c
$ level=1
$ print ${${(s:/:)filename:h}[-level]}
c
$ level=2
$ print ${${(s:/:)filename:h}[-level]}
b
You could also use array subscript flags instead to avoid the nested expansion:
$ level=1
$ print ${filename[(ws:/:)-level-1]}
c
$ level=2
$ print ${filename[(ws:/:)-level-1]}
b
w makes the index of a scalar split on words instead of by character, and s:...: has the same meaning, to say what to split on. Have to subtract one from the level to skip over the trailing d.png, since it's not stripped off already like the first way.
The :h (head) and :t (tail) expansion modifiers in zsh accept digits to specify a level; they can be combined to get a subset of the path:
> filname="a/b/c/d.png"
> print ${filname:t2}
c/d.png
> print ${filname:t2:h1}
c
> print ${filname:t3:h1}
b
If the level is in a variable, then the F modifier can be used to repeat the h modifier a specific number of times:
> for i in 1 2 3; printf '%s: %s\n' $i ${filname:F(i)h:t}
1: c
2: b
3: a
If using printf (a shell builtin) is allowed then this will do the trick in bash:
filename='a/b/c/d.png'
level=2
printf -v spaces '%*s' $level
pattern=${spaces//?/'/*'}
component=${filename%$pattern}
component=${component##*/}
echo $component
prints out
b
You can assign different values to the variable level.

Looping through variable with spaces

This piece of code works as expected:
for var in a 'b c' d;
do
echo $var;
done
The bash script loops through 3 arguments printing
a
b c
d
However, if this string is read in via jq , and then looped over like so:
JSON_FILE=path/to/jsonfile.json
ARGUMENTS=$(jq -r '.arguments' "${JSON_FILE}")
for var in ${ARGUMENTS};
do
echo $var;
done
The result is 4 arguments as follows:
a
'b
c'
d
Example json file for reference:
{
"arguments" : "a 'b c' d"
}
What is the reason for this? I tried putting quotes around the variable like suggested in other SO answers but that caused everything to just be handled as 1 argument.
What can I do to get the behavior of the first case (3 arguments)?
What is the reason for this?
The word splitting expansion is run over unquoted results of other expansions. Because ${ARGUMENTS} expansion in for var in ${ARGUMENTS}; is unquoted, word splitting is performed. No, word splitting ignores quotes resulted from variable expansion - it only cares about whitespaces.
What can I do to get the behavior of the first case (3 arguments)?
The good way™ would be to write your own parser, to parse the quotes inside the strings and split the argument depending on the quotes.
I advise to use xargs, it (by default, usually a confusing behavior) parses quotes in the input strings:
$ arguments="a 'b c' d"
$ echo "${arguments}" | xargs -n1 echo
a
b c
d
# convert to array
$ readarray -d '' arr < <(<<<"${arguments}" xargs printf "%s\0")
As presented in the other answer, you may use eval, but please do not, eval is evil and will run expansions over the input string.
Change IFS to a new line to make it work:
...
IFS='\n'; for var in $ARGUMENTS;
do
echo $var;
done

The semantics of arrays in bash

Check out the following transcript. With all possible rigor and formality, what is going on at each step?
$> ls -1 #This command prints 3 items. no explanation required.
a
b
c
$> X=$(ls -1) #Capture the output (as what? a string?)
$> Y=($(ls -1)) #Capture it again (as an array now?)
$> echo ${#X[#]} #Why is the length 1?
1
$> echo ${#Y[#]} #This works because Y is an array of the 3 items?
3
$> echo $X #Why are the linefeeds now spaces?
a b c
$> echo $Y #Why does the array echo as its first element
a
$> for x in $X;do echo $x; done #iterate over $X
a
b
c
$> for y in $Y;do echo $y; done #iterating over y doesn't work
a
$> echo ${X[2]} #I can loop over $X but not index into it?
$> echo ${Y[2]} #Why does this work if I can't loop over $Y?
c
I assume bash has well established semantics about how arrays and text variables (if that's even what they're called) work, but the user manual is not organized in an optimal fashion for someone who wants to reason about scripts based on whatever small set of underlying principles the language designer intended.
Let me preface the following with the very strong suggestion that you never use ls to populate an array. The correct code would be
Z=( * )
to create an array with each (non-hidden) file in the current directory as a distinct array element.
$> ls -1 #This command prints 3 items. no explanation required.
a
b
c
Correct. Each file name is printed on a separate line (although, beware of file names containing newlines; the parts before and after each newline would appear as separate file names.)
$> X=$(ls -1) #Capture the output (as what? a string?)
Yes. The output of ls is concatenated by the command substitution into a single string using a single space to separate each line. (The command substitution would be subject to word-splitting if it weren't the right-hand side of an assignment; word-splitting will come up below.)
$> Y=($(ls -1)) #Capture it again (as an array now?)
Same as with X, but now each of the words in the result of the command substitution is treated as a separate array element. As long as none of the output lines contain any characters in the value of IFS, each file name is one word and will be treated as a separate array element.
$> echo ${#X[#]} #Why is the length 1?
1
X, not being a real array, is treated as an array with a single element, namely the value of $X.
$> echo ${#Y[#]} #This works because Y is an array of the 3 items?
3
Correct.
$> echo $X #Why are the linefeeds now spaces?
a b c
When $X is unquoted, the resulting expansion is subject to word-splitting. In this case, the newlines are simply treated the same as any other whitespace, separating the result into a sequence of words that are passed to echo as distinct arguments, which are then displayed separated by a single space each.
$> echo $Y #Why does the array echo as its first element
a
For a true array, $Y is equivalent to ${Y[0]}.
$> for x in $X;do echo $x; done #iterate over $X
a
b
c
This works, but has caveats.
$> for y in $Y;do echo $y; done #iterating over y doesn't work
a
See above; $Y only expands to the first element. You want for y in "${Y[#]}"; do to iterate over all the elements.
$> echo ${X[2]} #I can loop over $X but not index into it?
Correct. X is not an array, but $X expanded to a space-separated list which the for loop could iterate over.
$> echo ${Y[2]} #Why does this work if I can't loop over $Y?
c
Indexing and iteration are two completely different things in shell. You don't actually iterate over an array; you iterate over the resulting sequence of words of a properly expanded array.

How to separate string into shell arguments?

I have this test variable in ZSH:
test_str='echo "a \" b c"'
I'd like to parse this into an array of two strings ("echo" "a \" b c").
i.e. Read test_str as the shell itself would and give me back an array of
arguments.
Please note that I'm not looking to split on white space or anything like that. This is really about parsing arbitrarily complex strings into shell arguments.
Zsh has (z) modifier:
ARGS=( ${(z)test_str} )
. But this will produce echo and "a \" b c", it won’t unquote string. To unquote you have to use Q modifier:
ARGS=( ${(Q)${(z)test_str}} )
: results in having echo and a " b c in $ARGS array. Neither would execute code in … or $(…), but (z) will split $(false true) into one argument.
that is to say:
% testfoo=${(z):-'blah $(false true)'}; echo $testfoo[2]
$(false true)
A simpler (?) answer is hinted at by the wording of the question. To set shell argument, use set:
#!/bin/sh
test_str='echo "a \" b"'
eval set $test_str
for i; do echo $i; done
This sets $1 to echo and $2 to a " b. eval certainly has risks, but this is portable sh. It does not assign to an array, of course, but you can use $# in the normal way.

Resources