Why does IFS not affect the length of an array in bash? - bash

I have two specific questions about the IFS. I'm aware that changing the internal field separator, IFS, changes what the bash script iterates over.
So, why is it that the length of the array doesn't change?
Here's my example:
delimiter=$1
strings_to_find=$2
OIFS=$IFS
IFS=$delimiter
echo "the internal field separator is $IFS"
echo "length of strings_to_find is ${#strings_to_find[#]}"
for string in ${strings_to_find[#]}
do
echo "string is separated correctly and is $string"
done
IFS=$OIFS
But why does the length not get affected by the new IFS?
The second thing that I don't understand is how to make the IFS affect the input arguments.
Let's say I'm expecting my input arguments to look like this:
./executable_shell_script.sh first_arg:second_arg:third_arg
And I want to parse the input arguments by setting the IFS to :. How do I do this? Setting the IFS doesn't seem to do anything. I must be doing this wrong....?
Thank you.

Bash arrays are, in fact, arrays. They are not strings which are parsed on demand. Once you create an array, the elements are whatever they are, and they won't change retroactively.
However, nothing in your example creates an array. If you wanted to create an array out of argument 2, you would need to use a different syntax:
strings_to_find=($2)
Although your strings_to_find is not an array, bash allows you to refer to it as though it were an array of one element. So ${#strings_to_find[#]} will always be one, regardless of the contents of strings_to_find. Also, your line:
for string in ${strings_to_find[#]}
is really no different from
for string in $strings_to_find
Since that expansion is not quoted, it will be word-split, using the current value of IFS.
If you use an array, most of the time you will not want to write for string in ${strings_to_find[#]}, because that just reassembles the elements of an array into a string and then word-splits them again, which loses the original array structure. Normally you will avoid the word-splitting by using double quotes:
strings_to_find=(...)
for string in "${strings_to_find[#]}"
As for your second question, the value of IFS does not alter the shell grammar. Regardless of the value of IFS, words in a command are separated by unquoted whitespace. After the line is parsed, the shell performs parameter and other expansions on each word. As mentioned above, if the expansion is not quoted, the expanded text is then word-split using the value of IFS.
If the word does not contain any expansions, no word-splitting is performed. And even if the word does contain expansions, word-splitting is only performed on the expansion itself. So, if you write:
IFS=:
my_function a:b:c
my_function will be called with a single argument; no expansion takes places, so no word-splitting occurs. However, if you use $1 unquoted inside the function, the expansion of $1 will be word-split (if it is expanded in a context in which word-splitting occurs).
On the other hand,
IFS=:
args=a:b:c
my_function $args
will cause my_function to be invoked with three arguments.
And finally,
IFS=:
args=c
my_function a:b:$args
is exactly the same as the first invocation, because there is no : in the expansion.

This is an example script based on #rici's answer :
#!/bin/bash
fun()
{
echo "Total Params : " $#
}
fun2()
{
array1=($1) # Word splitting occurs here based on the IFS ':'
echo "Total elements in array1 : "${#array1[#]}
# Here '#' before array counts the length of the array
array2=("$1") # No word splitting because we have enclosed $1 in double quotes
echo "Total elements in array2 : "${#array2[#]}
}
IFS_OLD="$IFS"
IFS=$':' #Changing the IFS
fun a:b:c #Nothing to expand here, so no use of IFS at all. See fun2 at last
fun a b c
fun abc
args="a:b:c"
fun $args # Expansion! Word splitting occurs with the current IFS ':' here
fun "$args" # preventing word spliting by enclosing ths string in double quotes
fun2 a:b:c
IFS="$IFS_OLD"
Output
Total Params : 1
Total Params : 3
Total Params : 1
Total Params : 3
Total Params : 1
Total elements in array1 : 3
Total elements in array2 : 1
Bash manpage says :
The shell treats each character of IFS as a delimiter, and splits the
results of the other expansions into words on these characters.

Related

How do bash variable types work and how to work around automatic interpretation?

I am trying to set up a variable that contains a string representation of a value with leading zeroes. I know I can printf to terminal the value, and I can pass the string output of printf to a variable. It seems however that assigning the value string to a new variable reinterprets the value and if I then print it, the value has lost its leading zeroes.
How do we work with variables in bash scripts to avoid implicit type inferences and ultimately how do I get to the solution I'm looking for. FYI I'm looking to concatenate a large fixed length string numeric, something like a part number, and build it from smaller prepared strings.
Update:
Turns out exactly how variables are assigned changes their interpretation in some way, see below:
Example:
#!/bin/bash
a=3
b=4
aStr=$(printf %03d $a)
bStr=$(printf %03d $b)
echo $aStr$bStr
output
$ ./test.sh
003004
$
Alternate form:
#!/bin/bash
((a = 3))
((b = 4))
((aStr = $(printf %03d $a)))
((bStr = $(printf %03d $b)))
echo $aStr$bStr
output
$ ./test.sh
34
$
How do bash variable types
There are no variable types. All variables are strings (type).. Variables store a value (a string), but also variables have some additional magic attributes associated with them.
There are Bash arrays, but I think it's an attribute that a variable is an array. Still, in any case, every array element holds a string. There is a "numeric" variable declare -i var, but it's attribute of the variable - in memory, the variable is still a string, only when setting it Bash checks if the string (still a string!) to be set is a number.
assigning the value string to a new variable reinterprets the value
Bash does not "interpret" the value on assignment.
How do we work with variables in bash scripts to avoid implicit type inferences
There are no "type inferences". The type of variable does not change - it holds a string.
The value of the variable undergoes different expansions and conversions depending on the context where it is used. For example $(...) removes trailing newlines. Most notably unquoted variable expansions undergo word splitting and filename expansion.
Example:
Posting your code to shellcheck results in:
Line 2:
a = 3
^-- SC2283 (error): Remove spaces around = to assign (or use [ ] to compare, or quote '=' if literal).
Line 3:
b = 4
^-- SC2283 (error): Remove spaces around = to assign (or use [ ] to compare, or quote '=' if literal).
Line 4:
aStr = $(printf %03d $a)
^-- SC2283 (error): Remove spaces around = to assign (or use [ ] to compare, or quote '=' if literal).
^-- SC2046 (warning): Quote this to prevent word splitting.
^-- SC2154 (warning): a is referenced but not assigned.
^-- SC2086 (info): Double quote to prevent globbing and word splitting.
Did you mean: (apply this, apply all SC2086)
aStr = $(printf %03d "$a")
Line 5:
bStr = $(printf %03d $b)
^-- SC2283 (error): Remove spaces around = to assign (or use [ ] to compare, or quote '=' if literal).
^-- SC2046 (warning): Quote this to prevent word splitting.
^-- SC2154 (warning): b is referenced but not assigned.
^-- SC2086 (info): Double quote to prevent globbing and word splitting.
Did you mean: (apply this, apply all SC2086)
bStr = $(printf %03d "$b")
Line 7:
echo $aStr$bStr
^-- SC2154 (warning): aStr is referenced but not assigned.
^-- SC2086 (info): Double quote to prevent globbing and word splitting.
^-- SC2154 (warning): bStr is referenced but not assigned.
^-- SC2086 (info): Double quote to prevent globbing and word splitting.
Did you mean: (apply this, apply all SC2086)
echo "$aStr""$bStr"
Shellcheck tells you what is wrong. After fixing the problems:
#!/bin/bash
a=3
b=4
aStr=$(printf %03d "$a")
bStr=$(printf %03d "$b")
echo "$aStr$bStr"
Which upon execution outputs your expected output:
003004
By doing ((aStr = $(printf %03d $a))), you are destroying again the careful formatting done by printf. You would see the same effect if you do a
(( x = 005 ))
echo $x
which outputs 5.
Actually the zeroes inserted by printf could do harm to your number, as you see by the following example:
(( x = 015 ))
echo $x
which outputs 13, because the ((....)) interprets the leading zero as an indication for octal numbers.
Hence, if you have a string representing a formatted (pretty-printed) number, don't use this string in numeric context anymore.

Storing a variable string with special characters into an array in bash

I need to store a string that may include special characters (to be exact: *) into an array as individual strings. The string is returned by a function so at the point of the array declaration I do not know its contents
foo(){
in="my * string"
echo "$in"
}
arr=($(foo))
What I've already tried was:
arr=("$(foo)")
where * doesn't get expanded but the array consists of 1 string, and:
arr=($(foo | sed -r "s/[\*]/'*'/g"))
that replaces each occurence of * with the string: *. Which is not what I want to achieve. What I aim for is just storing each * from the returned string as *.
Storing an array this way does not expand the "*"
ins="my * string"
read -r -a array <<< "$ins"
echo "${array[*]}"
Short answer:
read -a arr <<< "$(foo)"
To elaborate -
Your function is correctly returning the single string "my * string".
Your assignment to an array executes the function in unquoted context, so the asterisk is evaluated and parsed to the names of everything in the directory.
Putting quotes around the outer parens makes the whole assignment into the string "(my * string)" - also not what you want. You need something that preserves the asterisk unexpanded into directory contents but parses the elements of the string into separate items in your array, yes?
read -a arr <<< "$(foo)"
This passes back the string properly quoted, and then reads it into the array after splitting with $IFS, so each item becomes an unexpanded string in the array.
$: echo "${#arr[#]}"
3
$: printf "%s\n" "${arr[#]}"
my
*
string

return array from perl to bash

I'm trying to get back an array from perl to bash.
My perl scrip has an array and then I use return(#arr)
from my bash script I use
VAR = `perl....
when I echo VAR
I get the aray as 1 long string with all the array vars connected with no spaces.
Thanks
In the shell (and in Perl), backticks (``) capture the output of a command. However, Perl's return is normally for returning variables from subroutines - it does not produce output, so you probably want print instead. Also, in bash, array variables are declared with parentheses. So this works for me:
$ ARRAY=(`perl -wMstrict -le 'my #array = qw/foo bar baz/; print "#array"'`); \
echo "<${ARRAY[*]}> 0=${ARRAY[0]} 1=${ARRAY[1]} 2=${ARRAY[2]}"
<foo bar baz> 0=foo 1=bar 2=baz
In Perl, interpolating an array into a string (like "#array") will join the array with the special variable $" in between elements; that variable defaults to a single space. If you simply print #array, then the array elements will be joined by the variable $,, which is undef by default, meaning no space between the elements. This probably explains the behavior you mentioned ("the array vars connected with no spaces").
Note that the above will not work the way you expect if the elements of the array contain whitespace, because bash will split them into separate array elements. If your array does contain whitespace, then please provide an MCVE with sample data so we can perhaps make an alternative suggestion of how to return that back to bash. For example:
( # subshell so IFS is only affected locally
IFS=$'\n'
ARRAY=(`perl -wMstrict -e 'my #array = ("foo","bar","quz baz"); print join "\n", #array'`)
echo "0=<${ARRAY[0]}> 1=<${ARRAY[1]}> 2=<${ARRAY[2]}>"
)
Outputs: 0=<foo> 1=<bar> 2=<quz baz>
Here is one way using Bash word splitting, it will split the string on white space into the new array array:
array_str=$(perl -E '#a = 1..5; say "#a"')
array=( $array_str )
for item in ${array[#]} ; do
echo ": $item"
done
Output:
: 1
: 2
: 3
: 4
: 5

Why does use of # is not commenting instead getting no. of elements in the following code?

#!/bin/bash
#Declare array with 4 elements
ARRAY=( 'Debian Linux' 'Redhat Linux' Ubuntu Linux )
# get number of elements in the array
ELEMENTS=${#ARRAY[#]}
# echo each element in array
# for loop
for (( i=0;i<$ELEMENTS;i++)); do
echo ${ARRAY[${i}]}
done
In the 5th line (ELEMENTS=${#ARRAY[#]}) is getting the element no. How does this happens? Please explain.
It's because of the ${...} expansion. Inside one of them, the # character is not treated as an indicator of a comment. I wanted to know exactly, so I searched the source code of bash. First the part with normal comments in parse.y:
if MBTEST(character == '#' && (!interactive || interactive_comments))
{
/* A comment. Discard until EOL or EOF, and then return a newline. */
discard_until ('\n');
shell_getc (0);
character = '\n'; /* this will take the next if statement and return. */
}
If the character is a # the rest of the line is ignored. So far so good.
Now, if we're inside an opened ${...} expansion and the next character is #, the rest of the content until the closing } is interpreted as a variable name. See the relevant part in subst.c:
/* ${#var} doesn't have any of the other parameter expansions on it. */
if (string[t_index] == '#' && legal_variable_starter (string[t_index+1]))
name = string_extract (string, &t_index, "}", SX_VARNAME);
else
From man bash:
COMMENTS
In a non-interactive shell, or an interactive shell in which the interactive_comments option to the shopt builtin is enabled (see SHELL BUILTIN COMMANDS below), a word beginning with # causes that word and all remaining characters on that line to be ignored. An interactive shell without the interactive_comments option enabled does not allow comments. The interactive_comments option is on by default in interactive shells.
If a word begins with #, that indicates start of the comment. If it is in between the word, it is not.
As #choroba mentioned, read the paragraph Parameter Expansion in the bash manual pages:
${#parameter}
The length in characters of the expanded value of parameter is
substituted. If parameter is ‘’ or ‘#’, the value substituted is the
number of positional parameters. If parameter is an array name
subscripted by ‘’ or ‘#’, the value substituted is the number of
elements in the array.

confusing statement in shell script

while sizes=`sizes $pgid`
do
set -- $sizes
sample=$((${#/#/+}))
let peak="sample > peak ? sample : peak"
sleep 0.1
done
i am confused about the below statement:
sample=$((${#/#/+}))
could anybody explain this?
The '${#/#/+}' part is a regular expression expansion:
${parameter/pattern/string}
The pattern is expanded to produce a pattern just as in filename expansion.
Parameter is expanded and the longest match of pattern against its value is
replaced with string. If pattern begins with '/', all matches of pattern are replaced
with string. Normally only the first match is replaced. If pattern begins
with '#', it must match at the beginning of the expanded value of parameter.
If pattern begins with '%', it must match at the end of the expanded value of
parameter. If string is null, matches of pattern are deleted and the / following
pattern may be omitted. If parameter is '#' or '*', the substitution operation
is applied to each positional parameter in turn, and the expansion is the resultant
list. If parameter is an array variable subscripted with '#' or '*', the
substitution operation is applied to each member of the array in turn, and the
expansion is the resultant list.
So, it looks like it replaces the empty string at the start of each value in the argument list '$#' with a '+'. It's key merit is that it prefixes each argument in one fell swoop; otherwise, it is similar to "+$var".
The '$(( ... )) part is an arithmetic expression. It performs arithmetic on the expression between the parentheses. So, in context, it adds up the values in the argument list, assuming they are all numeric. Given the expansion, it might yield:
set -- 2 3 5 7 11
sample=$((${#/#/+}))
sample1=$((+2 +3 +5 +7 +11))
echo $sample = $sample1
and hence '28 = 28'.
Let's take the line from the inside out.
${#/#/+}
This is a parameter expansion, which expands the $# parameter (which in this case, will be an array of all of the items in $sizes), and then does a pattern match on each item, replacing each matched sequence with +. The # in the pattern matches the beginning of each item in the input; it doesn't actually consume anything, so the replacement by + will just add a + before each item. You can see this in action with a simple test function:
$ function tst() { echo ${#/#/+}; }
$ tst 1 2 3
+1 +2 +3
The result of this is then substituted into $(( )), which performs arithmetic expansion, evaluating the expression within it. The end result is that the variable $sample is set to the sum of all of the numbers in $sizes.
It's an arithmetic expansion of a string replacement.
$(( )) is arithmetic expansion - eg echo $((1 + 2)).
${var/x/y} is a string replacement; in this case, replace the first # in a line with +. $# is a variable that in this case contains $sizes; this will replace the string and then looks like it will add the values in it.
${var/old/new} expands $var, changing any "old" to "new".
${var/#old/new} insists that the match start at the start of the value
${var/#/new} substitutes at the start of every variable
${#/#/new} (and $#) applies to each parameter
$(( 1 + 3 )) replaces with the arithmetic result.
$(( ${#/#/+/ ))
Expands $#, the arguments from set -- $sizes, prepends a "+" to each parameter and runs the result through an arithmetic evaluation. It looks like it is adding all values on each line.

Resources