Bash looping through array - get index [duplicate] - bash

A script takes a URL, parses it for the required fields, and redirects its output to be saved in a file, file.txt. The output is saved on a new line each time a field has been found.
file.txt
A Cat
A Dog
A Mouse
etc...
I want to take file.txt and create an array from it in a new script, where every line gets to be its own string variable in the array. So far I have tried:
#!/bin/bash
filename=file.txt
declare -a myArray
myArray=(`cat "$filename"`)
for (( i = 0 ; i < 9 ; i++))
do
echo "Element [$i]: ${myArray[$i]}"
done
When I run this script, whitespace results in words getting split and instead of getting
Desired output
Element [0]: A Cat
Element [1]: A Dog
etc...
I end up getting this:
Actual output
Element [0]: A
Element [1]: Cat
Element [2]: A
Element [3]: Dog
etc...
How can I adjust the loop below such that the entire string on each line will correspond one-to-one with each variable in the array?

Use the mapfile command:
mapfile -t myArray < file.txt
The error is using for -- the idiomatic way to loop over lines of a file is:
while IFS= read -r line; do echo ">>$line<<"; done < file.txt
See BashFAQ/005 for more details.

mapfile and readarray (which are synonymous) are available in Bash version 4 and above. If you have an older version of Bash, you can use a loop to read the file into an array:
arr=()
while IFS= read -r line; do
arr+=("$line")
done < file
In case the file has an incomplete (missing newline) last line, you could use this alternative:
arr=()
while IFS= read -r line || [[ "$line" ]]; do
arr+=("$line")
done < file
Related:
Need alternative to readarray/mapfile for script on older version of Bash

You can do this too:
oldIFS="$IFS"
IFS=$'\n' arr=($(<file))
IFS="$oldIFS"
echo "${arr[1]}" # It will print `A Dog`.
Note:
Filename expansion still occurs. For example, if there's a line with a literal * it will expand to all the files in current folder. So use it only if your file is free of this kind of scenario.

Use mapfile or read -a
Always check your code using shellcheck. It will often give you the correct answer. In this case SC2207 covers reading a file that either has space separated or newline separated values into an array.
Don't do this
array=( $(mycommand) )
Files with values separated by newlines
mapfile -t array < <(mycommand)
Files with values separated by spaces
IFS=" " read -r -a array <<< "$(mycommand)"
The shellcheck page will give you the rationale why this is considered best practice.

You can simply read each line from the file and assign it to an array.
#!/bin/bash
i=0
while read line
do
arr[$i]="$line"
i=$((i+1))
done < file.txt

This answer says to use
mapfile -t myArray < file.txt
I made a shim for mapfile if you want to use mapfile on bash < 4.x for whatever reason. It uses the existing mapfile command if you are on bash >= 4.x
Currently, only options -d and -t work. But that should be enough for that command above. I've only tested on macOS. On macOS Sierra 10.12.6, the system bash is 3.2.57(1)-release. So the shim can come in handy. You can also just update your bash with homebrew, build bash yourself, etc.
It uses this technique to set variables up one call stack.

Make sure set the Internal File Separator (IFS)
variable to $'\n' so that it does not put each word
into a new array entry.
#!/bin/bash
# move all 2020 - 2022 movies to /backup/movies
# put list into file 1 line per dir
# dirs are "movie name (year)/"
ls | egrep 202[0-2] > 2020_movies.txt
OLDIFS=${IFS}
IFS=$'\n' #fix separator
declare -a MOVIES # array for dir names
MOVIES=( $( cat "${1}" ) ) // load into array
for M in ${MOVIES[#]} ; do
echo "[${M}]"
if [ -d "${M}" ] ; then # if dir name
mv -v "$M" /backup/movies/
fi
done
IFS=${OLDIFS} # restore standard separators
# not essential as IFS reverts when script ends
#END

Related

Why does splitting my $PATH with `read -r -a line` work but not with `while read -r line`?

Just noticed something strange which I can't quite explain:
When I split my $PATH variable using read -a everything works fine
IFS=: read -r -a lines <<< "$PATH"
for line in "${lines[#]}"; do echo "$line"; done
But when I try to do the same using while ... read loop, only the first line is printed
while IFS=: read -r line; do echo "$line"; done <<< "$PATH"
You can make this work; switch from using IFS=: to using -d:, and append a : to the end of your input stream:
while IFS= read -r -d: line; do echo "$line"; done <<< "$PATH:"
The difference is that IFS is used to find boundaries between words, but read -r line reads into exactly one variable, line, so it's not looking for multiple words at all. By contrast, -d tells each invocation of read which character to stop at; by default that's a newline, but you can replace it with any other single character. (If that character isn't found, read exits with a nonzero status; that's why the standard/idiomatic while read loop idiom skips the last line of your file if it isn't correctly terminated by a newline, and why we use $PATH: as our input here).
If you ran IFS=: read -r first second rest, on the other hand, it would put your first PATH entry into $first, the second one into $second, and the remainder of the line into $rest; whereas with IFS: read -r line, it's as if you only had a single item, $rest.
Your while loop processes 1 line, it is not a loop. So the complete path is stored in the field line.
When you had given more fields, the path would be divided to those fields (and the last field gets the remainder):
while IFS=: read -r line field2 field3 otherfields; do echo "$line"; done <<< "$PATH"
When you want to avoid an array, you can use
while read -r line; do echo "$line"; done <<< "${PATH//:/$'\n'}"
It works fine.
Splitting into an array gives an open-ended number of elements, so does what you expect.
Splitting into a single variable does the same thing, but when it runs out of supplied variable names into which to put the data, it's stops splitting and puts the rest into the last one.
Try this:
$: IFS=: read -r a b c <<< "$PATH"
$: printf "[%s]\n" "$a" "$b" "$c"
You'll get the first PATH element in $a, the second in $b, and the rest ALL in $c.
Does that make it clearer?
c.f. this guide
Why does splitting my $PATH with read -r -a line work but not with while read -r line?
Because read -r line reads the whole line and then after reading the single whole line then the line is spitt on IFS. Because you provided only one variable to read, all the line is in that one variable. You could like split the line on the first element and rest of elements:
IFS=: read -r part1 rest_of_parts <<<"$line"
See read 1p read the If there are fewer vars than fields, part. Note that still IFS=: read -r -a lines <<< "$PATH" will fail when PATH contains a newline, like so:
$ export PATH=/usr/bin # reset PATH to something short
$ cd /tmp/
$ mkdir temp$'\n'dir # create a directory with a newline in the name
$ ls -d tem*
'temp'$'\n''dir'/
$ cd temp$'\n'dir
$ printf "%s\n" '#!/bin/bash' 'echo hello world' > script.sh
$ chmod +x ./script.sh # add a script in that directory
$ export PATH="$PATH:$PWD" # add that directory to path
$ ./script.sh # yes. yes, it works
hello world
$ IFS=: read -r -a lines <<< "$PATH"
$ declare -p lines
declare -a lines=([0]="/usr/bin" [1]="/tmp/temp")
# ^^^^ newline and 'dir' is missing
# That is because `read` reads _one line_ and one line only
# _after_ reading that one line that _one line_ is split on IFS
# so any more lines are ignored.
You could use a bash extension to read -d that makes read not read the whole line, but up until a character (but I needed to ignore read exit status, dunno why):
$ while IFS= read -r -d':' line || [[ -n "$line" ]]; do declare -p line; done < <(printf "%s" "$PATH")
declare -- line="/usr/bin"
declare -- line="/tmp/temp
dir"
Note that <<< adds a trailing newline, so using that will result in the last element of PATH having a newline - as a workaround, in bash you may use process substitution < <(printf "%s" "$PATH").
The real safe solution if using bash is just using mapfile/readarray:
$ mapfile -d: -t lines < <(printf "%s" "$PATH")
$ declare -p lines
declare -a lines=([0]="/usr/bin" [1]=$'/tmp/temp\ndir')

Various input methods to an array [duplicate]

A script takes a URL, parses it for the required fields, and redirects its output to be saved in a file, file.txt. The output is saved on a new line each time a field has been found.
file.txt
A Cat
A Dog
A Mouse
etc...
I want to take file.txt and create an array from it in a new script, where every line gets to be its own string variable in the array. So far I have tried:
#!/bin/bash
filename=file.txt
declare -a myArray
myArray=(`cat "$filename"`)
for (( i = 0 ; i < 9 ; i++))
do
echo "Element [$i]: ${myArray[$i]}"
done
When I run this script, whitespace results in words getting split and instead of getting
Desired output
Element [0]: A Cat
Element [1]: A Dog
etc...
I end up getting this:
Actual output
Element [0]: A
Element [1]: Cat
Element [2]: A
Element [3]: Dog
etc...
How can I adjust the loop below such that the entire string on each line will correspond one-to-one with each variable in the array?
Use the mapfile command:
mapfile -t myArray < file.txt
The error is using for -- the idiomatic way to loop over lines of a file is:
while IFS= read -r line; do echo ">>$line<<"; done < file.txt
See BashFAQ/005 for more details.
mapfile and readarray (which are synonymous) are available in Bash version 4 and above. If you have an older version of Bash, you can use a loop to read the file into an array:
arr=()
while IFS= read -r line; do
arr+=("$line")
done < file
In case the file has an incomplete (missing newline) last line, you could use this alternative:
arr=()
while IFS= read -r line || [[ "$line" ]]; do
arr+=("$line")
done < file
Related:
Need alternative to readarray/mapfile for script on older version of Bash
You can do this too:
oldIFS="$IFS"
IFS=$'\n' arr=($(<file))
IFS="$oldIFS"
echo "${arr[1]}" # It will print `A Dog`.
Note:
Filename expansion still occurs. For example, if there's a line with a literal * it will expand to all the files in current folder. So use it only if your file is free of this kind of scenario.
Use mapfile or read -a
Always check your code using shellcheck. It will often give you the correct answer. In this case SC2207 covers reading a file that either has space separated or newline separated values into an array.
Don't do this
array=( $(mycommand) )
Files with values separated by newlines
mapfile -t array < <(mycommand)
Files with values separated by spaces
IFS=" " read -r -a array <<< "$(mycommand)"
The shellcheck page will give you the rationale why this is considered best practice.
You can simply read each line from the file and assign it to an array.
#!/bin/bash
i=0
while read line
do
arr[$i]="$line"
i=$((i+1))
done < file.txt
This answer says to use
mapfile -t myArray < file.txt
I made a shim for mapfile if you want to use mapfile on bash < 4.x for whatever reason. It uses the existing mapfile command if you are on bash >= 4.x
Currently, only options -d and -t work. But that should be enough for that command above. I've only tested on macOS. On macOS Sierra 10.12.6, the system bash is 3.2.57(1)-release. So the shim can come in handy. You can also just update your bash with homebrew, build bash yourself, etc.
It uses this technique to set variables up one call stack.
Make sure set the Internal File Separator (IFS)
variable to $'\n' so that it does not put each word
into a new array entry.
#!/bin/bash
# move all 2020 - 2022 movies to /backup/movies
# put list into file 1 line per dir
# dirs are "movie name (year)/"
ls | egrep 202[0-2] > 2020_movies.txt
OLDIFS=${IFS}
IFS=$'\n' #fix separator
declare -a MOVIES # array for dir names
MOVIES=( $( cat "${1}" ) ) // load into array
for M in ${MOVIES[#]} ; do
echo "[${M}]"
if [ -d "${M}" ] ; then # if dir name
mv -v "$M" /backup/movies/
fi
done
IFS=${OLDIFS} # restore standard separators
# not essential as IFS reverts when script ends
#END

Assign each line of file to be a variable

I am looking to assign each line of a file, through stdin a specific variable that can be used to refer to that exact line, such as line1, line2
example:
cat Testfile
Sample 1 -line1
Sample 2 -line2
Sample 3 -line3
The wrong way to do this, but exactly what you asked for, using discrete variables:
while IFS= read -r line; do
printf -v "line$(( ++i ))" '%s' "$line"
done <Testfile
echo "$line1" # to demonstrate use of array values
echo "$line2"
The right way, using an array, for bash 4.0 or newer:
mapfile -t array <Testfile
echo "${array[0]}" # to demonstrate use of array values
echo "${array[1]}"
The right way, using an array, for bash 3.x:
declare -a array
while read -r; do
array+=( "$REPLY" )
done <Testfile
See BashFAQ #6 for more in-depth discussion.
bash has a builtin function to do that. readarray reads lines from a stdin (which can be your file) and assigns them elements of an array:
declare -a lines
readarray -t lines <Testfile
Thereafter, you can refer to the lines by number. The first line is "${lines[0]}" and the second is "${lines[1]}", etc.
readarray requires bash version 4 (released in 2009), or better and is available on many modern linux systems. Debian stable, for example, currently provides bash 4.2 while RHEL6 provides 4.1. Mac OSX, though, is still usingbash 3.x.

list in script shell bash

I did this script
#!/bin/bash
liste=`ls -l`
for i in $liste
do
echo $i
done
The problem is I want the script displays each result line by line, but it displays word by word :
I have :
my_name
etud
4096
Oct
8
10:13
and I want to have :
my_name etud 4096 Oct 8 10:13
The final aim of the script is to analyze each line ; it is the reason I want to be able to recover the entire line. Maybe the list is not the best solution but I don't know how to recover the lines.
To start, we'll assume that none of your filenames ever contain newlines:
ls -l | IFS= while read -r line; do
echo "$line"
# Do whatever else you want with $line
done
If your filenames could contain newlines, things get tricky. In this case, it's better (although slower) to use stat to retrieve the desired metadata from each file individually. Consult man stat for details about how your local variety of stat works, as it is unfortunately not very standardized.
for f in *; do
line=$(stat -c "%U %n %s %y" "$f") # One possibility
# Work with $line as if it came from ls -l
done
You can replace
echo $i
with
echo -n "$i "
echo -n outputs to console without newline.
Another to do it with a while loop and without a pipe:
#!/bin/bash
while read line
do
echo "line: $line"
done < <(ls -l)
First, I hope that you aren't genuinely using ls in your real code, but only using it as an example. If you want a list of files, ls is the wrong tool; see http://mywiki.wooledge.org/ParsingLs for details.
Second, modern versions of bash have a builtin called readarray.
Try this:
readarray -t my_array < <(ls -l)
for entry in "${my_array[#]}"; do
read -a pieces <<<"$entry"
printf '<%s> ' "${pieces[#]}"; echo
done
First, it creates an array (called my_array) with all the output from the command being run.
Then, for each line in that output, it creates an array called pieces, and emits each piece with arrow brackets around them.
If you want to read a line at a time, rather than reading the entire file at once, see http://mywiki.wooledge.org/BashFAQ/001 ("How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?")
Joinning the previous answers with the need to store the list of files in a variable. You can do this
echo -n "$list"|while read -r lin
do
echo $lin
done

assign a value to a variable in a loop

There are 2 pieces of code here, and the value in $1 is the name of a file which contains 3 lines of text.
Now, I have a problem. In the first piece of the code, I can't get the "right" value out of the loop, but in the second piece of the code, I can get the right result. I don't know why.
How can I make the first piece of the code get the right result?
#!/bin/bash
count=0
cat "$1" | while read line
do
count=$[ $count + 1 ]
done
echo "$count line(s) in all."
#-----------------------------------------
count2=0
for var in a b c
do
count2=$[ $count2 + 1 ]
done
echo "$count2 line(s) in all."
This happens because of the pipe before the while loop. It creates a sub-shell, and thus the changes in the variables are not passed to the main script. To overcome this, use process substitution instead:
while read -r line
do
# do some stuff
done < <( some commad)
In version 4.2 or later, you can also set the lastpipe option, and the last command
in the pipeline will run in the current shell, not a subshell.
shopt -s lastpipe
some command | while read -r line; do
# do some stuff
done
In this case, since you are just using the contents of the file, you can use input redirection:
while read -r line
do
# do some stuff
done < "$file"

Resources