A script takes a URL, parses it for the required fields, and redirects its output to be saved in a file, file.txt. The output is saved on a new line each time a field has been found.
file.txt
A Cat
A Dog
A Mouse
etc...
I want to take file.txt and create an array from it in a new script, where every line gets to be its own string variable in the array. So far I have tried:
#!/bin/bash
filename=file.txt
declare -a myArray
myArray=(`cat "$filename"`)
for (( i = 0 ; i < 9 ; i++))
do
echo "Element [$i]: ${myArray[$i]}"
done
When I run this script, whitespace results in words getting split and instead of getting
Desired output
Element [0]: A Cat
Element [1]: A Dog
etc...
I end up getting this:
Actual output
Element [0]: A
Element [1]: Cat
Element [2]: A
Element [3]: Dog
etc...
How can I adjust the loop below such that the entire string on each line will correspond one-to-one with each variable in the array?
Use the mapfile command:
mapfile -t myArray < file.txt
The error is using for -- the idiomatic way to loop over lines of a file is:
while IFS= read -r line; do echo ">>$line<<"; done < file.txt
See BashFAQ/005 for more details.
mapfile and readarray (which are synonymous) are available in Bash version 4 and above. If you have an older version of Bash, you can use a loop to read the file into an array:
arr=()
while IFS= read -r line; do
arr+=("$line")
done < file
In case the file has an incomplete (missing newline) last line, you could use this alternative:
arr=()
while IFS= read -r line || [[ "$line" ]]; do
arr+=("$line")
done < file
Related:
Need alternative to readarray/mapfile for script on older version of Bash
You can do this too:
oldIFS="$IFS"
IFS=$'\n' arr=($(<file))
IFS="$oldIFS"
echo "${arr[1]}" # It will print `A Dog`.
Note:
Filename expansion still occurs. For example, if there's a line with a literal * it will expand to all the files in current folder. So use it only if your file is free of this kind of scenario.
Use mapfile or read -a
Always check your code using shellcheck. It will often give you the correct answer. In this case SC2207 covers reading a file that either has space separated or newline separated values into an array.
Don't do this
array=( $(mycommand) )
Files with values separated by newlines
mapfile -t array < <(mycommand)
Files with values separated by spaces
IFS=" " read -r -a array <<< "$(mycommand)"
The shellcheck page will give you the rationale why this is considered best practice.
You can simply read each line from the file and assign it to an array.
#!/bin/bash
i=0
while read line
do
arr[$i]="$line"
i=$((i+1))
done < file.txt
This answer says to use
mapfile -t myArray < file.txt
I made a shim for mapfile if you want to use mapfile on bash < 4.x for whatever reason. It uses the existing mapfile command if you are on bash >= 4.x
Currently, only options -d and -t work. But that should be enough for that command above. I've only tested on macOS. On macOS Sierra 10.12.6, the system bash is 3.2.57(1)-release. So the shim can come in handy. You can also just update your bash with homebrew, build bash yourself, etc.
It uses this technique to set variables up one call stack.
Make sure set the Internal File Separator (IFS)
variable to $'\n' so that it does not put each word
into a new array entry.
#!/bin/bash
# move all 2020 - 2022 movies to /backup/movies
# put list into file 1 line per dir
# dirs are "movie name (year)/"
ls | egrep 202[0-2] > 2020_movies.txt
OLDIFS=${IFS}
IFS=$'\n' #fix separator
declare -a MOVIES # array for dir names
MOVIES=( $( cat "${1}" ) ) // load into array
for M in ${MOVIES[#]} ; do
echo "[${M}]"
if [ -d "${M}" ] ; then # if dir name
mv -v "$M" /backup/movies/
fi
done
IFS=${OLDIFS} # restore standard separators
# not essential as IFS reverts when script ends
#END
Related
Just noticed something strange which I can't quite explain:
When I split my $PATH variable using read -a everything works fine
IFS=: read -r -a lines <<< "$PATH"
for line in "${lines[#]}"; do echo "$line"; done
But when I try to do the same using while ... read loop, only the first line is printed
while IFS=: read -r line; do echo "$line"; done <<< "$PATH"
You can make this work; switch from using IFS=: to using -d:, and append a : to the end of your input stream:
while IFS= read -r -d: line; do echo "$line"; done <<< "$PATH:"
The difference is that IFS is used to find boundaries between words, but read -r line reads into exactly one variable, line, so it's not looking for multiple words at all. By contrast, -d tells each invocation of read which character to stop at; by default that's a newline, but you can replace it with any other single character. (If that character isn't found, read exits with a nonzero status; that's why the standard/idiomatic while read loop idiom skips the last line of your file if it isn't correctly terminated by a newline, and why we use $PATH: as our input here).
If you ran IFS=: read -r first second rest, on the other hand, it would put your first PATH entry into $first, the second one into $second, and the remainder of the line into $rest; whereas with IFS: read -r line, it's as if you only had a single item, $rest.
Your while loop processes 1 line, it is not a loop. So the complete path is stored in the field line.
When you had given more fields, the path would be divided to those fields (and the last field gets the remainder):
while IFS=: read -r line field2 field3 otherfields; do echo "$line"; done <<< "$PATH"
When you want to avoid an array, you can use
while read -r line; do echo "$line"; done <<< "${PATH//:/$'\n'}"
It works fine.
Splitting into an array gives an open-ended number of elements, so does what you expect.
Splitting into a single variable does the same thing, but when it runs out of supplied variable names into which to put the data, it's stops splitting and puts the rest into the last one.
Try this:
$: IFS=: read -r a b c <<< "$PATH"
$: printf "[%s]\n" "$a" "$b" "$c"
You'll get the first PATH element in $a, the second in $b, and the rest ALL in $c.
Does that make it clearer?
c.f. this guide
Why does splitting my $PATH with read -r -a line work but not with while read -r line?
Because read -r line reads the whole line and then after reading the single whole line then the line is spitt on IFS. Because you provided only one variable to read, all the line is in that one variable. You could like split the line on the first element and rest of elements:
IFS=: read -r part1 rest_of_parts <<<"$line"
See read 1p read the If there are fewer vars than fields, part. Note that still IFS=: read -r -a lines <<< "$PATH" will fail when PATH contains a newline, like so:
$ export PATH=/usr/bin # reset PATH to something short
$ cd /tmp/
$ mkdir temp$'\n'dir # create a directory with a newline in the name
$ ls -d tem*
'temp'$'\n''dir'/
$ cd temp$'\n'dir
$ printf "%s\n" '#!/bin/bash' 'echo hello world' > script.sh
$ chmod +x ./script.sh # add a script in that directory
$ export PATH="$PATH:$PWD" # add that directory to path
$ ./script.sh # yes. yes, it works
hello world
$ IFS=: read -r -a lines <<< "$PATH"
$ declare -p lines
declare -a lines=([0]="/usr/bin" [1]="/tmp/temp")
# ^^^^ newline and 'dir' is missing
# That is because `read` reads _one line_ and one line only
# _after_ reading that one line that _one line_ is split on IFS
# so any more lines are ignored.
You could use a bash extension to read -d that makes read not read the whole line, but up until a character (but I needed to ignore read exit status, dunno why):
$ while IFS= read -r -d':' line || [[ -n "$line" ]]; do declare -p line; done < <(printf "%s" "$PATH")
declare -- line="/usr/bin"
declare -- line="/tmp/temp
dir"
Note that <<< adds a trailing newline, so using that will result in the last element of PATH having a newline - as a workaround, in bash you may use process substitution < <(printf "%s" "$PATH").
The real safe solution if using bash is just using mapfile/readarray:
$ mapfile -d: -t lines < <(printf "%s" "$PATH")
$ declare -p lines
declare -a lines=([0]="/usr/bin" [1]=$'/tmp/temp\ndir')
A script takes a URL, parses it for the required fields, and redirects its output to be saved in a file, file.txt. The output is saved on a new line each time a field has been found.
file.txt
A Cat
A Dog
A Mouse
etc...
I want to take file.txt and create an array from it in a new script, where every line gets to be its own string variable in the array. So far I have tried:
#!/bin/bash
filename=file.txt
declare -a myArray
myArray=(`cat "$filename"`)
for (( i = 0 ; i < 9 ; i++))
do
echo "Element [$i]: ${myArray[$i]}"
done
When I run this script, whitespace results in words getting split and instead of getting
Desired output
Element [0]: A Cat
Element [1]: A Dog
etc...
I end up getting this:
Actual output
Element [0]: A
Element [1]: Cat
Element [2]: A
Element [3]: Dog
etc...
How can I adjust the loop below such that the entire string on each line will correspond one-to-one with each variable in the array?
Use the mapfile command:
mapfile -t myArray < file.txt
The error is using for -- the idiomatic way to loop over lines of a file is:
while IFS= read -r line; do echo ">>$line<<"; done < file.txt
See BashFAQ/005 for more details.
mapfile and readarray (which are synonymous) are available in Bash version 4 and above. If you have an older version of Bash, you can use a loop to read the file into an array:
arr=()
while IFS= read -r line; do
arr+=("$line")
done < file
In case the file has an incomplete (missing newline) last line, you could use this alternative:
arr=()
while IFS= read -r line || [[ "$line" ]]; do
arr+=("$line")
done < file
Related:
Need alternative to readarray/mapfile for script on older version of Bash
You can do this too:
oldIFS="$IFS"
IFS=$'\n' arr=($(<file))
IFS="$oldIFS"
echo "${arr[1]}" # It will print `A Dog`.
Note:
Filename expansion still occurs. For example, if there's a line with a literal * it will expand to all the files in current folder. So use it only if your file is free of this kind of scenario.
Use mapfile or read -a
Always check your code using shellcheck. It will often give you the correct answer. In this case SC2207 covers reading a file that either has space separated or newline separated values into an array.
Don't do this
array=( $(mycommand) )
Files with values separated by newlines
mapfile -t array < <(mycommand)
Files with values separated by spaces
IFS=" " read -r -a array <<< "$(mycommand)"
The shellcheck page will give you the rationale why this is considered best practice.
You can simply read each line from the file and assign it to an array.
#!/bin/bash
i=0
while read line
do
arr[$i]="$line"
i=$((i+1))
done < file.txt
This answer says to use
mapfile -t myArray < file.txt
I made a shim for mapfile if you want to use mapfile on bash < 4.x for whatever reason. It uses the existing mapfile command if you are on bash >= 4.x
Currently, only options -d and -t work. But that should be enough for that command above. I've only tested on macOS. On macOS Sierra 10.12.6, the system bash is 3.2.57(1)-release. So the shim can come in handy. You can also just update your bash with homebrew, build bash yourself, etc.
It uses this technique to set variables up one call stack.
Make sure set the Internal File Separator (IFS)
variable to $'\n' so that it does not put each word
into a new array entry.
#!/bin/bash
# move all 2020 - 2022 movies to /backup/movies
# put list into file 1 line per dir
# dirs are "movie name (year)/"
ls | egrep 202[0-2] > 2020_movies.txt
OLDIFS=${IFS}
IFS=$'\n' #fix separator
declare -a MOVIES # array for dir names
MOVIES=( $( cat "${1}" ) ) // load into array
for M in ${MOVIES[#]} ; do
echo "[${M}]"
if [ -d "${M}" ] ; then # if dir name
mv -v "$M" /backup/movies/
fi
done
IFS=${OLDIFS} # restore standard separators
# not essential as IFS reverts when script ends
#END
I'm new to UNIX and have this really simple problem:
I have a text-file (input.txt) containing a string in each line. It looks like this:
House
Monkey
Car
And inside my shell script I need to read this input file line by line to get to a variable like this:
things="House,Monkey,Car"
I know this sounds easy, but I just couldnt find any simple solution for this. My closest attempt so far:
#!/bin/sh
things=""
addToString() {
things="${things},$1"
}
while read line; do addToString $line ;done <input.txt
echo $things
But this won't work. Regarding to my google research I thought the while loop would create a new sub shell, but this I was wrong there (see the comment section). Nevertheless the variable "things" was still not available in the echo later on. (I cannot just write the echo inside the while loop, because I need to work with that string later on)
Could you please help me out here? Any help will be appreciated, thank you!
What you proposed works fine! I've only made two changes here: Adding missing quotes, and handling the empty-string case.
things=""
addToString() {
if [ -n "$things" ]; then
things="${things},$1"
else
things="$1"
fi
}
while read -r line; do addToString "$line"; done <input.txt
echo "$things"
If you were piping into while read, this would create a subshell, and that would eat your variables. You aren't piping -- you're doing a <input.txt redirection. No subshell, code works without changes.
That said, there are better ways to read lists of items into shell variables. On any version of bash after 3.0:
IFS=$'\n' read -r -d '' -a things <input.txt # read into an array
printf -v things_str '%s,' "${things[#]}" # write array to a comma-separated string
echo "${things_str%,}" # print that string w/o trailing comma
...on bash 4, that first line can be:
readarray -t things <input.txt # read into an array
This is not a shell solution, but the truth is that solutions in pure shell are often excessively long and verbose. So e.g. to do string processing it is better to use special tools that are part of the “default” Unix environment.
sed ':b;N;$!bb;s/\n/,/g' < input.txt
If you want to omit empty lines, then:
sed ':b;N;$!bb;s/\n\n*/,/g' < input.txt
Speaking about your solution, it should work, but you should really always use quotes where applicable. E.g. this works for me:
things=""
while read line; do things="$things,$line"; done < input.txt
echo "$things"
(Of course, there is an issue with this code, as it outputs a leading comma. If you want to skip empty lines, just add an if check.)
This might/might not work, depending on the shell you are using. On my Ubuntu 14.04/x64, it works with both bash and dash.
To make it more reliable and independent from the shell's behavior, you can try to put the whole block into a subshell explicitly, using the (). For example:
(
things=""
addToString() {
things="${things},$1"
}
while read line; do addToString $line ;done
echo $things
) < input.txt
P.S. You can use something like this to avoid the initial comma. Without bash extensions (using short-circuit logical operators instead of the if for shortness):
test -z "$things" && things="$1" || things="${things},${1}"
Or with bash extensions:
things="${things}${things:+,}${1}"
P.P.S. How I would have done it:
tr '\n' ',' < input.txt | sed 's!,$!\n!'
You can do this too:
#!/bin/bash
while read -r i
do
[[ $things == "" ]] && things="$i" || things="$things","$i"
done < <(grep . input.txt)
echo "$things"
Output:
House,Monkey,Car
N.B:
Used grep to tackle with empty lines and the probability of not having a new line at the end of file. (Normal while read will fail to read the last line if there is no newline at the end of file.)
I am looking to assign each line of a file, through stdin a specific variable that can be used to refer to that exact line, such as line1, line2
example:
cat Testfile
Sample 1 -line1
Sample 2 -line2
Sample 3 -line3
The wrong way to do this, but exactly what you asked for, using discrete variables:
while IFS= read -r line; do
printf -v "line$(( ++i ))" '%s' "$line"
done <Testfile
echo "$line1" # to demonstrate use of array values
echo "$line2"
The right way, using an array, for bash 4.0 or newer:
mapfile -t array <Testfile
echo "${array[0]}" # to demonstrate use of array values
echo "${array[1]}"
The right way, using an array, for bash 3.x:
declare -a array
while read -r; do
array+=( "$REPLY" )
done <Testfile
See BashFAQ #6 for more in-depth discussion.
bash has a builtin function to do that. readarray reads lines from a stdin (which can be your file) and assigns them elements of an array:
declare -a lines
readarray -t lines <Testfile
Thereafter, you can refer to the lines by number. The first line is "${lines[0]}" and the second is "${lines[1]}", etc.
readarray requires bash version 4 (released in 2009), or better and is available on many modern linux systems. Debian stable, for example, currently provides bash 4.2 while RHEL6 provides 4.1. Mac OSX, though, is still usingbash 3.x.
I did this script
#!/bin/bash
liste=`ls -l`
for i in $liste
do
echo $i
done
The problem is I want the script displays each result line by line, but it displays word by word :
I have :
my_name
etud
4096
Oct
8
10:13
and I want to have :
my_name etud 4096 Oct 8 10:13
The final aim of the script is to analyze each line ; it is the reason I want to be able to recover the entire line. Maybe the list is not the best solution but I don't know how to recover the lines.
To start, we'll assume that none of your filenames ever contain newlines:
ls -l | IFS= while read -r line; do
echo "$line"
# Do whatever else you want with $line
done
If your filenames could contain newlines, things get tricky. In this case, it's better (although slower) to use stat to retrieve the desired metadata from each file individually. Consult man stat for details about how your local variety of stat works, as it is unfortunately not very standardized.
for f in *; do
line=$(stat -c "%U %n %s %y" "$f") # One possibility
# Work with $line as if it came from ls -l
done
You can replace
echo $i
with
echo -n "$i "
echo -n outputs to console without newline.
Another to do it with a while loop and without a pipe:
#!/bin/bash
while read line
do
echo "line: $line"
done < <(ls -l)
First, I hope that you aren't genuinely using ls in your real code, but only using it as an example. If you want a list of files, ls is the wrong tool; see http://mywiki.wooledge.org/ParsingLs for details.
Second, modern versions of bash have a builtin called readarray.
Try this:
readarray -t my_array < <(ls -l)
for entry in "${my_array[#]}"; do
read -a pieces <<<"$entry"
printf '<%s> ' "${pieces[#]}"; echo
done
First, it creates an array (called my_array) with all the output from the command being run.
Then, for each line in that output, it creates an array called pieces, and emits each piece with arrow brackets around them.
If you want to read a line at a time, rather than reading the entire file at once, see http://mywiki.wooledge.org/BashFAQ/001 ("How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?")
Joinning the previous answers with the need to store the list of files in a variable. You can do this
echo -n "$list"|while read -r lin
do
echo $lin
done