Reading input character-by-character appears to be skipping newlines

Reading input character-by-character appears to be skipping newlines - macos

I wanted to write a simple function to find all the newlines in an input string in Bash, and this is what I came up with:
find_newlines() {
while IFS= read -r -n 1 c; do
if [[ "$c" == $'\n' ]]; then
echo 'FOUND NEWLINE'
fi
done
}
I thought I did everything I needed to. I set IFS to nothing, I passed the -r flag, and I forced read to read a single character at a time with -n 1. However, to my dismay, the following did nothing at all:
test_data="this is a
test"
printf "$test_data" | find_newlines
I received no output whatsoever. I'm testing under Mac OS X, and I ran this on both Bash version 3.2.57 (the one Apple provides) and 4.3.33 (installed via Homebrew). Both gave the same result.
What do I need to do in order to include newlines when I loop?

Replace read's option -n by -N.
See: https://unix.stackexchange.com/a/27424/74329

The problem here is that read even in -n 1 mode is still reading delimited lines. So when it sees the newline it still considers that a line delimiter and removes it (and leaves you with an empty variable).
$ find_newlines() {
while IFS= read -r -n 1 c; do
declare -p c;
done
}
$ printf 'a\nb' | find_newlines
declare -- c="a"
declare -- c=""
declare -- c="b"
As indicated in the answer by Cyrus with bash 4.1+(ref) you can use the -N flag to read to avoid this problem.
The -N option is documented as (from the Bash Reference Manual entry for read):
-N nchars
read returns after reading exactly nchars characters rather than waiting for a complete line of input, unless EOF is encountered or read times out. Delimiter characters encountered in the input are not treated specially and do not cause read to return until nchars characters are read.
For non-4.1+ versions of bash (2.04+ it looks like) you can use the -d flag to read to specify an alternate delimiter to work around this problem. Any delimiter that isn't going to exist in your input will work. The most likely value for that for many input streams is likely to be the NUL/\0 character which you can specify to read as -d ''.
$ find_newlines() {
while IFS= read -d '' -r -n 1 c; do
declare -p c;
done
}
$ printf 'a\nb' | find_newlines
declare -- c="a"
declare -- c="
"
declare -- c="b"

maybe you can use sed to replace the newline char with some unique symbol that you can later trace. this will work disregarding bash version.
enter code here
sed ':a;N;$!ba;s/\n/#/g' <<< "bla
bla
bla"
bla#bla#bla

Related

Why does splitting my $PATH with `read -r -a line` work but not with `while read -r line`?

Just noticed something strange which I can't quite explain:
When I split my $PATH variable using read -a everything works fine
IFS=: read -r -a lines <<< "$PATH"
for line in "${lines[#]}"; do echo "$line"; done
But when I try to do the same using while ... read loop, only the first line is printed
while IFS=: read -r line; do echo "$line"; done <<< "$PATH"

You can make this work; switch from using IFS=: to using -d:, and append a : to the end of your input stream:
while IFS= read -r -d: line; do echo "$line"; done <<< "$PATH:"
The difference is that IFS is used to find boundaries between words, but read -r line reads into exactly one variable, line, so it's not looking for multiple words at all. By contrast, -d tells each invocation of read which character to stop at; by default that's a newline, but you can replace it with any other single character. (If that character isn't found, read exits with a nonzero status; that's why the standard/idiomatic while read loop idiom skips the last line of your file if it isn't correctly terminated by a newline, and why we use $PATH: as our input here).
If you ran IFS=: read -r first second rest, on the other hand, it would put your first PATH entry into $first, the second one into $second, and the remainder of the line into $rest; whereas with IFS: read -r line, it's as if you only had a single item, $rest.

Your while loop processes 1 line, it is not a loop. So the complete path is stored in the field line.
When you had given more fields, the path would be divided to those fields (and the last field gets the remainder):
while IFS=: read -r line field2 field3 otherfields; do echo "$line"; done <<< "$PATH"
When you want to avoid an array, you can use
while read -r line; do echo "$line"; done <<< "${PATH//:/$'\n'}"

It works fine.
Splitting into an array gives an open-ended number of elements, so does what you expect.
Splitting into a single variable does the same thing, but when it runs out of supplied variable names into which to put the data, it's stops splitting and puts the rest into the last one.
Try this:
$: IFS=: read -r a b c <<< "$PATH"
$: printf "[%s]\n" "$a" "$b" "$c"
You'll get the first PATH element in $a, the second in $b, and the rest ALL in $c.
Does that make it clearer?
c.f. this guide

Why does splitting my $PATH with read -r -a line work but not with while read -r line?
Because read -r line reads the whole line and then after reading the single whole line then the line is spitt on IFS. Because you provided only one variable to read, all the line is in that one variable. You could like split the line on the first element and rest of elements:
IFS=: read -r part1 rest_of_parts <<<"$line"
See read 1p read the If there are fewer vars than fields, part. Note that still IFS=: read -r -a lines <<< "$PATH" will fail when PATH contains a newline, like so:
$ export PATH=/usr/bin # reset PATH to something short
$ cd /tmp/
$ mkdir temp$'\n'dir # create a directory with a newline in the name
$ ls -d tem*
'temp'$'\n''dir'/
$ cd temp$'\n'dir
$ printf "%s\n" '#!/bin/bash' 'echo hello world' > script.sh
$ chmod +x ./script.sh # add a script in that directory
$ export PATH="$PATH:$PWD" # add that directory to path
$ ./script.sh # yes. yes, it works
hello world
$ IFS=: read -r -a lines <<< "$PATH"
$ declare -p lines
declare -a lines=([0]="/usr/bin" [1]="/tmp/temp")
# ^^^^ newline and 'dir' is missing
# That is because `read` reads _one line_ and one line only
# _after_ reading that one line that _one line_ is split on IFS
# so any more lines are ignored.
You could use a bash extension to read -d that makes read not read the whole line, but up until a character (but I needed to ignore read exit status, dunno why):
$ while IFS= read -r -d':' line || [[ -n "$line" ]]; do declare -p line; done < <(printf "%s" "$PATH")
declare -- line="/usr/bin"
declare -- line="/tmp/temp
dir"
Note that <<< adds a trailing newline, so using that will result in the last element of PATH having a newline - as a workaround, in bash you may use process substitution < <(printf "%s" "$PATH").
The real safe solution if using bash is just using mapfile/readarray:
$ mapfile -d: -t lines < <(printf "%s" "$PATH")
$ declare -p lines
declare -a lines=([0]="/usr/bin" [1]=$'/tmp/temp\ndir')

for loop to be used with readpst library [duplicate]

How do I iterate through each line of a text file with Bash?
With this script:
echo "Start!"
for p in (peptides.txt)
do
echo "${p}"
done
I get this output on the screen:
Start!
./runPep.sh: line 3: syntax error near unexpected token `('
./runPep.sh: line 3: `for p in (peptides.txt)'
(Later I want to do something more complicated with $p than just output to the screen.)
The environment variable SHELL is (from env):
SHELL=/bin/bash
/bin/bash --version output:
GNU bash, version 3.1.17(1)-release (x86_64-suse-linux-gnu)
Copyright (C) 2005 Free Software Foundation, Inc.
cat /proc/version output:
Linux version 2.6.18.2-34-default (geeko#buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
The file peptides.txt contains:
RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL

One way to do it is:
while read p; do
echo "$p"
done <peptides.txt
As pointed out in the comments, this has the side effects of trimming leading whitespace, interpreting backslash sequences, and skipping the last line if it's missing a terminating linefeed. If these are concerns, you can do:
while IFS="" read -r p || [ -n "$p" ]
do
printf '%s\n' "$p"
done < peptides.txt
Exceptionally, if the loop body may read from standard input, you can open the file using a different file descriptor:
while read -u 10 p; do
...
done 10<peptides.txt
Here, 10 is just an arbitrary number (different from 0, 1, 2).

cat peptides.txt | while read line
do
# do something with $line here
done
and the one-liner variant:
cat peptides.txt | while read line; do something_with_$line_here; done
These options will skip the last line of the file if there is no trailing line feed.
You can avoid this by the following:
cat peptides.txt | while read line || [[ -n $line ]];
do
# do something with $line here
done

Option 1a: While loop: Single line at a time: Input redirection
#!/bin/bash
filename='peptides.txt'
echo Start
while read p; do
echo "$p"
done < "$filename"
Option 1b: While loop: Single line at a time:
Open the file, read from a file descriptor (in this case file descriptor #4).
#!/bin/bash
filename='peptides.txt'
exec 4<"$filename"
echo Start
while read -u4 p ; do
echo "$p"
done

This is no better than other answers, but is one more way to get the job done in a file without spaces (see comments). I find that I often need one-liners to dig through lists in text files without the extra step of using separate script files.
for word in $(cat peptides.txt); do echo $word; done
This format allows me to put it all in one command-line. Change the "echo $word" portion to whatever you want and you can issue multiple commands separated by semicolons. The following example uses the file's contents as arguments into two other scripts you may have written.
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done
Or if you intend to use this like a stream editor (learn sed) you can dump the output to another file as follows.
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done > outfile.txt
I've used these as written above because I have used text files where I've created them with one word per line. (See comments) If you have spaces that you don't want splitting your words/lines, it gets a little uglier, but the same command still works as follows:
OLDIFS=$IFS; IFS=$'\n'; for line in $(cat peptides.txt); do cmd_a.sh $line; cmd_b.py $line; done > outfile.txt; IFS=$OLDIFS
This just tells the shell to split on newlines only, not spaces, then returns the environment back to what it was previously. At this point, you may want to consider putting it all into a shell script rather than squeezing it all into a single line, though.
Best of luck!

A few more things not covered by other answers:
Reading from a delimited file
# ':' is the delimiter here, and there are three fields on each line in the file
# IFS set below is restricted to the context of `read`, it doesn't affect any other code
while IFS=: read -r field1 field2 field3; do
# process the fields
# if the line has less than three fields, the missing fields will be set to an empty string
# if the line has more than three fields, `field3` will get all the values, including the third field plus the delimiter(s)
done < input.txt
Reading from the output of another command, using process substitution
while read -r line; do
# process the line
done < <(command ...)
This approach is better than command ... | while read -r line; do ... because the while loop here runs in the current shell rather than a subshell as in the case of the latter. See the related post A variable modified inside a while loop is not remembered.
Reading from a null delimited input, for example find ... -print0
while read -r -d '' line; do
# logic
# use a second 'read ... <<< "$line"' if we need to tokenize the line
done < <(find /path/to/dir -print0)
Related read: BashFAQ/020 - How can I find and safely handle file names containing newlines, spaces or both?
Reading from more than one file at a time
while read -u 3 -r line1 && read -u 4 -r line2; do
# process the lines
# note that the loop will end when we reach EOF on either of the files, because of the `&&`
done 3< input1.txt 4< input2.txt
Based on #chepner's answer here:
-u is a bash extension. For POSIX compatibility, each call would look something like read -r X <&3.
Reading a whole file into an array (Bash versions earlier to 4)
while read -r line; do
my_array+=("$line")
done < my_file
If the file ends with an incomplete line (newline missing at the end), then:
while read -r line || [[ $line ]]; do
my_array+=("$line")
done < my_file
Reading a whole file into an array (Bash versions 4x and later)
readarray -t my_array < my_file
or
mapfile -t my_array < my_file
And then
for line in "${my_array[#]}"; do
# process the lines
done
More about the shell builtins read and readarray commands - GNU
More about IFS - Wikipedia
BashFAQ/001 - How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
Related posts:
Creating an array from a text file in Bash
What is the difference between thee approaches to reading a file that has just one line?
Bash while read loop extremely slow compared to cat, why?

Use a while loop, like this:
while IFS= read -r line; do
echo "$line"
done <file
Notes:
If you don't set the IFS properly, you will lose indentation.
You should almost always use the -r option with read.
Don't read lines with for

If you don't want your read to be broken by newline character, use -
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "$line"
done < "$1"
Then run the script with file name as parameter.

Suppose you have this file:
$ cat /tmp/test.txt
Line 1
Line 2 has leading space
Line 3 followed by blank line
Line 5 (follows a blank line) and has trailing space
Line 6 has no ending CR
There are four elements that will alter the meaning of the file output read by many Bash solutions:
The blank line 4;
Leading or trailing spaces on two lines;
Maintaining the meaning of individual lines (i.e., each line is a record);
The line 6 not terminated with a CR.
If you want the text file line by line including blank lines and terminating lines without CR, you must use a while loop and you must have an alternate test for the final line.
Here are the methods that may change the file (in comparison to what cat returns):
1) Lose the last line and leading and trailing spaces:
$ while read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt
'Line 1'
'Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space'
(If you do while IFS= read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt instead, you preserve the leading and trailing spaces but still lose the last line if it is not terminated with CR)
2) Using process substitution with cat will reads the entire file in one gulp and loses the meaning of individual lines:
$ for p in "$(cat /tmp/test.txt)"; do printf "%s\n" "'$p'"; done
'Line 1
Line 2 has leading space
Line 3 followed by blank line
Line 5 (follows a blank line) and has trailing space
Line 6 has no ending CR'
(If you remove the " from $(cat /tmp/test.txt) you read the file word by word rather than one gulp. Also probably not what is intended...)
The most robust and simplest way to read a file line-by-line and preserve all spacing is:
$ while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/test.txt
'Line 1'
' Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space '
'Line 6 has no ending CR'
If you want to strip leading and trading spaces, remove the IFS= part:
$ while read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/test.txt
'Line 1'
'Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space'
'Line 6 has no ending CR'
(A text file without a terminating \n, while fairly common, is considered broken under POSIX. If you can count on the trailing \n you do not need || [[ -n $line ]] in the while loop.)
More at the BASH FAQ

I like to use xargs instead of while. xargs is powerful and command line friendly
cat peptides.txt | xargs -I % sh -c "echo %"
With xargs, you can also add verbosity with -t and validation with -p

This might be the simplest answer and maybe it don't work in all cases, but it is working great for me:
while read line;do echo "$line";done<peptides.txt
if you need to enclose in parenthesis for spaces:
while read line;do echo \"$line\";done<peptides.txt
Ahhh this is pretty much the same as the answer that got upvoted most, but its all on one line.

#!/bin/bash
#
# Change the file name from "test" to desired input file
# (The comments in bash are prefixed with #'s)
for x in $(cat test.txt)
do
echo $x
done

Here is my real life example how to loop lines of another program output, check for substrings, drop double quotes from variable, use that variable outside of the loop. I guess quite many is asking these questions sooner or later.
##Parse FPS from first video stream, drop quotes from fps variable
## streams.stream.0.codec_type="video"
## streams.stream.0.r_frame_rate="24000/1001"
## streams.stream.0.avg_frame_rate="24000/1001"
FPS=unknown
while read -r line; do
if [[ $FPS == "unknown" ]] && [[ $line == *".codec_type=\"video\""* ]]; then
echo ParseFPS $line
FPS=parse
fi
if [[ $FPS == "parse" ]] && [[ $line == *".r_frame_rate="* ]]; then
echo ParseFPS $line
FPS=${line##*=}
FPS="${FPS%\"}"
FPS="${FPS#\"}"
fi
done <<< "$(ffprobe -v quiet -print_format flat -show_format -show_streams -i "$input")"
if [ "$FPS" == "unknown" ] || [ "$FPS" == "parse" ]; then
echo ParseFPS Unknown frame rate
fi
echo Found $FPS
Declare variable outside of the loop, set value and use it outside of loop requires done <<< "$(...)" syntax. Application need to be run within a context of current console. Quotes around the command keeps newlines of output stream.
Loop match for substrings then reads name=value pair, splits right-side part of last = character, drops first quote, drops last quote, we have a clean value to be used elsewhere.

This is coming rather very late, but with the thought that it may help someone, i am adding the answer. Also this may not be the best way. head command can be used with -n argument to read n lines from start of file and likewise tail command can be used to read from bottom. Now, to fetch nth line from file, we head n lines, pipe the data to tail only 1 line from the piped data.
TOTAL_LINES=`wc -l $USER_FILE | cut -d " " -f1 `
echo $TOTAL_LINES # To validate total lines in the file
for (( i=1 ; i <= $TOTAL_LINES; i++ ))
do
LINE=`head -n$i $USER_FILE | tail -n1`
echo $LINE
done

#Peter: This could work out for you-
echo "Start!";for p in $(cat ./pep); do
echo $p
done
This would return the output-
Start!
RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL

Another way to go about using xargs
<file_name | xargs -I {} echo {}
echo can be replaced with other commands or piped further.

for p in `cat peptides.txt`
do
echo "${p}"
done

bash read mystery: `read -d'' -s -n 1` eats hyphens

I'm trying to use the read builtin in bash to read one character at a time. This actually works flawlessly when I use the -N 1 argument to read, but I had some OSX users report to me that their bash does not have that option.
So now I'm using something along the lines of:
$ while IFS= read -r -d'' -s -n 1 char; do echo -n "${char}"; done < filename
This echo's back every character in filename one at a time except, mysteriously, hyphens (-). E.g. if I have
$ cat blah
uh-oh
The result is:
$ while IFS= read -r -d'' -s -n 1 char; do echo -n "${char}"; done < blah
uhoh
Nothing in the documentation says anything that would indicate this. If I replace ${char} in the echo with ${#char} it prints 0 where it should have read the hyphen. It just gets completely eaten.
If I drop the -d'' it instead eats newlines, but does not eat hyphen, so that at least makes sense since newline is the default delimiter. It almost seems like a bug that -d'' is treating hyphen as a delimiter.
FWIW I have
$ bash --version
GNU bash, version 4.3.42(4)-release (x86_64-unknown-cygwin)
but this was first reported to me by an OSX user.

You actually did not set the delimiter option for read correctly.
Notice the extra space:
while IFS= read -r -d '' -s -n 1 char; do echo -n "${char}"; done < filename
This works fine.
In your code the delimiter chars where set to -s

echo stdin line by line into file [duplicate]

How do I iterate through each line of a text file with Bash?
With this script:
echo "Start!"
for p in (peptides.txt)
do
echo "${p}"
done
I get this output on the screen:
Start!
./runPep.sh: line 3: syntax error near unexpected token `('
./runPep.sh: line 3: `for p in (peptides.txt)'
(Later I want to do something more complicated with $p than just output to the screen.)
The environment variable SHELL is (from env):
SHELL=/bin/bash
/bin/bash --version output:
GNU bash, version 3.1.17(1)-release (x86_64-suse-linux-gnu)
Copyright (C) 2005 Free Software Foundation, Inc.
cat /proc/version output:
Linux version 2.6.18.2-34-default (geeko#buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
The file peptides.txt contains:
RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL

One way to do it is:
while read p; do
echo "$p"
done <peptides.txt
As pointed out in the comments, this has the side effects of trimming leading whitespace, interpreting backslash sequences, and skipping the last line if it's missing a terminating linefeed. If these are concerns, you can do:
while IFS="" read -r p || [ -n "$p" ]
do
printf '%s\n' "$p"
done < peptides.txt
Exceptionally, if the loop body may read from standard input, you can open the file using a different file descriptor:
while read -u 10 p; do
...
done 10<peptides.txt
Here, 10 is just an arbitrary number (different from 0, 1, 2).

cat peptides.txt | while read line
do
# do something with $line here
done
and the one-liner variant:
cat peptides.txt | while read line; do something_with_$line_here; done
These options will skip the last line of the file if there is no trailing line feed.
You can avoid this by the following:
cat peptides.txt | while read line || [[ -n $line ]];
do
# do something with $line here
done

Option 1a: While loop: Single line at a time: Input redirection
#!/bin/bash
filename='peptides.txt'
echo Start
while read p; do
echo "$p"
done < "$filename"
Option 1b: While loop: Single line at a time:
Open the file, read from a file descriptor (in this case file descriptor #4).
#!/bin/bash
filename='peptides.txt'
exec 4<"$filename"
echo Start
while read -u4 p ; do
echo "$p"
done

This is no better than other answers, but is one more way to get the job done in a file without spaces (see comments). I find that I often need one-liners to dig through lists in text files without the extra step of using separate script files.
for word in $(cat peptides.txt); do echo $word; done
This format allows me to put it all in one command-line. Change the "echo $word" portion to whatever you want and you can issue multiple commands separated by semicolons. The following example uses the file's contents as arguments into two other scripts you may have written.
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done
Or if you intend to use this like a stream editor (learn sed) you can dump the output to another file as follows.
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done > outfile.txt
I've used these as written above because I have used text files where I've created them with one word per line. (See comments) If you have spaces that you don't want splitting your words/lines, it gets a little uglier, but the same command still works as follows:
OLDIFS=$IFS; IFS=$'\n'; for line in $(cat peptides.txt); do cmd_a.sh $line; cmd_b.py $line; done > outfile.txt; IFS=$OLDIFS
This just tells the shell to split on newlines only, not spaces, then returns the environment back to what it was previously. At this point, you may want to consider putting it all into a shell script rather than squeezing it all into a single line, though.
Best of luck!

A few more things not covered by other answers:
Reading from a delimited file
# ':' is the delimiter here, and there are three fields on each line in the file
# IFS set below is restricted to the context of `read`, it doesn't affect any other code
while IFS=: read -r field1 field2 field3; do
# process the fields
# if the line has less than three fields, the missing fields will be set to an empty string
# if the line has more than three fields, `field3` will get all the values, including the third field plus the delimiter(s)
done < input.txt
Reading from the output of another command, using process substitution
while read -r line; do
# process the line
done < <(command ...)
This approach is better than command ... | while read -r line; do ... because the while loop here runs in the current shell rather than a subshell as in the case of the latter. See the related post A variable modified inside a while loop is not remembered.
Reading from a null delimited input, for example find ... -print0
while read -r -d '' line; do
# logic
# use a second 'read ... <<< "$line"' if we need to tokenize the line
done < <(find /path/to/dir -print0)
Related read: BashFAQ/020 - How can I find and safely handle file names containing newlines, spaces or both?
Reading from more than one file at a time
while read -u 3 -r line1 && read -u 4 -r line2; do
# process the lines
# note that the loop will end when we reach EOF on either of the files, because of the `&&`
done 3< input1.txt 4< input2.txt
Based on #chepner's answer here:
-u is a bash extension. For POSIX compatibility, each call would look something like read -r X <&3.
Reading a whole file into an array (Bash versions earlier to 4)
while read -r line; do
my_array+=("$line")
done < my_file
If the file ends with an incomplete line (newline missing at the end), then:
while read -r line || [[ $line ]]; do
my_array+=("$line")
done < my_file
Reading a whole file into an array (Bash versions 4x and later)
readarray -t my_array < my_file
or
mapfile -t my_array < my_file
And then
for line in "${my_array[#]}"; do
# process the lines
done
More about the shell builtins read and readarray commands - GNU
More about IFS - Wikipedia
BashFAQ/001 - How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
Related posts:
Creating an array from a text file in Bash
What is the difference between thee approaches to reading a file that has just one line?
Bash while read loop extremely slow compared to cat, why?

Use a while loop, like this:
while IFS= read -r line; do
echo "$line"
done <file
Notes:
If you don't set the IFS properly, you will lose indentation.
You should almost always use the -r option with read.
Don't read lines with for

If you don't want your read to be broken by newline character, use -
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "$line"
done < "$1"
Then run the script with file name as parameter.

Suppose you have this file:
$ cat /tmp/test.txt
Line 1
Line 2 has leading space
Line 3 followed by blank line
Line 5 (follows a blank line) and has trailing space
Line 6 has no ending CR
There are four elements that will alter the meaning of the file output read by many Bash solutions:
The blank line 4;
Leading or trailing spaces on two lines;
Maintaining the meaning of individual lines (i.e., each line is a record);
The line 6 not terminated with a CR.
If you want the text file line by line including blank lines and terminating lines without CR, you must use a while loop and you must have an alternate test for the final line.
Here are the methods that may change the file (in comparison to what cat returns):
1) Lose the last line and leading and trailing spaces:
$ while read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt
'Line 1'
'Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space'
(If you do while IFS= read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt instead, you preserve the leading and trailing spaces but still lose the last line if it is not terminated with CR)
2) Using process substitution with cat will reads the entire file in one gulp and loses the meaning of individual lines:
$ for p in "$(cat /tmp/test.txt)"; do printf "%s\n" "'$p'"; done
'Line 1
Line 2 has leading space
Line 3 followed by blank line
Line 5 (follows a blank line) and has trailing space
Line 6 has no ending CR'
(If you remove the " from $(cat /tmp/test.txt) you read the file word by word rather than one gulp. Also probably not what is intended...)
The most robust and simplest way to read a file line-by-line and preserve all spacing is:
$ while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/test.txt
'Line 1'
' Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space '
'Line 6 has no ending CR'
If you want to strip leading and trading spaces, remove the IFS= part:
$ while read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/test.txt
'Line 1'
'Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space'
'Line 6 has no ending CR'
(A text file without a terminating \n, while fairly common, is considered broken under POSIX. If you can count on the trailing \n you do not need || [[ -n $line ]] in the while loop.)
More at the BASH FAQ

I like to use xargs instead of while. xargs is powerful and command line friendly
cat peptides.txt | xargs -I % sh -c "echo %"
With xargs, you can also add verbosity with -t and validation with -p

This might be the simplest answer and maybe it don't work in all cases, but it is working great for me:
while read line;do echo "$line";done<peptides.txt
if you need to enclose in parenthesis for spaces:
while read line;do echo \"$line\";done<peptides.txt
Ahhh this is pretty much the same as the answer that got upvoted most, but its all on one line.

#!/bin/bash
#
# Change the file name from "test" to desired input file
# (The comments in bash are prefixed with #'s)
for x in $(cat test.txt)
do
echo $x
done

Here is my real life example how to loop lines of another program output, check for substrings, drop double quotes from variable, use that variable outside of the loop. I guess quite many is asking these questions sooner or later.
##Parse FPS from first video stream, drop quotes from fps variable
## streams.stream.0.codec_type="video"
## streams.stream.0.r_frame_rate="24000/1001"
## streams.stream.0.avg_frame_rate="24000/1001"
FPS=unknown
while read -r line; do
if [[ $FPS == "unknown" ]] && [[ $line == *".codec_type=\"video\""* ]]; then
echo ParseFPS $line
FPS=parse
fi
if [[ $FPS == "parse" ]] && [[ $line == *".r_frame_rate="* ]]; then
echo ParseFPS $line
FPS=${line##*=}
FPS="${FPS%\"}"
FPS="${FPS#\"}"
fi
done <<< "$(ffprobe -v quiet -print_format flat -show_format -show_streams -i "$input")"
if [ "$FPS" == "unknown" ] || [ "$FPS" == "parse" ]; then
echo ParseFPS Unknown frame rate
fi
echo Found $FPS
Declare variable outside of the loop, set value and use it outside of loop requires done <<< "$(...)" syntax. Application need to be run within a context of current console. Quotes around the command keeps newlines of output stream.
Loop match for substrings then reads name=value pair, splits right-side part of last = character, drops first quote, drops last quote, we have a clean value to be used elsewhere.

This is coming rather very late, but with the thought that it may help someone, i am adding the answer. Also this may not be the best way. head command can be used with -n argument to read n lines from start of file and likewise tail command can be used to read from bottom. Now, to fetch nth line from file, we head n lines, pipe the data to tail only 1 line from the piped data.
TOTAL_LINES=`wc -l $USER_FILE | cut -d " " -f1 `
echo $TOTAL_LINES # To validate total lines in the file
for (( i=1 ; i <= $TOTAL_LINES; i++ ))
do
LINE=`head -n$i $USER_FILE | tail -n1`
echo $LINE
done

#Peter: This could work out for you-
echo "Start!";for p in $(cat ./pep); do
echo $p
done
This would return the output-
Start!
RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL

Another way to go about using xargs
<file_name | xargs -I {} echo {}
echo can be replaced with other commands or piped further.

for p in `cat peptides.txt`
do
echo "${p}"
done

Read an input file in shell script and store its lines in a variable

I'm new to UNIX and have this really simple problem:
I have a text-file (input.txt) containing a string in each line. It looks like this:
House
Monkey
Car
And inside my shell script I need to read this input file line by line to get to a variable like this:
things="House,Monkey,Car"
I know this sounds easy, but I just couldnt find any simple solution for this. My closest attempt so far:
#!/bin/sh
things=""
addToString() {
things="${things},$1"
}
while read line; do addToString $line ;done <input.txt
echo $things
But this won't work. Regarding to my google research I thought the while loop would create a new sub shell, but this I was wrong there (see the comment section). Nevertheless the variable "things" was still not available in the echo later on. (I cannot just write the echo inside the while loop, because I need to work with that string later on)
Could you please help me out here? Any help will be appreciated, thank you!

What you proposed works fine! I've only made two changes here: Adding missing quotes, and handling the empty-string case.
things=""
addToString() {
if [ -n "$things" ]; then
things="${things},$1"
else
things="$1"
fi
}
while read -r line; do addToString "$line"; done <input.txt
echo "$things"
If you were piping into while read, this would create a subshell, and that would eat your variables. You aren't piping -- you're doing a <input.txt redirection. No subshell, code works without changes.
That said, there are better ways to read lists of items into shell variables. On any version of bash after 3.0:
IFS=$'\n' read -r -d '' -a things <input.txt # read into an array
printf -v things_str '%s,' "${things[#]}" # write array to a comma-separated string
echo "${things_str%,}" # print that string w/o trailing comma
...on bash 4, that first line can be:
readarray -t things <input.txt # read into an array

This is not a shell solution, but the truth is that solutions in pure shell are often excessively long and verbose. So e.g. to do string processing it is better to use special tools that are part of the “default” Unix environment.
sed ':b;N;$!bb;s/\n/,/g' < input.txt
If you want to omit empty lines, then:
sed ':b;N;$!bb;s/\n\n*/,/g' < input.txt
Speaking about your solution, it should work, but you should really always use quotes where applicable. E.g. this works for me:
things=""
while read line; do things="$things,$line"; done < input.txt
echo "$things"
(Of course, there is an issue with this code, as it outputs a leading comma. If you want to skip empty lines, just add an if check.)

This might/might not work, depending on the shell you are using. On my Ubuntu 14.04/x64, it works with both bash and dash.
To make it more reliable and independent from the shell's behavior, you can try to put the whole block into a subshell explicitly, using the (). For example:
(
things=""
addToString() {
things="${things},$1"
}
while read line; do addToString $line ;done
echo $things
) < input.txt
P.S. You can use something like this to avoid the initial comma. Without bash extensions (using short-circuit logical operators instead of the if for shortness):
test -z "$things" && things="$1" || things="${things},${1}"
Or with bash extensions:
things="${things}${things:+,}${1}"
P.P.S. How I would have done it:
tr '\n' ',' < input.txt | sed 's!,$!\n!'

You can do this too:
#!/bin/bash
while read -r i
do
[[ $things == "" ]] && things="$i" || things="$things","$i"
done < <(grep . input.txt)
echo "$things"
Output:
House,Monkey,Car
N.B:
Used grep to tackle with empty lines and the probability of not having a new line at the end of file. (Normal while read will fail to read the last line if there is no newline at the end of file.)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Reading input character-by-character appears to be skipping newlines - macos

Replace read's option -n by -N. See: https://unix.stackexchange.com/a/27424/74329

maybe you can use sed to replace the newline char with some unique symbol that you can later trace. this will work disregarding bash version. enter code here sed ':a;N;$!ba;s/\n/#/g' <<< "bla bla bla" bla#bla#bla

Related

Why does splitting my $PATH with `read -r -a line` work but not with `while read -r line`?

for loop to be used with readpst library [duplicate]

bash read mystery: `read -d'' -s -n 1` eats hyphens

echo stdin line by line into file [duplicate]

Read an input file in shell script and store its lines in a variable

Categories

Resources