How to read a file into a variable in shell? - shell

I want to read a file and save it in variable, but I need to keep the variable and not just print out the file.
How can I do this? I have written this script but it isn't quite what I needed:
#!/bin/sh
while read LINE
do
echo $LINE
done <$1
echo 11111-----------
echo $LINE
In my script, I can give the file name as a parameter, so, if the file contains "aaaa", for example, it would print out this:
aaaa
11111-----
But this just prints out the file onto the screen, and I want to save it into a variable!
Is there an easy way to do this?

In cross-platform, lowest-common-denominator sh you use:
#!/bin/sh
value=`cat config.txt`
echo "$value"
In bash or zsh, to read a whole file into a variable without invoking cat:
#!/bin/bash
value=$(<config.txt)
echo "$value"
Invoking cat in bash or zsh to slurp a file would be considered a Useless Use of Cat.
Note that it is not necessary to quote the command substitution to preserve newlines.
See: Bash Hacker's Wiki - Command substitution - Specialities.

If you want to read the whole file into a variable:
#!/bin/bash
value=`cat sources.xml`
echo $value
If you want to read it line-by-line:
while read line; do
echo $line
done < file.txt

Two important pitfalls
which were ignored by other answers so far:
Trailing newline removal from command expansion
NUL character removal
Trailing newline removal from command expansion
This is a problem for the:
value="$(cat config.txt)"
type solutions, but not for read based solutions.
Command expansion removes trailing newlines:
S="$(printf "a\n")"
printf "$S" | od -tx1
Outputs:
0000000 61
0000001
This breaks the naive method of reading from files:
FILE="$(mktemp)"
printf "a\n\n" > "$FILE"
S="$(<"$FILE")"
printf "$S" | od -tx1
rm "$FILE"
POSIX workaround: append an extra char to the command expansion and remove it later:
S="$(cat $FILE; printf a)"
S="${S%a}"
printf "$S" | od -tx1
Outputs:
0000000 61 0a 0a
0000003
Almost POSIX workaround: ASCII encode. See below.
NUL character removal
There is no sane Bash way to store NUL characters in variables.
This affects both expansion and read solutions, and I don't know any good workaround for it.
Example:
printf "a\0b" | od -tx1
S="$(printf "a\0b")"
printf "$S" | od -tx1
Outputs:
0000000 61 00 62
0000003
0000000 61 62
0000002
Ha, our NUL is gone!
Workarounds:
ASCII encode. See below.
use bash extension $"" literals:
S=$"a\0b"
printf "$S" | od -tx1
Only works for literals, so not useful for reading from files.
Workaround for the pitfalls
Store an uuencode base64 encoded version of the file in the variable, and decode before every usage:
FILE="$(mktemp)"
printf "a\0\n" > "$FILE"
S="$(uuencode -m "$FILE" /dev/stdout)"
uudecode -o /dev/stdout <(printf "$S") | od -tx1
rm "$FILE"
Output:
0000000 61 00 0a
0000003
uuencode and udecode are POSIX 7 but not in Ubuntu 12.04 by default (sharutils package)... I don't see a POSIX 7 alternative for the bash process <() substitution extension except writing to another file...
Of course, this is slow and inconvenient, so I guess the real answer is: don't use Bash if the input file may contain NUL characters.

this works for me:
v=$(cat <file_path>)
echo $v

With bash you may use read like this:
#!/usr/bin/env bash
{ IFS= read -rd '' value <config.txt;} 2>/dev/null
printf '%s' "$value"
Notice that:
The last newline is preserved.
The stderr is silenced to /dev/null by redirecting the whole commands block, but the return status of the read command is preserved, if one needed to handle read error conditions.

As Ciro Santilli notes using command substitutions will drop trailing newlines. Their workaround adding trailing characters is great, but after using it for quite some time I decided I needed a solution that didn't use command substitution at all.
My approach now uses read along with the printf builtin's -v flag in order to read the contents of stdin directly into a variable.
# Reads stdin into a variable, accounting for trailing newlines. Avoids
# needing a subshell or command substitution.
# Note that NUL bytes are still unsupported, as Bash variables don't allow NULs.
# See https://stackoverflow.com/a/22607352/113632
read_input() {
# Use unusual variable names to avoid colliding with a variable name
# the user might pass in (notably "contents")
: "${1:?Must provide a variable to read into}"
if [[ "$1" == '_line' || "$1" == '_contents' ]]; then
echo "Cannot store contents to $1, use a different name." >&2
return 1
fi
local _line _contents=()
while IFS='' read -r _line; do
_contents+=("$_line"$'\n')
done
# include $_line once more to capture any content after the last newline
printf -v "$1" '%s' "${_contents[#]}" "$_line"
}
This supports inputs with or without trailing newlines.
Example usage:
$ read_input file_contents < /tmp/file
# $file_contents now contains the contents of /tmp/file

All the given solutions are quite slow, so:
mapfile -d '' content </etc/passwd # Read file into an array
content="${content[*]%$'\n'}" # Remove trailing newline
Would be nice to optimise it even more but I can't think of much
Update: Found a faster way
read -rd '' content </etc/passwd
This will return exit code of 1 so if you need it
to be always 0:
read -rd '' content </etc/passwd || :

I use:
NGINX_PID=`cat -s "/sdcard/server/nginx/logs/nginx.pid" 2>/dev/null`
if [ "$NGINX_PID" = "" ]; then
echo "..."
exit
fi

You can access 1 line at a time by for loop
#!/bin/bash -eu
#This script prints contents of /etc/passwd line by line
FILENAME='/etc/passwd'
I=0
for LN in $(cat $FILENAME)
do
echo "Line number $((I++)) --> $LN"
done
Copy the entire content to File (say line.sh ) ; Execute
chmod +x line.sh
./line.sh

Related

for loop to be used with readpst library [duplicate]

How do I iterate through each line of a text file with Bash?
With this script:
echo "Start!"
for p in (peptides.txt)
do
echo "${p}"
done
I get this output on the screen:
Start!
./runPep.sh: line 3: syntax error near unexpected token `('
./runPep.sh: line 3: `for p in (peptides.txt)'
(Later I want to do something more complicated with $p than just output to the screen.)
The environment variable SHELL is (from env):
SHELL=/bin/bash
/bin/bash --version output:
GNU bash, version 3.1.17(1)-release (x86_64-suse-linux-gnu)
Copyright (C) 2005 Free Software Foundation, Inc.
cat /proc/version output:
Linux version 2.6.18.2-34-default (geeko#buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
The file peptides.txt contains:
RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL
One way to do it is:
while read p; do
echo "$p"
done <peptides.txt
As pointed out in the comments, this has the side effects of trimming leading whitespace, interpreting backslash sequences, and skipping the last line if it's missing a terminating linefeed. If these are concerns, you can do:
while IFS="" read -r p || [ -n "$p" ]
do
printf '%s\n' "$p"
done < peptides.txt
Exceptionally, if the loop body may read from standard input, you can open the file using a different file descriptor:
while read -u 10 p; do
...
done 10<peptides.txt
Here, 10 is just an arbitrary number (different from 0, 1, 2).
cat peptides.txt | while read line
do
# do something with $line here
done
and the one-liner variant:
cat peptides.txt | while read line; do something_with_$line_here; done
These options will skip the last line of the file if there is no trailing line feed.
You can avoid this by the following:
cat peptides.txt | while read line || [[ -n $line ]];
do
# do something with $line here
done
Option 1a: While loop: Single line at a time: Input redirection
#!/bin/bash
filename='peptides.txt'
echo Start
while read p; do
echo "$p"
done < "$filename"
Option 1b: While loop: Single line at a time:
Open the file, read from a file descriptor (in this case file descriptor #4).
#!/bin/bash
filename='peptides.txt'
exec 4<"$filename"
echo Start
while read -u4 p ; do
echo "$p"
done
This is no better than other answers, but is one more way to get the job done in a file without spaces (see comments). I find that I often need one-liners to dig through lists in text files without the extra step of using separate script files.
for word in $(cat peptides.txt); do echo $word; done
This format allows me to put it all in one command-line. Change the "echo $word" portion to whatever you want and you can issue multiple commands separated by semicolons. The following example uses the file's contents as arguments into two other scripts you may have written.
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done
Or if you intend to use this like a stream editor (learn sed) you can dump the output to another file as follows.
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done > outfile.txt
I've used these as written above because I have used text files where I've created them with one word per line. (See comments) If you have spaces that you don't want splitting your words/lines, it gets a little uglier, but the same command still works as follows:
OLDIFS=$IFS; IFS=$'\n'; for line in $(cat peptides.txt); do cmd_a.sh $line; cmd_b.py $line; done > outfile.txt; IFS=$OLDIFS
This just tells the shell to split on newlines only, not spaces, then returns the environment back to what it was previously. At this point, you may want to consider putting it all into a shell script rather than squeezing it all into a single line, though.
Best of luck!
A few more things not covered by other answers:
Reading from a delimited file
# ':' is the delimiter here, and there are three fields on each line in the file
# IFS set below is restricted to the context of `read`, it doesn't affect any other code
while IFS=: read -r field1 field2 field3; do
# process the fields
# if the line has less than three fields, the missing fields will be set to an empty string
# if the line has more than three fields, `field3` will get all the values, including the third field plus the delimiter(s)
done < input.txt
Reading from the output of another command, using process substitution
while read -r line; do
# process the line
done < <(command ...)
This approach is better than command ... | while read -r line; do ... because the while loop here runs in the current shell rather than a subshell as in the case of the latter. See the related post A variable modified inside a while loop is not remembered.
Reading from a null delimited input, for example find ... -print0
while read -r -d '' line; do
# logic
# use a second 'read ... <<< "$line"' if we need to tokenize the line
done < <(find /path/to/dir -print0)
Related read: BashFAQ/020 - How can I find and safely handle file names containing newlines, spaces or both?
Reading from more than one file at a time
while read -u 3 -r line1 && read -u 4 -r line2; do
# process the lines
# note that the loop will end when we reach EOF on either of the files, because of the `&&`
done 3< input1.txt 4< input2.txt
Based on #chepner's answer here:
-u is a bash extension. For POSIX compatibility, each call would look something like read -r X <&3.
Reading a whole file into an array (Bash versions earlier to 4)
while read -r line; do
my_array+=("$line")
done < my_file
If the file ends with an incomplete line (newline missing at the end), then:
while read -r line || [[ $line ]]; do
my_array+=("$line")
done < my_file
Reading a whole file into an array (Bash versions 4x and later)
readarray -t my_array < my_file
or
mapfile -t my_array < my_file
And then
for line in "${my_array[#]}"; do
# process the lines
done
More about the shell builtins read and readarray commands - GNU
More about IFS - Wikipedia
BashFAQ/001 - How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
Related posts:
Creating an array from a text file in Bash
What is the difference between thee approaches to reading a file that has just one line?
Bash while read loop extremely slow compared to cat, why?
Use a while loop, like this:
while IFS= read -r line; do
echo "$line"
done <file
Notes:
If you don't set the IFS properly, you will lose indentation.
You should almost always use the -r option with read.
Don't read lines with for
If you don't want your read to be broken by newline character, use -
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "$line"
done < "$1"
Then run the script with file name as parameter.
Suppose you have this file:
$ cat /tmp/test.txt
Line 1
Line 2 has leading space
Line 3 followed by blank line
Line 5 (follows a blank line) and has trailing space
Line 6 has no ending CR
There are four elements that will alter the meaning of the file output read by many Bash solutions:
The blank line 4;
Leading or trailing spaces on two lines;
Maintaining the meaning of individual lines (i.e., each line is a record);
The line 6 not terminated with a CR.
If you want the text file line by line including blank lines and terminating lines without CR, you must use a while loop and you must have an alternate test for the final line.
Here are the methods that may change the file (in comparison to what cat returns):
1) Lose the last line and leading and trailing spaces:
$ while read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt
'Line 1'
'Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space'
(If you do while IFS= read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt instead, you preserve the leading and trailing spaces but still lose the last line if it is not terminated with CR)
2) Using process substitution with cat will reads the entire file in one gulp and loses the meaning of individual lines:
$ for p in "$(cat /tmp/test.txt)"; do printf "%s\n" "'$p'"; done
'Line 1
Line 2 has leading space
Line 3 followed by blank line
Line 5 (follows a blank line) and has trailing space
Line 6 has no ending CR'
(If you remove the " from $(cat /tmp/test.txt) you read the file word by word rather than one gulp. Also probably not what is intended...)
The most robust and simplest way to read a file line-by-line and preserve all spacing is:
$ while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/test.txt
'Line 1'
' Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space '
'Line 6 has no ending CR'
If you want to strip leading and trading spaces, remove the IFS= part:
$ while read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/test.txt
'Line 1'
'Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space'
'Line 6 has no ending CR'
(A text file without a terminating \n, while fairly common, is considered broken under POSIX. If you can count on the trailing \n you do not need || [[ -n $line ]] in the while loop.)
More at the BASH FAQ
I like to use xargs instead of while. xargs is powerful and command line friendly
cat peptides.txt | xargs -I % sh -c "echo %"
With xargs, you can also add verbosity with -t and validation with -p
This might be the simplest answer and maybe it don't work in all cases, but it is working great for me:
while read line;do echo "$line";done<peptides.txt
if you need to enclose in parenthesis for spaces:
while read line;do echo \"$line\";done<peptides.txt
Ahhh this is pretty much the same as the answer that got upvoted most, but its all on one line.
#!/bin/bash
#
# Change the file name from "test" to desired input file
# (The comments in bash are prefixed with #'s)
for x in $(cat test.txt)
do
echo $x
done
Here is my real life example how to loop lines of another program output, check for substrings, drop double quotes from variable, use that variable outside of the loop. I guess quite many is asking these questions sooner or later.
##Parse FPS from first video stream, drop quotes from fps variable
## streams.stream.0.codec_type="video"
## streams.stream.0.r_frame_rate="24000/1001"
## streams.stream.0.avg_frame_rate="24000/1001"
FPS=unknown
while read -r line; do
if [[ $FPS == "unknown" ]] && [[ $line == *".codec_type=\"video\""* ]]; then
echo ParseFPS $line
FPS=parse
fi
if [[ $FPS == "parse" ]] && [[ $line == *".r_frame_rate="* ]]; then
echo ParseFPS $line
FPS=${line##*=}
FPS="${FPS%\"}"
FPS="${FPS#\"}"
fi
done <<< "$(ffprobe -v quiet -print_format flat -show_format -show_streams -i "$input")"
if [ "$FPS" == "unknown" ] || [ "$FPS" == "parse" ]; then
echo ParseFPS Unknown frame rate
fi
echo Found $FPS
Declare variable outside of the loop, set value and use it outside of loop requires done <<< "$(...)" syntax. Application need to be run within a context of current console. Quotes around the command keeps newlines of output stream.
Loop match for substrings then reads name=value pair, splits right-side part of last = character, drops first quote, drops last quote, we have a clean value to be used elsewhere.
This is coming rather very late, but with the thought that it may help someone, i am adding the answer. Also this may not be the best way. head command can be used with -n argument to read n lines from start of file and likewise tail command can be used to read from bottom. Now, to fetch nth line from file, we head n lines, pipe the data to tail only 1 line from the piped data.
TOTAL_LINES=`wc -l $USER_FILE | cut -d " " -f1 `
echo $TOTAL_LINES # To validate total lines in the file
for (( i=1 ; i <= $TOTAL_LINES; i++ ))
do
LINE=`head -n$i $USER_FILE | tail -n1`
echo $LINE
done
#Peter: This could work out for you-
echo "Start!";for p in $(cat ./pep); do
echo $p
done
This would return the output-
Start!
RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL
Another way to go about using xargs
<file_name | xargs -I {} echo {}
echo can be replaced with other commands or piped further.
for p in `cat peptides.txt`
do
echo "${p}"
done

How to store the entire XML into a variable using shell script [duplicate]

I want to read a file and save it in variable, but I need to keep the variable and not just print out the file.
How can I do this? I have written this script but it isn't quite what I needed:
#!/bin/sh
while read LINE
do
echo $LINE
done <$1
echo 11111-----------
echo $LINE
In my script, I can give the file name as a parameter, so, if the file contains "aaaa", for example, it would print out this:
aaaa
11111-----
But this just prints out the file onto the screen, and I want to save it into a variable!
Is there an easy way to do this?
In cross-platform, lowest-common-denominator sh you use:
#!/bin/sh
value=`cat config.txt`
echo "$value"
In bash or zsh, to read a whole file into a variable without invoking cat:
#!/bin/bash
value=$(<config.txt)
echo "$value"
Invoking cat in bash or zsh to slurp a file would be considered a Useless Use of Cat.
Note that it is not necessary to quote the command substitution to preserve newlines.
See: Bash Hacker's Wiki - Command substitution - Specialities.
If you want to read the whole file into a variable:
#!/bin/bash
value=`cat sources.xml`
echo $value
If you want to read it line-by-line:
while read line; do
echo $line
done < file.txt
Two important pitfalls
which were ignored by other answers so far:
Trailing newline removal from command expansion
NUL character removal
Trailing newline removal from command expansion
This is a problem for the:
value="$(cat config.txt)"
type solutions, but not for read based solutions.
Command expansion removes trailing newlines:
S="$(printf "a\n")"
printf "$S" | od -tx1
Outputs:
0000000 61
0000001
This breaks the naive method of reading from files:
FILE="$(mktemp)"
printf "a\n\n" > "$FILE"
S="$(<"$FILE")"
printf "$S" | od -tx1
rm "$FILE"
POSIX workaround: append an extra char to the command expansion and remove it later:
S="$(cat $FILE; printf a)"
S="${S%a}"
printf "$S" | od -tx1
Outputs:
0000000 61 0a 0a
0000003
Almost POSIX workaround: ASCII encode. See below.
NUL character removal
There is no sane Bash way to store NUL characters in variables.
This affects both expansion and read solutions, and I don't know any good workaround for it.
Example:
printf "a\0b" | od -tx1
S="$(printf "a\0b")"
printf "$S" | od -tx1
Outputs:
0000000 61 00 62
0000003
0000000 61 62
0000002
Ha, our NUL is gone!
Workarounds:
ASCII encode. See below.
use bash extension $"" literals:
S=$"a\0b"
printf "$S" | od -tx1
Only works for literals, so not useful for reading from files.
Workaround for the pitfalls
Store an uuencode base64 encoded version of the file in the variable, and decode before every usage:
FILE="$(mktemp)"
printf "a\0\n" > "$FILE"
S="$(uuencode -m "$FILE" /dev/stdout)"
uudecode -o /dev/stdout <(printf "$S") | od -tx1
rm "$FILE"
Output:
0000000 61 00 0a
0000003
uuencode and udecode are POSIX 7 but not in Ubuntu 12.04 by default (sharutils package)... I don't see a POSIX 7 alternative for the bash process <() substitution extension except writing to another file...
Of course, this is slow and inconvenient, so I guess the real answer is: don't use Bash if the input file may contain NUL characters.
this works for me:
v=$(cat <file_path>)
echo $v
With bash you may use read like this:
#!/usr/bin/env bash
{ IFS= read -rd '' value <config.txt;} 2>/dev/null
printf '%s' "$value"
Notice that:
The last newline is preserved.
The stderr is silenced to /dev/null by redirecting the whole commands block, but the return status of the read command is preserved, if one needed to handle read error conditions.
As Ciro Santilli notes using command substitutions will drop trailing newlines. Their workaround adding trailing characters is great, but after using it for quite some time I decided I needed a solution that didn't use command substitution at all.
My approach now uses read along with the printf builtin's -v flag in order to read the contents of stdin directly into a variable.
# Reads stdin into a variable, accounting for trailing newlines. Avoids
# needing a subshell or command substitution.
# Note that NUL bytes are still unsupported, as Bash variables don't allow NULs.
# See https://stackoverflow.com/a/22607352/113632
read_input() {
# Use unusual variable names to avoid colliding with a variable name
# the user might pass in (notably "contents")
: "${1:?Must provide a variable to read into}"
if [[ "$1" == '_line' || "$1" == '_contents' ]]; then
echo "Cannot store contents to $1, use a different name." >&2
return 1
fi
local _line _contents=()
while IFS='' read -r _line; do
_contents+=("$_line"$'\n')
done
# include $_line once more to capture any content after the last newline
printf -v "$1" '%s' "${_contents[#]}" "$_line"
}
This supports inputs with or without trailing newlines.
Example usage:
$ read_input file_contents < /tmp/file
# $file_contents now contains the contents of /tmp/file
All the given solutions are quite slow, so:
mapfile -d '' content </etc/passwd # Read file into an array
content="${content[*]%$'\n'}" # Remove trailing newline
Would be nice to optimise it even more but I can't think of much
Update: Found a faster way
read -rd '' content </etc/passwd
This will return exit code of 1 so if you need it
to be always 0:
read -rd '' content </etc/passwd || :
I use:
NGINX_PID=`cat -s "/sdcard/server/nginx/logs/nginx.pid" 2>/dev/null`
if [ "$NGINX_PID" = "" ]; then
echo "..."
exit
fi
You can access 1 line at a time by for loop
#!/bin/bash -eu
#This script prints contents of /etc/passwd line by line
FILENAME='/etc/passwd'
I=0
for LN in $(cat $FILENAME)
do
echo "Line number $((I++)) --> $LN"
done
Copy the entire content to File (say line.sh ) ; Execute
chmod +x line.sh
./line.sh

Shell POSIX two nested while read and read from stdin not working

I have that sample script:
#!/bin/sh
while read ll </dev/fd/4; do
echo "1 "$ll
while read line; do
echo $line
read input </dev/fd/3
echo "$input"
done 3<&0 <notify-finished
done 4<output_file
Currently The first loop do not iterate just stays on line 1. How do I fix that without bashisms because it has to be highly portable. Thanks.
Your code already has bashisms. Here, I'm taking them out (and simplifying the FD handling for better readability):
#!/bin/sh
while read ll <&4; do # read from output_file
printf '%s\n' "1 $ll"
while read line <&3; do # read from notify-finished
printf '%s\n' "$line"
read input # read from stdin
printf '%s\n' "$input"
done 3<notify-finished
done 4<output_file
Run the script as follows:
echo "output_file" >output_file
echo "notify-finished" >notify-finished
echo "stdout" | ./yourscript
...and it correctly exits with the following output:
1 output_file
notify-finished
stdout
Notes:
echo's behavior is wildly nonportable across POSIX platforms. See the APPLICATION USAGE section of the POSIX spec for echo, which advises using printf instead.
/dev/fd/## is not specified by POSIX; it is an extension made available both by Linux distributions (creating a symlink to /proc/self/fd -- /proc being itself an unspecified extension) and by bash itself. Use <&4 in place of </dev/fd/4.
You probably want to use the -r argument to read -- which is POSIX-specified, and prevents the default behavior of treating backslashes as escape sequences for newlines and characters in IFS. Without it, foo\bar is read as foobar, thus not reading your data as it truly exists in its input sources.

echo stdin line by line into file [duplicate]

How do I iterate through each line of a text file with Bash?
With this script:
echo "Start!"
for p in (peptides.txt)
do
echo "${p}"
done
I get this output on the screen:
Start!
./runPep.sh: line 3: syntax error near unexpected token `('
./runPep.sh: line 3: `for p in (peptides.txt)'
(Later I want to do something more complicated with $p than just output to the screen.)
The environment variable SHELL is (from env):
SHELL=/bin/bash
/bin/bash --version output:
GNU bash, version 3.1.17(1)-release (x86_64-suse-linux-gnu)
Copyright (C) 2005 Free Software Foundation, Inc.
cat /proc/version output:
Linux version 2.6.18.2-34-default (geeko#buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
The file peptides.txt contains:
RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL
One way to do it is:
while read p; do
echo "$p"
done <peptides.txt
As pointed out in the comments, this has the side effects of trimming leading whitespace, interpreting backslash sequences, and skipping the last line if it's missing a terminating linefeed. If these are concerns, you can do:
while IFS="" read -r p || [ -n "$p" ]
do
printf '%s\n' "$p"
done < peptides.txt
Exceptionally, if the loop body may read from standard input, you can open the file using a different file descriptor:
while read -u 10 p; do
...
done 10<peptides.txt
Here, 10 is just an arbitrary number (different from 0, 1, 2).
cat peptides.txt | while read line
do
# do something with $line here
done
and the one-liner variant:
cat peptides.txt | while read line; do something_with_$line_here; done
These options will skip the last line of the file if there is no trailing line feed.
You can avoid this by the following:
cat peptides.txt | while read line || [[ -n $line ]];
do
# do something with $line here
done
Option 1a: While loop: Single line at a time: Input redirection
#!/bin/bash
filename='peptides.txt'
echo Start
while read p; do
echo "$p"
done < "$filename"
Option 1b: While loop: Single line at a time:
Open the file, read from a file descriptor (in this case file descriptor #4).
#!/bin/bash
filename='peptides.txt'
exec 4<"$filename"
echo Start
while read -u4 p ; do
echo "$p"
done
This is no better than other answers, but is one more way to get the job done in a file without spaces (see comments). I find that I often need one-liners to dig through lists in text files without the extra step of using separate script files.
for word in $(cat peptides.txt); do echo $word; done
This format allows me to put it all in one command-line. Change the "echo $word" portion to whatever you want and you can issue multiple commands separated by semicolons. The following example uses the file's contents as arguments into two other scripts you may have written.
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done
Or if you intend to use this like a stream editor (learn sed) you can dump the output to another file as follows.
for word in $(cat peptides.txt); do cmd_a.sh $word; cmd_b.py $word; done > outfile.txt
I've used these as written above because I have used text files where I've created them with one word per line. (See comments) If you have spaces that you don't want splitting your words/lines, it gets a little uglier, but the same command still works as follows:
OLDIFS=$IFS; IFS=$'\n'; for line in $(cat peptides.txt); do cmd_a.sh $line; cmd_b.py $line; done > outfile.txt; IFS=$OLDIFS
This just tells the shell to split on newlines only, not spaces, then returns the environment back to what it was previously. At this point, you may want to consider putting it all into a shell script rather than squeezing it all into a single line, though.
Best of luck!
A few more things not covered by other answers:
Reading from a delimited file
# ':' is the delimiter here, and there are three fields on each line in the file
# IFS set below is restricted to the context of `read`, it doesn't affect any other code
while IFS=: read -r field1 field2 field3; do
# process the fields
# if the line has less than three fields, the missing fields will be set to an empty string
# if the line has more than three fields, `field3` will get all the values, including the third field plus the delimiter(s)
done < input.txt
Reading from the output of another command, using process substitution
while read -r line; do
# process the line
done < <(command ...)
This approach is better than command ... | while read -r line; do ... because the while loop here runs in the current shell rather than a subshell as in the case of the latter. See the related post A variable modified inside a while loop is not remembered.
Reading from a null delimited input, for example find ... -print0
while read -r -d '' line; do
# logic
# use a second 'read ... <<< "$line"' if we need to tokenize the line
done < <(find /path/to/dir -print0)
Related read: BashFAQ/020 - How can I find and safely handle file names containing newlines, spaces or both?
Reading from more than one file at a time
while read -u 3 -r line1 && read -u 4 -r line2; do
# process the lines
# note that the loop will end when we reach EOF on either of the files, because of the `&&`
done 3< input1.txt 4< input2.txt
Based on #chepner's answer here:
-u is a bash extension. For POSIX compatibility, each call would look something like read -r X <&3.
Reading a whole file into an array (Bash versions earlier to 4)
while read -r line; do
my_array+=("$line")
done < my_file
If the file ends with an incomplete line (newline missing at the end), then:
while read -r line || [[ $line ]]; do
my_array+=("$line")
done < my_file
Reading a whole file into an array (Bash versions 4x and later)
readarray -t my_array < my_file
or
mapfile -t my_array < my_file
And then
for line in "${my_array[#]}"; do
# process the lines
done
More about the shell builtins read and readarray commands - GNU
More about IFS - Wikipedia
BashFAQ/001 - How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
Related posts:
Creating an array from a text file in Bash
What is the difference between thee approaches to reading a file that has just one line?
Bash while read loop extremely slow compared to cat, why?
Use a while loop, like this:
while IFS= read -r line; do
echo "$line"
done <file
Notes:
If you don't set the IFS properly, you will lose indentation.
You should almost always use the -r option with read.
Don't read lines with for
If you don't want your read to be broken by newline character, use -
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "$line"
done < "$1"
Then run the script with file name as parameter.
Suppose you have this file:
$ cat /tmp/test.txt
Line 1
Line 2 has leading space
Line 3 followed by blank line
Line 5 (follows a blank line) and has trailing space
Line 6 has no ending CR
There are four elements that will alter the meaning of the file output read by many Bash solutions:
The blank line 4;
Leading or trailing spaces on two lines;
Maintaining the meaning of individual lines (i.e., each line is a record);
The line 6 not terminated with a CR.
If you want the text file line by line including blank lines and terminating lines without CR, you must use a while loop and you must have an alternate test for the final line.
Here are the methods that may change the file (in comparison to what cat returns):
1) Lose the last line and leading and trailing spaces:
$ while read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt
'Line 1'
'Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space'
(If you do while IFS= read -r p; do printf "%s\n" "'$p'"; done </tmp/test.txt instead, you preserve the leading and trailing spaces but still lose the last line if it is not terminated with CR)
2) Using process substitution with cat will reads the entire file in one gulp and loses the meaning of individual lines:
$ for p in "$(cat /tmp/test.txt)"; do printf "%s\n" "'$p'"; done
'Line 1
Line 2 has leading space
Line 3 followed by blank line
Line 5 (follows a blank line) and has trailing space
Line 6 has no ending CR'
(If you remove the " from $(cat /tmp/test.txt) you read the file word by word rather than one gulp. Also probably not what is intended...)
The most robust and simplest way to read a file line-by-line and preserve all spacing is:
$ while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/test.txt
'Line 1'
' Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space '
'Line 6 has no ending CR'
If you want to strip leading and trading spaces, remove the IFS= part:
$ while read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/test.txt
'Line 1'
'Line 2 has leading space'
'Line 3 followed by blank line'
''
'Line 5 (follows a blank line) and has trailing space'
'Line 6 has no ending CR'
(A text file without a terminating \n, while fairly common, is considered broken under POSIX. If you can count on the trailing \n you do not need || [[ -n $line ]] in the while loop.)
More at the BASH FAQ
I like to use xargs instead of while. xargs is powerful and command line friendly
cat peptides.txt | xargs -I % sh -c "echo %"
With xargs, you can also add verbosity with -t and validation with -p
This might be the simplest answer and maybe it don't work in all cases, but it is working great for me:
while read line;do echo "$line";done<peptides.txt
if you need to enclose in parenthesis for spaces:
while read line;do echo \"$line\";done<peptides.txt
Ahhh this is pretty much the same as the answer that got upvoted most, but its all on one line.
#!/bin/bash
#
# Change the file name from "test" to desired input file
# (The comments in bash are prefixed with #'s)
for x in $(cat test.txt)
do
echo $x
done
Here is my real life example how to loop lines of another program output, check for substrings, drop double quotes from variable, use that variable outside of the loop. I guess quite many is asking these questions sooner or later.
##Parse FPS from first video stream, drop quotes from fps variable
## streams.stream.0.codec_type="video"
## streams.stream.0.r_frame_rate="24000/1001"
## streams.stream.0.avg_frame_rate="24000/1001"
FPS=unknown
while read -r line; do
if [[ $FPS == "unknown" ]] && [[ $line == *".codec_type=\"video\""* ]]; then
echo ParseFPS $line
FPS=parse
fi
if [[ $FPS == "parse" ]] && [[ $line == *".r_frame_rate="* ]]; then
echo ParseFPS $line
FPS=${line##*=}
FPS="${FPS%\"}"
FPS="${FPS#\"}"
fi
done <<< "$(ffprobe -v quiet -print_format flat -show_format -show_streams -i "$input")"
if [ "$FPS" == "unknown" ] || [ "$FPS" == "parse" ]; then
echo ParseFPS Unknown frame rate
fi
echo Found $FPS
Declare variable outside of the loop, set value and use it outside of loop requires done <<< "$(...)" syntax. Application need to be run within a context of current console. Quotes around the command keeps newlines of output stream.
Loop match for substrings then reads name=value pair, splits right-side part of last = character, drops first quote, drops last quote, we have a clean value to be used elsewhere.
This is coming rather very late, but with the thought that it may help someone, i am adding the answer. Also this may not be the best way. head command can be used with -n argument to read n lines from start of file and likewise tail command can be used to read from bottom. Now, to fetch nth line from file, we head n lines, pipe the data to tail only 1 line from the piped data.
TOTAL_LINES=`wc -l $USER_FILE | cut -d " " -f1 `
echo $TOTAL_LINES # To validate total lines in the file
for (( i=1 ; i <= $TOTAL_LINES; i++ ))
do
LINE=`head -n$i $USER_FILE | tail -n1`
echo $LINE
done
#Peter: This could work out for you-
echo "Start!";for p in $(cat ./pep); do
echo $p
done
This would return the output-
Start!
RKEKNVQ
IPKKLLQK
QYFHQLEKMNVK
IPKKLLQK
GDLSTALEVAIDCYEK
QYFHQLEKMNVKIPENIYR
RKEKNVQ
VLAKHGKLQDAIN
ILGFMK
LEDVALQILL
Another way to go about using xargs
<file_name | xargs -I {} echo {}
echo can be replaced with other commands or piped further.
for p in `cat peptides.txt`
do
echo "${p}"
done

Removing colors from output

I have some script that produces output with colors and I need to remove the ANSI codes.
#!/bin/bash
exec > >(tee log) # redirect the output to a file but keep it on stdout
exec 2>&1
./somescript
The output is (in log file):
java (pid 12321) is running...#[60G[#[0;32m OK #[0;39m]
I didn't know how to put the ESC character here, so I put # in its place.
I changed the script into:
#!/bin/bash
exec > >(tee log) # redirect the output to a file but keep it on stdout
exec 2>&1
./somescript | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g"
But now it gives me (in log file):
java (pid 12321) is running...#[60G[ OK ]
How can I also remove this '#[60G?
Maybe there is a way to completely disable coloring for the entire script?
According to Wikipedia, the [m|K] in the sed command you're using is specifically designed to handle m (the color command) and K (the "erase part of line" command). Your script is trying to set absolute cursor position to 60 (^[[60G) to get all the OKs in a line, which your sed line doesn't cover.
(Properly, [m|K] should probably be (m|K) or [mK], because you're not trying to match a pipe character. But that's not important right now.)
If you switch that final match in your command to [mGK] or (m|G|K), you should be able to catch that extra control sequence.
./somescript | sed -r "s/\x1B\[([0-9]{1,3}(;[0-9]{1,2};?)?)?[mGK]//g"
IMHO, most of these answers try too hard to restrict what is inside the escape code. As a result, they end up missing common codes like [38;5;60m (foreground ANSI color 60 from 256-color mode).
They also require the -r option which enables GNU extensions. These are not required; they just make the regex read better.
Here is a simpler answer that handles the 256-color escapes and works on systems with non-GNU sed:
./somescript | sed 's/\x1B\[[0-9;]\{1,\}[A-Za-z]//g'
This will catch anything that starts with [, has any number of decimals and semicolons, and ends with a letter. This should catch any of the common ANSI escape sequences.
For funsies, here's a larger and more general (but minimally tested) solution for all conceivable ANSI escape sequences:
./somescript | sed 's/\x1B[#A-Z\\\]^_]\|\x1B\[[0-9:;<=>?]*[-!"#$%&'"'"'()*+,.\/]*[][\\#A-Z^_`a-z{|}~]//g'
(and if you have #edi9999's SI problem, add | sed "s/\x0f//g" to the end; this works for any control char by replacing 0f with the hex of the undesired char)
I couldn't get decent results from any of the other answers, but the following worked for me:
somescript | sed -r "s/[[:cntrl:]]\[[0-9]{1,3}m//g"
If I only removed the control char "^[", it left the rest of the color data, e.g., "33m". Including the color code and "m" did the trick. I'm puzzled with s/\x1B//g doesn't work because \x1B[31m certainly works with echo.
I came across ansi2txt tool from colorized-logs package in Debian. The tool drops ANSI control codes from STDIN.
Usage example:
./somescript | ansi2txt
Source code http://github.com/kilobyte/colorized-logs
For Mac OSX or BSD use
./somescript | sed $'s,\x1b\\[[0-9;]*[a-zA-Z],,g'
The regular expression below will miss some ANSI Escape Codes sequences, as well as 3 digit colors. Example and Fix on regex101.com.
Use this instead:
./somescript | sed -r 's/\x1B\[(;?[0-9]{1,3})+[mGK]//g'
I also had the problem that sometimes, the SI character appeared.
It happened for example with this input : echo "$(tput setaf 1)foo$(tput sgr0) bar"
Here's a way to also strip the SI character (shift in) (0x0f)
./somescript | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" | sed "s/\x0f//g"
Much simpler function in pure Bash to filter-out common ANSI codes from a text stream:
# Strips common ANSI codes from a text stream
shopt -s extglob # Enable Bash Extended Globbing expressions
ansi_filter() {
local line
local IFS=
while read -r line || [[ "$line" ]]; do
printf '%s\n' "${line//$'\e'[\[(]*([0-9;])[#-n]/}"
done
}
See:
linuxjournal.com: Extended Globbing
gnu.org: Bash Parameter Expansion
I had a similar problem. All solutions I found did work well for the color codes but did not remove the characters added by "$(tput sgr0)" (resetting attributes).
Taking, for example, the solution in the comment by davemyron the length of the resulting string in the example below is 9, not 6:
#!/usr/bin/env bash
string="$(tput setaf 9)foobar$(tput sgr0)"
string_sed="$( sed -r "s/\x1B\[[0-9;]*[JKmsu]//g" <<< "${string}" )"
echo ${#string_sed}
In order to work properly, the regex had to be extend to also match the sequence added by sgr0 ("\E(B"):
string_sed="$( sed -r "s/\x1B(\[[0-9;]*[JKmsu]|\(B)//g" <<< "${string}" )"
Not sure what's in ./somescript but if escape sequences are not hardcoded you can set the terminal type to avoid them
TERM=dumb ./somescript
For example, if you try
TERM=dumb tput sgr0 | xxd
you'll see it produces no output while
tput sgr0 | xxd
00000000: 1b28 421b 5b6d .(B.[m
does (for xterm-256color).
Hmm, not sure if this will work for you, but 'tr' will 'strip' (delete) control codes - try:
./somescript | tr -d '[:cntrl:]'
There's also a dedicated tool to handle ANSI escape sequences: ansifilter. Use the default --text output format to strip all ANSI escape sequences (note: not just coloring).
ref: https://stackoverflow.com/a/6534712
Here's a pure Bash solution.
Save as strip-escape-codes.sh, make executable and then run <command-producing-colorful-output> | ./strip-escape-codes.sh.
Note that this strips all ANSI escape codes/sequences. If you want to strip colors only, replace [a-zA-Z] with "m".
Bash >= 4.0:
#!/usr/bin/env bash
# Strip ANSI escape codes/sequences [$1: input string, $2: target variable]
function strip_escape_codes() {
local _input="$1" _i _char _escape=0
local -n _output="$2"; _output=""
for (( _i=0; _i < ${#_input}; _i++ )); do
_char="${_input:_i:1}"
if (( ${_escape} == 1 )); then
if [[ "${_char}" == [a-zA-Z] ]]; then
_escape=0
fi
continue
fi
if [[ "${_char}" == $'\e' ]]; then
_escape=1
continue
fi
_output+="${_char}"
done
}
while read -r line; do
strip_escape_codes "${line}" line_stripped
echo "${line_stripped}"
done
Bash < 4.0:
#!/usr/bin/env bash
# Strip ANSI escape codes/sequences [$1: input string, $2: target variable]
function strip_escape_codes() {
local input="${1//\"/\\\"}" output="" i char escape=0
for (( i=0; i < ${#input}; ++i )); do # process all characters of input string
char="${input:i:1}" # get current character from input string
if (( ${escape} == 1 )); then # if we're currently within an escape sequence, check if
if [[ "${char}" == [a-zA-Z] ]]; then # end is reached, i.e. if current character is a letter
escape=0 # end reached, we're no longer within an escape sequence
fi
continue # skip current character, i.e. do not add to ouput
fi
if [[ "${char}" == $'\e' ]]; then # if current character is '\e', we've reached the start
escape=1 # of an escape sequence -> set flag
continue # skip current character, i.e. do not add to ouput
fi
output+="${char}" # add current character to output
done
eval "$2=\"${output}\"" # assign output to target variable
}
while read -r line; do
strip_escape_codes "${line}" line_stripped
echo "${line_stripped}"
done
#jeff-bowman's solution helped me getting rid of SOME of the color codes.
I added another small portion to the regex in order to remove some more:
sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" # Original. Removed Red ([31;40m[1m[error][0m)
sed -r "s/\x1B\[([0-9];)?([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" # With an addition, removed yellow and green ([1;33;40m[1m[warning][0m and [1;32;40m[1m[ok][0m)
^^^^^^^^^
remove Yellow and Green (and maybe more colors)
The controversial idea would be to reconfigure terminal settings for this process environment to let the process know that terminal does not support colors.
Something like TERM=xterm-mono ./somescript comes to my mind. YMMV with your specific OS and ability of your script to understand terminal color settings.
I had some issues with colorized output which the other solutions here didn't process correctly, so I built this perl one liner. It looks for escape \e followed by opening bracket \[ followed by one or color codes \d+ separated by semicolons, ending on m.
perl -ple 's/\e\[\d+(;\d+)*m//g'
It seems to work really well for colorized compiler output.
I came across this question/answers trying to do something similar as the OP. I found some other useful resources and came up with a log script based on those. Posting here in case it can help others.
Digging into the links helps understand some of the redirection which I won't try and explain because I'm just starting to understand it myself.
Usage will render the colorized output to the console, while stripping the color codes out of the text going to the log file. It will also include stderr in the logfile for any commands that don't work.
Edit: adding more usage at bottom to show how to log in different ways
#!/bin/bash
set -e
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
. $DIR/dev.conf
. $DIR/colors.cfg
filename=$(basename ${BASH_SOURCE[0]})
# remove extension
# filename=`echo $filename | grep -oP '.*?(?=\.)'`
filename=`echo $filename | awk -F\. '{print $1}'`
log=$DIR/logs/$filename-$target
if [ -f $log ]; then
cp $log "$log.bak"
fi
exec 3>&1 4>&2
trap 'exec 2>&4 1>&3' 0 1 2 3
exec 1>$log 2>&1
# log message
log(){
local m="$#"
echo -e "*** ${m} ***" >&3
echo "=================================================================================" >&3
local r="$#"
echo "================================================================================="
echo -e "*** $r ***" | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g"
echo "================================================================================="
}
echo "=================================================================================" >&3
log "${Cyan}The ${Yellow}${COMPOSE_PROJECT_NAME} ${filename} ${Cyan}script has been executed${NC}"
log $(ls) #log $(<command>)
log "${Green}Apply tag to image $source with version $version${NC}"
# log $(exec docker tag $source $target 3>&2) #prints error only to console
# log $(docker tag $source $target 2>&1) #prints error to both but doesn't exit on fail
log $(docker tag $source $target 2>&1) && exit $? #prints error to both AND exits on fail
# docker tag $source $target 2>&1 | tee $log # prints gibberish to log
echo $? # prints 0 because log function was successful
log "${Purple}Push $target to acr${NC}"
Here are the other links that helped:
Can I use sed to manipulate a variable in bash?
https://www.cyberciti.biz/faq/redirecting-stderr-to-stdout/
https://unix.stackexchange.com/questions/42728/what-does-31-12-23-do-in-a-script
https://serverfault.com/questions/103501/how-can-i-fully-log-all-bash-scripts-actions
https://www.gnu.org/software/bash/manual/bash.html#Redirections
I used perl as I have to do this frequently on many files. This will go through all files with filename*.txt and will remove any formatting. This works for my use case and may be useful for someone else too so just thought of posting here. replace whatever your file name is in place of filename*.txt or you can put file names separated by spaces in setting the FILENAME variable below.
$ FILENAME=$(ls filename*.txt) ; for file in $(echo $FILENAME); do echo $file; cat $file | perl -pe 's/\e([^\[\]]|\[.*?[a-zA-Z]|\].*?\a)//g' | col -b > $file-new; mv $file-new $file; done
my contribution:
./somescript | sed -r "s/\\x1B[\\x5d\[]([0-9]{1,3}(;[0-9]{1,3})?(;[0-9]{1,3})?)?[mGK]?//g"
This works for me:
./somescript | cat

Resources