here-document gives 'unexpected end of file' error - bash

I need my script to send an email from terminal. Based on what I've seen here and many other places online, I formatted it like this:
/var/mail -s "$SUBJECT" "$EMAIL" << EOF
Here's a line of my message!
And here's another line!
Last line of the message here!
EOF
However, when I run this I get this warning:
myfile.sh: line x: warning: here-document at line y delimited by end-of-file (wanted 'EOF')
myfile.sh: line x+1: syntax error: unexpected end of file
...where line x is the last written line of code in the program, and line y is the line with /var/mail in it. I've tried replacing EOF with other things (ENDOFMESSAGE, FINISH, etc.) but to no avail. Nearly everything I've found online has it done this way, and I'm really new at bash so I'm having a hard time figuring it out on my own. Could anyone offer any help?

The EOF token must be at the beginning of the line, you can't indent it along with the block of code it goes with.
If you write <<-EOF you may indent it, but it must be indented with Tab characters, not spaces. So it still might not end up even with the block of code.
Also make sure you have no whitespace after the EOF token on the line.

The line that starts or ends the here-doc probably has some non-printable or whitespace characters (for example, carriage return) which means that the second "EOF" does not match the first, and doesn't end the here-doc like it should. This is a very common error, and difficult to detect with just a text editor. You can make non-printable characters visible for example with cat:
cat -A myfile.sh
Once you see the output from cat -A the solution will be obvious: remove the offending characters.

Please try to remove the preceeding spaces before EOF:-
/var/mail -s "$SUBJECT" "$EMAIL" <<-EOF
Using <tab> instead of <spaces> for ident AND using <<-EOF works fine.
The "-" removes the <tabs>, not <spaces>, but at least this works.

Note one can also get this error if you do this;
while read line; do
echo $line
done << somefile
Because << somefile should read < somefile in this case.

May be old but I had a space after the ending EOF
<< EOF
blah
blah
EOF <-- this was the issue. Had it for years, finally looked it up here

For anyone stumbling here who googled "bash warning: here-document delimited by end-of-file", it may be that you are getting the
warning: here-document at line 74 delimited by end-of-file
...type warning because you accidentally used a here document symbol (<<) when you meant to use a here string symbol (<<<). That was my case.

Here is a flexible way to do deal with multiple indented lines without using heredoc.
echo 'Hello!'
sed -e 's:^\s*::' < <(echo '
Some indented text here.
Some indented text here.
')
if [[ true ]]; then
sed -e 's:^\s\{4,4\}::' < <(echo '
Some indented text here.
Some extra indented text here.
Some indented text here.
')
fi
Some notes on this solution:
if the content is expected to have simple quotes, either escape them using \ or replace the string delimiters with double quotes. In the latter case, be careful that construction like $(command) will be interpreted. If the string contains both simple and double quotes, you'll have to escape at least of kind.
the given example print a trailing empty line, there are numerous way to get rid of it, not included here to keep the proposal to a minimum clutter
the flexibility comes from the ease with which you can control how much leading space should stay or go, provided that you know some sed REGEXP of course.

When I want to have docstrings for my bash functions, I use a solution similar to the suggestion of user12205 in a duplicate of this question.
See how I define USAGE for a solution that:
auto-formats well for me in my IDE of choice (sublime)
is multi-line
can use spaces or tabs as indentation
preserves indentations within the comment.
function foo {
# Docstring
read -r -d '' USAGE <<' END'
# This method prints foo to the terminal.
#
# Enter `foo -h` to see the docstring.
# It has indentations and multiple lines.
#
# Change the delimiter if you need hashtag for some reason.
# This can include $$ and = and eval, but won't be evaluated
END
if [ "$1" = "-h" ]
then
echo "$USAGE" | cut -d "#" -f 2 | cut -c 2-
return
fi
echo "foo"
}
So foo -h yields:
This method prints foo to the terminal.
Enter `foo -h` to see the docstring.
It has indentations and multiple lines.
Change the delimiter if you need hashtag for some reason.
This can include $$ and = and eval, but won't be evaluated
Explanation
cut -d "#" -f 2: Retrieve the second portion of the # delimited lines. (Think a csv with "#" as the delimiter, empty first column).
cut -c 2-: Retrieve the 2nd to end character of the resultant string
Also note that if [ "$1" = "-h" ] evaluates as False if there is no first argument, w/o error, since it becomes an empty string.

make sure where you put the ending EOF you put it at the beginning of a new line

Along with the other answers mentioned by Barmar and Joni, I've noticed that I sometimes have to leave a blank line before and after my EOF when using <<-EOF.

Related

Bash while read loop with IFS prints only the first line [duplicate]

I have an ... odd issue with a bash shell script that I was hoping to get some insight on.
My team is working on a script that iterates through lines in a file and checks for content in each one. We had a bug where, when run via the automated process that sequences different scripts together, the last line wasn't being seen.
The code used to iterate over the lines in the file (name stored in DATAFILE was
cat "$DATAFILE" | while read line
We could run the script from the command line and it would see every line in the file, including the last one, just fine. However, when run by the automated process (which runs the script that generates the DATAFILE just prior to the script in question), the last line is never seen.
We updated the code to use the following to iterate over the lines, and the problem cleared up:
for line in `cat "$DATAFILE"`
Note: DATAFILE has no newline ever written at the end of the file.
My question is two part... Why would the last line not be seen by the original code, and why this would change make a difference?
I only thought I could come up with as to why the last line would not be seen was:
The previous process, which writes the file, was relying on the process to end to close the file descriptor.
The problem script was starting up and opening the file prior fast enough that, while the previous process had "ended", it hadn't "shut down/cleaned up" enough for the system to close the file descriptor automatically for it.
That being said, it seems like, if you have 2 commands in a shell script, the first one should be completely shut down by the time the script runs the second one.
Any insight into the questions, especially the first one, would be very much appreciated.
The C standard says that text files must end with a newline or the data after the last newline may not be read properly.
ISO/IEC 9899:2011 §7.21.2 Streams
A text stream is an ordered sequence of characters composed into lines, each line
consisting of zero or more characters plus a terminating new-line character. Whether the
last line requires a terminating new-line character is implementation-defined. Characters
may have to be added, altered, or deleted on input and output to conform to differing
conventions for representing text in the host environment. Thus, there need not be a one-to-
one correspondence between the characters in a stream and those in the external
representation. Data read in from a text stream will necessarily compare equal to the data
that were earlier written out to that stream only if: the data consist only of printing
characters and the control characters horizontal tab and new-line; no new-line character is
immediately preceded by space characters; and the last character is a new-line character.
Whether space characters that are written out immediately before a new-line character
appear when read in is implementation-defined.
I would not have expected a missing newline at the end of file to cause trouble in bash (or any Unix shell), but that does seem to be the problem reproducibly ($ is the prompt in this output):
$ echo xxx\\c
xxx$ { echo abc; echo def; echo ghi; echo xxx\\c; } > y
$ cat y
abc
def
ghi
xxx$
$ while read line; do echo $line; done < y
abc
def
ghi
$ bash -c 'while read line; do echo $line; done < y'
abc
def
ghi
$ ksh -c 'while read line; do echo $line; done < y'
abc
def
ghi
$ zsh -c 'while read line; do echo $line; done < y'
abc
def
ghi
$ for line in $(<y); do echo $line; done # Preferred notation in bash
abc
def
ghi
xxx
$ for line in $(cat y); do echo $line; done # UUOC Award pending
abc
def
ghi
xxx
$
It is also not limited to bash — Korn shell (ksh) and zsh behave like that too. I live, I learn; thanks for raising the issue.
As demonstrated in the code above, the cat command reads the whole file. The for line in `cat $DATAFILE` technique collects all the output and replaces arbitrary sequences of white space with a single blank (I conclude that each line in the file contains no blanks).
Tested on Mac OS X 10.7.5.
What does POSIX say?
The POSIX read command specification says:
The read utility shall read a single line from standard input.
By default, unless the -r option is specified, <backslash> shall act as an escape character. An unescaped <backslash> shall preserve the literal value of the following character, with the exception of a <newline>. If a <newline> follows the <backslash>, the read utility shall interpret this as line continuation. The <backslash> and <newline> shall be removed before splitting the input into fields. All other unescaped <backslash> characters shall be removed after splitting the input into fields.
If standard input is a terminal device and the invoking shell is interactive, read shall prompt for a continuation line when it reads an input line ending with a <backslash> <newline>, unless the -r option is specified.
The terminating <newline> (if any) shall be removed from the input and the results shall be split into fields as in the shell for the results of parameter expansion (see Field Splitting); [...]
Note that '(if any)' (emphasis added in quote)! It seems to me that if there is no newline, it should still read the result. On the other hand, it also says:
STDIN
The standard input shall be a text file.
and then you get back to the debate about whether a file that does not end with a newline is a text file or not.
However, the rationale on the same page documents:
Although the standard input is required to be a text file, and therefore will always end with a <newline> (unless it is an empty file), the processing of continuation lines when the -r option is not used can result in the input not ending with a <newline>. This occurs if the last line of the input file ends with a <backslash> <newline>. It is for this reason that "if any" is used in "The terminating <newline> (if any) shall be removed from the input" in the description. It is not a relaxation of the requirement for standard input to be a text file.
That rationale must mean that the text file is supposed to end with a newline.
The POSIX definition of a text file is:
3.395 Text File
A file that contains characters organized into zero or more lines. The lines do not contain NUL characters and none can exceed {LINE_MAX} bytes in length, including the <newline> character. Although POSIX.1-2008 does not distinguish between text files and binary files (see the ISO C standard), many utilities only produce predictable or meaningful output when operating on text files. The standard utilities that have such restrictions always specify "text files" in their STDIN or INPUT FILES sections.
This does not stipulate 'ends with a <newline>' directly, but does defer to the C standard and it does say "A file that contains characters organized into zero or more lines" and when we look at the POSIX definition of a "Line" it says:
3.206 Line
A sequence of zero or more non- <newline> characters plus a
terminating <newline> character.
so per the POSIX definition a file must end in a terminating newline because it's made up of lines and each line must end in a terminating newline.
A solution to the 'no terminal newline' problem
Note Gordon Davisson's answer. A simple test shows that his observation is accurate:
$ while read line; do echo $line; done < y; echo $line
abc
def
ghi
xxx
$
Therefore, his technique of:
while read line || [ -n "$line" ]; do echo $line; done < y
or:
cat y | while read line || [ -n "$line" ]; do echo $line; done
will work for files without a newline at the end (at least on my machine).
I'm still surprised to find that the shells drop the last segment (it can't be called a line because it doesn't end with a newline) of the input, but there might be sufficient justification in POSIX to do so. And clearly it is best to ensure that your text files really are text files ending with a newline.
According to the POSIX spec for the read command, it should return a nonzero status if "End-of-file was detected or an error occurred." Since EOF is detected as it reads the last "line", it sets $line and then returns an error status, and the error status prevents the loop from executing on that last "line". The solution is easy: make the loop execute if the read command succeeds OR if anything was read into $line.
while read line || [ -n "$line" ]; do
Adding some additional info:
There's no need to use cat with while loop. while ...;do something;done<file is enough.
Don't read lines with for.
When using while loop to read lines:
Set the IFS properly (you may lose indentation otherwise).
You should almost always use the -r option with read.
with meeting the above requirements a proper while loop will look like this:
while IFS= read -r line; do
...
done <file
And to make it work with files without a newline at end (reposting my solution from here):
while IFS= read -r line || [ -n "$line" ]; do
echo "$line"
done <file
Or using grep with while loop:
while IFS= read -r line; do
echo "$line"
done < <(grep "" file)
As a workaround, before reading from the text file a newline can be appended to the file.
echo -e "\n" >> $file_path
This will ensure that all the lines that was previously in the file will be read.We need to pass -e argument to echo to enable interpretation of escape sequences.
https://superuser.com/questions/313938/shell-script-echo-new-line-to-file
I tested this in command line
# create dummy file. last line doesn't end with newline
printf "%i\n%i\nNo-newline-here" >testing
Test with your first form (piping to while-loop)
cat testing | while read line; do echo $line; done
This misses the last line, which makes sense since read only gets input that ends with a newline.
Test with your second form (command substitution)
for line in `cat testbed1` ; do echo $line; done
This gets the last line as well
read only gets input if it's terminated by newline, that's why you miss the last line.
On the other hand, in the second form
`cat testing`
expands to the form of
line1\nline2\n...lineM
which is separated by the shell into multiple fields using IFS, so you get
line1 line2 line3 ... lineM
That's why you still get the last line.
p/s: What I don't understand is how you get the first form working...
Use sed to match the last line of a file, which it will then append a newline if one does not exist and have it do an inline replacement of the file:
sed -i '' -e '$a\' file
The code is from this stackexchange link
Note: I have added empty single quotes to -i '' because, at least in OS X, -i was using -e as a file extension for the backup file. I would have gladly commented on the original post but lacked 50 points. Perhaps this will gain me a few in this thread, thanks.
I had a similar issue.
I was doing a cat of a file, piping it to a sort and then piping the result to a 'while read var1 var2 var3'.
ie:
cat $FILE|sort -k3|while read Count IP Name
do
The work under the "do" was an if statement that identified changing data in the $Name field and based on change or no change did sums of $Count or printed the summed line to the report.
I also ran into the issue where I couldnt get the last line to print to the report.
I went with the simple expedient of redirecting the cat/sort to a new file, echoing a newline to that new file and THEN ran my "while read Count IP Name" on the new file with successful results.
ie:
cat $FILE|sort -k3 > NEWFILE
echo "\n" >> NEWFILE
cat NEWFILE |while read Count IP Name
do
Sometimes the simple, inelegant is the best way to go.

Download files through FTP using BASH [duplicate]

I need my script to send an email from terminal. Based on what I've seen here and many other places online, I formatted it like this:
/var/mail -s "$SUBJECT" "$EMAIL" << EOF
Here's a line of my message!
And here's another line!
Last line of the message here!
EOF
However, when I run this I get this warning:
myfile.sh: line x: warning: here-document at line y delimited by end-of-file (wanted 'EOF')
myfile.sh: line x+1: syntax error: unexpected end of file
...where line x is the last written line of code in the program, and line y is the line with /var/mail in it. I've tried replacing EOF with other things (ENDOFMESSAGE, FINISH, etc.) but to no avail. Nearly everything I've found online has it done this way, and I'm really new at bash so I'm having a hard time figuring it out on my own. Could anyone offer any help?
The EOF token must be at the beginning of the line, you can't indent it along with the block of code it goes with.
If you write <<-EOF you may indent it, but it must be indented with Tab characters, not spaces. So it still might not end up even with the block of code.
Also make sure you have no whitespace after the EOF token on the line.
The line that starts or ends the here-doc probably has some non-printable or whitespace characters (for example, carriage return) which means that the second "EOF" does not match the first, and doesn't end the here-doc like it should. This is a very common error, and difficult to detect with just a text editor. You can make non-printable characters visible for example with cat:
cat -A myfile.sh
Once you see the output from cat -A the solution will be obvious: remove the offending characters.
Please try to remove the preceeding spaces before EOF:-
/var/mail -s "$SUBJECT" "$EMAIL" <<-EOF
Using <tab> instead of <spaces> for ident AND using <<-EOF works fine.
The "-" removes the <tabs>, not <spaces>, but at least this works.
Note one can also get this error if you do this;
while read line; do
echo $line
done << somefile
Because << somefile should read < somefile in this case.
May be old but I had a space after the ending EOF
<< EOF
blah
blah
EOF <-- this was the issue. Had it for years, finally looked it up here
For anyone stumbling here who googled "bash warning: here-document delimited by end-of-file", it may be that you are getting the
warning: here-document at line 74 delimited by end-of-file
...type warning because you accidentally used a here document symbol (<<) when you meant to use a here string symbol (<<<). That was my case.
Here is a flexible way to do deal with multiple indented lines without using heredoc.
echo 'Hello!'
sed -e 's:^\s*::' < <(echo '
Some indented text here.
Some indented text here.
')
if [[ true ]]; then
sed -e 's:^\s\{4,4\}::' < <(echo '
Some indented text here.
Some extra indented text here.
Some indented text here.
')
fi
Some notes on this solution:
if the content is expected to have simple quotes, either escape them using \ or replace the string delimiters with double quotes. In the latter case, be careful that construction like $(command) will be interpreted. If the string contains both simple and double quotes, you'll have to escape at least of kind.
the given example print a trailing empty line, there are numerous way to get rid of it, not included here to keep the proposal to a minimum clutter
the flexibility comes from the ease with which you can control how much leading space should stay or go, provided that you know some sed REGEXP of course.
When I want to have docstrings for my bash functions, I use a solution similar to the suggestion of user12205 in a duplicate of this question.
See how I define USAGE for a solution that:
auto-formats well for me in my IDE of choice (sublime)
is multi-line
can use spaces or tabs as indentation
preserves indentations within the comment.
function foo {
# Docstring
read -r -d '' USAGE <<' END'
# This method prints foo to the terminal.
#
# Enter `foo -h` to see the docstring.
# It has indentations and multiple lines.
#
# Change the delimiter if you need hashtag for some reason.
# This can include $$ and = and eval, but won't be evaluated
END
if [ "$1" = "-h" ]
then
echo "$USAGE" | cut -d "#" -f 2 | cut -c 2-
return
fi
echo "foo"
}
So foo -h yields:
This method prints foo to the terminal.
Enter `foo -h` to see the docstring.
It has indentations and multiple lines.
Change the delimiter if you need hashtag for some reason.
This can include $$ and = and eval, but won't be evaluated
Explanation
cut -d "#" -f 2: Retrieve the second portion of the # delimited lines. (Think a csv with "#" as the delimiter, empty first column).
cut -c 2-: Retrieve the 2nd to end character of the resultant string
Also note that if [ "$1" = "-h" ] evaluates as False if there is no first argument, w/o error, since it becomes an empty string.
make sure where you put the ending EOF you put it at the beginning of a new line
Along with the other answers mentioned by Barmar and Joni, I've noticed that I sometimes have to leave a blank line before and after my EOF when using <<-EOF.

shell: HERE document - mark of EOF does not work in function [duplicate]

I need my script to send an email from terminal. Based on what I've seen here and many other places online, I formatted it like this:
/var/mail -s "$SUBJECT" "$EMAIL" << EOF
Here's a line of my message!
And here's another line!
Last line of the message here!
EOF
However, when I run this I get this warning:
myfile.sh: line x: warning: here-document at line y delimited by end-of-file (wanted 'EOF')
myfile.sh: line x+1: syntax error: unexpected end of file
...where line x is the last written line of code in the program, and line y is the line with /var/mail in it. I've tried replacing EOF with other things (ENDOFMESSAGE, FINISH, etc.) but to no avail. Nearly everything I've found online has it done this way, and I'm really new at bash so I'm having a hard time figuring it out on my own. Could anyone offer any help?
The EOF token must be at the beginning of the line, you can't indent it along with the block of code it goes with.
If you write <<-EOF you may indent it, but it must be indented with Tab characters, not spaces. So it still might not end up even with the block of code.
Also make sure you have no whitespace after the EOF token on the line.
The line that starts or ends the here-doc probably has some non-printable or whitespace characters (for example, carriage return) which means that the second "EOF" does not match the first, and doesn't end the here-doc like it should. This is a very common error, and difficult to detect with just a text editor. You can make non-printable characters visible for example with cat:
cat -A myfile.sh
Once you see the output from cat -A the solution will be obvious: remove the offending characters.
Please try to remove the preceeding spaces before EOF:-
/var/mail -s "$SUBJECT" "$EMAIL" <<-EOF
Using <tab> instead of <spaces> for ident AND using <<-EOF works fine.
The "-" removes the <tabs>, not <spaces>, but at least this works.
Note one can also get this error if you do this;
while read line; do
echo $line
done << somefile
Because << somefile should read < somefile in this case.
May be old but I had a space after the ending EOF
<< EOF
blah
blah
EOF <-- this was the issue. Had it for years, finally looked it up here
For anyone stumbling here who googled "bash warning: here-document delimited by end-of-file", it may be that you are getting the
warning: here-document at line 74 delimited by end-of-file
...type warning because you accidentally used a here document symbol (<<) when you meant to use a here string symbol (<<<). That was my case.
Here is a flexible way to do deal with multiple indented lines without using heredoc.
echo 'Hello!'
sed -e 's:^\s*::' < <(echo '
Some indented text here.
Some indented text here.
')
if [[ true ]]; then
sed -e 's:^\s\{4,4\}::' < <(echo '
Some indented text here.
Some extra indented text here.
Some indented text here.
')
fi
Some notes on this solution:
if the content is expected to have simple quotes, either escape them using \ or replace the string delimiters with double quotes. In the latter case, be careful that construction like $(command) will be interpreted. If the string contains both simple and double quotes, you'll have to escape at least of kind.
the given example print a trailing empty line, there are numerous way to get rid of it, not included here to keep the proposal to a minimum clutter
the flexibility comes from the ease with which you can control how much leading space should stay or go, provided that you know some sed REGEXP of course.
When I want to have docstrings for my bash functions, I use a solution similar to the suggestion of user12205 in a duplicate of this question.
See how I define USAGE for a solution that:
auto-formats well for me in my IDE of choice (sublime)
is multi-line
can use spaces or tabs as indentation
preserves indentations within the comment.
function foo {
# Docstring
read -r -d '' USAGE <<' END'
# This method prints foo to the terminal.
#
# Enter `foo -h` to see the docstring.
# It has indentations and multiple lines.
#
# Change the delimiter if you need hashtag for some reason.
# This can include $$ and = and eval, but won't be evaluated
END
if [ "$1" = "-h" ]
then
echo "$USAGE" | cut -d "#" -f 2 | cut -c 2-
return
fi
echo "foo"
}
So foo -h yields:
This method prints foo to the terminal.
Enter `foo -h` to see the docstring.
It has indentations and multiple lines.
Change the delimiter if you need hashtag for some reason.
This can include $$ and = and eval, but won't be evaluated
Explanation
cut -d "#" -f 2: Retrieve the second portion of the # delimited lines. (Think a csv with "#" as the delimiter, empty first column).
cut -c 2-: Retrieve the 2nd to end character of the resultant string
Also note that if [ "$1" = "-h" ] evaluates as False if there is no first argument, w/o error, since it becomes an empty string.
make sure where you put the ending EOF you put it at the beginning of a new line
Along with the other answers mentioned by Barmar and Joni, I've noticed that I sometimes have to leave a blank line before and after my EOF when using <<-EOF.

Shell script read missing last line

I have an ... odd issue with a bash shell script that I was hoping to get some insight on.
My team is working on a script that iterates through lines in a file and checks for content in each one. We had a bug where, when run via the automated process that sequences different scripts together, the last line wasn't being seen.
The code used to iterate over the lines in the file (name stored in DATAFILE was
cat "$DATAFILE" | while read line
We could run the script from the command line and it would see every line in the file, including the last one, just fine. However, when run by the automated process (which runs the script that generates the DATAFILE just prior to the script in question), the last line is never seen.
We updated the code to use the following to iterate over the lines, and the problem cleared up:
for line in `cat "$DATAFILE"`
Note: DATAFILE has no newline ever written at the end of the file.
My question is two part... Why would the last line not be seen by the original code, and why this would change make a difference?
I only thought I could come up with as to why the last line would not be seen was:
The previous process, which writes the file, was relying on the process to end to close the file descriptor.
The problem script was starting up and opening the file prior fast enough that, while the previous process had "ended", it hadn't "shut down/cleaned up" enough for the system to close the file descriptor automatically for it.
That being said, it seems like, if you have 2 commands in a shell script, the first one should be completely shut down by the time the script runs the second one.
Any insight into the questions, especially the first one, would be very much appreciated.
The C standard says that text files must end with a newline or the data after the last newline may not be read properly.
ISO/IEC 9899:2011 §7.21.2 Streams
A text stream is an ordered sequence of characters composed into lines, each line
consisting of zero or more characters plus a terminating new-line character. Whether the
last line requires a terminating new-line character is implementation-defined. Characters
may have to be added, altered, or deleted on input and output to conform to differing
conventions for representing text in the host environment. Thus, there need not be a one-to-
one correspondence between the characters in a stream and those in the external
representation. Data read in from a text stream will necessarily compare equal to the data
that were earlier written out to that stream only if: the data consist only of printing
characters and the control characters horizontal tab and new-line; no new-line character is
immediately preceded by space characters; and the last character is a new-line character.
Whether space characters that are written out immediately before a new-line character
appear when read in is implementation-defined.
I would not have expected a missing newline at the end of file to cause trouble in bash (or any Unix shell), but that does seem to be the problem reproducibly ($ is the prompt in this output):
$ echo xxx\\c
xxx$ { echo abc; echo def; echo ghi; echo xxx\\c; } > y
$ cat y
abc
def
ghi
xxx$
$ while read line; do echo $line; done < y
abc
def
ghi
$ bash -c 'while read line; do echo $line; done < y'
abc
def
ghi
$ ksh -c 'while read line; do echo $line; done < y'
abc
def
ghi
$ zsh -c 'while read line; do echo $line; done < y'
abc
def
ghi
$ for line in $(<y); do echo $line; done # Preferred notation in bash
abc
def
ghi
xxx
$ for line in $(cat y); do echo $line; done # UUOC Award pending
abc
def
ghi
xxx
$
It is also not limited to bash — Korn shell (ksh) and zsh behave like that too. I live, I learn; thanks for raising the issue.
As demonstrated in the code above, the cat command reads the whole file. The for line in `cat $DATAFILE` technique collects all the output and replaces arbitrary sequences of white space with a single blank (I conclude that each line in the file contains no blanks).
Tested on Mac OS X 10.7.5.
What does POSIX say?
The POSIX read command specification says:
The read utility shall read a single line from standard input.
By default, unless the -r option is specified, <backslash> shall act as an escape character. An unescaped <backslash> shall preserve the literal value of the following character, with the exception of a <newline>. If a <newline> follows the <backslash>, the read utility shall interpret this as line continuation. The <backslash> and <newline> shall be removed before splitting the input into fields. All other unescaped <backslash> characters shall be removed after splitting the input into fields.
If standard input is a terminal device and the invoking shell is interactive, read shall prompt for a continuation line when it reads an input line ending with a <backslash> <newline>, unless the -r option is specified.
The terminating <newline> (if any) shall be removed from the input and the results shall be split into fields as in the shell for the results of parameter expansion (see Field Splitting); [...]
Note that '(if any)' (emphasis added in quote)! It seems to me that if there is no newline, it should still read the result. On the other hand, it also says:
STDIN
The standard input shall be a text file.
and then you get back to the debate about whether a file that does not end with a newline is a text file or not.
However, the rationale on the same page documents:
Although the standard input is required to be a text file, and therefore will always end with a <newline> (unless it is an empty file), the processing of continuation lines when the -r option is not used can result in the input not ending with a <newline>. This occurs if the last line of the input file ends with a <backslash> <newline>. It is for this reason that "if any" is used in "The terminating <newline> (if any) shall be removed from the input" in the description. It is not a relaxation of the requirement for standard input to be a text file.
That rationale must mean that the text file is supposed to end with a newline.
The POSIX definition of a text file is:
3.395 Text File
A file that contains characters organized into zero or more lines. The lines do not contain NUL characters and none can exceed {LINE_MAX} bytes in length, including the <newline> character. Although POSIX.1-2008 does not distinguish between text files and binary files (see the ISO C standard), many utilities only produce predictable or meaningful output when operating on text files. The standard utilities that have such restrictions always specify "text files" in their STDIN or INPUT FILES sections.
This does not stipulate 'ends with a <newline>' directly, but does defer to the C standard and it does say "A file that contains characters organized into zero or more lines" and when we look at the POSIX definition of a "Line" it says:
3.206 Line
A sequence of zero or more non- <newline> characters plus a
terminating <newline> character.
so per the POSIX definition a file must end in a terminating newline because it's made up of lines and each line must end in a terminating newline.
A solution to the 'no terminal newline' problem
Note Gordon Davisson's answer. A simple test shows that his observation is accurate:
$ while read line; do echo $line; done < y; echo $line
abc
def
ghi
xxx
$
Therefore, his technique of:
while read line || [ -n "$line" ]; do echo $line; done < y
or:
cat y | while read line || [ -n "$line" ]; do echo $line; done
will work for files without a newline at the end (at least on my machine).
I'm still surprised to find that the shells drop the last segment (it can't be called a line because it doesn't end with a newline) of the input, but there might be sufficient justification in POSIX to do so. And clearly it is best to ensure that your text files really are text files ending with a newline.
According to the POSIX spec for the read command, it should return a nonzero status if "End-of-file was detected or an error occurred." Since EOF is detected as it reads the last "line", it sets $line and then returns an error status, and the error status prevents the loop from executing on that last "line". The solution is easy: make the loop execute if the read command succeeds OR if anything was read into $line.
while read line || [ -n "$line" ]; do
Adding some additional info:
There's no need to use cat with while loop. while ...;do something;done<file is enough.
Don't read lines with for.
When using while loop to read lines:
Set the IFS properly (you may lose indentation otherwise).
You should almost always use the -r option with read.
with meeting the above requirements a proper while loop will look like this:
while IFS= read -r line; do
...
done <file
And to make it work with files without a newline at end (reposting my solution from here):
while IFS= read -r line || [ -n "$line" ]; do
echo "$line"
done <file
Or using grep with while loop:
while IFS= read -r line; do
echo "$line"
done < <(grep "" file)
As a workaround, before reading from the text file a newline can be appended to the file.
echo -e "\n" >> $file_path
This will ensure that all the lines that was previously in the file will be read.We need to pass -e argument to echo to enable interpretation of escape sequences.
https://superuser.com/questions/313938/shell-script-echo-new-line-to-file
I tested this in command line
# create dummy file. last line doesn't end with newline
printf "%i\n%i\nNo-newline-here" >testing
Test with your first form (piping to while-loop)
cat testing | while read line; do echo $line; done
This misses the last line, which makes sense since read only gets input that ends with a newline.
Test with your second form (command substitution)
for line in `cat testbed1` ; do echo $line; done
This gets the last line as well
read only gets input if it's terminated by newline, that's why you miss the last line.
On the other hand, in the second form
`cat testing`
expands to the form of
line1\nline2\n...lineM
which is separated by the shell into multiple fields using IFS, so you get
line1 line2 line3 ... lineM
That's why you still get the last line.
p/s: What I don't understand is how you get the first form working...
Use sed to match the last line of a file, which it will then append a newline if one does not exist and have it do an inline replacement of the file:
sed -i '' -e '$a\' file
The code is from this stackexchange link
Note: I have added empty single quotes to -i '' because, at least in OS X, -i was using -e as a file extension for the backup file. I would have gladly commented on the original post but lacked 50 points. Perhaps this will gain me a few in this thread, thanks.
I had a similar issue.
I was doing a cat of a file, piping it to a sort and then piping the result to a 'while read var1 var2 var3'.
ie:
cat $FILE|sort -k3|while read Count IP Name
do
The work under the "do" was an if statement that identified changing data in the $Name field and based on change or no change did sums of $Count or printed the summed line to the report.
I also ran into the issue where I couldnt get the last line to print to the report.
I went with the simple expedient of redirecting the cat/sort to a new file, echoing a newline to that new file and THEN ran my "while read Count IP Name" on the new file with successful results.
ie:
cat $FILE|sort -k3 > NEWFILE
echo "\n" >> NEWFILE
cat NEWFILE |while read Count IP Name
do
Sometimes the simple, inelegant is the best way to go.

How to split strings over multiple lines in Bash?

How can i split my long string constant over multiple lines?
I realize that you can do this:
echo "continuation \
lines"
>continuation lines
However, if you have indented code, it doesn't work out so well:
echo "continuation \
lines"
>continuation lines
This is what you may want
$ echo "continuation"\
> "lines"
continuation lines
If this creates two arguments to echo and you only want one, then let's look at string concatenation. In bash, placing two strings next to each other concatenate:
$ echo "continuation""lines"
continuationlines
So a continuation line without an indent is one way to break up a string:
$ echo "continuation"\
> "lines"
continuationlines
But when an indent is used:
$ echo "continuation"\
> "lines"
continuation lines
You get two arguments because this is no longer a concatenation.
If you would like a single string which crosses lines, while indenting but not getting all those spaces, one approach you can try is to ditch the continuation line and use variables:
$ a="continuation"
$ b="lines"
$ echo $a$b
continuationlines
This will allow you to have cleanly indented code at the expense of additional variables. If you make the variables local it should not be too bad.
Here documents with the <<-HERE terminator work well for indented multi-line text strings. It will remove any leading tabs from the here document. (Line terminators will still remain, though.)
cat <<-____HERE
continuation
lines
____HERE
See also http://ss64.com/bash/syntax-here.html
If you need to preserve some, but not all, leading whitespace, you might use something like
sed 's/^ //' <<____HERE
This has four leading spaces.
Two of them will be removed by sed.
____HERE
or maybe use tr to get rid of newlines:
tr -d '\012' <<-____
continuation
lines
____
(The second line has a tab and a space up front; the tab will be removed by the dash operator before the heredoc terminator, whereas the space will be preserved.)
For wrapping long complex strings over many lines, I like printf:
printf '%s' \
"This will all be printed on a " \
"single line (because the format string " \
"doesn't specify any newline)"
It also works well in contexts where you want to embed nontrivial pieces of shell script in another language where the host language's syntax won't let you use a here document, such as in a Makefile or Dockerfile.
printf '%s\n' >./myscript \
'#!/bin/sh` \
"echo \"G'day, World\"" \
'date +%F\ %T' && \
chmod a+x ./myscript && \
./myscript
You can use bash arrays
$ str_array=("continuation"
"lines")
then
$ echo "${str_array[*]}"
continuation lines
there is an extra space, because (after bash manual):
If the word is double-quoted, ${name[*]} expands to a single word with
the value of each array member separated by the first character of the
IFS variable
So set IFS='' to get rid of extra space
$ IFS=''
$ echo "${str_array[*]}"
continuationlines
In certain scenarios utilizing Bash's concatenation ability might be appropriate.
Example:
temp='this string is very long '
temp+='so I will separate it onto multiple lines'
echo $temp
this string is very long so I will separate it onto multiple lines
From the PARAMETERS section of the Bash Man page:
name=[value]...
...In the context where an assignment statement is assigning a value to a shell variable or array index, the += operator can be used to append to or add to the variable's previous value. When += is applied to a variable for which the integer attribute has been set, value is evaluated as an arithmetic expression and added to the variable's current value, which is also evaluated. When += is applied to an array variable using compound assignment (see Arrays below), the variable's value is not unset (as it is when using =), and new values are appended to the array beginning at one greater than the array's maximum index (for indexed arrays) or added as additional key-value pairs in an associative array. When applied to a string-valued variable, value is expanded and appended to the variable's value.
You could simply separate it with newlines (without using backslash) as required within the indentation as follows and just strip of new lines.
Example:
echo "continuation
of
lines" | tr '\n' ' '
Or if it is a variable definition newlines gets automatically converted to spaces. So, strip of extra spaces only if applicable.
x="continuation
of multiple
lines"
y="red|blue|
green|yellow"
echo $x # This will do as the converted space actually is meaningful
echo $y | tr -d ' ' # Stripping of space may be preferable in this case
This isn't exactly what the user asked, but another way to create a long string that spans multiple lines is by incrementally building it up, like so:
$ greeting="Hello"
$ greeting="$greeting, World"
$ echo $greeting
Hello, World
Obviously in this case it would have been simpler to build it one go, but this style can be very lightweight and understandable when dealing with longer strings.
Line continuations also can be achieved through clever use of syntax.
In the case of echo:
# echo '-n' flag prevents trailing <CR>
echo -n "This is my one-line statement" ;
echo -n " that I would like to make."
This is my one-line statement that I would like to make.
In the case of vars:
outp="This is my one-line statement" ;
outp+=" that I would like to make." ;
echo -n "${outp}"
This is my one-line statement that I would like to make.
Another approach in the case of vars:
outp="This is my one-line statement" ;
outp="${outp} that I would like to make." ;
echo -n "${outp}"
This is my one-line statement that I would like to make.
Voila!
I came across a situation in which I had to send a long message as part of a command argument and had to adhere to the line length limitation. The commands looks something like this:
somecommand --message="I am a long message" args
The way I solved this is to move the message out as a here document (like #tripleee suggested). But a here document becomes a stdin, so it needs to be read back in, I went with the below approach:
message=$(
tr "\n" " " <<-END
This is a
long message
END
)
somecommand --message="$message" args
This has the advantage that $message can be used exactly as the string constant with no extra whitespace or line breaks.
Note that the actual message lines above are prefixed with a tab character each, which is stripped by here document itself (because of the use of <<-). There are still line breaks at the end, which are then replaced by tr with spaces.
Note also that if you don't remove newlines, they will appear as is when "$message" is expanded. In some cases, you may be able to workaround by removing the double-quotes around $message, but the message will no longer be a single argument.
Depending on what sort of risks you will accept and how well you know and trust the data, you can use simplistic variable interpolation.
$: x="
this
is
variably indented
stuff
"
$: echo "$x" # preserves the newlines and spacing
this
is
variably indented
stuff
$: echo $x # no quotes, stacks it "neatly" with minimal spacing
this is variably indented stuff
Following #tripleee 's printf example (+1):
LONG_STRING=$( printf '%s' \
'This is the string that never ends.' \
' Yes, it goes on and on, my friends.' \
' My brother started typing it not knowing what it was;' \
" and he'll continue typing it forever just because..." \
' (REPEAT)' )
echo $LONG_STRING
This is the string that never ends. Yes, it goes on and on, my friends. My brother started typing it not knowing what it was; and he'll continue typing it forever just because... (REPEAT)
And we have included explicit spaces between the sentences, e.g. "' Yes...". Also, if we can do without the variable:
echo "$( printf '%s' \
'This is the string that never ends.' \
' Yes, it goes on and on, my friends.' \
' My brother started typing it not knowing what it was;' \
" and he'll continue typing it forever just because..." \
' (REPEAT)' )"
This is the string that never ends. Yes, it goes on and on, my friends. My brother started typing it not knowing what it was; and he'll continue typing it forever just because... (REPEAT)
Acknowledgement for the song that never ends
However, if you have indented code, it doesn't work out so well:
echo "continuation \
lines"
>continuation lines
Try with single quotes and concatenating the strings:
echo 'continuation' \
'lines'
>continuation lines
Note: the concatenation includes a whitespace.
This probably doesn't really answer your question but you might find it useful anyway.
The first command creates the script that's displayed by the second command.
The third command makes that script executable.
The fourth command provides a usage example.
john#malkovich:~/tmp/so$ echo $'#!/usr/bin/env python\nimport textwrap, sys\n\ndef bash_dedent(text):\n """Dedent all but the first line in the passed `text`."""\n try:\n first, rest = text.split("\\n", 1)\n return "\\n".join([first, textwrap.dedent(rest)])\n except ValueError:\n return text # single-line string\n\nprint bash_dedent(sys.argv[1])' > bash_dedent
john#malkovich:~/tmp/so$ cat bash_dedent
#!/usr/bin/env python
import textwrap, sys
def bash_dedent(text):
"""Dedent all but the first line in the passed `text`."""
try:
first, rest = text.split("\n", 1)
return "\n".join([first, textwrap.dedent(rest)])
except ValueError:
return text # single-line string
print bash_dedent(sys.argv[1])
john#malkovich:~/tmp/so$ chmod a+x bash_dedent
john#malkovich:~/tmp/so$ echo "$(./bash_dedent "first line
> second line
> third line")"
first line
second line
third line
Note that if you really want to use this script, it makes more sense to move the executable script into ~/bin so that it will be in your path.
Check the python reference for details on how textwrap.dedent works.
If the usage of $'...' or "$(...)" is confusing to you, ask another question (one per construct) if there's not already one up. It might be nice to provide a link to the question you find/ask so that other people will have a linked reference.

Resources