Cannot read lines from a file using bash read command [duplicate] - bash

This question already has answers here:
Shell script read missing last line
(7 answers)
Closed 2 years ago.
Created a text file as hello_world.rtf with following two lines only:
Hello
World
and trying to read above file using below bash script from terminal:
while test= read -r line; do
> echo "The text read from file is: $line"
> done < hello_world.rtf
and it returns the following:
The text read from file is: {\rtf1\ansi\ansicpg1252\cocoartf1671\cocoasubrtf500
The text read from file is: {\fonttbl\f0\fswiss\fcharset0 Helvetica;}
The text read from file is: {\colortbl;\red255\green255\blue255;}
The text read from file is: {\*\expandedcolortbl;;}
The text read from file is: \paperw12240\paperh15840\margl1440\margr1440\vieww10800\viewh8400\viewkind0
The text read from file is: \pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
The text read from file is:
The text read from file is: \f0\fs24 \cf0 Hello\
Any suggestion what is wrong here and how can I get the clean result?

RTF means Rich Text Format. It is a language for text formatting, developed and used mostly by Microsoft and deprecated for a while.
The text inside the file looks as you can see in the output of your code. It contains the words "Hello" and "World" but also formatting instructions.
Save the file as plain text, not RTF and it will contain only the text you typed in it.
test= in front of read does not have any effect in this context. You can remove it.
Make sure the last line of the file ends with a new-line character. read returns an non-zero exit status (and this means false) when it reaches the end of file and your code exits the while loop and does not display the last value read by read. If the file ends with a new-line character, the last line (that is read but not listed by the code) is empty, therefore nothing is lost.
It is a recommended practice for text files to always end with a newline character.
Alternatively you can print the value of line again after the loop. It contains the last line of the file (from the last end-of-line character until the end of file).

Related

Is there a way to use bash to get specific text content of a .eml?

Total noob here with both bash and working with .eml files, so bare with me...
I have a folder with many saved .eml files, and I want a bash script (if this is not possible with bash, I'm willing to use python, or zsh, or maybe perl--never used perl before, but it may be good to learn) that will print the email content after a line containing a specific textual phrase, and before the next empty line.
I also want this script to combine consecutive lines ending in "=". (Lines that do not end with an "=" sign should continue printing on a new line.)
All of my testing with .txt files that I create manually work fine, but when I use an actual .eml file, then things stop working.
Here is a portion of a sample .eml file:
(.eml file continues above)
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
testing
StartLine (This is where stuff begins)
This is a line that should be printed.
This is a long line that should be printed. Soooooooooooooooooooooooooooooo=
Loooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo L=
oooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo Loo=
oooooooooooooooooooooonnnnnnnnnggggg.
This is where things should stop (no more printing)
Don=92t print me please!
Don=92t print me please!
Don=92t print me please!
[This message is from an external sender.]
(.eml file continues below)
I want the script to output:
This is a line that should be printed.
This is a long line that should be printed. Soooooooooooooooooooooooooooooo Loooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo Loooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo Loooooooooooooooooooooooonnnnnnnnnggggg.
Here is my script so far:
#!/bin/bash
files="/Users/username/Desktop/emails/*"
specifictext="StartLine"
for f in $files
do
begin=false
previous=""
while read -r line
do
if [[ -z "$line" ]] #this doesn't seem to be working right
then
begin=false
fi
if [[ "$begin" = true ]]
then
if [[ "${line:0-1}" = "=" ]] #this also doesn't appear to be working
then
previous=$previous"${line::${#line}-1}"
else
echo $previous$line
fi
fi
if [[ $line = "$specifictext"* ]]
then
begin=true
fi
done < "$f"
done
This will successfully skip everything up to and including the line containing $specifictext, but then it will print off the entire remainder of each email instead of stopping at the next empty line. Like this:
$ ./printeml.sh
This is a line that should be printed.
This is a long line that should be printed. Soooooooooooooooooooooooooooooo=
Loooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo L=
oooooooooooooooooooooooonnnnnnnnnggggg. Soooooooooooooooooooooooooooooo Loo=
oooooooooooooooooooooonnnnnnnnnggggg.
This is where things should stop (no more printing)
Don=92t print me please!
Don=92t print me please!
Don=92t print me please!
[This message is from an external sender.]
(continues printing remainder of .eml)
As you can see above, the other issue I'm having is that I wanted to get combine lines with "=" signs at the end, but that is not working. It appears all the testing I do with test files works fine, except when I use an actual .eml file. I think this is an issue with hidden characters in .eml files, but I'm not really sure how that works.
I'm using bash version 3.2.57(1) on MacOS 12.4.
Both of your problems stem from the fact that the .eml file is using Windows line endings (really, MIME line endings; the specification is designed for transmission over the TELNET protocol and thus dictates the use of CRLF instead of bare LF). Bash doesn't understand those, and sees the carriage return as an ordinary character that happens to be the last character of every line. So the blank lines are really single-character lines containing a carriage return, and the lines ending in an = really end in = followed by a carriage return ($'=\r'). When you check the last character, you're getting the carriage return, which of course is never =.
But that's just part of the problem. You could convert the file to UNIX line-endings (though it wouldn't be a valid .eml file at that point) or account for the CRs in your code. However, the trailing equal sign for continued lines is just one part of the "quoted printable" encoding scheme that the Content-Encoding header tells you the message body is using. Another thing you may run into is that Q-P messages cannot legally contain any characters outside the ASCII range, but must use =xx with two hex digits to represent such characters. Any Windows-1252 characters whose code point is > 127 will be replaced by =xx with the code in hexadecimal – as will any literal equal signs, which become =3D.
So you should ideally be using some library that understands MIME messages rather than trying to roll your own code to do bits and pieces of the decoding. Perhaps a Perl script using the MIME::Parser module would be appropriate? Or you could use the Python answers given to this question.

Weird txt behavior

I have a centos server. I cloned a GitHub repository. And I have .txt file in that repository which contains 1 line. For some reason it does that:
[root#0-0-0-0 Some]# cat some.txt
some text[root#0-0-0-0 Some]#
And also while read i; do echo "$i"; done < some.txt don't see that line. What could cause that? And how to avoid it. If I edit it with vim adding a new line and then deleting that new line (so it still contains only one line) it starts to work properly.
The text file has no newline character at the end of it. Some programs will treat it as a valid text file whose last line doesn't happen to end in a newline. Others (apparently including bash's built-in read command, at least by default) will treat it as invalid, and perhaps ignore the last line (which isn't considered a "line" because it's not marked as one).
vim's default behavior is to quietly add a newline to the end of a file if you modify and save it.
You can add a newline to a file that lacks one by editing it with vim (or another editor that behaves similarly), or by adding it from the shell:
echo '' >> some.txt
In general, it's a good idea to ensure that text files end in a newline character in the first place, at least if they're intended to be used on UNIX-like systems.

How to read a txt file in BBx4

I have a >30 year old program in BBx what need to read something outside it's own database. Actually it must be something very simple like
txt$ = read (message.txt)
print txt$
However there isn't any documentation available. So my question is: How can i read a plain txt file in to BBx4
simple open the file and read it with READ RECORD
open (1,err=linenr) "message.txt"
read record (1,siz=1,end=linenr) txt$
opens on channel 1, linenr=line to go when there is an error
*siz=1 reads 1 character siz=100 reads 100 etc. end where to go when end of file is detected.
You can read an ASCII file line by line in a loop, as follows:
ch=unt
open (ch)file$
while 1
read (ch,end=*break)line$
if line$="" then
continue
rem if the line is empty, skip it
fi
print line$
wend
close (ch)
If you know that the content of the file will fit in the memory, you can read it in one:
ch=unt
open (ch)file$
read record (ch,siz=dec(fin(ch)(1,4)))content$
close (ch)
print content$
The fin(ch) is the file information string, bytes 1-4 are the actual file length in bytes (for an ASCII file).

Reading in output as one line, not each word

Basically what I am having trouble with is when I type: file *
I will get:
AdvDataStructures.text.ref: ASCII text
makefile: ASCII make commands text
makelib: ASCII English text
README.txt: ASCII Pascal program text
shell3_2016.sh: ASCII text
shell3_2016.sh~: ASCII text
smallTestDir: directory
smallTestDir.text.out: empty
smallTestDir.text.ref: ASCII text
testarg0.text.ref: ASCII text
testarg1.text.ref: ASCII text
testbaddir.text.ref: ASCII text
When I use
for i in `file *`
it reads in each word separated by space in for i. I need it to read in each line as: AdvDataStructures.text.ref: ASCII text ,so I can look through it for a pattern.
ALSO, I have no clue how to make it so when I read in the line, I somehow have to read in the amount of lines within the file that is called. Is there a way to like call the first word of the output so it knows to read in the file name?
Basically, an example of what I have to do is read in one line at a time (AdvDataStructures.text.ref: ASCII text), if a pattern finds a match in it (I know how to do this with egrep) it will the count the number of lines within the file(AdvDataStructures.text.ref)
The usual way is to use a while read loop:
file * | while read filename description; do
filename=${filename%:} # remove : after filename
...
done

Why the loop of "While ... do ... done" can't read a text file? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
The job I want to do is reading a line from a text file(currently this file only contains a line, the number of lines will be increased later) with the loop of "While ... do ... done". The weird thing is it can only read some of text files. My code is :
...(previous commands to create "myfile.txt")...
while read -r line
do
echo "flag"
done < "myfile.txt"
I have tried a few cases. If I replaced the "myfile.txt" by another file "test.txt" which is created by hand in current directory(this "test.txt" contains one line either), my script can print "flag".
Similarly, after "myfile.txt" has been created, if I modify and save it in current directory, then run my script, it can print "flag" normally either.
Other cases except above two, my script can't print "flag".
I also tried to "chmod" and "touch" the text file in my script, like following, it can't work either.
Obviously, I hope my script read the line(s) of a text file, can anybody please tell me the reason and give a solution ?
BTW, this file can be read by cat command.
...(previous commands to create "myfile.txt")...
chmod 777 "myfile.txt"
touch "myfile.txt"
cat "myfile.txt" #(I can see the results of this line)
while read -r line
do
echo "flag"
done < "myfile.txt"
Thanks !
the whole code of creating the text file is around 800 lines. However, I'd like to post the lines which create my text file. Here they are:
for(i = 1, i<=6, ++i){
...
ofstream myfile("myfile.txt", std::ios_base::app);
...
if(myfile.is_open()){
myfile << "rms_" << std::setprecision(3) << RMS_values ;
myfile.close();
}
}
**************** Beginning of my solution ****************************************
Thanks for above replies.
I have solved by myself and this link : https://unix.stackexchange.com/questions/31807/what-does-the-noeol-indicator-at-the-bottom-of-a-vim-edit-session-mean
The reason is in my script of producing the text file, there is no "\n" at the end. So, the text file has a "[noeol]" icon after the filename when opened in VI.
According to the above link, if there is no "[noeol]", UNIX/LINUX won't read this file.
The solution is rather simple(looking afterwards), just add "<< "\n" " at the end of "cout". The line becomes,
myfile << "rms_" << std::setprecision(3) << RMS_values << "\n";
**************** End of my solution ****************************************
$ cat test.sh
#!/bnin/bash
echo "content" > "myfile.txt"
cat "myfile.txt" #(I can see the results of this line)
while read -r line
do
echo "flag"
done < "myfile.txt"
$ bash test.sh
content
flag
$
It works. There is no problem with it. The script is exact copy of what you posted except the touch is replaced with some content, because the while loop prints one message per line in the file, so if there are no lines (and touch won't add any), it will obviously print nothing.
I'm taking a guess here:
In Unix, two assumptions are made about text files:
All lines end in a <LF> character. If you edit your file on an old, old Mac which used <CR>, Unix won't see the line endings. If you edit a file on Windows programs like Notepad.exe, your lines will end in <CR><LF> and Unix will assume the <CR> is part of the line.
All lines must end in a <LF>, including the last line. If you write a program using a C program, the last line may not end in a <LF> unless you specifically write it out.
Unix utilities like awk, grep, and shells live and breath on these assumptions. When someone usually tells me something doesn't quite work when reading a file using a shell script, I tell them to edit that file in VIM and then save it (thus forcing an ending <LF> character). In VIM, you need to :set ff=unix and then save. That usually takes care of the issue.
My guess is that your file you're reading in doesn't have the correct line endings, and/or that the last line doesn't have that <LF> character on the end.
I don't really understand your question - can you show us more code/how you create the file?
Here is a working example:
$ cat readfile.sh
#!/bin/bash
{
cat <<EOT
this
is
a
test
file
EOT
} > ./test.txt
while read -r line; do
echo "line = [${line}]"
done <./test.txt
.
$ ./readfile.sh
line = [this]
line = [is]
line = [a]
line = [test]
line = [file]

Resources