Remove additional newlines from file output - ruby

I have a script that dumps data from a serial port to both a terminal and the harddrive. The output to the terminal looks fine, however the file write an ^M after each resulting in an extra newline for every other line.
The offending code:
# run and dump to file.
loop {
# output data to log file.
data = sp.read
data.delete!("\C-M") # Removes escape character.
if( data != "" )
puts data
File.open($log_file, 'a') { |f| f.write( data ) }
end
}
Example output:
On the terminal:
1
2
3
In the file
1
2
3
Edit: The solution is to run data.delete!("\C-M") after the read.

Try opening the data written to the file in ruby with read. I suspect the problem you have is with the carriage return characters that sometimes cause problems when transferring a file from windows to linux or when downloading files via some mail clients.

I don't know how your serial data looks like, but you can always do a chomp on data variable before writing. Try it and see how it goes.
Edit: If you want to remove the ^M, maybe you can try sp.read.tr("\r","")

Related

InstallScript GetLine() can not read text file contains result from command prompt

My Installation needs to check the result of a command from cmd.exe. Thus, I redirect the result of the command to a text file and then try to read the file to get the result as follows:
// send command to cmd to execute and redirect the result to a text file
// try to read the file
szDir = "D:\\";
szFileName = "MyFile.txt";
if Is(FILEEXISTS, szDir ^ szFileName) then
listID = ListCreate(STRINGLIST);
if listID != LIST_NULL then
if OpenFIleMode(FILE_MODE_NORMAL) = 0 then
if OpenFile(nFileHandle, szDir, szFileName) = 0 then
// I run into problem here
while (GetLine(nFileHandle, szCurLine) = 0 )
ListAddString(listID, szCurLine, AFTER);
endwhile;
CloseFile(nFileHandle);
endif;
endif;
endif;
endif;
The problem is that right after the command prompt is executed and the result is redirected to MyFile.txt, I can set open file mode, open the file but I can not read any text into my list. ListReadFromFile() does not helps. If I open the file, edit and save it manually, my script works.
After debugging, I figured that GetLine() returns an error code (-1) which means the file pointer must be at the end of file or other errors. However, FILE_MODE_NORMAL sets the file as read only and SET THE FILE POINTER AT THE BEGINNING OF THE FILE.
What did I possibly do wrong? Is this something to do with read/write access of the file? I tried this command without result:
icacls D:\MyFile.txt /grant Administrator:(R,W)
I am using IstallShield 2018 and Windows 10 64-bit btw. Your help is much appreciated.
EDIT 1: I suspected the encoding and tried a few things:
After running "wslconfig /l", the content of MyFile.txt opened in Notepad++ is without an encoding, but still appeared normal and readable. I tried to converted the content to UTF-8 but it did not work.
If I add something to the file (echo This line is appended >> MyFile.txt), the encoding changed to UTF-8, but the content in step 1 is changeed also. NULL (\0) is added to between every character and even repelace new line character. Maybe this is why GetLine() failed to read the file.
Work around: after step 1, I run "find "my_desired_content" MyFile.txt" > TempFile.txt and read TempFile.txt (which is encoded in UTF-8).
My ultimate goal is to check if "my_desired_content" apeears in the result of "wslconfig /l" so this is fine. However, what I don't understand is that both MyFile.txt and TempFile.txt are created from cmd command but they are encoded differently?
The problem is due to the contents of the file. Assuming this is the file generated by your linked question, you can examine its contents in a hex editor to find out the following facts:
Its contents are encoded in UTF-16 (LE) without a BOM
Its newlines are encoded as CR or CR CR instead of CR LF
I thought the newlines would be more important than the text encoding, but it turns out I had it backwards. If I change each of these things independently, GetLine seems to function correctly for either CR, CR CR, or CR LF, but only handles UTF-16 when the BOM is present. (That is, in a hex editor, the file starts with FF FE 57 00 instead of 57 00 for a file starting with the character W.)
I'm at a bit of a loss for the best way to address this. If you're up for a challenge, you could read the file with FILE_MODE_BINARYREADONLY, and can use your extra knowledge about what should be in the file to ensure you interpret its encoding correctly. Note that for most of UTF-16, you can create a single code unit by combining two bytes in the following manner:
szResult[i] = (nHigh << 8) + nLow;
where nHigh and nLow are probably values like szBuffer[2*i + 1] and szBuffer[2*i], assuming you filled a STRING szBuffer by calling ReadBytes.
Other unproven ideas include editing it in binary to ensure the BOM (FF FE) is present, figuring out ways to ensure the file is originally created with the BOM, figuring out ways to create it in an alternate encoding, finding another command you can invoke to "fix" the file, or lodging a request with the vendor (my employer) and hoping the development team changes something to better handle this case.
Here's an easier workaround. If you can safely assume that the command will append UTF-16 characters without a signature, you can append this output to a file that has just a signature. How do you get such a file?
You could create a file with just the BOM in your development environment, and add it to your Support Files. If you need to use it multiple times, copy it around first.
You could create it with code. Just call the following (error checking omitted for clarity)
OpenFileMode(FILE_MODE_APPEND_UNICODE);
CreateFile(nFileHandle, szDir, szFileName);
CloseFile(nFileHandle);
and if szDir ^ szFileName didn't exist, it will now be a file with just the UTF-16 signature.
Assuming this file is called sig.txt, you can then invoke the command
wslconfig /l >> sig.txt to write to that file. Note the doubled >> for append. The resulting file will include the Unicode signature you created ahead of time, plus the Unicode data output from wslconfig, and GetLine should interpret things correctly.
The biggest problem here is that this hardcodes around the behavior of wslconfig, and that behavior may change at any point. This is why Christopher alludes to recommending an API, and I agree completely. In the mean time, You could try to make this more robust by invoking it in a cmd /U (but my understanding of what that does or guarantees is fuzzy at best), or by trying the original way and then with the BOM.
This whole WSL thing is pretty new. I don't see any APIs it but rather then screen scrapping command outputs you might want to look at this registry key:
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Lxss
It seems to have the list of installed distros that come from the store. Coming from the store probably explains why this is HKCU and not HKLM.
A brave new world.... sigh.

How to read a txt file in BBx4

I have a >30 year old program in BBx what need to read something outside it's own database. Actually it must be something very simple like
txt$ = read (message.txt)
print txt$
However there isn't any documentation available. So my question is: How can i read a plain txt file in to BBx4
simple open the file and read it with READ RECORD
open (1,err=linenr) "message.txt"
read record (1,siz=1,end=linenr) txt$
opens on channel 1, linenr=line to go when there is an error
*siz=1 reads 1 character siz=100 reads 100 etc. end where to go when end of file is detected.
You can read an ASCII file line by line in a loop, as follows:
ch=unt
open (ch)file$
while 1
read (ch,end=*break)line$
if line$="" then
continue
rem if the line is empty, skip it
fi
print line$
wend
close (ch)
If you know that the content of the file will fit in the memory, you can read it in one:
ch=unt
open (ch)file$
read record (ch,siz=dec(fin(ch)(1,4)))content$
close (ch)
print content$
The fin(ch) is the file information string, bytes 1-4 are the actual file length in bytes (for an ASCII file).

Why must I .read() a file I wrote before being able to actually output the content to the terminal?

I am learning Ruby and am messing with reading/writing files right now. When I create the file, 'filename', I can write to it with the .write() method. However, I cannot output the content to the terminal without reopening it after running .read() on it (see line 8: puts write_txt.read()). I have tried running line 8 multiple times, but all that does is output more blank lines. Without line 8, puts txt.read() simply outputs a blank line. The following code also works without the puts in line 8 (simply write_txt.read())
# Unpacks first argument to 'filename'
filename = ARGV.first
# Lets write try writing to a file
write_txt = File.new(filename, 'w+')
write_txt.write("OMG I wrote this file!\nHow cool is that?")
# This outputs a blank line THIS IS THE LINE IN QUESTION
puts write_txt.read()
txt = File.open(filename)
# This actually outputs the text that I wrote
puts txt.read()
Why is this necessary? Why is the file that has clearly been written to being read as blank until it is reopened after being read as blank at least once?
When you read or write to a file, there's an internal pointer called a "cursor" that keeps track of where in the file you currently are. When you write a file, the cursor is set to the point after the last byte you wrote, so that if you perform additional writes, they happen after your previous write (rather than on top of it). When you perform a read, you are reading from the current position to the end of the file, which contains...nothing!
You can open a file (cursor position 0), then write the string "Hello" (cursor position 6), and attempting to read from the cursor will cause Ruby to say "Oh hey, there's no more content in this file past cursor position 6", and will simply return a blank string.
You can rewind the file cursor with IO#rewind to reset the cursor to the beginning of the file. You may then read the file (which will read from the cursor to the end of the file) normally.
Note that if you perform any writes after rewinding, you will overwrite your previously-written content.
# Unpacks first argument to 'filename'
filename = ARGV.first
# Lets write try writing to a file
write_txt = File.new(filename, 'w+')
write_txt.write("OMG I wrote this file!\nHow cool is that?")
write_txt.rewind
puts write_txt.read()
Note, however, that it is generally considered bad practice to both read from and write to the same file handle. You would generally open one file handle for reading and one for writing, as mixing the two can have nasty consequenses (such as accidentally overwriting existing content by rewinding the cursor for a read, and then performing a write!)
The output is not necessarily written to the file immediately. Also, the pointer is at the end of the file, if you want to read while in read-write mode you have to reset it. You can simply close if you want to reopen it for reading. Try:
write_txt.write("OMG I wrote this file!\nHow cool is that?")
# This outputs a blank line THIS IS THE LINE IN QUESTION
write_txt.close
txt = File.open(filename)
puts txt.read()

What changes when a file is saved in Kedit for windows that the unix2dos command doesn't do?

So I have a strange question. I have written a script that re-formats data files. I basically create new files with the right column order, spacing, and such. I then unix2dos these files (the program I am formatting these files for is DIPS for windows, and I assume that the files should be ansi). When I go to open the files in the DIPS Program however an error occurs and the file won't open.
When I create the same kind of data file through the DIPS program and open it in note pad, it matches exactly with the data files I have created with my script.
On the other hand if I open the data files that I have created with my script in Kedit first, save them, and then open them in the DIPS program everything works.
My question is what could saving in Kedit possibly do that unix2dos does not?
(Also if I try using note pad or word pad to save instead of Kedit the file doesn't open in DIPS)
Here is what was created using the diff command in unix
"
1,16c1,16
* This file is generated by Dips for Windows.
* The following 2 lines are the Title of this file.
Cobre Panama
Drill Hole B11106-GT
Number of Traverses: 0
Global Orientation is:
DIP/DIPDIRECTION
0.000000 (Declination)
NO QUANTITY
Number of extra columns are: 0
--
* This file is generated by Dips for Windows.
* The following 2 lines are the Title of this file.
Cobre Panama
Drill Hole B11106-GT
Number of Traverses: 0
Global Orientation is:
DIP/DIPDIRECTION
0.000000 (Declination)
NO QUANTITY
Number of extra columns are: 0
18c18
--
440c440
--
442c442
-1
-1
"
Any help would be appreciated! Thanks!
Okay! Figured it out.
Simply when you unix2dos your file you do not strip any space characters in between the last letter in a line and the line break character. When saving in Kedit you do strip the spaces between the last letter in a line and the line break character.
In my script I had a poor programing practice in which I was writing a string like this;
echo "This is an example string " >> outfile.txt
The character count is 32, and if you could see the break line character (chr(10)) the line would read;
This is an example string
If you unix2dos outfile.txt the line looks the same as above but with a different break line character. However when you place the file into Kedit and save it, now the character count is 25 and the line looks like this;
This is an example string
This occurs because Kedit does not preserve spaces at the end of a line. It places the return or line break character at the last letter or "non space" character in a line.
So programs that read literal input like DIPS (i'm guessing) or more widely used AutoCAD scripting will have a real problem with extra spaces before the return character. Basically in AutoCAD scripting a space in a line is treated as a return character. So if you have ten extra spaces at the end of a line it's treated the same as ten returns instead of the one you probably intended.
OH and if this helped you out or though it was good please give me a vote up!
unix2dos converts the line-break characters at the end of each line, from unix line breaks (10) to dos line breaks (13, 10)
Kedit could possible change the encoding of the file (like from ansi to UTF-8)
You can change the encoding of a file with the iconv utility (on a linux box)

Writing over previously output lines in the command prompt with ruby

I've run command line programs that output a line, and then update that line a moment later. But with ruby I can only seem to output a line and then another line.
What I have being output now:
Downloading file:
11MB 294K/s
12MB 307K/s
14MB 294K/s
15MB 301K/s
16MB 300K/s
Done!
And instead, I want to see this:
Downloading file:
11MB 294K/s
Followed a moment later by this:
Downloading file:
16MB 300K/s
Done!
The line my ruby script outputs that shows the downloaded filesize and transfer speed would be overwritten each time instead of listing the updated values as a whole new line.
I'm currently using puts to generate output, which clearly isn't designed for this case. Is there a different output method that can achieve this result?
Use \r to move the cursor to the beginning of the line. And you should not be using puts as it adds \n, use print instead. Like this:
print "11MB 294K/s"
print "\r"
print "12MB 307K/s"
One thing to keep in mind though: \r doesn't delete anything, it just moves the cursor back, so you would need to pad the output with spaces to overwrite the previous output (in case it was longer).
By default when \n is printed to the standard output the buffer is flushed. Now you might need to use STDOUT.flush after print to make sure the text get printed right away.

Resources