Ruby file writes in windows returning wrong file sizes? - ruby

I'm still learning ruby, so I'm sure I'm doing something wrong here, but using ruby 1.9.3 on windows, I'm having a problem writing a file with random ascii garbage to be a specific size. I need to be able to write these files for a test on an application I'm QAing. On Mac and on *nix, the file size is written correctly every time. But on windows, it generates files of random size, generally between 1,024 bytes and 1,031 bytes.
I'm sure the problem is one of the characters that the rstr is generating is counting as two characters but... it seems like this shouldn't happen.
Here is my code:
num = 10
k = 1
for i in 1..num
fname = "f#{i}.txt"
f = File.new(fname, "w")
for k in 1..size
rstr = "#{(1..1024).map{rand(255).chr}.join}"
f.write rstr
print " #{rstr.size} " # this returns 1024 every time.
rstr = ""
end
f.close
end
Also tried:
opts = {}
opts[:encoding] = "UTF-8"
fname = "f#{i}.txt"
f = File.new(fname, "w", opts)

By default files open in Windows are open with text mode meaning that line endings and other details are adjusted.
If you want the files be written byte-to-byte exactly as you want, you need to open the files in binary mode:
File.new("foo", "wb") do |f|
# ...
end
The b is a ignored on POSIX operating systems, so your scripts are now cross-platform compatible.
Note: I used block syntax to manage the file so it properly closes and disposes the file handler once the block is executed. You no longer need to worry about closing the file ;-)
Hope this helps.

There is not any 255 ASCII. The values goes from 0~254.
If you try to printf 255.chr, you'll get a multibyte character.
As Windows does not standard utf-8, you'll get incorrect values. Hence the problem you're facing!
Try adding #coding: utf-8 at the top of your file. It should get things working.

Related

Ruby tempfile corruption of binary files

after a lot of digging around I've found that RubyZip can corrupt binary files. After taking a closer look, it seems like Tempfile class can't correctly re-open binary files. To demonstrate the effect take the following script:
require 'tempfile'
tmp = Tempfile.new('test.bin', Dir.getwd)
File.open('test.bin', 'rb') { |h| IO.copy_stream(h, tmp) } # => 2
# 2 is the expected number of bytes
tmp.close
# temporary file (looking in OS) now really IS 2 bytes in size
tmp.open
# temporary file (looking in OS) now is 1 byte in size
tmp.binmode
# temporary file (looking in OS) still has the wrong number of bytes (1)
tmp.read.length # => 1
# And here is the problem I keep bumping into
The test.bin file I'm using only contains two bytes: 00 1a. After corruption of the temporary file, it contains 1 byte: 00. If it matters I'm running windows.
Is there something that I'm missing? Is this intentional behavior? And if so, is there a way to prevent this behavior?
Thank you
The instance open method is documented as:
Opens or reopens the file with mode r+.
This means you can't depend on that method to open it in the correct mode. That's not a big deal since the normal use of Tempfile is different:
tmp = Tempfile.new('test.bin', Dir.getwd)
File.open('test.bin', 'rb') { |h| IO.copy_stream(h, tmp) } # => 2
tmp.rewind
Now once it's been "rewound" you can read any data you want from it starting from the beginning.

Ruby writing zip file works on Mac but not on windows / How to recieve zip file in Net::HTTP

actually i'm writing a ruby script which accesses an API based on HTTP-POST calls.
The API returns a zip file containing textdocuments when i call it with specific POST-Parameters. At the moment i'm doing that with the Net::HTTP Package.
Now my problem:
It seems to return the zip-file as a string as far as i know. I can see "PK" (which i suppose is part of the PK-Header of zip-files) and the text from the documents.
And the Content-Type Header is telling me "application/x-zip-compressed; name="somename.zip"".
When i save the zip file like so:
result = comodo.get_cert("<somenumber>")
puts result['Content-Type']
puts result.inspect
puts result.body
File.open("test.zip", "w") do |file|
file.write result.body
end
I can unzip it on my macbook without further problems. But when i run the same code on my Win10 PC it tells me that the file is corrupt or not a ZIP-file.
Has it something to do with the encoding? Can i change it, so it's working on both?
Or is it a complete wrong approach on how to recieve a zip-file from a POST-request?
PS:
My ruby-version on Mac:
ruby 2.2.3p173
My ruby-version on Windows:
ruby 2.2.4p230
Many thanks in advance!
The problem is due to the way Windows handles line endings (\r\n for Windows, whereas OS X and other Unix based operating systems use just \n). When using File.open, using the mode of just w makes the file subject to line ending changes, so any occurrences of byte 0x0A (or \n) are converted into bytes 0x0D 0x0A (or \r\n), which effectively breaks the zip.
When opening the file for write, use the mode wb instead, as this will suppress any line ending changes.
http://ruby-doc.org/core-2.2.0/IO.html#method-c-new-label-IO+Open+Mode
Many thanks! Just as you posted the solution i found it out myself..
So much trouble because of one missing 'b' :/
Thank you very much!
The solution (see Ben Y's answer):
result = comodo.get_cert("<somenumber>")
puts result['Content-Type']
puts result.inspect
puts result.body
File.open("test.zip", "wb") do |file|
file.write result.body
end

How can I securely erase a file?

Is there a Gem or means of securely erasing a file in Ruby? I'd like to avoid external programs that may not be present on the system.
By "secure erase" I'm referring to overwriting the file contents.
If you are on *nix, a pretty good way would be to just call shred using exec/open3/open4:
`shred -fxuz #{filename}`
http://www.gnu.org/s/coreutils/manual/html_node/shred-invocation.html
Check this similar post:
Writing a file shredder in python or ruby?
Something like this will get you started:
#!/usr/bin/env ruby
abort "Missing filename" if (ARGV.empty?)
ARGV.each do |filename|
filesize = File.size(filename)
[0x00, 0xff].each do |byte|
File.open(filename, 'wb') do |fo|
filesize.times { fo.print(byte.chr) }
end
end
end
It should get you close.
For more thoroughness, you could also use 0xaa and 0x55 for alternating 0 and 1 bits in the byte. Random.rand(0xff) will give you a random value from 0 to 255.
just
open the file
write some garbage at least in amount equal to current file size
flush() and close()
repeat N times, mixing garbage with zeroes and 0xff's on different passes

Why won't gsub! change my files?

I am trying to do a simple find/replace on all text files in a directory, modifying any instance of [RAVEN_START: by inserting a string (in this case 'raven was here') before the line.
Here is the entire ruby program:
#!/usr/bin/env ruby
require 'rubygems'
require 'fileutils' #for FileUtils.mv('your file', 'new location')
class RavenParser
rawDir = Dir.glob("*.txt")
count = 0
rawDir.each do |ravFile|
#we have selected every text file, so now we have to search through the file
#and make the needed changes.
rav = File.open(ravFile, "r+") do |modRav|
#Now we've opened the file, and we need to do the operations.
if modRav
lines = File.open(modRav).readlines
lines.each { |line|
if line.match /\[RAVEN_START:.*\]/
line.gsub!(/\[RAVEN_START:/, 'raven was here '+line)
count = count + 1
end
}
printf("Total Changed: %d\n",count)
else
printf("No txt files found. \n")
end
end
#end of file replacing instructions.
end
# S
end
The program runs and compiles fine, but when I open up the text file, there has been no change to any of the text within the file. count increments properly (that is, it is equal to the number of instances of [RAVEN_START: across all the files), but the actual substitution is failing to take place (or at least not saving the changes).
Is my syntax on the gsub! incorrect? Am I doing something else wrong?
You're reading the data, updating it, and then neglecting to write it back to the file. You need something like:
# And save the modified lines.
File.open(modRav, 'w') { |f| f.puts lines.join("\n") }
immediately before or after this:
printf("Total Changed: %d\n",count)
As DMG notes below, just overwriting the file isn't properly paranoid as you could be interrupted in the middle of the write and lose data. If you want to be paranoid (which all of us should be because they really are out to get us), then you want to write to a temporary file and then do an atomic rename to replace the original file the new one. A rename generally only works when you stay within a single file system as there is no guarantee that the OS's temp directory (which Tempfile uses by default) will be on the same file system as modRav so File.rename might not even be an option with a Tempfile unless precautions are taken. But the Tempfile constructor takes a tmpdir parameter so we're saved:
modRavDir = File.dirname(File.realpath(modRav))
tmp = Tempfile.new(modRav, modRavDir)
tmp.write(lines.join("\n"))
tmp.close
File.rename(tmp.path, modRav)
You might want to stick that in a separate method (safe_save(modRav, lines) perhaps) to avoid further cluttering your block.
There is no gsub! in the post (except the title and question). I would actually recommend not using gsub!, but rather use the result of gsub -- avoiding mutability can help reduce a number of subtle bugs.
The line read from the file stream into a String is a copy and modifying it will not affect the contents of the file. (The general approach is to read a line, process the line, and write the line. Or do it all at once: read all lines, process all lines, write all processed lines. In either case, nothing is being written back to the file in the code in the post ;-)
Happy coding.
You're not using gsub!, you're using gsub. gsub! and gsub different methods, one does replacement on the object itself and the other does replacement then returns the result, respectively.
Change this
line.gsub(/\[RAVEN_START:/, 'raven was here '+line)
to this :
line.gsub!(/\[RAVEN_START:/, 'raven was here '+line)
or this:
line = line.gsub(/\[RAVEN_START:/, 'raven was here '+line)
See String#gsub for more info

Read binary file as string in Ruby

I need an easy way to take a tar file and convert it into a string (and vice versa). Is there a way to do this in Ruby? My best attempt was this:
file = File.open("path-to-file.tar.gz")
contents = ""
file.each {|line|
contents << line
}
I thought that would be enough to convert it to a string, but then when I try to write it back out like this...
newFile = File.open("test.tar.gz", "w")
newFile.write(contents)
It isn't the same file. Doing ls -l shows the files are of different sizes, although they are pretty close (and opening the file reveals most of the contents intact). Is there a small mistake I'm making or an entirely different (but workable) way to accomplish this?
First, you should open the file as a binary file. Then you can read the entire file in, in one command.
file = File.open("path-to-file.tar.gz", "rb")
contents = file.read
That will get you the entire file in a string.
After that, you probably want to file.close. If you don’t do that, file won’t be closed until it is garbage-collected, so it would be a slight waste of system resources while it is open.
If you need binary mode, you'll need to do it the hard way:
s = File.open(filename, 'rb') { |f| f.read }
If not, shorter and sweeter is:
s = IO.read(filename)
To avoid leaving the file open, it is best to pass a block to File.open. This way, the file will be closed after the block executes.
contents = File.open('path-to-file.tar.gz', 'rb') { |f| f.read }
how about some open/close safety.
string = File.open('file.txt', 'rb') { |file| file.read }
Ruby have binary reading
data = IO.binread(path/filaname)
or if less than Ruby 1.9.2
data = IO.read(path/file)
on os x these are the same for me... could this maybe be extra "\r" in windows?
in any case you may be better of with:
contents = File.read("e.tgz")
newFile = File.open("ee.tgz", "w")
newFile.write(contents)
You can probably encode the tar file in Base64. Base 64 will give you a pure ASCII representation of the file that you can store in a plain text file. Then you can retrieve the tar file by decoding the text back.
You do something like:
require 'base64'
file_contents = Base64.encode64(tar_file_data)
Have look at the Base64 Rubydocs to get a better idea.
Ruby 1.9+ has IO.binread (see #bardzo's answer) and also supports passing the encoding as an option to IO.read:
Ruby 1.9
data = File.read(name, {:encoding => 'BINARY'})
Ruby 2+
data = File.read(name, encoding: 'BINARY')
(Note in both cases that 'BINARY' is an alias for 'ASCII-8BIT'.)
If you can encode the tar file by Base64 (and storing it in a plain text file) you can use
File.open("my_tar.txt").each {|line| puts line}
or
File.new("name_file.txt", "r").each {|line| puts line}
to print each (text) line in the cmd.

Resources