Ruby File IO: Can't open url as File object - ruby

I have a function in my code that takes a string representing the url of an image and creates a File object from that string, to be attached to a Tweet. This seems to work about 90% of the time, but occasionally fails.
require 'open-uri'
attachment_url = "https://s3.amazonaws.com/FirmPlay/photos/images/000/002/443/medium/applying_too_many_jobs_-_daniel.jpg?1448392757"
image = File.new(open(attachment_url))
If I run the above code it returns TypeError: no implicit conversion of StringIO into String. If I change open(attachment_url) to open(attachment_url).read I get ArgumentError: string contains null byte. I also tried stripping out the null bytes from the file like so, but that also made no difference.
image = File.new(open(attachment_url).read.gsub("\u0000", ''))
Now if I try the original code with a different image, such as the one below, it works fine. It returns a File object as expected:
attachment_url = "https://s3.amazonaws.com/FirmPlay/photos/images/000/002/157/medium/mike_4.jpg"
I thought maybe it had something to do with the params in the original url, so I stripped those out, but it made no difference. If I open the images in Chrome they appear to be fine.
I'm not sure what I'm missing here. How can I resolve this issue?
Thanks!
Update
Here is the working code I have in my app:
filename = self.attachment_url.split(/[\/]/)[-1].split('?')[0]
stream = open(self.attachment_url)
image = File.open(filename, 'w+b') do |file|
stream.respond_to?(:read) ? IO.copy_stream(stream, file) : file.write(stream)
open(file)
end
Jordan's answer works except that calling File.new returns an empty File object, whereas File.open returns a File object containing the image data from stream.

The reason you're getting TypeError: no implicit conversion of StringIO into String is that open sometimes returns a String object and sometimes returns a StringIO object, which is unfortunate and confusing. Which it does depends on the size of the file. See this answer for more information: open-uri returning ASCII-8BIT from webpage encoded in iso-8859 (Although I don't recommend using the ensure-encoding gem mentioned therein, since it hasn't been updated since 2010 and Ruby has had significant encoding-related changes since then.)
The reason you're getting ArgumentError: string contains null byte is that you're trying to pass the image data as the first argument to File.new:
image = File.new(open(attachment_url))
The first argument of File.new should be a filename, and null bytes aren't allowed in filenames on most systems. Try this instead:
image_data = open(attachment_url)
filename = 'some-filename.jpg'
File.new(filename, 'wb') do |file|
if image_data.respond_to?(:read)
IO.copy_stream(image_data, file)
else
file.write(image_data)
end
end
The above opens the file (creating it if it doesn't exist; the b in 'wb' tells Ruby that you're going to write binary data), then writes the data from image_data to it using IO.copy_stream if it's a StreamIO object or File#write otherwise, then closes the file again.

If you use Paperclip, they have a method to copy to disk.
def raw_image_data
attachment.copy_to_local_file.read
end
change attachment to what ever variable you used of course.

Related

How to save an openssl key/iv to a file and then recover it?

I'm trying to encrypt a file, save the key/iv to a file, then recover the key/iv from the file.
For some reason, after I read the file, the data has changed in some way that I cannot fathom.
See below for a MWE:
require 'openssl'
cipher = OpenSSL::Cipher.new('aes-256-gcm')
cipher.encrypt
original = cipher.random_key
File.open("foo", "w") {|f| f.write(original) }
readfromfile = File.read("foo")
if readfromfile != original
puts "The information has changed, but why?"
end
I am expecting the data to be unchanged after I read it from file, but ruby always returns them as different.
When I print original and readfromfile they always look identical. When I compare original and cat of the file they look identical.
The class of the data both return string.
If I save any other string into the file and read it back it stays the same.
I get the same result whether I generate a key or iv.
What is happening?
Secondary question: is there a way in ruby to run a comparison that returns what the difference is? Something like diff?
In the end I could not get to the bottom of why the data was changing, but I solved the problem by encoding the iv and key to base64 (using this gem: https://ruby-doc.org/stdlib-2.5.3/libdoc/base64/rdoc/Base64.html), before saving it to file, and then decoding it after reading it from file.

How to read a large file into a string

I'm trying to save and load the states of Matrices (using Matrix) during the execution of my program with the functions dump and load from Marshal. I can serialize the matrix and get a ~275 KB file, but when I try to load it back as a string to deserialize it into an object, Ruby gives me only the beginning of it.
# when I want to save
mat_dump = Marshal.dump(#mat) # serialize object - OK
File.open('mat_save', 'w') {|f| f.write(mat_dump)} # write String to file - OK
# somewhere else in the code
mat_dump = File.read('mat_save') # read String from file - only reads like 5%
#mat = Marshal.load(mat_dump) # deserialize object - "ArgumentError: marshal data too short"
I tried to change the arguments for load but didn't find anything yet that doesn't cause an error.
How can I load the entire file into memory? If I could read the file chunk by chunk, then loop to store it in the String and then deserialize, it would work too. The file has basically one big line so I can't even say I'll read it line by line, the problem stays the same.
I saw some questions about the topic:
"Ruby serialize array and deserialize back"
"What's a reasonable way to read an entire text file as a single string?"
"How to read whole file in Ruby?"
but none of them seem to have the answers I'm looking for.
Marshal is a binary format, so you need to read and write in binary mode. The easiest way is to use IO.binread/write.
...
IO.binwrite('mat_save', mat_dump)
...
mat_dump = IO.binread('mat_save')
#mat = Marshal.load(mat_dump)
Remember that Marshaling is Ruby version dependent. It's only compatible under specific circumstances with other Ruby versions. So keep that in mind:
In normal use, marshaling can only load data written with the same major version number and an equal or lower minor version number.

Parse PNG string into Ruby file

How do I parse a PNG in string format (like below) into a file in Ruby?
\x89PNG\r\n\x1A\n\x00\x00\x00\rIHDR\x00\x00\x01,\x00\x00\
Adding more details I left in the comments.
If the PNG existed on the file system, I could open the file with File.open. I want the same object File.open would create, but I need to create it from the string, not the file system.
Ultimately, I want to assign this to a Paperclip attachment and have it recognize the object as a png.
File is just an implementation of IO. Ruby has another IO implementation that can read/write strings called, obviously, StringIO.
file = StringIO.new("\x89PNG\r\n\x1A\n\x00\x00\x00\rIHDR\x00\x00\x01,\x00\x00\")
Your comment suggests you need this to work with paperclip. In that case paperclip will usually (depends on version) want a file name and mime type, so add them before assigning the file to your paperclip attachment attribute.
file.content_type = "image/png"
file.original_filename = "image.png"
object.attachment = file
The above works for the most recent paperclip. Still better than writing to a temp file.
You can use StringIO:
s = "\x89PNG\r\n..."
file = StringIO.new(s)
Alternatively, you can use Tempfile (if you want real file object):
require 'tempfile'
file = Tempfile.new('png')
file.write "\x89PNG\r\n..."
file.rewind # move position pointer to the beginning of the file

Uploading and parsing text document in Rails

In my application, the user must upload a text document, the contents of which are then parsed by the receiving controller action. I've gotten the document to upload successfully, but I'm having trouble reading its contents.
There are several threads on this issue. I've tried more or less everything recommended on these threads, and I'm still unable to resolve the problem.
Here is my code:
file_data = params[:file]
contents = ""
if file_data.respond_to?(:read)
contents = file_data.read
else
if file_data.respond_to?(:path)
File.open(file_data, 'r').each_line do |line|
elts = line.split
#
#
end
end
end
So here are my problems:
file_data doesn't 'respond_to?' either :read or :path. According to some other threads on the topic, if the uploaded file is less than a certain size, it's interpreted as a string and will respond to :read. Otherwise, it should respond to :path. But in my code, it responds to neither.
If I try to take out the if statements and straight away attempt File.open(file_data, 'r'), I get an error saying that the file wasn't found.
Can someone please help me find out what's wrong?
PS, I'm really sorry that this is a redundant question, but I found the other threads unhelpful.
Are you actually storing the file? Because if you are not, of course it can't be found.
First, find out what you're actually getting for file_data by adding debug output of file_data.inspect. It maybe something you don't expect, especially if form isn't set up correctly (i.e. :multipart => true).
Rails should enclose uploaded file in special object providing uniform interface, so that something as simple as this should work:
file_data.read.each_line do |line|
elts = line.split
#
#
end

StringScanner scanning IO instead of a string

I've got a parser written using ruby's standard StringScanner. It would be nice if I could use it on streaming files. Is there an equivalent to StringScanner that doesn't require me to load the whole string into memory?
You might have to rework your parser a bit, but you can feed lines from a file to a scanner like this:
File.open('filepath.txt', 'r') do |file|
scanner = StringScanner.new(file.readline)
until file.eof?
scanner.scan(/whatever/)
scanner << file.readline
end
end
StringScanner was intended for that, to load a big string and going back and forth with an internal pointer, if you make it a stream, then the references get lost, you can not use unscan, check_until, pre_match, post_match,
well you can, but for that you need to buffer all the previous input.
If you are concerned about the buffer size then just load by chunk of data, and use a simple regexp or a gem called Parser.
The simplest way is to read a fix size of data.
# iterate over fixed length records
open("fixed-record-file") do |f|
while record = f.read(1024)
# parse here the record using regexp or parser
end
end
[Updated]
Even with this loop you can use StringSanner, you just need to update the string with each new chunk of data:
string=(str)
Changes the string being scanned to str and resets the scanner.
Returns str
There is StringIO.
Sorry misread you question. Take a look at this seems to have streaming options

Resources