Ruby: Download zip file and extract - ruby

I have a ruby script that downloads a remote ZIP file from a server using rubys opencommand. When I look into the downloaded content, it shows something like this:
PK\x03\x04\x14\x00\b\x00\b\x00\x9B\x84PG\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\n\x00\x10\x00foobar.txtUX\f\x00\x86\v!V\x85\v!V\xF6\x01\x14\x00K\xCB\xCFOJ,RH\x03S\\\x00PK\a\b\xC1\xC0\x1F\xE8\f\x00\x00\x00\x0E\x00\x00\x00PK\x01\x02\x15\x03\x14\x00\b\x00\b\x00\x9B\x84PG\xC1\xC0\x1F\xE8\f\x00\x00\x00\x0E\x00\x00\x00\n\x00\f\x00\x00\x00\x00\x00\x00\x00\x00#\xA4\x81\x00\x00\x00\x00foobar.txtUX\b\x00\x86\v!V\x85\v!VPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x00D\x00\x00\x00T\x00\x00\x00\x00\x00
I tried using the Rubyzip gem (https://github.com/rubyzip/rubyzip) along with its class Zip::ZipInputStream like this:
stream = open("http://localhost:3000/foobar.zip").read # this outputs the zip content from above
zip = Zip::ZipInputStream.new stream
Unfortunately, this throws an error:
Failure/Error: zip = Zip::ZipInputStream.new stream
ArgumentError:
string contains null byte
My questions are:
Is it possible, in general, to download a ZIP file and extract its content in-memory?
Is Rubyzip the right library for it?
If so, how can I extract the content?

I found the solution myself and then at stackoverflow :D (How to iterate through an in-memory zip file in Ruby)
input = HTTParty.get("http://example.com/somedata.zip").body
Zip::InputStream.open(StringIO.new(input)) do |io|
while entry = io.get_next_entry
puts entry.name
parse_zip_content io.read
end
end
Download your ZIP file, I'm using HTTParty for this (but you could also use ruby's open command (require 'open-uri').
Convert it into a StringIO stream using StringIO.new(input)
Iterate over every entry inside the ZIP archive using io.get_next_entry (it returns an instance of Entry)
With io.read you get the content, and with entry.name you get the filename.

Like I commented in https://stackoverflow.com/a/43303222/4196440, we can just use Zip::File.open_buffer:
require 'open-uri'
content = open('http://localhost:3000/foobar.zip')
Zip::File.open_buffer(content) do |zip|
zip.each do |entry|
puts entry.name
# Do whatever you want with the content files.
end
end

Related

Load gemspec from stdin

I'm trying to adapt some existing code to also handle gems. This existing code needs the version number of the thing in question (here: the gem) and does some git stuff to get the relevant file (here I take the gemspec) in the right version, and then passes it on stdin to another script that extract the version number (and other stuff).
To avoid having to write code to parse a gemspec, I was trying to do:
spec = Gem::Specification::load('-')
puts spec.name
puts spec.version
But I can't make it read from stdin (it works fine if I hardcode a file name, but that won't work in my usecase). Can I do that, or is there another (easy) way to do it?
Gem::Specification.load expects either a File instance or a path to a file as the first argument so the easiest way to solve this would be to simply create a Tempfile instance and write the data from stdin to it.
file = Tempfile.new
begin
file.write(data_from_stdin)
file.rewind
spec = Gem::Specification.load(file)
puts spec.name
puts spec.version
ensure
file.close
file.unlink
end

in Ruby open IO object and pass each line to another object

I need to download a large zipped file, unzip it and modify each string before I save them to array.
I prefer to read downloaded zipped file line(entry) at a time, and manipulate each line(entry) as they load, rather then load the whole file in the memory.
I experimented with many IO methods of opening a file this way, but I struggle to pass a line(entry) to Zip::InputStream object. This is what I have:
require 'tempfile'
require 'zip'
require 'open-uri'
f = open(FILE_URL) #FILE_URL contains download path to .zip file
Zip::InputStream.open(f) do |io| #io is a String
while (io.get_next_entry)
io.each do |line|
# manipulate the line and push it to an array
end
end
end
if I use open(FILE_URL).each do |zip_entry|, I cannot figure out how to pass zip_entry to Zip::InputStream. Simply Zip::InputStream.open(zip_entry) does not work...
is this scenario possible, or do I have to have content of zipped file downloaded in to Tempfile completely? Any pointers so solve will be helpful

Uploading Images through Sinatra

I'm using the example code from this page:
http://www.wooptoot.com/file-upload-with-sinatra
When I try to upload an image file (png or jpg), it uploads successfully and I can see the file in the proper directory, but it gets corrupted in the process. I cannot open the image. Doing a diff with the original files, I see several newlines that are missing in the uploaded version.
I'm running Ruby 1.9.3p392 on Windows.
Edit:
I tried a test outside the context of Sinatra
File.open('57-new.jpg', "wb") do |f|
f.write(File.open('57.jpg', 'rb').read)
end
That works. The only difference is the addition of the binary flags. When using Sinatra I can set the binary flag on the write operation, but I'm not sure how I can set it on the read since I seem to be passed a file object by the request.
File.open('uploads/' + params['myfile'][:filename], "wb") do |f|
f.write(params['myfile'][:tempfile].read)
end
Okay, so it looks like all I needed was the binary flag when opening the new file.
File.open('uploads/' + params['myfile'][:filename], "wb") do |f|
f.write(params['myfile'][:tempfile].read)
end

My file is getting shorter and I don't know why

I have a requirement where I need to edit part of xml file and save it, but in my code some part of the xml file it not saving.I want to modify <mtn:ttl>4</mtn:ttl> to <mtn:ttl>9</mtn:ttl>, this part is getting modified in the below code but while writting/saving only part of file is getting chaged or the format of the file is getting chaged, can any one tell me how to solve this? original xml file size is 79kb but after editing and saving its becoming 78kb...
require "rexml/text"
require "rexml/document"
include REXML
File.open("c://conf//cad-mtn-config.xml") do |config_file|
# Open the document and edit the file
config = Document.new(config_file)
if testField.to_s.match(/<mtn:ttl>/)
config.root.elements[4].elements[11].elements[1].elements[1].elements[1].elements[8].text="9"
# Write the result to a new file.
formatter = REXML::Formatters::Default.new
File.open("c://mtn-3//mtn-2.2//conf//cad-mtn-config.xml", 'w') do |result|
formatter.write(config, result)
end
end
end
It looks like your trying to use regular expressions, why not just use rexml? The only requirement is that you need to know where the namespace is located online. Note if it were not mtn:ttl and just ttl you would not need the namespace.
require 'rexml/document'
file_path="path to file"
contents=File.new(file_path).read
xml_doc=REXML::Document.new(contents)
xml_doc.add_namespace('mtn',"http://url to mtn namespace")
xml_doc.root.elements.each('mtn:ttl') do |element|
element.text="9"
end
File.open(file_path,"w") do |data|
data<<xml_doc
end

Ruby Unzip String

I have to work with a zipped (regular Zip) string in Ruby.
Apparently I can't save a temporary file with Ruby-Zip or Zip-Ruby.
Is there any practicable way to unzip this string?
rubyzip supports StringIO since version 1.1.0
require "zip"
# zip is the string with the zipped contents
Zip::InputStream.open(StringIO.new(zip)) do |io|
while (entry = io.get_next_entry)
puts "#{entry.name}: '#{io.read}'"
end
end
See Zip/Ruby Zip::Archive.open_buffer(...):
require 'zipruby'
Zip::Archive.open_buffer(str) do |archive|
archive.each do |entry|
entry.name
entry.read
end
end
As the Ruby-Zip seems to lack support of reading/writing to IO objects, you can fake File.
What you can do is the following:
Create a class called File under Zip module which inherits from StringIO, e.g. class Zip::File < StringIO
Create the exists? class method (returns true)
Create the open class method (yields the StringIO to the block)
Stub close instance method (if needed)
Perhaps it'll need more fake methods
As #Roman mentions, rubyzip currently lacks reading and writing of IO objects (including StringIO.new(s)). Try using zipruby instead, like this:
gem install zipruby
require 'zipruby'
# Given a string in zip format, return a hash where
# each key is an zip archive entry name and each
# value is the un-zipped contents of the entry
def unzip(zipfile)
{}.tap do |h|
Zip::Archive.open_buffer(zipfile) do |archive|
archive.each {|entry| h[entry.name] = entry.read }
end
end
end
The zlib library. Works fine with StringIO.

Resources