Download and write .tar.gz files without corruption - ruby

How do you download files, specifically .zip and .tar.gz, with Ruby and write them to the disk?
—This question was originally specific to a bug in MacRuby, but the answers are relevant to the above general question.
Using MacRuby, I've found that the file appears to be the same as the reference (in
size), but the archives refuse to extract. What I'm attempting now is at: https://gist.github.com/arbales/8203385Thanks!

I've successfully downloaded and extracted GZip files with this code:
require 'open-uri'
require 'zlib'
open('tarball.tar', 'w') do |local_file|
open('http://github.com/jashkenas/coffee-script/tarball/master/tarball.tar.gz') do |remote_file|
local_file.write(Zlib::GzipReader.new(remote_file).read)
end
end

I'd recommend using open-uri in ruby's stdlib.
require 'open-uri'
open(out_file, 'w') do |out|
out.write(open(url).read)
end
http://ruby-doc.org/stdlib/libdoc/open-uri/rdoc/classes/OpenURI/OpenRead.html#M000832
Make sure you look at the :progress_proc option to open as it looks like you want a progress hook.

The last time I got currupted files with Ruby was when I forgot to call file.binmode right after File.open. Took me hours to find out what was wrong. Does it help with your issue?

When downloading a .tar.gz with open-uri via a simple open() call, I was also getting errors uncompressing the file on disk. I eventually noticed that the file size was much larger than expected.
Inspecting the file download.tar.gz on disk, what it actually contained was download.tar uncompressed; and that could be untarred. This seems to be due to an implicit Accept-encoding: gzip header on the open() call which makes sense for web content, but is not what I wanted when retrieving a gzipped tarball. I was able to work around it and defeat that behavior by sending a blank Accept-encoding header in the optional hash argument to the remote open():
open('/local/path/to/download.tar.gz', 'wb') do |file|
# Send a blank Accept-encoding header
file.write open('https://example.com/remote.tar.gz', {'Accept-encoding'=>''}).read
end

Related

(Multipart) zip upload in ruby webmachine handled by rack

I'm making an upload form for zips in a ruby webmachine app. My idea is to have an upload through my backend where I can add some extra params and then upload it to amazons s3 service with RestClient.
I did successfully create a direct upload (web based form post) to a s3bucket, but in that way I'm unable to handle the variables which are needed in the request, the way I want.
I've tried several things but I can't figure out, how to handle the request, as soon as it gets in my backend. I've created a resource and I'm debugging directly in the process_post method.
My #request variable represents a Webmachine::Request, with a Webmachine::Adapters::Rack::RequestBody and a Rack::Request, but I can't get the file out of it to use it as input for my RestClient request.
I think; #request.body.to_s and #request.body.to_io, represent the uploaded file in some way, and I tried to use them as input for Rack::Multipart methods, but that doesn't give me the file.
I also tried to work with the rack-raw-upload gem, but I can't get the mime-type something else than "application/x-www-form-urlencoded" or multipart. I do explicitly set it to; application/octet-stream
Things like File.new(filename, 'rb') gave me `rrno::ENOENT: No such file or directory # rb_sysopen'. For filename I just used 'example.zip'.
I guess I'm missing something which has to do with the Rack::Request call(env) method.
Does somebody have an idea, on how to handle the Rack uploads? Or give me any hints for a new direction? Thanks.
I've created a gist which shows how to retrieve the multipart stream. You'll need further parsing in order to get the uploaded file.
https://gist.github.com/jewilmeer/eb40abd665b70f53e6eb60801de24342

How to download a file in parts

I'm writing a program that downloads files anywhere up to a 1Gb in size. Right now I'm using the requests package to download files, and although it works (I think it times out sometimes) it is very slow. I've seen some examples multi-part download examples using urllib2 but I'm looking for a way to use urllib3 or requests, if that package has the ability.
How closely have you looked at requests' documentation?
In the Quickstart documentation the following is described
r = requests.get(url, stream=True)
r.raw.read(amount)
The better way, however, to do this is:
fd = open(filename, 'wb')
r = requests.get(url, stream=True)
for chunk in r.iter_content(amount):
fd.write(chunk)
fd.close()
(Assuming you are saving the downloaded content to a file.)

How do I load files into Ruby?

I was wondering how to load a file from my System Libraries, but I'm not sure how. I was thinking something along the lines of:
require 'gosu'
require "Picture.jpg"
and then having the rest of my code, but every time I try that, I get the error:
No such file to load
I'm not sure if I'm doing something wrong, or there just isn't a way to load a file from my system library into Ruby?
I would suggest you to take close looks at require, load, extend and include.
It will help you to use in your application.
http://ionrails.com/2009/09/19/ruby_require-vs-load-vs-include-vs-extend/

How do I generate multiple images and put them into one zip file for download using Rails3?

I am generating QR codes from Google's charting API on my website as a URL with some params passed in.
I have around 100 of these codes which are generated from a URL, something like this:
$= image_tag("http://chart.apis.google.com/chart?cht=qr&chl=#{qr.code}&chs=120x120&choe=UTF-8", :size => "120x120")
I want to create a method that loops through my array and generates the png files, then places them inside a zip file which I can download from one click.
I tried using send_data "url", :disposition = > "attachment", :type => "image/png"
This only saved the URL, not the image generated. Putting the URL into the browser opened a window with the image.
Other than that I was not able to add all the files into a zip file. Does Rails have its own built-in compression methods?
First, for each QR code, use Net::HTTP to download the QR code's image from Google's API into a temp file. Then use rubyzip to compress the temp files into a zip file. An example of how to use the zip library is here. Finally, use send_data (or send_file if you've written it to disk) to send this generated zip file to the client's web browser.
Check out my mod_zip module for Nginx:
https://github.com/evanmiller/mod_zip
You can dynamically stream a ZIP file to the client without any temporary files.

Ruby - Working with Mechanize::File response without saving to disk

I'm working on my first ORM project and am using Mechanize. Here's the situation:
I'm downloading a zip file from my website into a Mechanize::File object. Inside the zip is a file buried three folders deep (folder_1/folder_2/file.txt). I'd like to pull file.txt out of the zip file and return that instead of the zip file itself.
My first thought was to use zip/zipfilesystem. I can do this fine if I save the file to the disk first and use Zip::ZipFile.open(src) but can anyone tell me how/if it is possible to send it over straight from the Mechanize::File.body.
My gut says this has to be possible and I'm just missing something basic. I tried...
zipfile = Mechanize::File.body
Zip::ZipFile.open(zipfile)
...but from what I can tell Zip::ZipFile is only set up to locate a source from a filesystem.
Any direction would be very appreciated and let me know if there are any questions
Thanks in advance
Rob
It seems what you want to do is not possible with rubyzip. From rubyzip library's TODO file:
SUggestion: ZipInputStream/ZipOutputStream should accept an IO object in addition to a filename.

Resources