Download image with Ruby RIO gem - ruby

My code:
require 'rio'
rio('nice.jpg') < rio('http://farm4.static.flickr.com/3134/3160515898_59354c9733.jpg?v=0')
But the image downloaded is currupted. Whtat is wrong with this solution?

pjb3 is correct. You must call binmode on the left-hand term:
rio('nice.jpg').binmode < rio('http://...')
If this still does not work (notably, it may happen for large jpeg files, i.e. rio uses an intermediate temp file when retrieving from the URL you have provided), then apply the binmode modifier to both terms:
rio('nice.jpg').binmode < rio('http://...').binmode
2011 UPDATE
According to Luke C., the above answer no longer applies to more recent versions of the gem:
Neither of these work. On Linux having .binmode set on the destination causes a Errno::ENOENT exception. Doing: rio('nice.jpg') < rio('http://...').binmode works

It works for me. Are you on windows? It might be because the file isn't being opened with the binary flag.

I had similar problems downloading images on Linux, I found that this worked for me:
rio(source_url).binmode > rio(filename)

Here is some simple ruby code to download an image
require 'net/http'
url = URI.parse("http://www.somedomain.com/image.jpg")
Net::HTTP.start(url.host, url.port) do |http|
resp, data = http.get(url.path, nil)
open( File.join(File.dirname(__FILE__), "image.jpg"), "wb" ) { |file| file.write(resp.body) }
end
This can even be extended to follow redirects:
require 'net/http'
url = URI.parse("http://www.somedomain.com/image.jpg")
Net::HTTP.start(url.host, url.port) do |http|
resp, data = http.get(url.path, nil)
prev_redirect = ''
while resp.header['location']
raise "Recursive redirect: #{resp.header['location']}" if prev_redirect == resp.header['location']
prev_redirect = resp.header['location']
url = URI.parse(resp.header['location'])
host = url.host if url.host
port = url.port if url.port
http = Net::HTTP.new(host, port)
resp, data = http.get(url.path, nil)
end
open( File.join(File.dirname(__FILE__), "image.jpg"), "wb" ) { |file| file.write(resp.body) }
end
It can probably be prettied up some, but it gets the job done, and is not dependent on any 3rd party gems! :)

I guess this is a bug. On windows all 0x0A replaced with 0x0D 0x0A. And as so, it makes sence that properly used (with .binmode) it works on Linux.

For downloading pictures from the web page, you can use ruby gem image_downloader

Related

How to download an image file via HTTP into a temp file?

I've found good examples of NET::HTTP for downloading an image file, and I've found good examples of creating a temp file. But I don't see how I can use these libraries together. I.e., how would the creation of the temp file be worked into this code for downloading a binary file?
require 'net/http'
Net::HTTP.start("somedomain.net/") do |http|
resp = http.get("/flv/sample/sample.flv")
open("sample.flv", "wb") do |file|
file.write(resp.body)
end
end
puts "Done."
There are more api-friendly libraries than Net::HTTP, for example httparty:
require "httparty"
url = "https://upload.wikimedia.org/wikipedia/commons/thumb/9/91/DahliaDahlstarSunsetPink.jpg/250px-DahliaDahlstarSunsetPink.jpg"
File.open("/tmp/my_file.jpg", "wb") do |f|
f.write HTTParty.get(url).body
end
require 'net/http'
require 'tempfile'
require 'uri'
def save_to_tempfile(url)
uri = URI.parse(url)
Net::HTTP.start(uri.host, uri.port) do |http|
resp = http.get(uri.path)
file = Tempfile.new('foo', Dir.tmpdir, 'wb+')
file.binmode
file.write(resp.body)
file.flush
file
end
end
tf = save_to_tempfile('http://a.fsdn.com/sd/topics/transportation_64.png')
tf # => #<File:/var/folders/sj/2d7czhyn0ql5n3_2tqryq3f00000gn/T/foo20130827-58194-7a9j19>
I like to use RestClient:
file = File.open("/tmp/image.jpg", 'wb' ) do |output|
output.write RestClient.get("http://image_url/file.jpg")
end
Though the answers above work totally fine, I thought I would mention that it is also possible to just use the good ol' curl command to download the file into a temporary location. This was the use case that I needed for myself. Here's a rough idea of the code:
# Set up the temp file:
file = Tempfile.new(['filename', '.jpeg'])
#Make the curl request:
url = "http://example.com/image.jpeg"
curlString = "curl --silent -X GET \"#{url}\" -o \"#{file.path}\""
curlRequest = `#{curlString}`
If you like to download a file using HTTParty you can use the following code.
resp = HTTParty.get("https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_92x30dp.png")
file = Tempfile.new
file.binmode
file.write(resp.body)
file.rewind
Further, if you want to store the file in ActiveStorage refer below code.
object.images.attach(io: file, filename: "Test.png")

use ruby to get content length of URLs

I am trying to write a ruby script that gets some details about files on a website using net/http. My code looks like this:
require 'open-uri'
require 'net/http'
url = URI.parse asset
res = Net::HTTP.start(url.host, url.port) {|http|
http.get(asset)
}
headers = res.to_hash
p headers
I would like to get two pieces of information from this request: the total length of the content inflated, and (as appropriate) the length of the content deflated.
Sometimes, the headers will include a content-length parameter, which appears to be the gzipped length of the content. I can also approximate the inflated size of the content using res.body.length, but this has not been foolproof by any stretch of the imagination. The documentation on net/http says that gzip headers are removed from the list automatically (to help me, gee thanks) so I cannot seem to get a reliable handle on this information.
Any help is appreciated (including other gems if they will do this more easily).
Got it! The "magic" behavior here only occurs if you don't specify your own accept-encoding header. Amended code as follows:
require 'open-uri'
require 'net/http'
require 'date'
require 'zlib'
headers = { "accept-encoding" => "gzip;q=1.0,deflate;q=0.6,identity;q=0.3" }
url = URI.parse asset
res = Net::HTTP.start(url.host, url.port) {|http|
http.get(asset, headers)
}
headers = res.to_hash
gzipped = headers['content-encoding'] && headers['content-encoding'][0] == "gzip"
content = gzipped ? Zlib::GzipReader.new(StringIO.new(res.body)).read : res.body
full_length = content.length,
compressed_length = (headers["content-length"] && headers["content-length"][0] || res.body.length),
You can try use sockets to send HEAD request to the server with is faster (no content) and don't send "Accept-Encoding: gzip", so your response will not be gzip.

Converting python script to ruby (downloading part of a file)

I've been at this for a couple of day, and am having no luck at all. Despite reading over these two posts, I can't seem to rewrite this little python script I did up in ruby.
clean_link = link['href'].replace(' ', '%20')
mp3file = urllib2.urlopen(clean_link)
output = open('temp.mp3','wb')
output.write(mp3file.read(2000))
output.close()
I've been looking at using open-uri and net/http to do the same in ruby, but keep hitting a url redirect issue. So far I have
clean_link = link.attributes['href'].gsub(' ', '%20')
link_pieces = clean_link.scan(/http:\/\/(?:www\.)?([^\/]+?)(\/.*?\.mp3)/)
host = link_pieces[0][0]
path = link_pieces[0][1]
Net::HTTP.start(host) do |http|
resp = http.get(path)
open("temp.mp3", "wb") do |file|
file.write(resp.body)
end
end
Is there a simpler way to do this in ruby? Also, as with the python script, is there a way to only download part of the file?
EDIT: progress updated
see here & here
http.request_get('/index.html') {|res|
size = 0
res.read_body do |chunk|
size += chunk.size
# do some processing
break if size >= 2000
end
}
but you can't control chunk sizes here

How to decompress Gzip string in ruby?

Zlib::GzipReader can take "an IO, or IO-like, object." as it's input, as stated in docs.
Zlib::GzipReader.open('hoge.gz') {|gz|
print gz.read
}
File.open('hoge.gz') do |f|
gz = Zlib::GzipReader.new(f)
print gz.read
gz.close
end
How should I ungzip a string?
The above method didn't work for me.
I kept getting incorrect header check (Zlib::DataError) error. Apparently it assumes you have a header by default, which may not always be the case.
The work around that I implemented was:
require 'zlib'
require 'stringio'
gz = Zlib::GzipReader.new(StringIO.new(resp.body.to_s))
uncompressed_string = gz.read
Zlib by default asumes that your compressed data contains a header.
If your data does NOT contain a header it will fail by raising a Zlib::DataError.
You can tell Zlib to assume the data has no header via the following workaround:
def inflate(string)
zstream = Zlib::Inflate.new(-Zlib::MAX_WBITS)
buf = zstream.inflate(string)
zstream.finish
zstream.close
buf
end
You need Zlib::Inflate for decompression of a string and Zlib::Deflate for compression
def inflate(string)
zstream = Zlib::Inflate.new
buf = zstream.inflate(string)
zstream.finish
zstream.close
buf
end
In Rails you can use:
ActiveSupport::Gzip.compress("my string")
ActiveSupport::Gzip.decompress().
zstream = Zlib::Inflate.new(16+Zlib::MAX_WBITS)
Using (-Zlib::MAX_WBITS), I got ERROR: invalid code lengths set and ERROR: invalid block type
The only following works for me, too.
Zlib::GzipReader.new(StringIO.new(response_body)).read
I used the answer above to use a Zlib::Deflate
I kept getting broken files (for small files) and it took many hours to figure out that the problem can be fixed using:
buf = zstream.deflate(string,Zlib::FINISH)
without the the zstream.finish line!
def self.deflate(string)
zstream = Zlib::Deflate.new
buf = zstream.deflate(string,Zlib::FINISH)
zstream.close
buf
end
To gunzip content, use following code (tested on 1.9.2)
Zlib::GzipReader.new(StringIO.new(content), :external_encoding => content.encoding).read
Beware of encoding problems
We don't need any extra parameters these days. There are deflate and inflate class methods which allow for quick oneliners like these:
>> data = "Hello, Zlib!"
>> compressed = Zlib::Deflate.deflate(data)
=> "x\234\363H\315\311\311\327Q\210\312\311LR\004\000\032\305\003\363"
>> uncompressed = Zlib::Inflate.inflate(compressed)
=> "Hello, Zlib!"
I think it answers the question "How should I ungzip a string?" the best. :)

Calling a file from a website in Ruby

I'm trying to call resources (images, for example.) from my website to avoid constant updates. Thus far, I've tried just using this:
#sprite.bitmap = Bitmap.new("http://www.minscandboo.com/minscgame/001-Title01.jpg")
But, this just gives "File not found error". What is the correct method for achieving this?
Try using Net::HTTP to get a local file first:
require 'net/http'
Net::HTTP.start("minscandboo.com") { |http|
resp = http.get("/miscgame/001-Title01.jpg")
open("local-game-image.jpg", "wb") { |file|
file.write(resp.body)
}
}
# ...
#sprite.bitmap = Bitmap.new("local-game-image.jpg")

Resources