I am trying to download a binary file via HTTP using the following Ruby script.
#!/usr/bin/env ruby
require 'net/http'
require 'uri'
def http_download(resource, filename, debug = false)
uri = URI.parse(resource)
puts "Starting HTTP download for: #{uri}"
http_object = Net::HTTP.new(uri.host, uri.port)
http_object.use_ssl = true if uri.scheme == 'https'
begin
http_object.start do |http|
request = Net::HTTP::Get.new uri.request_uri
Net::HTTP.get_print(uri) if debug
http.read_timeout = 500
http.request request do |response|
open filename, 'w' do |io|
response.read_body do |chunk|
io.write chunk
end
end
end
end
rescue Exception => e
puts "=> Exception: '#{e}'. Skipping download."
return
end
puts "Stored download as #{filename}."
end
However it downloads the HTML source instead of the binary. When I enter the URL in the browser the binary file is downloaded. Here is a URL with which the script fails:
http://dcatlas.dcgis.dc.gov/catalog/download.asp?downloadID=2175&downloadTYPE=KML
I execute the script as follows
pry> require 'myscript'
pry> resource = "http://dcatlas.dcgis.dc.gov/catalog/download.asp?downloadID=2175&downloadTYPE=KML"
pry> http_download(resource,"StreetTreePt.KML", true)
How can I download the binary?
Redirection experiments
I found this redirection check which looks quite reasonable. When I integrate in the response block it fails with the following error:
Exception: 'undefined method `host' for "save_download.asp?filename=StreetTreePt.KML":String'. Skipping download.
The exception does not occur in the "original" function posted above.
The documentation for Net::HTTP shows how to handle redirects:
Following Redirection
Each Net::HTTPResponse object belongs to a class for its response code.
For example, all 2XX responses are instances of a Net::HTTPSuccess subclass, a 3XX response is an instance of a Net::HTTPRedirection subclass and a 200 response is an instance of the Net::HTTPOK class. For details of response classes, see the section “HTTP Response Classes” below.
Using a case statement you can handle various types of responses properly:
def fetch(uri_str, limit = 10)
# You should choose a better exception.
raise ArgumentError, 'too many HTTP redirects' if limit == 0
response = Net::HTTP.get_response(URI(uri_str))
case response
when Net::HTTPSuccess then
response
when Net::HTTPRedirection then
location = response['location']
warn "redirected to #{location}"
fetch(location, limit - 1)
else
response.value
end
end
print fetch('http://www.ruby-lang.org')
Or, you can use Ruby's OpenURI, which handles it automatically. Or, the Curb gem will do it. Probably Typhoeus and HTTPClient too.
According to the code you show in your question, the exception you are getting can only come from:
http_object = Net::HTTP.new(uri.host, uri.port)
which is hardly likely since uri is a URI object. You need to show the complete code if you want help with that problem.
Related
I'm using Ruby version 2.3.0.
I want to check when my application is up, and I wrote this method for my "deployer".
At runtime http.request_get(uri) raises
EOFError: end of file reached
when I pass http://localhost as a first argument into the method:
require 'net/http'
def check_start_application(address, port)
success_codes = [200, 301]
attempts = 200
uri = URI.parse("#{address}:#{port}")
http = Net::HTTP.new(uri.host, uri.port)
attempts.times do |attempt|
# it raises EOFError: end of file reached
http.request_get(uri) do |response|
if success_codes.include?(response.code.to_i)
return true
elsif attempt == attempts - 1
return false
end
end
end
end
But, when I test this method separately from a context with irb, this code works pretty well for two cases:
check_start_application('http://example.com', '80')
check_start_application('http://localhost', any_port)
In an app's context this code works for only one case:
check_start_application('http://example.com', '80')
What I tried:
using 'rest-client' instead of 'net/http'
using 'net/https' with http.use_ssl = false
remove times from the method
call sleep before a request
Who faced a similar problem? I believe I'm not the only one.
It may be that on the deploy your app is running on SSL. It's hard to help debug without access to it, but try with:
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
Im using Net::HTTP in my ruby code to make http requests. For example to make a post request i do
require 'net/http'
Net::HTTP.post_form(url,{'email' => email,'password' => password})
This works. But im unable to make a delete request, i.e.
require 'net/http'
Net::HTTP::Delete(url)
gives the following error
NoMethodError: undefined method `Delete' for Net::HTTP:Class
The documentation at http://ruby-doc.org/stdlib-1.9.3/libdoc/net/http/rdoc/Net/HTTP.html shows Delete is available. So why is it not working in my case ?
Thank You
The documentation tells you that Net::HTTP::Delete is a class, not a method.
Try Net::HTTP.new('www.server.com').delete('/path') instead.
uri = URI('http://localhost:8080/customer/johndoe')
http = Net::HTTP.new(uri.host, uri.port)
req = Net::HTTP::Delete.new(uri.path)
res = http.request(req)
puts "deleted #{res}"
Simple post and delete requests, see docs for more:
puts Net::HTTP.new("httpbin.org").post("/post", "a=1").body
puts Net::HTTP.new("httpbin.org").delete("/delete").body
This works for me:
uri = URI(YOUR_URL)
req = Net::HTTP::Delete.new(uri, {}) # params on second place
response = Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http|
http.request req
end
I'm working on rails 3.0.4. I intend to send out sms to a particular number after saving students record. The code that I'm gonna mention below worked well in rails 2.X, but on rails 3.0.4, I get an error:
NameError in StudentsController#create
uninitialized constant Student::Net
Code:
def send_welcome_sms
url=URI.parse("http://webaddress.com");
#error occuring at this point
request = Net::HTTP::Post.new(url.path)
message = "message goes here"
request.set_form_data({'username'=>"abc", 'password'=>"xyz", 'to'=> "some number", 'text'=> "#{message}", 'from'=> "someone"})
response = Net::HTTP.new(url.host, url.port).start {|http| http.request(request) }
# If U are Behind The Proxy Comment Above Line And Uncomment Below Line, Give The Proxy Ip & Port
#response = Net::HTTP::Proxy("PROXY IP", PROXYPORT).new(url.host, url.port).start {|http| http.request(request) }
case response
when Net::HTTPSuccess
puts response.body
else
response.body
response.error!
end
end
Make sure that you have the appropriate require statement somewhere, either in your controller, or preferably, in your environment.rb file or an initializer:
require 'net/http'
How do I take this URL http://t.co/yjgxz5Y and get the destination URL which is http://nickstraffictricks.com/4856_how-to-rank-1-in-google/
require 'net/http'
require 'uri'
Net::HTTP.get_response(URI.parse('http://t.co/yjgxz5Y'))['location']
# => "http://nickstraffictricks.com/4856_how-to-rank-1-in-google/"
I've used open-uri for this, because it's nice and simple. It will retrieve the page, but will also follow multiple redirects:
require 'open-uri'
final_uri = ''
open('http://t.co/yjgxz5Y') do |h|
final_uri = h.base_uri
end
final_uri # => #<URI::HTTP:0x00000100851050 URL:http://nickstraffictricks.com/4856_how-to-rank-1-in-google/>
The docs show a nice example for using the lower-level Net::HTTP to handle redirects.
require 'net/http'
require 'uri'
def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
response = Net::HTTP.get_response(URI.parse(uri_str))
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then fetch(response['location'], limit - 1)
else
response.error!
end
end
puts fetch('http://www.ruby-lang.org')
Of course this all breaks down if the page isn't using a HTTP redirect. A lot of sites use meta-redirects, which you have to handle by retrieving the URL from the meta tag, but that's a different question.
For resolving redirects you should use a HEAD request to avoid downloading the whole response body (imagine resolving a URL to an audio or video file).
Working example using the Faraday gem:
require 'faraday'
require 'faraday_middleware'
def resolve_redirects(url)
response = fetch_response(url, method: :head)
if response
return response.to_hash[:url].to_s
else
return nil
end
end
def fetch_response(url, method: :get)
conn = Faraday.new do |b|
b.use FaradayMiddleware::FollowRedirects;
b.adapter :net_http
end
return conn.send method, url
rescue Faraday::Error, Faraday::Error::ConnectionFailed => e
return nil
end
puts resolve_redirects("http://cre.fm/feed/m4a") # http://feeds.feedburner.com/cre-podcast
You would have to follow the redirect. I think that would help :
http://shadow-file.blogspot.com/2009/03/handling-http-redirection-in-ruby.html
I only need to download the first few kilobytes of a file via HTTP.
I tried
require 'open-uri'
url = 'http://example.com/big-file.dat'
file = open(url)
content = file.read(limit)
But it actually downloads the full file.
This seems to work when using sockets:
require 'socket'
host = "download.thinkbroadband.com"
path = "/1GB.zip" # get 1gb sample file
request = "GET #{path} HTTP/1.0\r\n\r\n"
socket = TCPSocket.open(host,80)
socket.print(request)
# find beginning of response body
buffer = ""
while !buffer.match("\r\n\r\n") do
buffer += socket.read(1)
end
response = socket.read(100) #read first 100 bytes of body
puts response
I'm curious if there is a "ruby way".
This is an old thread, but it's still a question that seems mostly unanswered according to my research. Here's a solution I came up with by monkey-patching Net::HTTP a bit:
require 'net/http'
# provide access to the actual socket
class Net::HTTPResponse
attr_reader :socket
end
uri = URI("http://www.example.com/path/to/file")
begin
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new(uri.request_uri)
# calling request with a block prevents body from being read
http.request(request) do |response|
# do whatever limited reading you want to do with the socket
x = response.socket.read(100);
end
end
rescue IOError
# ignore
end
The rescue catches the IOError that's thrown when you call HTTP.finish prematurely.
FYI, the socket within the HTTPResponse object isn't a true IO object (it's an internal class called BufferedIO), but it's pretty easy to monkey-patch that, too, to mimic the IO methods you need. For example, another library I was using (exifr) needed the readchar method, which was easy to add:
class Net::BufferedIO
def readchar
read(1)[0].ord
end
end
Check out "OpenURI returns two different objects". You might be able to abuse the methods in there to interrupt downloading/throw away the rest of the result after a preset limit.