uncaught exception with net/http request wrapper in ruby - ruby

I have a script that calls the facebook test api to automate test account creation. At seemingly random interval, from either after 50 requests to 6000 requests, I get an exception that isn't caught. I'm at a loss for how to figure out what the error is, so I'll start with the relevant code here.
I'm not using the URI library because facebook keys have the pipe character that breaks URI.parse for ruby 1.8.7.
require 'rubygems'
require 'net/https'
require 'json'
http = Net::HTTP.new(domain, 443)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
def request_wrapper(http, request)
retry_count = 5
begin
return http.request(request)
rescue Exception => e
retry_count--
if retry_count < 0
raise e
end
retry
end
for i in 0..500 do
request = Net::HTTP::Get.new('https://' + domain + path)
response = request_wrapper(http, request)
end
The code will work for some time, but always inevitably reports the following:
in rescue in request_wrapper': undefined method-#' for nil:NilClass
(NoMethodError)
Anyone ever seen this undefined method '-#' before? Again, this happens very intermittently, but it's been a real thorn in my side. It always point to the line in the code where I am calling the request wrapper.
Thanks for taking a look.

The problem is with the line retry_count--. This line gets evaluated only when a failed HTTP request raises an Exception, which explains why it occurs intermittently.
Ruby does not have a unary decrement (--) or increment (++) operator. Matz has outlined the philosophical reasons behind this here. Also see this thread and this one for more information.
Instead, retry_count -= 1 should do the job.

Related

How to view the X-Shopify-Shop-Api-Call-Limit

I want to view this specific header: X_SHOPIFY_SHOP_API_CALL_LIMIT
When I do this:
CreateShopifyClientService.call(shop)
begin
response = ShopifyAPI::InventoryLevel.find(:all, params: { inventory_item_ids: product.inventory_item_id})
byebug
rescue Exception => e
byebug
end
I've tried just about everything I can think of, including looking thru the tests on the gem and cannot find the mysterious header.
Here are some of my futile attempts:
(byebug) ap response.header
*** NoMethodError Exception: undefined method "header' for #<ShopifyAPI::PaginatedCollection:0x000055b39213c9c8>
(byebug) ap response.headers
*** NoMethodError Exception: undefined method "headers' for #<ShopifyAPI::PaginatedCollection:0x000055b39213c9c8>
I just did this in my terminal and it worked a peach...
shop = Shop.first
session = ShopifyAPI::Auth::Session.new(
shop: shop.shopify_domain,
access_token: shop.shopify_token
)
client = ShopifyAPI::Clients::Rest::Admin.new(
session: session
)
response = client.get(path: "products")
p response.headers["x-shopify-shop-api-call-limit"]
["1/40"]
=> ["1/40"]
So it is not too terribly hard to access the response headers. I see you're getting the response of the API call and not the HTTP response itself, so it seems you need to approach it differently for those old API calls. Maybe try updating to the more modern API and see if that helps you out.

In RoR, how do I catch an exception if I get no response from a server?

I’m using Rails 4.2.3 and Nokogiri to get data from a web site. I want to perform an action when I don’t get any response from the server, so I have:
begin
content = open(url).read
if content.lstrip[0] == '<'
doc = Nokogiri::HTML(content)
else
begin
json = JSON.parse(content)
rescue JSON::ParserError => e
content
end
end
rescue Net::OpenTimeout => e
attempts = attempts + 1
if attempts <= max_attempts
sleep(3)
retry
end
end
Note that this is different than getting a 500 from the server. I only want to retry when I get no response at all, either because I get no TCP connection or because the server fails to respond (or some other reason that causes me not to get any response). Is there a more generic way to take account of this situation other than how I have it? I feel like there are a lot of other exception types I’m not thinking of.
This is generic sample how you can define timeout durations for HTTP connection, and perform several retries in case of any error while fetching content (edited)
require 'open-uri'
require 'nokogiri'
url = "http://localhost:3000/r503"
openuri_params = {
# set timeout durations for HTTP connection
# default values for open_timeout and read_timeout is 60 seconds
:open_timeout => 1,
:read_timeout => 1,
}
attempt_count = 0
max_attempts = 3
begin
attempt_count += 1
puts "attempt ##{attempt_count}"
content = open(url, openuri_params).read
rescue OpenURI::HTTPError => e
# it's 404, etc. (do nothing)
rescue SocketError, Net::ReadTimeout => e
# server can't be reached or doesn't send any respones
puts "error: #{e}"
sleep 3
retry if attempt_count < max_attempts
else
# connection was successful,
# content is fetched,
# so here we can parse content with Nokogiri,
# or call a helper method, etc.
doc = Nokogiri::HTML(content)
p doc
end
When it comes to rescuing exceptions, you should aim to have a clear understanding of:
Which lines in your system can raise exceptions
What is going on under the hood when those lines of code run
What specific exceptions could be raised by the underlying code
In your code, the line that's fetching the content is also the one that could see network errors:
content = open(url).read
If you go to the documentation for the OpenURI module you'll see that it uses Net::HTTP & friends to get the content of arbitrary URIs.
Figuring out what Net::HTTP can raise is actually very complicated but, thankfully, others have already done this work for you. Thoughtbot's suspenders project has lists of common network errors that you can use. Notice that some of those errors have to do with different network conditions than what you had in mind, like the connection being reset. I think it's worth rescuing those as well, but feel free to trim the list down to your specific needs.
So here's what your code should look like (skipping the Nokogiri and JSON parts to simplify things a bit):
require 'net/http'
require 'open-uri'
HTTP_ERRORS = [
EOFError,
Errno::ECONNRESET,
Errno::EINVAL,
Net::HTTPBadResponse,
Net::HTTPHeaderSyntaxError,
Net::ProtocolError,
Timeout::Error,
]
MAX_RETRIES = 3
attempts = 0
begin
content = open(url).read
rescue *HTTP_ERRORS => e
if attempts < MAX_RETRIES
attempts += 1
sleep(2)
retry
else
raise e
end
end
I would think about using a Timeout that raises an exception after a short period:
MAX_RESPONSE_TIME = 2 # seconds
begin
content = nil # needs to be defined before the following block
Timeout.timeout(MAX_RESPONSE_TIME) do
content = open(url).read
end
# parsing `content`
rescue Timeout::Error => e
attempts += 1
if attempts <= max_attempts
sleep(3)
retry
end
end

Parse html GET via open() with nokogiri - redirect exception

I'm trying to learn ruby, so I'm following an exercise of google dev. I'm trying to parse some links. In the case of successful redirection (considering that I know that it its possible only to get redirected once), I get redirect forbidden. I noticed that I go from a http protocol link to an https protocol link. Any concrete idea how could I implement in this in ruby because google's exercise is for python?
error:
ruby fix.rb
redirection forbidden: http://code.google.com/edu/languages/google-python-class/images/puzzle/p-bija-baei.jpg -> https://developers.google.com/edu/python/images/puzzle/p-bija-baei.jpg?csw=1
code that should achieve what I'm looking for:
def acquireData(urls, imgs) #List item urls list of valid urls !checked, imgs list of the imgs I'll download afterwards.
begin
urls.each do |url|
page = Nokogiri::HTML(open(url))
puts page.body
end
rescue Exception => e
puts e
end
end
Ruby's OpenURI will automatically handle redirects for you, as long as they're not "meta-refresh" that occur inside the HTML itself.
For instance, this follows a redirect automatically:
irb(main):008:0> page = open('http://www.example.org')
#<StringIO:0x00000002ae2de0>
irb(main):009:0> page.base_uri.to_s
"http://www.iana.org/domains/example"
In other words, the request to "www.example.org" got redirected to "www.iana.org" and OpenURI tracked it correctly.
If you are trying to learn HOW to handle redirects, read the Net::HTTP documentation. Here is the example how to do it from the document:
Following Redirection
Each Net::HTTPResponse object belongs to a class for its response code.
For example, all 2XX responses are instances of a Net::HTTPSuccess subclass, a 3XX response is an instance of a Net::HTTPRedirection subclass and a 200 response is an instance of the Net::HTTPOK class. For details of response classes, see the section “HTTP Response Classes” below.
Using a case statement you can handle various types of responses properly:
def fetch(uri_str, limit = 10)
# You should choose a better exception.
raise ArgumentError, 'too many HTTP redirects' if limit == 0
response = Net::HTTP.get_response(URI(uri_str))
case response
when Net::HTTPSuccess then
response
when Net::HTTPRedirection then
location = response['location']
warn "redirected to #{location}"
fetch(location, limit - 1)
else
response.value
end
end
print fetch('http://www.ruby-lang.org')
If you want to handle meta-refresh statements, reflect on this:
require 'nokogiri'
doc = Nokogiri::HTML(%[<meta http-equiv="refresh" content="5;URL='http://example.com/'">])
meta_refresh = doc.at('meta[http-equiv="refresh"]')
if meta_refresh
puts meta_refresh['content'][/URL=(.+)/, 1].gsub(/['"]/, '')
end
Which outputs:
http://example.com/
Basically the url in code.google that you're trying to open redirects to a https url. You can see that by yourself if you paste http://code.google.com/edu/languages/google-python-class/images/puzzle/p-bija-baei.jpg into your browser
Check the following bug report that explains why open-uri can't redirect to https;
So the solution to your problem is simply: use a different set of urls (that don't redirect to https)

Ruby Net::HTTP - following 301 redirects

My users submit urls (to mixes on mixcloud.com) and my app uses them to perform web requests.
A good url returns a 200 status code:
uri = URI.parse("http://www.mixcloud.com/ErolAlkan/hard-summer-mix/")
request = Net::HTTP.get_response(uri)(
#<Net::HTTPOK 200 OK readbody=true>
But if you forget the trailing slash then our otherwise good url returns a 301:
uri = "http://www.mixcloud.com/ErolAlkan/hard-summer-mix"
#<Net::HTTPMovedPermanently 301 MOVED PERMANENTLY readbody=true>
The same thing happens with 404's:
# bad path returns a 404
"http://www.mixcloud.com/bad/path/"
# bad path minus trailing slash returns a 301
"http://www.mixcloud.com/bad/path"
How can I 'drill down' into the 301 to see if it takes us on to a valid resource or an error page?
Is there a tool that provides a comprehensive overview of the rules that a particular domain might apply to their urls?
301 redirects are fairly common if you do not type the URL exactly as the web server expects it. They happen much more frequently than you'd think, you just don't normally ever notice them while browsing because the browser does all that automatically for you.
Two alternatives come to mind:
1: Use open-uri
open-uri handles redirects automatically. So all you'd need to do is:
require 'open-uri'
...
response = open('http://xyz...').read
If you have trouble redirecting between HTTP and HTTPS, then have a look here for a solution:
Ruby open-uri redirect forbidden
2: Handle redirects with Net::HTTP
def get_response_with_redirect(uri)
r = Net::HTTP.get_response(uri)
if r.code == "301"
r = Net::HTTP.get_response(URI.parse(r['location']))
end
r
end
If you want to be even smarter you could try to add or remove missing backslashes to the URL when you get a 404 response. You could do that by creating a method like get_response_smart which handles this URL fiddling in addition to the redirects.
I can't figure out how to comment on the accepted answer (this question might be closed), but I should note that r.header is now obsolete, so r.header['location'] should be replaced by r['location'] (per https://stackoverflow.com/a/6934503/1084675 )
rest-client follows the redirections for GET and HEAD requests without any additional configuration. It works very nice.
for result codes between 200 and 207, a RestClient::Response will be returned
for result codes 301, 302 or 307, the redirection will be followed if the request is a GET or a HEAD
for result code 303, the redirection will be followed and the request transformed into a GET
example of usage:
require 'rest-client'
RestClient.get 'http://example.com/resource'
The rest-client README also gives an example of following redirects with POST requests:
begin
RestClient.post('http://example.com/redirect', 'body')
rescue RestClient::MovedPermanently,
RestClient::Found,
RestClient::TemporaryRedirect => err
err.response.follow_redirection
end
Here is the code I came up with (derived from different examples) which will bail out if there are too many redirects (note that ensure_success is optional):
require "net/http"
require "uri"
class Net::HTTPResponse
def ensure_success
unless kind_of? Net::HTTPSuccess
warn "Request failed with HTTP #{#code}"
each_header do |h,v|
warn "#{h} => #{v}"
end
abort
end
end
end
def do_request(uri_string)
response = nil
tries = 0
loop do
uri = URI.parse(uri_string)
http = Net::HTTP.new(uri.host, uri.port)
request = Net::HTTP::Get.new(uri.request_uri)
response = http.request(request)
uri_string = response['location'] if response['location']
unless response.kind_of? Net::HTTPRedirection
response.ensure_success
break
end
if tries == 10
puts "Timing out after 10 tries"
break
end
tries += 1
end
response
end
Not sure if anyone is looking for this exact solution, but if you are trying to download an image http/https and store it to a variable
require 'open_uri_redirections'
require 'net/https'
web_contents = open('file_url_goes_here', :ssl_verify_mode => OpenSSL::SSL::VERIFY_NONE, :allow_redirections => :all) {|f| f.read }
puts web_contents

Ruby throws Timeout::Error when calling Net::HTTP.get on an HTTPS URL

I've tried this on a few machines on different networks, all running ruby 1.8.7 and I get the same result after a long wait.
Net::HTTP.get(URI.parse('https://encrypted.google.com/'))
Timeout::Error: execution expired
but HTTP works fine
Net::HTTP.get(URI.parse('http://www.google.com/'))
After the inital timeout I get an EOFError instead
EOFError: end of file reached
It's really got me stumped. If you have any ideas or you can let me know if you get the same results I'd really appreciate it.
I think you need to set use_ssl to true...
example:
uri = URI.parse("https://www.google.com/")
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
request = Net::HTTP::Get.new(uri.request_uri)
response = http.request(request)
puts response.body
This is cannibalized from the following Ruby Inside post.

Resources