Can I make asynchronous requests with Ruby's Typhoeus? - ruby

I am using Typhoeus and would like to make a single request without blocking for the response. Later on, I might check the response, or I might not. The point is I don't want the code execution to wait for the response.
Is there a way to do this built-in to Typhoeus?
Otherwise I guess I have to use threads and do it myself?

You could try using a thread:
response = nil
request_thread = Thread.new {
# Set up the request object here
response = request.response
}
From there you can check response == nil to see if the request has been made yet, and you can call request_thread.join to block until the thread is done executing.

I would suggest looking into the 'unirest' gem for Ruby.
As far as I am aware, Typhoeus blocks on the 'hydra.run' call
With Unirest, it does not block on the get / post / put / etc call, but continues to run. If you want, you can store the 'object' in a hash or an array with an identifier to retrieve later, like so:
identifier_requests['id'] = Unirest.post(url,headers: headers, parameters: param, auth: auth)
Then to block, or retrieve responses, use one of the calls on the response object:
response_code = (identifier_requests['id']).code
response.body
http://unirest.io/ruby.html

Typhoeus has non-blocking calls built-in. From their docs:
request = Typhoeus::Request.new("www.example.com", followlocation: true)
request.on_complete do |response|
if response.success?
# hell yeah
elsif response.timed_out?
# aw hell no
log("got a time out")
elsif response.code == 0
# Could not get an http response, something's wrong.
log(response.return_message)
else
# Received a non-successful http response.
log("HTTP request failed: " + response.code.to_s)
end
end
request.run
This is from their docs at https://github.com/typhoeus/typhoeus

Related

How do I get a redirected status code using NET:HTTP?

Similar to "getting the status code of a HTTP redirected page", but with NET::HTTP instead of curb I am making a GET request to a page that that will redirect:
response = Net::HTTP.get_response(URI.parse("http://www.wikipedia.org/wiki/URL_redirection"))
puts response.code #{
puts response['location']
=> 301
en.wikipedia.org/wiki/URL_redirection
The problem is that I want to know the status code of the redirected page. In this case it is 200, but in my app I want to check if it is 200 or something else.
The solution I've seen is to just call get_response(response['location']), but that won't work in my application because the way the redirect is designed makes it so that the redirect can only be followed once. Since the first GET consumes that one redirect, I can't then follow it again.
Is there some way to get the last status code that is a result of a GET?
EDIT: Further clarification of the situation:
The application that I'm sending GET to has a single sign-on authentication mechanism where, if I want to access 'myapp/mypage', I have to first send a post:
postResponse = Net::HTTP.post_form(URI.parse("http://myapp.com/trusted"), {"username" => #username})
Then make the GET request to:
'http://myapp.com/trusted/#{postResponse.body}/mypage
*The postResponse.body is a 'ticket' which can be redeemed once.
That GET verifies that the ticket is valid and then redirects to:
myapp.com/mypage
So whether that ticket is valid or not, I get a 301.
I want to check the status code of the final get to myapp.com/mypage.
If I manually try to follow the redirect, whether it's a HEAD request or a GET, the original redirect will have already consumed the ticket, so I will get an error that the ticket is expired even if the original redirect was a 200.
The Net::HTTP documentation has example code showing how to deal with redirects. Have you tried it? It should make it easy to get inside the redirect mechanism and grab statuses for later.
Here's their example:
Following Redirection
Each Net::HTTPResponse object belongs to a class for its response code.
For example, all 2XX responses are instances of a Net::HTTPSuccess subclass, a 3XX response is an instance of a Net::HTTPRedirection subclass and a 200 response is an instance of the Net::HTTPOK class. For details of response classes, see the section “HTTP Response Classes” below.
Using a case statement you can handle various types of responses properly:
def fetch(uri_str, limit = 10)
# You should choose a better exception.
raise ArgumentError, 'too many HTTP redirects' if limit == 0
response = Net::HTTP.get_response(URI(uri_str))
case response
when Net::HTTPSuccess then
response
when Net::HTTPRedirection then
location = response['location']
warn "redirected to #{location}"
fetch(location, limit - 1)
else
response.value
end
end
print fetch('http://www.ruby-lang.org')
A minor change like this should help:
require 'net/http'
RESPONSES = []
def fetch(uri_str, limit = 10)
# You should choose a better exception.
raise ArgumentError, 'too many HTTP redirects' if limit == 0
response = Net::HTTP.get_response(URI(uri_str))
RESPONSES << response
case response
when Net::HTTPSuccess then
response
when Net::HTTPRedirection then
location = response['location']
warn "redirected to #{location}"
fetch(location, limit - 1)
else
response.value
end
end
print fetch('http://jigsaw.w3.org/HTTP/300/302.html')
puts RESPONSES.join("\n") # =>
I see this when I run it:
redirected to http://jigsaw.w3.org/HTTP/300/Overview.html
#<Net::HTTPOK:0x007f9e82a1e050>#<Net::HTTPFound:0x007f9e82a2daa0>
#<Net::HTTPOK:0x007f9e82a1e050>
If it's enough just to make an HTTP HEAD request without 'consuming' your URL (this would be the usual expectation for a HEAD request), you can do it like this:
2.0.0-p195 :143 > result = Net::HTTP.start('www.google.com') { |http| http.head '/' }
=> #<Net::HTTPFound 302 Found readbody=true>
So in your example you'd do this:
...
result = Net::HTTP.start(response.uri.host) { |http| http.head response.uri.path }
If you want to preserve a history of response codes, you could try this. This retains the last 5 response codes from calls to get_response and exposes them through a Net::HTTP.history method.
module Net
class << HTTP
alias_method :_get_response, :get_response
def get_response *args, &block
resp = _get_response *args, &block
#history = (#history || []).push(resp.code).last 5
resp
end
def history
#history || []
end
end
end
(I don't entirely get the usage scenario, so adapt to your needs)

Ruby do-block, and RestClient

I'm new to Ruby.
I noticed that if I do (assume "request" has been defined):
RestClient::Request.execute(request) do |response|
print response
end
Then response is empty. But if I do
response = RestClient::Request.execute(request)
print response
Then response has something.
What's the reason why the second one works and the first one doesn't?
The documentation for RestClient::Request.execute doesn't show it takes a block:
def self.execute(args)
new(args).execute
end
It only returns the value returned by calling execute on an anonymous instance of RestClient.

Retrieving full request string using Ruby curl

I intend to send a request like the following:
c = Curl::Easy.http_post("https://example.com", json_string
) do |curl|
curl.headers['Accept'] = 'application/json'
curl.headers['Content-Type'] = 'application/json'
curl.headers['Api-Version'] = '2.2'
end
I want to log the exact http request that is being made. Is there a way to get the actual request that was made (base path, query parameters, headers and body)?
The on_debug handler has helped me before. In your example you could try:
curl.on_debug do |type, data|
puts type, data
end
You can reach the solution in differents manner:
Inside your block you can put:
curl.verbose = true # that prints a detailed output of the connection
Or outside the block:
c.url # return the url with queries
c.total_time # retrieve the total time for the prev transfer (name resolving, TCP,...)
c.header_str # return the response header
c.headers # return your call header
c.body_str # return the body of the response
Remember to call c.perform (if not yet performed) before call these methods.
Many more option can be found here: http://curb.rubyforge.org/classes/Curl/Easy.html#M000001

User-Agent in HTTP requests, Ruby

I'm pretty new to Ruby. I've tried looking over the online documentation, but I haven't found anything that quite works. I'd like to include a User-Agent in the following HTTP requests, bot get_response() and get(). Can someone point me in the right direction?
# Preliminary check that Proggit is up
check = Net::HTTP.get_response(URI.parse(proggit_url))
if check.code != "200"
puts "Error contacting Proggit"
return
end
# Attempt to get the json
response = Net::HTTP.get(URI.parse(proggit_url))
if response.nil?
puts "Bad response when fetching Proggit json"
return
end
Amir F is correct, that you may enjoy using another HTTP client like RestClient or Faraday, but if you wanted to stick with the standard Ruby library you could set your user agent like this:
url = URI.parse(proggit_url)
req = Net::HTTP::Get.new(proggit_url)
req.add_field('User-Agent', 'My User Agent Dawg')
res = Net::HTTP.start(url.host, url.port) {|http| http.request(req) }
res.body
Net::HTTP is very low level, I would recommend using the rest-client gem - it will also follows redirects automatically and be easier for you to work with, i.e:
require 'rest_client'
response = RestClient.get proggit_url
if response.code != 200
# do something
end

Ruby - Validate and update URL

I've been trying to modify this method from redirecting and returning the contents of the url to returning new valid url instead.
After reading up on the Net::HTTP object, I'm still not sure how exactly the get_response method works. Is this what's downloading the page? is there another method I could call that would just ping the url instead of downloading it?
require 'net/http'
def validate(url)
uri = URI.parse(url)
response = Net::HTTP.get_response(uri)
case response
when Net::HTTPSuccess
return response
when Net::HTTPRedirection
return validate(response['location'])
else
return nill
end
end
puts validate('http://somesite.com/somedir/mypage.html')
You are correct that get_response sends an HTTP GET request to the server, which requests the whole page.
You want to use a HEAD request instead of GET. This requests the same HTTP response header that a GET request would get, including the status code (200, 404, etc.), but without downloading the whole page.
See the request_head and head methods of Net::HTTP. For example
url = URI.parse('http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/index.html')
res = Net::HTTP.start(url.host, url.port) {|http|
http.head(url.path)
}
puts res.class
Do you mean, by 'ping the url', you want to know whether the url request returns an HTTP 200 response?
I haven't looked at the implementation of get_response, but I think it just sends out an HTTP GET request, by the looks of it.
If you want to check for the HTTP 200 response, I guess you could just keep doing get_response until you get HTTPSuccess && HTTPOK.

Resources