How to make an HTTP GET with modified headers? - ruby

What is the best way to make an HTTP GET request in Ruby with modified headers?
I want to get a range of bytes from the end of a log file and have been toying with the following code, but the server is throwing back a response saying that "it is a request that the server could not understand" (the server is Apache).
require 'net/http'
require 'uri'
#with #address, #port, #path all defined elsewhere
httpcall = Net::HTTP.new(#address, #port)
headers = {
'Range' => 'bytes=1000-'
}
resp, data = httpcall.get2(#path, headers)
Is there a better way to define headers in Ruby?
Does anyone know why this would be failing against Apache? If I do a get in a browser to http://[address]:[port]/[path] I get the data I am seeking without issue.

Created a solution that worked for me (worked very well) - this example getting a range offset:
require 'uri'
require 'net/http'
size = 1000 #the last offset (for the range header)
uri = URI("http://localhost:80/index.html")
http = Net::HTTP.new(uri.host, uri.port)
headers = {
'Range' => "bytes=#{size}-"
}
path = uri.path.empty? ? "/" : uri.path
#test to ensure that the request will be valid - first get the head
code = http.head(path, headers).code.to_i
if (code >= 200 && code < 300) then
#the data is available...
http.get(uri.path, headers) do |chunk|
#provided the data is good, print it...
print chunk unless chunk =~ />416.+Range/
end
end

If you have access to the server logs, try comparing the request from the browser with the one from Ruby and see if that tells you anything. If this isn't practical, fire up Webrick as a mock of the file server. Don't worry about the results, just compare the requests to see what they are doing differently.
As for Ruby style, you could move the headers inline, like so:
httpcall = Net::HTTP.new(#address, #port)
resp, data = httpcall.get2(#path, 'Range' => 'bytes=1000-')
Also, note that in Ruby 1.8+, what you are almost certainly running, Net::HTTP#get2 returns a single HTTPResponse object, not a resp, data pair.

Related

Ruby HTTP post with session cookie

I'm trying to write a Ruby script to use the API on the image gallery site Piwigo, this requires you to login first with one HTTP post and upload an image with another post.
This is what I've got so far but it doesn't work, just returns a 401 error, can anyone see where I am going wrong?
require 'net/http'
require 'pp'
http = Net::HTTP.new('mydomain.com',80)
path = '/piwigo/ws.php'
data = 'method=pwg.session.login&username=admin&password=password'
resp, data = http.post(path, data, {})
if (resp.code == '200')
cookie = resp.response['set-cookie']
data = 'method=pwg.images.addSimple&image=image.jpg&category=7'
headers = { "Cookie" => cookie }
resp, data = http.post(path, data, headers)
puts resp.code
puts resp.message
end
Which gives this response when run;
$ ruby piwigo.rb
401
Unauthorized
There is a Perl example on their API page which I was trying to convert to Ruby http://piwigo.org/doc/doku.php?id=dev:webapi:pwg.images.addsimple
By using the nice_http gem: https://github.com/MarioRuiz/nice_http
NiceHttp will take care of your cookies so you don't have to do anything
require 'nice_http'
path = '/piwigo/ws.php'
data = '?method=pwg.session.login&username=admin&password=password'
http = NiceHttp.new('http://example.com')
resp = http.get(path+data)
if resp.code == 200
resp = http.post(path)
puts resp.code
puts resp.message
end
Also if you want you can add your own cookies by using http.cookies
You can use a gem called mechanize. It handles cookies transparently.

Ruby equivalent for setting HTTP GET headers

In C# it was fairly simple and didn't take more than a couple minutes to google:
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(#"http://www.example.com?q=someValue");
request.Headers.Add("Authorization: OAuth realm=\"example.com\" oauth_consumer_key=\"BCqrstoO\" ... so on and so forth");
string resultString = "";
using (StreamReader read = new StreamReader(request.GetResponse().GetResponseStream(), true))
{
resultString = read.ReadToEnd();
}
Trying to do it in Ruby hasn't quite been as straight forward (or is just something stupid that I'm missing).
I have been looking and the closest things I've come to finding my answer are How to make an HTTP GET with modified headers? and Send Custom Headers in Ruby.
So my problem, I suppose, boils down to
How do I set the headers as just a just a straight forward string?
Why do these two examples show headers formatted the way they are?
Is what I'm asking for even good convention and if not, how do I format what I'm trying to do in the convention these Ruby methods are asking for?
So far I tried the two examples and here's my most recent non-working attempt:
headers = "Authorization: OAuth realm=\"example.com\" oauth_consumer_key=\"BCqrstoO\" ... so on and so forth"
uri = URI("www.example.com")
http = Net::HTTP.new(uri.host, uri.port)
http.get(uri.path, headers) do |chunk|
puts chunk
end
Use open-uri. Example:
require 'open-uri'
open("http://www.ruby-lang.org/en/",
"User-Agent" => "Ruby/#{RUBY_VERSION}",
"From" => "foo#bar.invalid",
"Referer" => "http://www.ruby-lang.org/") {|f|
# ...
}
Just in case you check this at this point on time, the Net:HTTPRequest object allows you to add headers easily.
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new uri
request['my-header'] = '1'
http.request request do |response|
puts response
end
end

Accessing Headers for Net::HTTP::Post in ruby

I have the following bit of code:
uri = URI.parse("https://rs.xxx-travel.com/wbsapi/RequestListenerServlet")
https = Net::HTTP.new(uri.host,uri.port)
https.use_ssl = true
req = Net::HTTP::Post.new(uri.path)
req.body = searchxml
req["Accept-Encoding"] ='gzip'
res = https.request(req)
This normally works fine but the server at the other side is complaining about something in my XML and the techies there need the xml message AND the headers that are being sent.
I've got the xml message, but I can't work out how to get at the Headers that are being sent with the above.
To access headers use the each_header method:
# Header being sent (the request object):
req.each_header do |header_name, header_value|
puts "#{header_name} : #{header_value}"
end
# Works with the response object as well:
res.each_header do |header_name, header_value|
puts "#{header_name} : #{header_value}"
end
you can add:
https.set_debug_output $stderr
before the request and you will see in console the real http request sent to the server.
very useful to debug this kind of scenarios.
Take a look at the docs for Net::HTTP's post method. It takes the path of the uri value, the data (XML) you want to post, then the headers you want to set. It returns the response and the body as a two-element array.
I can't test this because you've obscured the host, and odds are good it takes a registered account, but the code looks correct from what I remember when using Net::HTTP.
require 'net/http'
require 'uri'
uri = URI.parse("https://rs.xxx-travel.com/wbsapi/RequestListenerServlet")
https = Net::HTTP.new(uri.host, uri.port)
https.use_ssl = true
req, body = https.post(uri.path, '<xml><blah></blah></xml>', {"Accept-Encoding" => 'gzip'})
puts "#{body.size} bytes received."
req.each{ |h,v| puts "#{h}: #{v}" }
Look at Typhoeus as an alternate, and, in my opinion, easier to use gem, especially the "Making Quick Requests" section.

use ruby to get content length of URLs

I am trying to write a ruby script that gets some details about files on a website using net/http. My code looks like this:
require 'open-uri'
require 'net/http'
url = URI.parse asset
res = Net::HTTP.start(url.host, url.port) {|http|
http.get(asset)
}
headers = res.to_hash
p headers
I would like to get two pieces of information from this request: the total length of the content inflated, and (as appropriate) the length of the content deflated.
Sometimes, the headers will include a content-length parameter, which appears to be the gzipped length of the content. I can also approximate the inflated size of the content using res.body.length, but this has not been foolproof by any stretch of the imagination. The documentation on net/http says that gzip headers are removed from the list automatically (to help me, gee thanks) so I cannot seem to get a reliable handle on this information.
Any help is appreciated (including other gems if they will do this more easily).
Got it! The "magic" behavior here only occurs if you don't specify your own accept-encoding header. Amended code as follows:
require 'open-uri'
require 'net/http'
require 'date'
require 'zlib'
headers = { "accept-encoding" => "gzip;q=1.0,deflate;q=0.6,identity;q=0.3" }
url = URI.parse asset
res = Net::HTTP.start(url.host, url.port) {|http|
http.get(asset, headers)
}
headers = res.to_hash
gzipped = headers['content-encoding'] && headers['content-encoding'][0] == "gzip"
content = gzipped ? Zlib::GzipReader.new(StringIO.new(res.body)).read : res.body
full_length = content.length,
compressed_length = (headers["content-length"] && headers["content-length"][0] || res.body.length),
You can try use sockets to send HEAD request to the server with is faster (no content) and don't send "Accept-Encoding: gzip", so your response will not be gzip.

How to implement cookie support in ruby net/http?

I'd like to add cookie support to a ruby class utilizing net/http to browse the web. Cookies have to be stored in a file to survive after the script has ended. Of course I can read the specs and write some kind of a handler, use some cookie.txt format and so on, but it seems to mean reinventing the wheel. Is there a better way to accomplish this task? Maybe some kind of a cooie jar class to take care of cookies?
The accepted answer will not work if your server returns and expects multiple cookies. This could happen, for example, if the server returns a set of FedAuth[n] cookies. If this affects you, you might want to look into using something along the lines of the following instead:
http = Net::HTTP.new('https://example.com', 443)
http.use_ssl = true
path1 = '/index.html'
path2 = '/index2.html'
# make a request to get the server's cookies
response = http.get(path)
if (response.code == '200')
all_cookies = response.get_fields('set-cookie')
cookies_array = Array.new
all_cookies.each { | cookie |
cookies_array.push(cookie.split('; ')[0])
}
cookies = cookies_array.join('; ')
# now make a request using the cookies
response = http.get(path2, { 'Cookie' => cookies })
end
Taken from DZone Snippets
http = Net::HTTP.new('profil.wp.pl', 443)
http.use_ssl = true
path = '/login.html'
# GET request -> so the host can set his cookies
resp, data = http.get(path, nil)
cookie = resp.response['set-cookie'].split('; ')[0]
# POST request -> logging in
data = 'serwis=wp.pl&url=profil.html&tryLogin=1&countTest=1&logowaniessl=1&login_username=blah&login_password=blah'
headers = {
'Cookie' => cookie,
'Referer' => 'http://profil.wp.pl/login.html',
'Content-Type' => 'application/x-www-form-urlencoded'
}
resp, data = http.post(path, data, headers)
# Output on the screen -> we should get either a 302 redirect (after a successful login) or an error page
puts 'Code = ' + resp.code
puts 'Message = ' + resp.message
resp.each {|key, val| puts key + ' = ' + val}
puts data
update
#To save the cookies, you can use PStore
cookies = PStore.new("cookies.pstore")
# Save the cookie
cookies.transaction do
cookies[:some_identifier] = cookie
end
# Retrieve the cookie back
cookies.transaction do
cookie = cookies[:some_identifier]
end
The accepted answer does not work. You need to access the internal representation of the response header where the multiple set-cookie values are stores separately and then remove everything after the first semicolon from these string and join them together. Here is code that works
r = http.get(path)
cookie = {'Cookie'=>r.to_hash['set-cookie'].collect{|ea|ea[/^.*?;/]}.join}
r = http.get(next_path,cookie)
Use http-cookie, which implements RFC-compliant parsing and rendering, plus a jar.
A crude example that happens to follow a redirect post-login:
require 'uri'
require 'net/http'
require 'http-cookie'
uri = URI('...')
jar = HTTP::CookieJar.new
Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https') do |http|
req = Net::HTTP::Post.new uri
req.form_data = { ... }
res = http.request req
res.get_fields('Set-Cookie').each do |value|
jar.parse(value, req.uri)
end
fail unless res.code == '302'
req = Net::HTTP::Get.new(uri + res['Location'])
req['Cookie'] = HTTP::Cookie.cookie_value(jar.cookies(uri))
res = http.request req
end
Why do this? Because the answers above are incredibly insufficient and flat out don't work in many RFC-compliant scenarios (happened to me), so relying on the very lib implementing just what's needed is infinitely more robust if you want to handle more than one particular case.
I've used Curb and Mechanize for a similar project.
Just enable cookies support and save the cookies to a temp cookiejar...
If your using net/http or packages without cookie support built in, you will need to write your own cookie handling.
You can send receive cookies using headers.
You can store the header in any persistence framework. Whether it is some sort of database, or files.

Resources