Ruby's open-uri and cookies - ruby

I would like to store the cookies from one open-uri call and pass them to the next one. I can't seem to find the right docs for doing this. I'd appreciate it if you could tell me the right way to do this.
NOTES: w3.org is not the actual url, but it's shorter; pretend cookies matter here.
h1 = open("http://www.w3.org/")
h2 = open("http://www.w3.org/People/Berners-Lee/", "Cookie" => h1.FixThisSpot)
Update after 2 nays: While this wasn't intended as rhetorical question I guarantee that it's possible.
Update after tumbleweeds: See (the answer), it's possible. Took me a good while, but it works.

I thought someone would just know, but I guess it's not commonly done with open-uri.
Here's the ugly version that neither checks for privacy, expiration, the correct domain, nor the correct path:
h1 = open("http://www.w3.org/")
h2 = open("http://www.w3.org/People/Berners-Lee/",
"Cookie" => h1.meta['set-cookie'].split('; ',2)[0])
Yes, it works. No it's not pretty, nor fully compliant with recommendations, nor does it handle multiple cookies (as is).
Clearly, HTTP is a very straight-forward protocol, and open-uri lets you at most of it. I guess what I really needed to know was how to get the cookie from the h1 request so that it could be passed to the h2 request (that part I already knew and showed). The surprising thing here is how many people basically felt like answering by telling me not to use open-uri, and only one of those showed how to get a cookie set in one request passed to the next request.

You need to add a "Cookie" header.
I'm not sure if open-uri can do this or not, but it can be done using Net::HTTP.
# Create a new connection object.
conn = Net::HTTP.new(site, port)
# Get the response when we login, to set the cookie.
# body is the encoded arguments to log in.
resp, data = conn.post(login_path, body, {})
cookie = resp.response['set-cookie']
# Headers need to be in a hash.
headers = { "Cookie" => cookie }
# On a get, we don't need a body.
resp, data = conn.get(path, headers)

Thanks Matthew Schinckel your answer was really useful. Using Net::HTTP I was successful
# Create a new connection object.
site = "google.com"
port = 80
conn = Net::HTTP.new(site, port)
# Get the response when we login, to set the cookie.
# body is the encoded arguments to log in.
resp, data = conn.post(login_path, body, {})
cookie = resp.response['set-cookie']
# Headers need to be in a hash.
headers = { "Cookie" => cookie }
# On a get, we don't need a body.
resp, data = conn.get(path, headers)
puts resp.body

Depending on what you are trying to accomplish, check out webrat. I know it is usually used for testing, but it can also hit live sites, and it does a lot of the stuff that your web browser would do for you, like store cookies between requests and follow redirects.

you would have to roll your own cookie support by parsing the meta headers when reading and adding a cookie header when submitting a request if you are using open-uri. Consider using httpclient http://raa.ruby-lang.org/project/httpclient/ or something like mechanize instead http://mechanize.rubyforge.org/ as they have cookie support built in.

There is a RFC 2109 and RFC 2965 cookie jar implementation to be found here for does that want standard compliant cookie handling.
https://github.com/dwaite/cookiejar

Related

Using Ruby Script to perform a login

my goal is to use a ruby script to perform a login.
The website uses javascript to render the login form therefore I cannot use mechanize. I want to avoid using selenium,
If I were to login with false data, I can see under the network section, that an action url is performed ->
Request URL: https://www.example.com/admin/bocontroller/bocontroller.cfm?action=dologin
further down I can see the Form Data
->
username: Sample
password: 12345678
Based on this I tried to write several scripts (this being the closest i hope...)
require "net/http"
require "uri"
uri = URI.parse("https://www.eample.com/admin/bocontroller/bocontroller.cfm?action=dologin")
http = Net::HTTP.new(uri.host, uri.port)
request = Net::HTTP::Post.new(uri.request_uri)
request.set_form_data({'username' => 'Sample', 'password' => '12345678'})
request["Content-Type"] = "application/json"
response = http.request(request)
Unfortunately My script just stops running... and I am kind of lost. Can anyone give me some hints to lead me into the right direction? IS this the right approach?
As it seems to have gained some traction as a comment, I thought I'd move it to an answer.
There's a good chance this will be timing out to prevent CSRF attacks. Here's a link to the Rails docs explaining this: https://guides.rubyonrails.org/security.html#csrf-countermeasures.
In a nutshell, sites will send (and require) an authenticity token along with any potentially transformative request (POST, PUT, DELETE, etc.), in order to prevent people from sending such requests from outside the domain - as you're doing.
I'm not suggesting you have ill intent, though this prevents someone attempting to gain access to something they shouldn't, should their actions be designed to work in a manner beyond what the site intends.

Ruby basic syntax and Net::HTTP

I am completely new to ruby. I have the following code:
body = "hello"
site = "api.mysite.net"
port = 80
conn = Net::HTTP.new(site, port)
resp, data = conn.post("/v1/profile", body, {})
puts body
my questions are:
Where should I go for a library on how NET::HTTP.new() , conn.post() etc... works?
What does the comma between resp and data mean?
How come puts body gives me nothing even though I have hello defined initially? And when passed through the post(), I figure it would assign it a value? but instead puts resp.body actually gives me the http response.
This is all so new to me, just trying to get a handle on things.
Read the docs I guess, but you will need background knowledge on HTTP to really understand it.
That's shorthand for assigning two variables at the same time, assuming the right-hand side returns an array of 2 (or more) items.
You've posted the body in your request, resp.body is the body in the response. I don't know why body should be empty though. I would double-check that, but it sounds like a side effect of conn.post if anything.
BTW there are several nice 3rd-party gems which make HTTP client development much easier than dealing with Net::HTTP, e.g. RESTClient, Excon, HTTparty. Check these out. Or if you want to use the standard Ruby library, also look at Open URI as a higher-level API.

Maintaining session and cookies over a 302 redirect

I am trying to make fetch a PDF file that gets generated on-demand behind an auth wall. Based on my testing, the flow is as follows:
I make a GET request with several parameters (including auth credentials) to the appropriate page. That page validates my credentials and then processes my request. When the request is finished processing (nearly instantly), I am sent a 302 response that redirects me to the location of the generated PDF. This PDF can then only be accessed by that session.
Using a browser, there's really nothing strange that happens. I attempted to do the same via curl and wget without any optional parameters, but those both failed. I was able to get curl working by adding -L -b /tmp/cookie.txt as options, though (to follow redirects and store cookies).
According to the ruby-doc, using Net::HTTP.start should get me close to what I want. After playing around with it, I was indeed fairly close. I believe the only issue, however, was that my Set-Cookie values were different between requests, even though they were using the same http object in the same start block.
I tried keeping it as simple as possible and then expanding once I got the results I was looking for:
url = URI.parse("http://dev.example.com:8888/path/to/page.jsp?option1=test1&option2=test2&username=user1&password=password1")
Net::HTTP.start(url.host, url.port) do |http|
# Request the first URL
first_req = Net::HTTP::Get.new url
first_res = http.request first_req
# Grab the 302 redirect location (it will always be relative like "../servlet/sendfile/result/543675843657843965743895642865273847328.pdf")
redirect_loc = URI.parse(first_res['Location']
# Request the PDF
second_req = Net::HTTP::Get.new redirect_loc
second_res = http.request first_req
end
I also attempted to use http.get instead of creating a new request each time, but still no luck.
The problem is with cookie: it should be passed within the second request. Smth like:
second_req = Net::HTTP::Get.new(uri.path, {'Cookie' => first_req['Set-Cookie']})

How to properly close a HTTP connection in Ruby

I've been looking for a proper way to close a HTTP connection and have found nothing yet.
require 'net/http'
require 'uri'
uri = URI.parse("http://www.google.com")
http = Net::HTTP.new(uri.host, uri.port)
request = Net::HTTP::Get.new("/")
http.start
resp = http.request request
p resp.body
So far so good, but now when I try to close the connection guessing which method to use:
http.shutdown # error
http.close # error
when I check ri Net::HTTP.start, the documentation explicitly says
the caller is responsible for closing it (the connection) upon completion
I don't want to use the block form.
The method to close the connection is http.finish.
The net/http API is particularly confusing to use. It is generally easier to use a higher-level library instead, such as HTTParty or REST Client, which will provide a more intuitive API and take care of the lower level details for you.
If the optional block is given, the newly created Net::HTTP object is passed to it and closed when the block finishes.
from ruby-doc.org

Post request with body_stream and parameters

I'm building some kind of proxy.
When I call some url in a rack application, I forward that request to an other url.
The request I forward is a POST with a file and some parameters.
I want to add more parameters.
But the file can be quite big. So I send it with Net::HTTP#body_stream instead of Net::HTTP#body.
I get my request as a Rack::Request object and I create my Net::HTTP object with that.
req = Net::HTTP::Post.new(request.path_info)
req.body_stream = request.body
req.content_type = request.content_type
req.content_length = request.content_length
http = Net::HTTP.new(#host, #port)
res = http.request(req)
I've tried several ways to add the proxy's parameters. But it seems nothing in Net::HTTP allows to add parameters to a body_stream request, only to a body one.
Is there a simpler way to proxy a rack request like that ? Or a clean way to add my parameters to my request ?
Well.. as i see it, this is a normal behaviour. I'll explain why. If you only have access to a Rack::Request,(i guess that) your middleware does not parse the response (you do not include something like ActionController::ParamsParser), so you don't have access to a hash of parameters, but to a StringIo. This StringIO corresponds to a stream like:
Content-Type: multipart/form-data; boundary=AaB03x
--AaB03x
Content-Disposition: form-data; name="param1"
value1
--AaB03x
Content-Disposition: form-data; name="files"; filename="file1.txt"
Content-Type: text/plain
... contents of file1.txt ...
--AaB03x--
What you are trying to do with the Net::HTTP class is to: (1). parse the request into a hash of parameters; (2). merge the parameters hash with your own parameters; (3). recreate the request. The problem is that Net::HTTP library can't do (1), since it is a client library, not a server one.
Therefore, you can not escape parsing some how your request before adding the new parameters.
Possible solutions:
Insert ActionController::ParamsParser before your middleware. After that, you may use the excellent rest-client lib to do something like:
RestClient.post ('http://your_server' + request.path_info), :params => params.merge(your_params)
You can attempt to make a wrapper on the StringIO object, and add, at the end of stream,your own parameters. However, this is not trivial nor advisable.
Might be one year too late, but I had the same issue verifying Paypal IPNs. I wanted to forward back the IPN request to Paypal for verification but needed to add :cmd => '_notify-validate'.
Instead of modifying the body stream, or body, I appended it as part of the URL path, like so:
reply_request = Net::HTTP::Post.new(url.path + '?cmd=_notify-validate')
It seems a bit of a hack, but I think it's worth it if you aren't going to use it for anything else.

Resources