Net::HTTP Proxy list - ruby

I understand that you could use proxy in the ruby Net::HTTP. However, I have no idea how to do this with a bunch of proxy. I need the Net::HTTP to change to another proxy and send another post request after every post request. Also, is it possible to make the Net::HTTP to change to another proxy if the previous proxy is not working? If so, how?
Code I'm trying to implement the script in:
require 'net/http'
sleep(8)
http = Net::HTTP.new('URLHERE', 80)
http.read_timeout = 5000
http.use_ssl = false
path = 'PATHHERE'
data = '(DATAHERE)'
headers = {
'Referer' => 'REFERER HERE',
'Content-Type' => 'application/x-www-form-urlencoded; charset=UTF-8',
'User-Agent' => '(USERAGENTHERE)'}
resp, data = http.post(path, data, headers)
# Output on the screen -> we should get either a 302 redirect (after a successful login) or an error page
puts 'Code = ' + resp.code
puts 'Message = ' + resp.message
resp.each {|key, val| puts key + ' = ' + val}
puts data
end

Given an array of proxies, the following example will make a request through each proxy in the array until it receives a "302 Found" response. (This isn't actually a working example because Google doesn't accept POST requests, but it should work if you insert your own destination and working proxies.)
require 'net/http'
destination = URI.parse "http://www.google.com/search"
proxies = [
"http://proxy-example-1.net:8080",
"http://proxy-example-2.net:8080",
"http://proxy-example-3.net:8080"
]
# Create your POST request_object once
request_object = Net::HTTP::Post.new(destination.request_uri)
request_object.set_form_data({"q" => "stack overflow"})
proxies.each do |raw_proxy|
proxy = URI.parse raw_proxy
# Create a new http_object for each new proxy
http_object = Net::HTTP.new(destination.host, destination.port, proxy.host, proxy.port)
# Make the request
response = http_object.request(request_object)
# If we get a 302, report it and break
if response.code == "302"
puts "#{proxy.host}:#{proxy.port} responded with #{response.code} #{response.message}"
break
end
end
You should also probably do some error checking with begin ... rescue ... end each time you make a request. If you don't do any error checking and a proxy is down, control will never reach the line that checks for response.code == "302" -- the program will just fail with some type of connection timeout error.
See the Net::HTTPHeader docs for other methods that can be used to customize the Net::HTTP::Post object.

Related

Cannot make HTTP Delete request with Ruby's net/http library

I've been trying to make an API call to my server to delete a user record help on a dev database. When I use Fiddler to call the URL with the DELETE operation I am able to immediately delete the user record. When I call that same URL, again with the DELETE operation, from my script below, I get this error:
{"Message":"The requested resource does not support http method 'DELETE'."}
I have changed the url in my script below. The url I am using is definitely correct. I suspect that there is a logical error in my code that I haven't caught. My script:
require 'net/http'
require 'json'
require 'pp'
require 'uri'
def deleteUserRole
# prepare request
url= "http://my.database.5002143.access" # dev
uri = URI.parse(url)
request = Net::HTTP::Delete.new(uri.path)
http = Net::HTTP.new(uri.host, uri.port)
# send the request
response = http.request(request)
puts "response: \n"
puts response.body
puts "response code: " + response.code + "\n \n"
# parse response
buffer= response.body
result = JSON.parse(buffer)
status= result["Success"]
if status == true
then puts "passed"
else puts "failed"
end
end
deleteUserRole
It turns out that I was typing in the wrong command. I needed to change this line:
request = Net::HTTP::Delete.new(uri.path)
to this line:
request = Net::HTTP::Delete.new(uri)
By typing uri.path I was excluding part of the URL from the API call. When I was debugging, I would type puts uri and that would show me the full URL, so I was certain the URL was right. The URL was right, but I was not including the full URL in my DELETE call.
if you miss the parameters to pass while requesting delete, it won't work
you can do like this
uri = URI.parse('http://localhost/test')
http = Net::HTTP.new(uri.host, uri.port)
attribute_url = '?'
attribute_url << body.map{|k,v| "#{k}=#{v}"}.join('&')
request = Net::HTTP::Delete.new(uri.request_uri+attribute_url)
response = http.request(request)
where body is a hashmap where you can define query params as a hashmap.. while sending request it can be joined in the url by the code above.
ex:body = { :resname => 'res', :bucket_name => 'bucket', :uploaded_by => 'upload' }

Ruby HTTP post with session cookie

I'm trying to write a Ruby script to use the API on the image gallery site Piwigo, this requires you to login first with one HTTP post and upload an image with another post.
This is what I've got so far but it doesn't work, just returns a 401 error, can anyone see where I am going wrong?
require 'net/http'
require 'pp'
http = Net::HTTP.new('mydomain.com',80)
path = '/piwigo/ws.php'
data = 'method=pwg.session.login&username=admin&password=password'
resp, data = http.post(path, data, {})
if (resp.code == '200')
cookie = resp.response['set-cookie']
data = 'method=pwg.images.addSimple&image=image.jpg&category=7'
headers = { "Cookie" => cookie }
resp, data = http.post(path, data, headers)
puts resp.code
puts resp.message
end
Which gives this response when run;
$ ruby piwigo.rb
401
Unauthorized
There is a Perl example on their API page which I was trying to convert to Ruby http://piwigo.org/doc/doku.php?id=dev:webapi:pwg.images.addsimple
By using the nice_http gem: https://github.com/MarioRuiz/nice_http
NiceHttp will take care of your cookies so you don't have to do anything
require 'nice_http'
path = '/piwigo/ws.php'
data = '?method=pwg.session.login&username=admin&password=password'
http = NiceHttp.new('http://example.com')
resp = http.get(path+data)
if resp.code == 200
resp = http.post(path)
puts resp.code
puts resp.message
end
Also if you want you can add your own cookies by using http.cookies
You can use a gem called mechanize. It handles cookies transparently.

Ruby - net/http - following redirects

I've got a URL and I'm using HTTP GET to pass a query along to a page. What happens with the most recent flavor (in net/http) is that the script doesn't go beyond the 302 response. I've tried several different solutions; HTTPClient, net/http, Rest-Client, Patron...
I need a way to continue to the final page in order to validate an attribute tag on that pages html. The redirection is due to a mobile user agent hitting a page that redirects to a mobile view, hence the mobile user agent in the header. Here is my code as it is today:
require 'uri'
require 'net/http'
class Check_Get_Page
def more_http
url = URI.parse('my_url')
req, data = Net::HTTP::Get.new(url.path, {
'User-Agent' => 'Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_2 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8H7 Safari/6533.18.5'
})
res = Net::HTTP.start(url.host, url.port) {|http|
http.request(req)
}
cookie = res.response['set-cookie']
puts 'Body = ' + res.body
puts 'Message = ' + res.message
puts 'Code = ' + res.code
puts "Cookie \n" + cookie
end
end
m = Check_Get_Page.new
m.more_http
Any suggestions would be greatly appreciated!
To follow redirects, you can do something like this (taken from ruby-doc)
Following Redirection
require 'net/http'
require 'uri'
def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
url = URI.parse(uri_str)
req = Net::HTTP::Get.new(url.path, { 'User-Agent' => 'Mozilla/5.0 (etc...)' })
response = Net::HTTP.start(url.host, url.port, use_ssl: true) { |http| http.request(req) }
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then fetch(response['location'], limit - 1)
else
response.error!
end
end
print fetch('http://www.ruby-lang.org/')
Given a URL that redirects
url = 'http://httpbin.org/redirect-to?url=http%3A%2F%2Fhttpbin.org%2Fredirect-to%3Furl%3Dhttp%3A%2F%2Fexample.org'
A. Net::HTTP
begin
response = Net::HTTP.get_response(URI.parse(url))
url = response['location']
end while response.is_a?(Net::HTTPRedirection)
Make sure that you handle the case when there are too many redirects.
B. OpenURI
open(url).read
OpenURI::OpenRead#open follows redirects by default, but it doesn't limit the number of redirects.
I wrote another class for this based on examples given here, thank you very much everybody. I added cookies, parameters and exceptions and finally got what I need: https://gist.github.com/sekrett/7dd4177d6c87cf8265cd
require 'uri'
require 'net/http'
require 'openssl'
class UrlResolver
def self.resolve(uri_str, agent = 'curl/7.43.0', max_attempts = 10, timeout = 10)
attempts = 0
cookie = nil
until attempts >= max_attempts
attempts += 1
url = URI.parse(uri_str)
http = Net::HTTP.new(url.host, url.port)
http.open_timeout = timeout
http.read_timeout = timeout
path = url.path
path = '/' if path == ''
path += '?' + url.query unless url.query.nil?
params = { 'User-Agent' => agent, 'Accept' => '*/*' }
params['Cookie'] = cookie unless cookie.nil?
request = Net::HTTP::Get.new(path, params)
if url.instance_of?(URI::HTTPS)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
end
response = http.request(request)
case response
when Net::HTTPSuccess then
break
when Net::HTTPRedirection then
location = response['Location']
cookie = response['Set-Cookie']
new_uri = URI.parse(location)
uri_str = if new_uri.relative?
url + location
else
new_uri.to_s
end
else
raise 'Unexpected response: ' + response.inspect
end
end
raise 'Too many http redirects' if attempts == max_attempts
uri_str
# response.body
end
end
puts UrlResolver.resolve('http://www.ruby-lang.org')
The reference that worked for me is here: http://shadow-file.blogspot.co.uk/2009/03/handling-http-redirection-in-ruby.html
Compared to most examples (including the accepted answer here), it's more robust as it handles URLs which are just a domain (http://example.com - needs to add a /), handles SSL specifically, and also relative URLs.
Of course you would be better off using a library like RESTClient in most cases, but sometimes the low-level detail is necessary.
Maybe you can use curb-fu gem here https://github.com/gdi/curb-fu the only thing is some extra code to make it follow redirect. I've used the following before. Hope it helps.
require 'rubygems'
require 'curb-fu'
module CurbFu
class Request
module Base
def new_meth(url_params, query_params = {})
curb = old_meth url_params, query_params
curb.follow_location = true
curb
end
alias :old_meth :build
alias :build :new_meth
end
end
end
#this should follow the redirect because we instruct
#Curb.follow_location = true
print CurbFu.get('http://<your path>/').body
If you do not need to care about the details at each redirection, you can use the library Mechanize
require 'mechanize'
agent = Mechanize.new
begin
response = #agent.get(url)
rescue Mechanize::ResponseCodeError
// response codes other than 200, 301, or 302
rescue Timeout::Error
rescue Mechanize::RedirectLimitReachedError
rescue StandardError
end
It will return the destination page.
Or you can turn off redirection by this :
agent.redirect_ok = false
Or you can optionally change some settings at the request
agent.user_agent = "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.106 Mobile Safari/537.36"

Accessing Headers for Net::HTTP::Post in ruby

I have the following bit of code:
uri = URI.parse("https://rs.xxx-travel.com/wbsapi/RequestListenerServlet")
https = Net::HTTP.new(uri.host,uri.port)
https.use_ssl = true
req = Net::HTTP::Post.new(uri.path)
req.body = searchxml
req["Accept-Encoding"] ='gzip'
res = https.request(req)
This normally works fine but the server at the other side is complaining about something in my XML and the techies there need the xml message AND the headers that are being sent.
I've got the xml message, but I can't work out how to get at the Headers that are being sent with the above.
To access headers use the each_header method:
# Header being sent (the request object):
req.each_header do |header_name, header_value|
puts "#{header_name} : #{header_value}"
end
# Works with the response object as well:
res.each_header do |header_name, header_value|
puts "#{header_name} : #{header_value}"
end
you can add:
https.set_debug_output $stderr
before the request and you will see in console the real http request sent to the server.
very useful to debug this kind of scenarios.
Take a look at the docs for Net::HTTP's post method. It takes the path of the uri value, the data (XML) you want to post, then the headers you want to set. It returns the response and the body as a two-element array.
I can't test this because you've obscured the host, and odds are good it takes a registered account, but the code looks correct from what I remember when using Net::HTTP.
require 'net/http'
require 'uri'
uri = URI.parse("https://rs.xxx-travel.com/wbsapi/RequestListenerServlet")
https = Net::HTTP.new(uri.host, uri.port)
https.use_ssl = true
req, body = https.post(uri.path, '<xml><blah></blah></xml>', {"Accept-Encoding" => 'gzip'})
puts "#{body.size} bytes received."
req.each{ |h,v| puts "#{h}: #{v}" }
Look at Typhoeus as an alternate, and, in my opinion, easier to use gem, especially the "Making Quick Requests" section.

How to implement cookie support in ruby net/http?

I'd like to add cookie support to a ruby class utilizing net/http to browse the web. Cookies have to be stored in a file to survive after the script has ended. Of course I can read the specs and write some kind of a handler, use some cookie.txt format and so on, but it seems to mean reinventing the wheel. Is there a better way to accomplish this task? Maybe some kind of a cooie jar class to take care of cookies?
The accepted answer will not work if your server returns and expects multiple cookies. This could happen, for example, if the server returns a set of FedAuth[n] cookies. If this affects you, you might want to look into using something along the lines of the following instead:
http = Net::HTTP.new('https://example.com', 443)
http.use_ssl = true
path1 = '/index.html'
path2 = '/index2.html'
# make a request to get the server's cookies
response = http.get(path)
if (response.code == '200')
all_cookies = response.get_fields('set-cookie')
cookies_array = Array.new
all_cookies.each { | cookie |
cookies_array.push(cookie.split('; ')[0])
}
cookies = cookies_array.join('; ')
# now make a request using the cookies
response = http.get(path2, { 'Cookie' => cookies })
end
Taken from DZone Snippets
http = Net::HTTP.new('profil.wp.pl', 443)
http.use_ssl = true
path = '/login.html'
# GET request -> so the host can set his cookies
resp, data = http.get(path, nil)
cookie = resp.response['set-cookie'].split('; ')[0]
# POST request -> logging in
data = 'serwis=wp.pl&url=profil.html&tryLogin=1&countTest=1&logowaniessl=1&login_username=blah&login_password=blah'
headers = {
'Cookie' => cookie,
'Referer' => 'http://profil.wp.pl/login.html',
'Content-Type' => 'application/x-www-form-urlencoded'
}
resp, data = http.post(path, data, headers)
# Output on the screen -> we should get either a 302 redirect (after a successful login) or an error page
puts 'Code = ' + resp.code
puts 'Message = ' + resp.message
resp.each {|key, val| puts key + ' = ' + val}
puts data
update
#To save the cookies, you can use PStore
cookies = PStore.new("cookies.pstore")
# Save the cookie
cookies.transaction do
cookies[:some_identifier] = cookie
end
# Retrieve the cookie back
cookies.transaction do
cookie = cookies[:some_identifier]
end
The accepted answer does not work. You need to access the internal representation of the response header where the multiple set-cookie values are stores separately and then remove everything after the first semicolon from these string and join them together. Here is code that works
r = http.get(path)
cookie = {'Cookie'=>r.to_hash['set-cookie'].collect{|ea|ea[/^.*?;/]}.join}
r = http.get(next_path,cookie)
Use http-cookie, which implements RFC-compliant parsing and rendering, plus a jar.
A crude example that happens to follow a redirect post-login:
require 'uri'
require 'net/http'
require 'http-cookie'
uri = URI('...')
jar = HTTP::CookieJar.new
Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https') do |http|
req = Net::HTTP::Post.new uri
req.form_data = { ... }
res = http.request req
res.get_fields('Set-Cookie').each do |value|
jar.parse(value, req.uri)
end
fail unless res.code == '302'
req = Net::HTTP::Get.new(uri + res['Location'])
req['Cookie'] = HTTP::Cookie.cookie_value(jar.cookies(uri))
res = http.request req
end
Why do this? Because the answers above are incredibly insufficient and flat out don't work in many RFC-compliant scenarios (happened to me), so relying on the very lib implementing just what's needed is infinitely more robust if you want to handle more than one particular case.
I've used Curb and Mechanize for a similar project.
Just enable cookies support and save the cookies to a temp cookiejar...
If your using net/http or packages without cookie support built in, you will need to write your own cookie handling.
You can send receive cookies using headers.
You can store the header in any persistence framework. Whether it is some sort of database, or files.

Resources