I'm coding a native Ruby script to scrap a website using Nokogiri, whenever I pass proxy options to the open-uri open() method, it returns 407 Proxy Authentication Required but my options does have the authentification details, here's my code
proxy_url = URI.parse("http://12.34.567.89:PORT")
session = Nokogiri::HTML(open("http://google.com", :proxy_http_basic_authentication =>[proxy_url, "username", "password"]
Note: As my proxy is premium, I have replaced real proxy credentials with fake one
I have a restrictive proxy at work but the followig works.
Try the code with your proxy credentials.
I used Nokogiri here for parsing but you don't realy need it for getting the HTML.
require 'net/http'
require 'uri'
require 'nokogiri'
url = 'http://stackoverflow.com/questions/32818853/ruby-open-uri-proxy-authentication-fails'
proxy_host, proxy_port, proxy_user, proxy_pass = '****', 8080, "*****", "*****"
uri = URI.parse(url)
Net::HTTP::Proxy(proxy_host, proxy_port, proxy_user, proxy_pass).start(uri.host, uri.port) do |http|
http.get(uri.path) do |str|
puts Nokogiri::HTML(str).text
end
end
Related
I'm trying to write some scripts in Ruby to interface with Guild Wars 2's API (https://api.guildwars2.com/v2)
At the bottom of that page it has this info:
APIs which require authentication need to be passed an API key belonging to
the account to be accessed. The API key must have the appropriate permissions
associated with it (/v2/tokeninfo can be used to inspect key permissions). Keys
can be generated on the ArenaNet account site.
Keys can be passed either via query parameter or HTTP header. Our servers do
not support preflighted CORS requests, so if your application is running
in the user's browser you'll need to user the query parameter.
To pass via query parameter, include "?access_token=" in your request.
To pass via HTTP header, include "Authentication: Bearer (API key)".
The code I'm working with right now is as follows:
class Gw2
attr_reader :response, :uri, :http
def initialize
#uri = URI.parse('https://api.guildwars2.com/v2')
#http = Net::HTTP.new(#uri.host, #uri.port)
#http.use_ssl = true
#http.verify_mode = OpenSSL::SSL::VERIFY_NONE
end
def wallet
path ="/v2/account/wallet"
#response = #http.get(path).body
end
end
I'm not sure how to go about setting that up.
Here is a little example:
require 'net/http'
require 'uri'
url = URI.parse('http://some.url')
req = Net::HTTP::Get.new(url.path)
req.add_field('X-Forwarded-For', '0.0.0.0')
# For content type, you could also use content_type=(type, params={})
# req.set_form_data({'query' => 'search me'})
# req['X-Forwarded-For'] = '0.0.0.0'
res = Net::HTTP.new(url.host, url.port).start do |http|
http.request(req)
end
puts res.body
Im using Net::HTTP in my ruby code to make http requests. For example to make a post request i do
require 'net/http'
Net::HTTP.post_form(url,{'email' => email,'password' => password})
This works. But im unable to make a delete request, i.e.
require 'net/http'
Net::HTTP::Delete(url)
gives the following error
NoMethodError: undefined method `Delete' for Net::HTTP:Class
The documentation at http://ruby-doc.org/stdlib-1.9.3/libdoc/net/http/rdoc/Net/HTTP.html shows Delete is available. So why is it not working in my case ?
Thank You
The documentation tells you that Net::HTTP::Delete is a class, not a method.
Try Net::HTTP.new('www.server.com').delete('/path') instead.
uri = URI('http://localhost:8080/customer/johndoe')
http = Net::HTTP.new(uri.host, uri.port)
req = Net::HTTP::Delete.new(uri.path)
res = http.request(req)
puts "deleted #{res}"
Simple post and delete requests, see docs for more:
puts Net::HTTP.new("httpbin.org").post("/post", "a=1").body
puts Net::HTTP.new("httpbin.org").delete("/delete").body
This works for me:
uri = URI(YOUR_URL)
req = Net::HTTP::Delete.new(uri, {}) # params on second place
response = Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http|
http.request req
end
So far as I can tell the Google::Reader API is working fine, in that it returns an sid successfully. However, lower level interactions with gmail won't run properly:
warning: peer certificate won't be verified in this SSL session
#<Google::Reader::Base:0xb76efa0c
#email="hawat.thufir",
#password="pword",
#sid=
"DQAAAL4AAACq-Wrm1V_anY1sV4r_3kA4EuRax9oTt5z7upD6NNfT0e7bsN-8WA7cQOTt7zypI5fymS9Ux8QTtyu-7xal9c6szb2ZoeBR5dwPH_m7OrBe6ICkKY-dPus0_g5DFW6tckpCZmJIyrP9zfUQKJzGYjnYKJzJEJYFEdvMu756Hl68qeD6AuGKDdFWbyBEvgQGR2oFjkxHYGqwTQ9oHJBfBkMH9hrDl2Q9C_cVE5A-_Bb9RiUy6WuwIbS-pPN56z3XtpA">
#<URI::HTTPS:0xb76e7988 URL:https://hawat.thufir:pword#gmail.com>
#<Net::HTTP gmail.com:443 open=false>
#<Net::HTTP::Get GET>
["Basic aGF3YXQudGh1ZmlyOmRldm90Y2hrYQ=="]
#<Net::HTTP::Get GET>
/usr/lib/ruby/1.8/net/http.rb:1060:in `request': undefined method `closed?' for nil:NilClass (NoMethodError)
from ./req_uri.rb:23
code:
#!/usr/bin/ruby -w
require 'rubygems'
require 'google/reader'
require 'pp'
require 'net/http'
require 'net/https'
require 'uri'
require 'yaml'
yml = YAML.load_file 'login.yml'
user = yml["user"]
pword = yml["pword"]
pp Google::Reader::Base.establish_connection(user, pword)
uri = URI.parse "https://#{user}:#{pword}#gmail.com"
pp uri
pp http = Net::HTTP.new(uri.host, uri.port)
pp request = Net::HTTP::Get.new(uri.request_uri)
pp request.basic_auth(user, pword)
pp request
response = http.request(request)
So, the question is, should the request be basically empty when printed? What's wrong with sending the request to the response? That seems to be correct so far as I can ascertain. What am I missing?
Could someone tell me how I can fetch (GET) a URL (with params) using Ruby? I found a bunch of examples online but I couldn't find one that explained how I can also pass the parameters.
require 'net/http'
require 'uri'
uri = URI.parse("http://www.example.com/?test=1")
response = Net::HTTP.get_response uri
p response.body
There are also some other good HTTP clients or wrappers, such as HTTParty.
require 'rubygems'
require 'httparty'
response = HTTParty.get("http://www.example.com/?test=1")
p response.body
I use something like the following, it's pretty simple and doesn't make you build your own query string:
require 'net/http'
response = nil
Net::HTTP.start "example.com", 80 do |http|
request = Net::HTTP::Get.new "/endpoint"
request.form_data = {:q => "123"}
response = http.request(request)
end
I missed this one. The solutions are here.
Parametrized get request in Ruby?
I would like to take information from another website. Therefore (maybe) I should make a request to that website (in my case a HTTP GET request) and receive the response.
How can I make this in Ruby on Rails?
If it is possible, is it a correct approach to use in my controllers?
You can use Ruby's Net::HTTP class:
require 'net/http'
url = URI.parse('http://www.example.com/index.html')
req = Net::HTTP::Get.new(url.to_s)
res = Net::HTTP.start(url.host, url.port) {|http|
http.request(req)
}
puts res.body
Net::HTTP is built into Ruby, but let's face it, often it's easier not to use its cumbersome 1980s style and try a higher level alternative:
HTTP Gem
HTTParty
RestClient
Excon
Feedjira (RSS only)
OpenURI is the best; it's as simple as
require 'open-uri'
response = open('http://example.com').read
require 'net/http'
result = Net::HTTP.get(URI.parse('http://www.example.com/about.html'))
# or
result = Net::HTTP.get(URI.parse('http://www.example.com'), '/about.html')
I prefer httpclient over Net::HTTP.
client = HTTPClient.new
puts client.get_content('http://www.example.com/index.html')
HTTParty is a good choice if you're making a class that's a client for a service. It's a convenient mixin that gives you 90% of what you need. See how short the Google and Twitter clients are in the examples.
And to answer your second question: no, I wouldn't put this functionality in a controller--I'd use a model instead if possible to encapsulate the particulars (perhaps using HTTParty) and simply call it from the controller.
Here is the code that works if you are making a REST api call behind a proxy:
require "uri"
require 'net/http'
proxy_host = '<proxy addr>'
proxy_port = '<proxy_port>'
proxy_user = '<username>'
proxy_pass = '<password>'
uri = URI.parse("https://saucelabs.com:80/rest/v1/users/<username>")
proxy = Net::HTTP::Proxy(proxy_host, proxy_port, proxy_user, proxy_pass)
req = Net::HTTP::Get.new(uri.path)
req.basic_auth(<sauce_username>,<sauce_password>)
result = proxy.start(uri.host,uri.port) do |http|
http.request(req)
end
puts result.body
My favorite two ways to grab the contents of URLs are either OpenURI or Typhoeus.
OpenURI because it's everywhere, and Typhoeus because it's very flexible and powerful.