Ruby 2.0 Open Uri always uses proxy, even with proxy = nil flag - ruby

The following call is made in our code base:
require 'open-uri'
open(url, :proxy => nil)
However, when the call is made, Open uri uses the http_proxy environment variable to make the call, which is effectively blocked by our firewall. According to the docs when passing nil or false to the proxy option, the environment variables should be ignored as well, but this doesn't seem to be the case.
If we either switch to Ruby version 1.8.7 or unset the environment variable, the call is made successfully without the proxy. The problem is that we need to use a proxy for calls to the internet, but we cannot use them for internal calls (which is the case here) due to our firewall configuration.
Any help would be greatly appreciated!

I can reproduce what you described
http_proxy=http://127.0.0.1 ruby -e "require 'open-uri'; open('http://google.com', proxy: nil)"
~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:878:in `initialize': Connection refused - connect(2) (Errno::ECONNREFUSED)
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:878:in `open'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:878:in `block in connect'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/timeout.rb:52:in `timeout'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:877:in `connect'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:862:in `do_start'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:851:in `start'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/open-uri.rb:313:in `open_http'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/open-uri.rb:708:in `buffer_open'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/open-uri.rb:210:in `block in open_loop'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/open-uri.rb:208:in `catch'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/open-uri.rb:208:in `open_loop'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/open-uri.rb:149:in `open_uri'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/open-uri.rb:688:in `open'
from ~/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/open-uri.rb:34:in `open'
from -e:1:in `<main>'
I've digged the code, and I think I've found the problem.
So far open-uri manage correctly the proxy. For instance, in OpenURI.open_loop, it correctly considers that the proxy is set to nil (see open-uri.rb#195
case opt_proxy
when true
find_proxy = lambda {|u| pxy = u.find_proxy; pxy ? [pxy, nil, nil] : nil}
when nil, false
find_proxy = lambda {|u| nil} # <-- THIS LINE
But when it creates a Net::HTTP instance into OpenURI.open_http
http = klass.new(target_host, target_port)
it doesn't pass p_addr = nil. According the source code and doc for HTTP.new
def HTTP.new(address, port = nil, p_addr = :ENV, p_port = nil, p_user = nil, p_pass = nil)
http = super address, port
if proxy_class? then # from Net::HTTP::Proxy()
http.proxy_from_env = #proxy_from_env
http.proxy_address = #proxy_address
http.proxy_port = #proxy_port
http.proxy_user = #proxy_user
http.proxy_pass = #proxy_pass
elsif p_addr == :ENV then # <---- THIS LINE
http.proxy_from_env = true
else
http.proxy_address = p_addr
http.proxy_port = p_port || default_port
http.proxy_user = p_user
http.proxy_pass = p_pass
end
http
end
Above the doc says
To disable proxy detection set +p_addr+ to nil.
I've patched open-uri.rb by replacing the line 291 by
http = proxy ? klass.new(target_host, target_port) : klass.new(target_host, target_port, nil)
Now, it behaves as expected.
just filled an issue in Ruby bug tracker about that.

Related

Net::HTTP and Nokogiri - undefined method `body' for nil:NilClass (NoMethodError)

Thanks for your time. Somewhat new to OOP and Ruby and after synthesizing solutions from a few different stack overflow answers I've got myself turned around.
My goal is to write a script that parses a CSV of URLs using Nokogiri library. After trying and failing to use open-uri and the open-uri-redirections plugin to follow redirects, I settled on Net::HTTP and that got me moving...until I ran into URLs that have a 302 redirect specifically.
Here's the method I'm using to engage the URL:
require 'Nokogiri'
require 'Net/http'
require 'csv'
def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
url = URI.parse(uri_str)
#puts "The value of uri_str is: #{ uri_str}"
#puts "The value of URI.parse(uri_str) is #{ url }"
req = Net::HTTP::Get.new(url.path, { 'User-Agent' => 'Mozilla/5.0 (etc...)' })
# puts "THE URL IS #{url.scheme + ":" + url.host + url.path}" # just a reporter so I can see if it's mangled
response = Net::HTTP.start(url.host, url.port, :use_ssl => url.scheme == 'https') { |http| http.request(req) }
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then fetch(response['location'], limit - 1)
else
#puts "Problem clause!"
response.error!
end
end
Further down in my script I take an ARGV with the URL csv filename, do CSV.read, encode the URL to a string, then use Nokogiri::HTML.parse to turn it all into something I can use xpath selectors to examine and then write to an output CSV.
Works beautifully...so long as I encounter a 200 response, which unfortunately is not every website. When I run into a 302 I'm getting this:
C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:1570:in `addr_port': undefined method `+' for nil:NilClass (NoMethodError)
from C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:1503:in `begin_transport'
from C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:1442:in `transport_request'
from C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:1416:in `request'
from httpcsv.rb:14:in `block in fetch'
from C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:877:in `start'
from C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:608:in `start'
from httpcsv.rb:14:in `fetch'
from httpcsv.rb:17:in `fetch'
from httpcsv.rb:42:in `block in <main>'
from C:/Ruby24-x64/lib/ruby/2.4.0/csv.rb:866:in `each'
from C:/Ruby24-x64/lib/ruby/2.4.0/csv.rb:866:in `each'
from httpcsv.rb:38:in `<main>'
I know I'm missing something right in front of me but I can't tell what I should puts to see if it is nil. Any help is appreciated, thanks in advance.

Ruby NoMethodError // Telegram Bot

I'm using this Ruby code for running Telegram messenger bot. After code is runned with /start command, it crashes with the following error:
./bot.rb:26:in `block (3 levels) in <main>': undefined method `[]' for nil:NilClass (NoMethodError)
from ./bot.rb:20:in `loop'
from ./bot.rb:20:in `block (2 levels) in <main>'
from /var/lib/gems/2.1.0/gems/telegram-bot-ruby-0.3.11/lib/telegram/bot/client.rb:41:in `block in fetch_updates'
from /var/lib/gems/2.1.0/gems/telegram-bot-ruby-0.3.11/lib/telegram/bot/client.rb:37:in `each'
from /var/lib/gems/2.1.0/gems/telegram-bot-ruby-0.3.11/lib/telegram/bot/client.rb:37:in `fetch_updates'
from /var/lib/gems/2.1.0/gems/telegram-bot-ruby-0.3.11/lib/telegram/bot/client.rb:29:in `listen'
from ./bot.rb:17:in `block in <main>'
from /var/lib/gems/2.1.0/gems/telegram-bot-ruby-0.3.11/lib/telegram/bot/client.rb:22:in `run'
from /var/lib/gems/2.1.0/gems/telegram-bot-ruby-0.3.11/lib/telegram/bot/client.rb:10:in `run'
from ./bot.rb:16:in `<main>'
Bot code is here:
require 'telegram/bot'
token = '...'
require 'net/http'
url = URI.parse('http://api.wotblitz.ru/wotb/clans/info/?application_id=...&clan_id=8')
req = Net::HTTP::Get.new(url.to_s)
res = Net::HTTP.start(url.host, url.port) {|http|
http.request(req)
}
require 'json'
str = JSON.parse(res.body)
ids = str["data"]["8"]["members_ids"]
n = 0
m = ids.size - 1
players = ''
Telegram::Bot::Client.run(token) do |bot|
bot.listen do |message|
case message.text
when '/start'
loop do
url = URI.parse('http://api.wotblitz.ru/wotb/account/info/?application_id=...&account_id=' + ids[n].to_s)
req = Net::HTTP::Get.new(url.to_s)
res = Net::HTTP.start(url.host, url.port) {|http| http.request(req)
}
str1 = JSON.parse(res.body)
nick = str1["data"][ids[n].to_s]["nickname"]
players = players + nick + "\n"
n += 1
break if n == m
end
bot.api.sendMessage(chat_id: message.chat.id, text:"#{players}")
end
end
end
If anyone is facing this same issue, I found a hacky fix for it. When running this code from my macOS terminal, I get the undefined method 'run' message. When I run it inside my vscode terminal on the other hand, it works correctly. I had this same issue on a Linux machine aswell. It probably has something to do with how my environment is setup.
Hope this helps someone out.

Ruby and Https: A socket operation was attempted to an unreachable network

I'm trying to download all of my class notes from coursera. I figured that since I'm learning ruby this would be a good practice exercise, downloading all the PDFs they have for future use. Unfortunately though, I'm getting an exception saying ruby can't connect for some reason. Here is my code:
require 'net/http'
module Coursera
class Downloader
attr_accessor :page_url
attr_accessor :destination_directory
attr_accessor :cookie
def initialize(page_url,dest,cookie)
#page_url=page_url
#destination_directory = dest
#cookie=cookie
end
def download
puts #page_url
request = Net::HTTP::Get.new(#page_url)
puts #cookie.encoding
request['Cookie']=#cookie
# the line below is where the exception is thrown
res = Net::HTTP.start(#page_url.hostname, use_ssl=true,#page_url.port) {|http|
http.request(request)
}
html_page = res.body
pattern = /http[^\"]+\.pdf/
i=0
while (match = pattern.match(html_page,i)) != nil do
# 0 is the entire string.
url_string = match[0]
# make sure that 'i' is updated
i = match.begin(0)+1
# we want just the name of the file.
j = url_string.rindex("/")
filename = url_string[j+1..url_string.length]
destination = #destination_directory+"\\"+filename
# I want to download that resource to that file.
uri = URI(url_string)
res = Net::HTTP.get_response(uri)
# write that body to the file
f=File.new(destination,mode="w")
f.print(res.body)
end
end
end
end
page_url_string = 'https://class.coursera.org/datasci-002/lecture'
puts page_url_string.encoding
dest='C:\\Users\\michael\\training material\\data_science'
page_url=URI(page_url_string)
# I copied this from my browsers developer tools, I'm omitting it since
# it's long and has my session key in it
cookie="..."
downloader = Coursera::Downloader.new(page_url,dest,cookie)
downloader.download
At runtime the following is written to console:
Fast Debugger (ruby-debug-ide 0.4.22, debase 0.0.9) listens on 127.0.0.1:65485
UTF-8
https://class.coursera.org/datasci-002/lecture
UTF-8
Uncaught exception: A socket operation was attempted to an unreachable network. - connect(2)
C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `initialize'
C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `open'
C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `block in connect'
C:/Ruby200-x64/lib/ruby/2.0.0/timeout.rb:52:in `timeout'
C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:877:in `connect'
C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:862:in `do_start'
C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:851:in `start'
C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:582:in `start'
C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:20:in `download'
C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:52:in `<top (required)>'
C:/Ruby200-x64/bin/rdebug-ide:23:in `load'
C:/Ruby200-x64/bin/rdebug-ide:23:in `<main>'
C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `initialize': A socket operation was attempted to an unreachable network. - connect(2) (Errno::ENETUNREACH)
from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `open'
from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `block in connect'
from C:/Ruby200-x64/lib/ruby/2.0.0/timeout.rb:52:in `timeout'
from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:877:in `connect'
from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:862:in `do_start'
from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:851:in `start'
from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:582:in `start'
from C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:20:in `download'
from C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:52:in `<top (required)>'
from C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/ruby-debug-ide-0.4.22/lib/ruby-debug-ide.rb:86:in `debug_load'
from C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/ruby-debug-ide-0.4.22/lib/ruby-debug-ide.rb:86:in `debug_program'
from C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/ruby-debug-ide-0.4.22/bin/rdebug-ide:110:in `<top (required)>'
from C:/Ruby200-x64/bin/rdebug-ide:23:in `load'
from C:/Ruby200-x64/bin/rdebug-ide:23:in `<main>'
I was following instructions here to write all the HTTP code. As far as I can see I'm following them ver-batim.
I'm using Windows 7, ruby 2.0.0p481, and Aptana Studio 3. When I copy the url into my browser it goes straight to the page without a problem. When I look at the request headers in my browser for that url, I don't see anything else I think I'm missing. I also tried setting the Host and Referer request headers, it made no difference.
I am out of ideas, and have already searched Stack Overflow for similar questions but that didn't help. Please let me know what I'm missing.
So, I had this same error message with a different project and the problem was that my machine literally couldn't connect to the IP / Port. Have you tried connecting with curl? If it works in your browser, it could be using a proxy or something to actually get there. Testing the URL with curl solved the problem for me.

Need to implement Watirgrid,

I want to implement watirgrid, but I'm not able to do that, every time I'm getting errors related with controller and provider starting process, Also all the example over internet, none of them are working.
Could any one please help me to implement this, a full running example with steps will be a great help.
I'm trying this code:
require 'rubygems'
require 'watirgrid'
require 'watir'
require 'watir-webdriver'
# setup a controller on port 12351 for your new grid
controller = Controller.new(:ring_server_port => 12351, :loglevel => Logger::ERROR)
controller.start
# add a provider to your grid
# :browser_type => 'webdriver' if using webdriver or
# :browser_type => 'ie' if using watir...
provider = Provider.new(:controller_uri => 'druby://127.0.0.1:11235',
:ring_server_port => 12351,
:loglevel => Logger::ERROR,
:browser_type => 'webdriver')
provider.start
# connect to the grid and take all providers from it (this time only one)
grid = Watir::Grid.new(:ring_server_port => 12351, :ring_server_host => '127.0.0.1')
grid.start(:take_all => true)
# for each provider on the grid, launch a new thread to start multiple browsers
threads = []
grid.browsers.each do |browser|
threads << Thread.new do
p browser[:hostname]
p browser[:architecture]
p browser[:browser_type]
# in this case we are starting a new IE browser
b = browser[:object].new_browser(:ie)
b.goto("http://www.google.com")
b.text_field(:name, 'q').set("watirgrid")
b.button(:name, "btnI").click
end
end
threads.each {|thread| thread.join}
And Errors I'm getting is
DRb::DRbConnError: druby://127.0.0.1:11235 - #<Errno::ECONNREFUSED: No connection could
be made because the target machine actively refused it. - connect(2)>
from C:/Ruby192/lib/ruby/1.9.1/drb/drb.rb:736:in `rescue in block in open'
from C:/Ruby192/lib/ruby/1.9.1/drb/drb.rb:730:in `block in open'
from C:/Ruby192/lib/ruby/1.9.1/drb/drb.rb:729:in `each'
from C:/Ruby192/lib/ruby/1.9.1/drb/drb.rb:729:in `open'
from C:/Ruby192/lib/ruby/1.9.1/drb/drb.rb:1191:in `initialize'
from C:/Ruby192/lib/ruby/1.9.1/drb/drb.rb:1171:in `new'
from C:/Ruby192/lib/ruby/1.9.1/drb/drb.rb:1171:in `open'
from C:/Ruby192/lib/ruby/1.9.1/drb/drb.rb:1087:in `block in method_missing'
from C:/Ruby192/lib/ruby/1.9.1/drb/drb.rb:1105:in `with_friend'
from C:/Ruby192/lib/ruby/1.9.1/drb/drb.rb:1086:in `method_missing'
from C:/Ruby192/lib/ruby/gems/1.9.1/gems/watirgrid-1.1.5/lib/provider.rb:141:in `start'
from (irb):44
from C:/Ruby192/bin/irb:12:in `<main>'
irb(main):045:0>
I got the same problem, I got the provider started successfully by changing the
controller_uri
to
controller_uri => 'druby://machineIPAddress:11235'

Profiling Sinatra with rack-perftools_profiler (using net/http)

One of my Sinatra actions performs another request using net/http and computes something based on the response:
api_response = Net::HTTP.get_response(uri)
It works like a charm, but when I try to profile this:
configure do
use ::Rack::PerftoolsProfiler, default_printer: 'pdf', frequency: 4000, mode: :walltime
end
(and put ?profile=true into the params of the request I get this:
Errno::EADDRINUSE - Address already in use - connect(2):
/Users/tomek/.rvm/rubies/ruby-1.9.3-p286/lib/ruby/1.9.1/net/http.rb:762:in `initialize'
/Users/tomek/.rvm/rubies/ruby-1.9.3-p286/lib/ruby/1.9.1/net/http.rb:762:in `open'
/Users/tomek/.rvm/rubies/ruby-1.9.3-p286/lib/ruby/1.9.1/net/http.rb:762:in `block in connect'
/Users/tomek/.rvm/rubies/ruby-1.9.3-p286/lib/ruby/1.9.1/timeout.rb:54:in `timeout'
/Users/tomek/.rvm/rubies/ruby-1.9.3-p286/lib/ruby/1.9.1/timeout.rb:99:in `timeout'
/Users/tomek/.rvm/rubies/ruby-1.9.3-p286/lib/ruby/1.9.1/net/http.rb:762:in `connect'
/Users/tomek/.rvm/rubies/ruby-1.9.3-p286/lib/ruby/1.9.1/net/http.rb:755:in `do_start'
/Users/tomek/.rvm/rubies/ruby-1.9.3-p286/lib/ruby/1.9.1/net/http.rb:744:in `start'
/Users/tomek/.rvm/rubies/ruby-1.9.3-p286/lib/ruby/1.9.1/net/http.rb:454:in `get_response'
Any idea what's happening?
I try to setup these the same way as you.
require 'sinatra'
require 'rack/perftools_profiler'
require 'net/http'
configure do
use ::Rack::PerftoolsProfiler, default_printer: 'pdf', frequency: 4000, mode: :walltime
end
get "/" do
uri = URI('http://google.com')
api_response = Net::HTTP.get_response(uri)
end
And it run with out a problem on ruby 1.9.3-p286 and with rack-perftools_profiler (0.6.1).
Two recommendations:
update your ruby version
update your gems
If these don't help check which program has bound your port with sudo netstat -nlp
If these all don't help please tell us more about your setup.

Resources