open_http: 403 Forbidden (OpenURI::HTTPError) - ruby

I am trying to pull data from my Google+ API, using this script:
require 'open-uri'
require 'json'
google_api_key = 'put your google api key here'
page_id = '105672627985088123672'
data = open("https://www.googleapis.com/plus/v1/people/#{page_id}?key=#{google_api_key}").read
obj = JSON.parse(data)
puts obj['plusOneCount'].to_i
However, I keep getting this error:
/Users/xng/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/open-uri.rb:346:in `open_http': 403 Forbidden (OpenURI::HTTPError)
from /Users/xng/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/open-uri.rb:769:in `buffer_open'
from /Users/xng/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/open-uri.rb:203:in `block in open_loop'
from /Users/xng/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/open-uri.rb:201:in `catch'
from /Users/xng/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/open-uri.rb:201:in `open_loop'
from /Users/xng/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/open-uri.rb:146:in `open_uri'
from /Users/xng/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/open-uri.rb:671:in `open'
from /Users/xng/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/open-uri.rb:33:in `open'
from gplus.rb:8:in `<main>'
I am not sure what is wrong here, any help would be great.

The problem looks like your google API key doesn't match the one that google have in their servers. So you need to make sure that you are using the right key. is it a private or free service ?

Have to regenerate the API key.

Related

How do I use ruby to read the source a uri that im hosting on my own computer?

I trying to get ruby to read the source of a url thats being hosted on my own computer. I've tried using open-uri gem with:
source = open('http://127.0.0.1:8000/wikipedia_en_all_nopic_01_2012/A/Mick%20Jagger.html', &:read)
With normal external urls this works fine but it raises multiple errors when i try to access the url im hosting on my computer. Does anyone have any idea how to this? Heres the command line error report:
/Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:357:in `finish': incorrect header check (Zlib::DataError)
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:357:in `finish'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:262:in `ensure in inflater'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:262:in `inflater'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:274:in `read_body_0'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:201:in `read_body'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:328:in `block (2 levels) in open_http'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:1415:in `block (2 levels) in transport_request'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:162:in `reading_body'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:1414:in `block in transport_request'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:1405:in `catch'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:1405:in `transport_request'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:1378:in `request'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:319:in `block in open_http'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:853:in `start'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:313:in `open_http'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:723:in `buffer_open'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:210:in `block in open_loop'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:208:in `catch'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:208:in `open_loop'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:149:in `open_uri'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:703:in `open'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:34:in `open'
from testurl.rb:6:in `<main>'
UPDATE: I'm using kiwix server to host the URL
Try using net/http instead.
require 'net/http'
source = Net::HTTP.get URI.parse('http://127.0.0.1:8000/wikipedia_en_all_nopic_01_2012/A/Mick%20Jagger.html')
Why do you want you use a URL if it's on the local machine? Why not just give the path?
That error sounds like there's something wrong with the actual file you're trying to parse, or the way it's being served by the way. From reading about Kiwix server it sounds like the latter...On the Kiwix site it says that it uses some type compression method called openzim which is most likely why open-uri can't find a way to parse it.
You could try nokogiri and see if it has a problem parsing it. But since it seems like you're trying to open/manipulate a zim file in ruby, I'd look for a zim library for ruby instead of trying to serve it.
Here:
https://github.com/chrisistuff/zim-ruby
I've never dealt with kiwix/zim, so I don't know if this one works but it was the only one the a google search for 'zim ruby' came back with.
I had the same problem with kiwix. I extracted all the URLs in a file called hrefs.txt (in my case it was the german project gutenberg) and used wget to download each of them:
f = File.open("hrefs.txt", "r")
f.each_line do |url|
#filename = url.split("/").last.gsub!(/[^A-Za-z]/, '')[0..-4]
system "wget #{url}"
end
f.close

Ruby / Sinatra - Route any HTTP request to HTTPS (using Rack::SSL) - !! Invalid request

I'm playing with Ruby / Sinatra at present and attempting to get HTTPS working.
I've taken a look at the rack:ssl gem here: https://github.com/josh/rack-ssl
It seems to be working when I run the application (as in redirecting to HTTPS), but nothing is displayed in the browser and the log comes up with the following error:
!! Invalid request
#!/usr/bin/env ruby
require 'rack/ssl'
require 'sinatra'
use Rack::SSL
get '/' do
'Hello World'
end
I'm not sure what to do from here.
Update:
Turned Thin Logging on in the sinatra app and got the following in the log:
!! Invalid request
Invalid HTTP format, parsing fails.
/Users/ashleycox/.rvm/gems/ruby-2.0.0-p247/gems/thin-1.5.1/lib/thin/request.rb:82:in `execute'
/Users/ashleycox/.rvm/gems/ruby-2.0.0-p247/gems/thin-1.5.1/lib/thin/request.rb:82:in `parse'
/Users/ashleycox/.rvm/gems/ruby-2.0.0-p247/gems/thin-1.5.1/lib/thin/connection.rb:39:in `receive_data'
/Users/ashleycox/.rvm/gems/ruby-2.0.0-p247/gems/eventmachine-1.0.3/lib/eventmachine.rb:187:in `run_machine'
/Users/ashleycox/.rvm/gems/ruby-2.0.0-p247/gems/eventmachine-1.0.3/lib/eventmachine.rb:187:in `run'
/Users/ashleycox/.rvm/gems/ruby-2.0.0-p247/gems/thin-1.5.1/lib/thin/backends/base.rb:63:in `start'
/Users/ashleycox/.rvm/gems/ruby-2.0.0-p247/gems/thin-1.5.1/lib/thin/server.rb:159:in `start'
/Users/ashleycox/.rvm/gems/ruby-2.0.0-p247/gems/rack-1.5.2/lib/rack/handler/thin.rb:16:in `run'
/Users/ashleycox/.rvm/gems/ruby-2.0.0-p247/gems/sinatra-1.4.3/lib/sinatra/base.rb:1408:in `run!'
/Users/ashleycox/.rvm/gems/ruby-2.0.0-p247/gems/sinatra-1.4.3/lib/sinatra/main.rb:25:in `block in <module:Sinatra>'
Any help would be greatly appreciated, thank you.

RestClient failing to GET resource using SSL client certificate

I'm trying to use RestClient to retrieve a page that's secured using an SSL client certificate. My code is as follows:
require 'restclient'
p12 = OpenSSL::PKCS12.new(File.read('client.p12'), 'password')
client = RestClient::Resource.new('https://example.com/',
:ssl_client_key => p12.key,
:verify_ssl => OpenSSL::SSL::VERIFY_NONE)
client.get
When I run it, I see the following failure:
1.9.3-p374 :007 > client.get
RestClient::BadRequest: 400 Bad Request
from /home/duncan/.rvm/gems/ruby-1.9.3-p374/gems/rest-client-1.6.7/lib/restclient/abstract_response.rb:48:in `return!'
from /home/duncan/.rvm/gems/ruby-1.9.3-p374/gems/rest-client-1.6.7/lib/restclient/request.rb:230:in `process_result'
from /home/duncan/.rvm/gems/ruby-1.9.3-p374/gems/rest-client-1.6.7/lib/restclient/request.rb:178:in `block in transmit'
from /home/duncan/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/net/http.rb:745:in `start'
from /home/duncan/.rvm/gems/ruby-1.9.3-p374/gems/rest-client-1.6.7/lib/restclient/request.rb:172:in `transmit'
from /home/duncan/.rvm/gems/ruby-1.9.3-p374/gems/rest-client-1.6.7/lib/restclient/request.rb:64:in `execute'
from /home/duncan/.rvm/gems/ruby-1.9.3-p374/gems/rest-client-1.6.7/lib/restclient/request.rb:33:in `execute'
from /home/duncan/.rvm/gems/ruby-1.9.3-p374/gems/rest-client-1.6.7/lib/restclient/resource.rb:51:in `get'
from (irb):7
from /home/duncan/.rvm/rubies/ruby-1.9.3-p374/bin/irb:13:in `<main>'
I'm fairly sure this is a failure to authenticate, as I get the same error in a browser if I don't install the client certificate.
I'm using OpenSSL::SSL::VERIFY_NONE because the server has a self-signed certificate, and I believe this is the correct value to pass to ignore that.
Any suggestions on how to get this working would be greatly appreciated - even a pointer to some detailed documentation, or a suggestion of a different Gem could work. I've not had much luck with either the Gem docs or Google :(
Your HTTPS request is going to need the client certificate as well as the key. Try:
client = RestClient::Resource.new('https://example.com/',
:ssl_client_cert => p12.certificate,
:ssl_client_key => p12.key,
:verify_ssl => OpenSSL::SSL::VERIFY_NONE)
If that doesn't work you can try capturing the handshake packets (e.g. with WireShark) to verify that the API is offering the certificate.

OpenUri causing 401 Unauthorized error with HTTPS URL

I am adding functionality that scrapes an XML page from a source that requires the use of an HTTPS connection with authentication. I am trying to use Ryan Bates' Railscast #190 solution but I'm running into a 401 Authentication error.
Here is my test Ruby script:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
url = "https://biblesearch.americanbible.org/passages.xml?q[]=john+3:1-5&version=KJV"
doc = Nokogiri::XML(open(url, :http_basic_authentication => ['username' ,'password']))
puts doc.xpath("//text_preview")
Here is the output of the console after I run my script:
/usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:799:in `connect': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (OpenSSL::SSL::SSLError)
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:799:in `block in connect'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/timeout.rb:54:in `timeout'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/timeout.rb:99:in `timeout'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:799:in `connect'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:755:in `do_start'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:744:in `start'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:306:in `open_http'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:775:in `buffer_open'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:203:in `block in open_loop'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:201:in `catch'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:201:in `open_loop'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:146:in `open_uri'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:677:in `open'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:33:in `open'
from scrape.rb:6:in `<main>'
In my research, I saw one post in which it was suggested that in 1.9.3 the following option could be used:
doc = Nokogiri::XML(open(url, :http_basic_authentication => ['username' ,'password'], :ssl_verify_mode => OpenSSL::SSL::VERIFY_NONE))
However, this did not work either. I would appreciate some insight into addressing this challenge.
The given URL will be redirected to /v1/KJV/passages.xml?q[]=john+3%3A1-5 with HTTP status code 302 Found. OpenURI understands the redirection, but automatically deletes authentication header (maybe) for security reason. (*)
If you access "http://biblesearch.americanbible.org/v1/KJV/passages.xml?q[]=john+3%3A1-5" directly, you will get the expected result. :-)
(*) You can find in open-uri.rb:
if redirect
### snip ###
if options.include? :http_basic_authentication
# send authentication only for the URI directly specified.
options = options.dup
options.delete :http_basic_authentication
end
You can do this and it should work too:
open(url, :http_basic_authentication => [user, pass] )
doc = Nokogiri::HTML(open(url, :http_basic_authentication => [user, pass] ))
You can then parse the doc anyway you want.
By passing the http_basic_authentication in the header again in the second request, you will make up for the deleted header in the first request.
hope this works for you.
http://http-basic-authentication-nokogiri.blogspot.com/2014/08/http-basic-authentication-using-nokogiri.html
You say you need to use HTTPS, but you're using the HTTP protocol:
url = "http://biblesearch...."
OpenURI understands both HTTP and HTTPS. If you want to connect using HTTPS, change the protocol in the URL to HTTPS, then make the connection:
url = "https://biblesearch...."

OpenURI - 500 Internal Server Error on XML feed

I'm trying to read Stanford ecorner XML:
open("http://ecorner.stanford.edu/RecentlyAdded.xml")
but am running into the following error message:
OpenURI::HTTPError: 500 Internal Server Error
from /usr/local/lib/ruby/1.8/open-uri.rb:277:in `open_http'
from /usr/local/lib/ruby/1.8/open-uri.rb:616:in `buffer_open'
from /usr/local/lib/ruby/1.8/open-uri.rb:164:in `open_loop'
from /usr/local/lib/ruby/1.8/open-uri.rb:162:in `catch'
from /usr/local/lib/ruby/1.8/open-uri.rb:162:in `open_loop'
from /usr/local/lib/ruby/1.8/open-uri.rb:132:in `open_uri'
from /usr/local/lib/ruby/1.8/open-uri.rb:518:in `open'
from /usr/local/lib/ruby/1.8/open-uri.rb:30:in `open'
from (irb):65
from :0
I believe, but I could be wrong, it's because I would need to be logged in to use the feed.
Any workaround I could use?
In case of not being logged in you should get an HTTP response code of 401 Unauthorized and not 500. I tried to open the site in the browser, which works. Turns out their web server doesn't like missing user agents, so if you add that open-uri works:
>> require 'open-uri'
#=> true
>> open("http://ecorner.stanford.edu/RecentlyAdded.xml", 'User-Agent' => 'ruby')
#=> #<File:/var/folders/H9/H9qnar1yGZqBrWFGuTE0RU+++TI/-Tmp-/open-uri20110505-25566-zsc3pd-0>
This is working for me:
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::XML(open('http://ecorner.stanford.edu/RecentlyAdded.xml'))
puts doc.search('title').map{ |n| n.text }
>> Recently Added STVP Entrepreneurship Corner Materials
>> STVP Entrepreneurship Corner
>> Podcast: Developing Products that Save Lives - Richard Scheller (Genentech)
>> Podcast: How to Build Instant Connections - Ori Brafman (Author)
>> Podcast: A New Vision for Capital Markets - Barry Silbert (SecondMarket)
>> Podcast: Effective Models for Sustainable Growth - Jennifer Morris (Conservation International)
Note that you got a 500-range error. That means their server is acting up, but is functional enough to admit the problem. If you got a 400-range error they'd be refusing you access to the content for some reason, so I doubt the problem is authentication or anything on your side.

Resources