Convert JSON key values to CSV - ruby

My goal is to read a CSV file, get each ID from that file's records, use each ID into the Meetup API URL and then create a new CSV file with certain values from the JSON response.
Here's what I have so far:
require "net/https"
require "uri"
require 'csv'
require 'json'
membersCSV = CSV.foreach('id-members-meetup.csv') do |row|
id = row[1]
uri = URI.parse("https://api.meetup.com/2/members?order=name&member_id=" + id + "&format=json&key=MY_KEY")
http = Net::HTTP.new(uri.host, uri.port)
request = Net::HTTP::Get.new(uri.request_uri)
response = http.request(request)
CSV.open("ghmeetup.csv", "w", {:col_sep => ";"}) do |csv|
JSON.parse(response.body)["other_services"].each do |single|
csv << [single["twitter"]["identifier"], single["facebook"]["identifier"], single["linkedin"]["identifier"]]
end
end
end
And this is the error I get:
/Library/Ruby/Gems/2.0.0/gems/json-1.8.2/lib/json/common.rb:155:in `parse': 757: (JSON::ParserError) '<html>
<head><title>400 The plain HTTP request was sent to HTTPS port</title></head>
<body bgcolor="white">
<center><h1>400 Bad Request</h1></center>
<center>The plain HTTP request was sent to HTTPS port</center>
<hr><center>cloudflare-nginx</center>
</body>
</html>
'
from /Library/Ruby/Gems/2.0.0/gems/json-1.8.2/lib/json/common.rb:155:in `parse'
from ghmeetup.rb:13:in `block (2 levels) in <main>'
from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/csv.rb:1266:in `open'
from ghmeetup.rb:12:in `block in <main>'
from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/csv.rb:1716:in `each'
from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/csv.rb:1120:in `block in foreach'
from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/csv.rb:1266:in `open'
from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/csv.rb:1119:in `foreach'
from ghmeetup.rb:6:in `<main>'
What do you think?
EDIT
require "uri"
require 'csv'
require 'json'
require 'net/http'
ghCSV = CSV.foreach('id-gh-meetup.csv') do |row|
id = row[1]
key="KEY"
uri = URI.parse("https://api.meetup.com/2/members?order=name&member_id=#{id}&format=json&key=#{key}")
Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http|
request = Net::HTTP::Get.new uri
response = http.request request
parseResponse = JSON.parse(response.body)['results'][0]
p "working"
CSV.open("ghmeetup.csv", "w") do |csv|
p "working 2"
parseResponse.each do |single|
p "working 3"
csv << single
end
end
end
end
So it works if I keep only JSON.parse(response.body) but when I add ['results'][0] in parseResponse I get this error:
ghmeetup.rb:15:in `block (2 levels) in <main>': undefined method `[]' for nil:NilClass (NoMethodError)
This is the JSON structure, I want to target [results][0].other_services.twitter.identifier
{
results: [
- {
- other_services: {
twitter: {
identifier: "#HugoAmsellem"
Any idea?

HTTPS is enabled for an HTTP connection by #use_ssl=
This code gets a successful response on my system using Ruby 2.2.0:
require 'net/http' # Not HTTPS
key="..." # Get your personal API key from Meetup
uri = URI.parse("https://api.meetup.com/2/members?order=name&member_id=1&format=json&key=#{key}")
Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http|
request = Net::HTTP::Get.new uri
response = http.request request
p response.body
end
In previous versions of Ruby you would need to require 'net/https' to use HTTPS. This is no longer true.
Can you try the code above on your system?
If it works, great. If it doesn't work, then you can simplify your question code, such as omitting the CSV, the loop, the JSON, etc.

Related

400 Bad Request for Ruby RSS gem

I can't seem to get this RSS feed to work properly. I've tried Nokogiri and now RSS::Parser and neither work:
a = 'https://phys.org/rss-feed/biology-news/biology-other/'
URI.open(a) do |rss|
feed = RSS::Parser.parse(rss)
puts "Title: #{feed.channel.title}"
feed.items.each do |item|
puts "Item: #{item.title}"
end
end
The code is taken directly out of the docs: https://github.com/ruby/rss
The feed is valid, so I'm confused as to why there's a 400 error code.
What am I doing wrong? Anybody have insight as to how to get this RSS parsed?
Here is the error:
/Users/user3/.rbenv/versions/3.1.2/lib/ruby/3.1.0/open-uri.rb:364:in `open_http': 400 Bad request (OpenURI::HTTPError)
from /Users/user3/.rbenv/versions/3.1.2/lib/ruby/3.1.0/open-uri.rb:741:in `buffer_open'
from /Users/user3/.rbenv/versions/3.1.2/lib/ruby/3.1.0/open-uri.rb:212:in `block in open_loop'
from /Users/user3/.rbenv/versions/3.1.2/lib/ruby/3.1.0/open-uri.rb:210:in `catch'
from /Users/user3/.rbenv/versions/3.1.2/lib/ruby/3.1.0/open-uri.rb:210:in `open_loop'
from /Users/user3/.rbenv/versions/3.1.2/lib/ruby/3.1.0/open-uri.rb:151:in `open_uri'
from /Users/user3/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/open_uri_redirections-0.2.1/lib/open-uri/redirections_patch.rb:55:in `open_uri'
from /Users/user3/.rbenv/versions/3.1.2/lib/ruby/3.1.0/open-uri.rb:721:in `open'
from /Users/user3/.rbenv/versions/3.1.2/lib/ruby/3.1.0/open-uri.rb:29:in `open'
from /users/user3/app.rb:1856:in `<main>'
The web server requires the request to have a User-Agent set in the headers. Without such a User-Agent header it returns the 400 error message.
require 'uri'
require 'open-uri'
require 'rss'
uri = URI.parse("https://phys.org/rss-feed/biology-news/biology-other/")
uri.open("User-Agent" => "Ruby/#{RUBY_VERSION}") do |rss|
feed = RSS::Parser.parse(rss)
puts "Title: #{feed.channel.title}"
feed.items.each do |item|
puts "Item: #{item.title}"
end
end
This code work for me.

Net::HTTP and Nokogiri - undefined method `body' for nil:NilClass (NoMethodError)

Thanks for your time. Somewhat new to OOP and Ruby and after synthesizing solutions from a few different stack overflow answers I've got myself turned around.
My goal is to write a script that parses a CSV of URLs using Nokogiri library. After trying and failing to use open-uri and the open-uri-redirections plugin to follow redirects, I settled on Net::HTTP and that got me moving...until I ran into URLs that have a 302 redirect specifically.
Here's the method I'm using to engage the URL:
require 'Nokogiri'
require 'Net/http'
require 'csv'
def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
url = URI.parse(uri_str)
#puts "The value of uri_str is: #{ uri_str}"
#puts "The value of URI.parse(uri_str) is #{ url }"
req = Net::HTTP::Get.new(url.path, { 'User-Agent' => 'Mozilla/5.0 (etc...)' })
# puts "THE URL IS #{url.scheme + ":" + url.host + url.path}" # just a reporter so I can see if it's mangled
response = Net::HTTP.start(url.host, url.port, :use_ssl => url.scheme == 'https') { |http| http.request(req) }
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then fetch(response['location'], limit - 1)
else
#puts "Problem clause!"
response.error!
end
end
Further down in my script I take an ARGV with the URL csv filename, do CSV.read, encode the URL to a string, then use Nokogiri::HTML.parse to turn it all into something I can use xpath selectors to examine and then write to an output CSV.
Works beautifully...so long as I encounter a 200 response, which unfortunately is not every website. When I run into a 302 I'm getting this:
C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:1570:in `addr_port': undefined method `+' for nil:NilClass (NoMethodError)
from C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:1503:in `begin_transport'
from C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:1442:in `transport_request'
from C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:1416:in `request'
from httpcsv.rb:14:in `block in fetch'
from C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:877:in `start'
from C:/Ruby24-x64/lib/ruby/2.4.0/Net/http.rb:608:in `start'
from httpcsv.rb:14:in `fetch'
from httpcsv.rb:17:in `fetch'
from httpcsv.rb:42:in `block in <main>'
from C:/Ruby24-x64/lib/ruby/2.4.0/csv.rb:866:in `each'
from C:/Ruby24-x64/lib/ruby/2.4.0/csv.rb:866:in `each'
from httpcsv.rb:38:in `<main>'
I know I'm missing something right in front of me but I can't tell what I should puts to see if it is nil. Any help is appreciated, thanks in advance.

Opening a non-HTTP proxy URI on https domain using OpenURI

I'm behind a proxy and I must get an HTTPS webpage to collect some information, but OpenURI returns an error: "Non-HTTP proxy URI".
This is the issue:
> yadayada#ubuntu:~/Desktop/test/lib$ ruby JenkinsTest.rb
/home/yadayada/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/open-uri.rb:257:in `open_http': Non-HTTP proxy URI: https://web-proxy.yadayada:8088 (RuntimeError)
from /home/yadayada/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/open-uri.rb:736:in `buffer_open'
from /home/yadayada/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/open-uri.rb:211:in `block in open_loop'
from /home/yadayada/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/open-uri.rb:209:in `catch'
from /home/yadayada/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/open-uri.rb:209:in `open_loop'
from /home/yadayada/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/open-uri.rb:150:in `open_uri'
from /home/yadayada/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/open-uri.rb:716:in `open'
from /home/yadayada/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/open-uri.rb:34:in `open'
from JenkinsTest.rb:6:in `<main>'
yadayada#ubuntu:~/Desktop/test/lib$
This is the code I'm running:
1 require 'rubygems'
2 require 'nokogiri'
3 require 'open-uri'
4
5 # Request the Jenkins webpage
6 #jenkinsWebPage = Nokogiri::HTML(open("https://yadayad.yada.yada.com:8443"))
7
8 # Prints the received page
9 puts #jenkinsWebPage
The proxy has no login/password.
Any ideas?
Okay, so I've found out how to get the page, but I had to switch open-uri for net/https, also, I set OpenSSL to VERIFY_NONE, since it's a self signed certificate (company server):
require 'rubygems'
require 'nokogiri'
require 'net/https'
require 'openssl'
class JenkinsTest
# Request the Jenkins webpage
def request_jenkins_webpage
uri = URI.parse("https://https://yadayad.yada.yada.com:8443")
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
request = Net::HTTP::Get.new(uri.request_uri)
response = http.request(request)
##page = Nokogiri::HTML(response.body)
end
def print_jenkins_webpage
puts ##page
end
end
It looks ugly, if anybody finds out a better way to put this, please edit this post, but as of now, it's working fine.

How do I sent a GET with query parameters in ruby's net/http 1.9.3-p194? [duplicate]

This question already has answers here:
Parametrized get request in Ruby?
(7 answers)
Closed 5 years ago.
I'm trying to send a GET to http://www.hello.com/sup?a=b in ruby 1.9.3-p194 (can't update the version due to legacy code)
uri = URI.parse("http://www.hello.com/sup?a=b")
uri.query = "a=b"
req = Net::HTTP::Get.new(uri)
response = Net::HTTP.start(uri.host, uri.port) { |http| http.request(req) }
case response
when Net::HTTPSuccess then response
else
puts "Error"
end
I'm actually using ruby 1.9.3-p194 but I'm getting this error:
/Users/hithere/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:1860:in `initialize': undefined method `empty?' for #<URI::HTTP:0x007f938d9051c8> (NoMethodError)
from /Users/hithere/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:2093:in `initialize'
from send_to_hg_given_place_id.rb:101:in `new'
from send_to_hg_given_place_id.rb:101:in `block in fetch'
from /Users/hithere/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/timeout.rb:68:in `timeout'
from send_to_hg_given_place_id.rb:100:in `fetch'
from send_to_hg_given_place_id.rb:141:in `block in <main>'
from send_to_hg_given_place_id.rb:133:in `each'
from send_to_hg_given_place_id.rb:133:in `<main>'
For some reason it is trying to use http.rb 1.9.1, and 1.9.1 requires the parameter in #new to be a String instead of URI. I'd like to either fix it so it uses 1.9.3, or obtain a solution that works for 1.9.1 http.rb
You can refer to examples for Net::HTTP. You need to pass a string in Net::HTTP::Get.new
Here is an example from it (note uri.request_uri):
uri = URI('http://example.com/some_path?query=string')
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new uri.request_uri
response = http.request request # Net::HTTPResponse object
end
GET parameters you can append to the URL. Just pay attention to the URL encoding. See for example this SO question.
instead of
uri = URI.parse(http://www.hello.com/sup?a=b)
it would be
uri = URI.parse("http://www.hello.com/sup?a=b")

Making HEAD request in Ruby

I am kind of new to ruby and from a python background
I want to make a head request to a URL and check some information like if the file exists on the server and timestamp, etag etc.,I am not able to get this done in RUBY.
In Python:
import httplib2
print httplib2.Http().request('url.com/file.xml','HEAD')
In Ruby: I tried this and throwing some error
require 'net/http'
Net::HTTP.start('url.com'){|http|
response = http.head('/file.xml')
}
puts response
SocketError: getaddrinfo: nodename nor servname provided, or not known
from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `initialize'
from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `open'
from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `block in connect'
from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/timeout.rb:51:in `timeout'
from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:876:in `connect'
from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:861:in `do_start'
from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:850:in `start'
from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:582:in `start'
from (irb):2
from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/bin/irb:16:in `<main>'
I realize this has been answered but I had to go through some hoops, too. Here's something more concrete to start with:
#!/usr/bin/env ruby
require 'net/http'
require 'net/https' # for openssl
uri = URI('http://stackoverflow.com')
path = '/questions/16325918/making-head-request-in-ruby'
response=nil
http = Net::HTTP.new(uri.host, uri.port)
# http.use_ssl = true # if using SSL
# http.verify_mode = OpenSSL::SSL::VERIFY_NONE # for example, when using self-signed certs
response = http.head(path)
response.each { |key, value| puts key.ljust(40) + " : " + value }
I don't think that passing in a string to :start is enough; in the docs it looks like it requires a URI object's host and port for a correct address:
uri = URI('http://example.com/some_path?query=string')
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new uri
response = http.request request # Net::HTTPResponse object
end
You can try this:
require 'net/http'
url = URI('yoururl.com')
Net::HTTP.start(url.host, url.port){|http|
response = http.head('/file.xml')
puts response
}
One thing I noticed - your puts response needs to be inside the block! Otherwise, the variable response is not in scope.
Edit: You can also treat the response as a hash to get the values of the headers:
response.each_value { |value| puts value }
headers = nil
url = URI('http://my-bucket.amazonaws.com/filename.mp4')
Net::HTTP.start(url.host, url.port) do |http|
headers = http.head(url.path).to_hash
end
And now you have a hash of headers in headers

Resources