JSON to CSV File Ruby - ruby

I am trying to convert the following JSON to CSV via Ruby, but am having trouble with my code. I am learning as I go, so any help is appreciated.
require 'json'
require 'net/http'
require 'uri'
require 'csv'
uri = 'https://www.mapquestapi.com/search/v2/radius?key=Imjtd%7Clu6t200zn0,bw=o5-layg1&radius=3000&callback=processPOIs&maxMatches=4000&origin=40.7686973%2C-73.9918181&hostedData=mqap.33882_stores_prod%7Copen_status%20=%20?%20OR%20open_status%20=%20?%20OR%20open_status%20=%20?%7CExisting,Coming%20Soon,New%7C'
response = Net::HTTP.get_response(URI.parse(uri))
struct = JSON.parse(response.body.scan(/processPOIs\((.*)\);/).first.first)
CSV.open("output.csv", "w") do |csv|
JSON.parse(struct).read.each do |hash|
csv << hash.values
end
end
The error I receive is:
from c:/RailsInstaller/Ruby2.2.0/lib/ruby/gems/2.2.0/gems/json-1.8.3/lib/json/common.rb:155:in `new'
from c:/RailsInstaller/Ruby2.2.0/lib/ruby/gems/2.2.0/gems/json-1.8.3/lib/json/common.rb:155:in `parse'
from test.rb:14:in `block in <main>'
from c:/RailsInstaller/Ruby2.2.0/lib/ruby/2.2.0/csv.rb:1273:in `open'
from test.rb:13:in `<main>'
I am trying to get all the data off of the following link and put it into a CSV file that I can analyse later. https://www.mapquestapi.com/search/v2/radius?key=Imjtd%7Clu6t200zn0,bw=o5-layg1&radius=3000&callback=processPOIs&maxMatches=4000&origin=40.7686973%2C-73.9918181&hostedData=mqap.33882_stores_prod%7Copen_status%20=%20?%20OR%20open_status%20=%20?%20OR%20open_status%20=%20?%7CExisting,Coming%20Soon,New%7C

You have several problems here, the most significant of which is that you're calling JSON.parse twice. The second time you call it on struct, which was the result of calling JSON.parse the first time. You're basically doing JSON.parse(JSON.parse(string)). Oops.
There's another problem on the line where you call JSON.parse a second time: You call read on the value it returns. As far as I know JSON.parse does not ordinarily return anything that responds to read.
Fixing those two errors, your code looks something like this:
struct = JSON.parse(response.body.scan(/processPOIs\((.*)\);/).first.first)
CSV.open("output.csv", "w") do |csv|
struct.each do |hash|
csv << hash.values
end
end
This ought to work iif struct is an object that responds to each (like an array) and the values yielded by each all respond to values (like a hash). In other words, this code assumes that JSON.parse will return an array of hashes, or something similar. If it doesn't—well, that's beyond the scope of this question.
As an aside, this is not great:
response.body.scan(/processPOIs\((.*)\);/).first.first
The purpose of String#scan is to find every substring in a string that matches a regular expression. But you're only concerned with the first match, so scan is the wrong choice.
An alternative is to use String#match:
matches = response.body.match(/processPOIs\((.*)\)/)
json = matches[1]
struct = JSON.parse(json)
However, that's overkill. Since this is a JSONP response, we know that it will look like this:
processPOIs(...);
...give or take a trailing semicolon or newline. We don't need a regular expression to find the parts inside the parentheses, because we already know where it is: It starts 13 characters from the start (i.e. index 12) and ends two characters before the end ("index" -3). That makes it easy work with String#slice, a.k.a. String#[]:
json = response.body[12..-3]
struct = JSON.parse(json)
Like I said, "give or take a trailing semicolon or newline," so you might need to tweak that ending index depending on what the API returns. And with that, no more ugly .first.first, and it's faster, too.

Thank you everybody for the help. I was able to get everything into a CSV and then just used some VBA to organize it the way I wanted.
require 'json'
require 'net/http'
require 'uri'
require 'csv'
uri = 'https://www.mapquestapi.com/search/v2/radius?key=Imjtd%7Clu6t200zn0,bw=o5-layg1&radius=3000&callback=processPOIs&maxMatches=4000&origin=40.7686973%2C-73.9918181&hostedData=mqap.33882_stores_prod%7Copen_status%20=%20?%20OR%20open_status%20=%20?%20OR%20open_status%20=%20?%7CExisting,Coming%20Soon,New%7C'
response = Net::HTTP.get_response(URI.parse(uri))
matches = response.body.match(/processPOIs\((.*)\)/)
json = response.body[12..-3]
struct = JSON.parse(json)
CSV.open("output.csv", "w") do |csv|
csv << struct['searchResults'].map { |result| result['fields']}
end

Related

How to save an array of objects into a file in ruby?

I have an array of objects, where the objects are instances of a class. I would like to save this array into a file in such format that I could read the file back to an array and the objects and its' instance variable values would be as they were before saving. Does someone know how this could be achieved?
The class instance objects that I would like to save to a file are fairly complex containing tens of instance variables that are often other class instance variables themselves.
WHAT I HAVE TRIED:
According to this post I tried the following:
TRIAL1:
Save file:
require 'pp'
$stdout = File.open('path/to/file.txt', 'w')
pp myArray
Load file:
require 'rubygems'
require 'json'
buffer = File.open('path/to/file.txt', 'r').read
myArray = JSON.parse(buffer)
but I got a JSON::ParserError
TRIAL2:
Save file
serialized_array = Marshal.dump(myArray)
File.open('./myArray.txt', 'w') {|f| f.write(serialized_array) }
received Encoding::UndefinedConversionError
TRIAL1 doesn't work because pp "prints arguments in pretty form" and that's not necessarily JSON.
TRIAL2 probably isn't working because Marshal produces binary data (not text) and you're not working with your file in binary mode, that could lead to encoding and EOL problems. Besides, Marshal isn't a great format for persistence since the format is tied to the version of Ruby you're using.
A modification of TRIAL1 to write JSON is probably the best solution these days:
require 'json'
File.open('path/to/file.json', 'w') { |f| JSON.dump(myArray, f) }
Finally managed to find a solution that worked!
dump = Marshal.dump(myArray)
File.write('./myarray', myArray, mode: 'r+b')
dump = File.read('./myarray')
user = Marshal.restore(dump)
Marshall was able to do the trick after changing the encoding to binary mode

How can I convert this CSV to JSON with Ruby?

I am trying to convert a CSV file to JSON using Ruby. I am very, very, green when it comes to working with Ruby (or any language for that matter) so the answers may need to be dumbed down for me. Putting it in JSON seems like the most reasonable solution to me because I understand how to work with JSON when assigning variables equal to the attributes that come in the response. If there is a better way to do it, feel free to teach me.
My CSV is in the following format:
Header1,Header,Header3
ValueX,ValueY,ValueZ
I would like to be able to use the data to say something along the lines of this:
For each ValueX in Row 1 after the headers, check if valueZ is > ValueY. If yes, do this, if no do that. I understand how to do the if statement, just now how to parse out my information into variables/arrays.
Any ideas here?
require 'csv'
require 'json'
rows = []
CSV.foreach('a.csv', headers: true, converters: :all) do |row|
rows << row.to_hash
end
puts rows.to_json
# => [{"Header1":"ValueX","Header":"ValueY","Header3":"ValueZ"}]
Here is a first pointer:
require 'csv'
data = CSV.read('your_file.csv', { :col_sep => ',' }
Now you should have the data in data; you can test in irb.
I don't entirely understand the question:
if z > y
# do this
else
# do that
end
For JSON, you should be able to do JSON.parse().
I am not sure what target format JSON requires, probably a Hash.
You can populate your hash with the dataset from the CVS:
hash = Hash.new
hash[key_goes_here] = value_here

Convert JSON to string or hash in ruby

I have tried:
require 'net/http'
require 'json'
require 'pp'
require 'uri'
url = "http://xyz.com"
resp = Net::HTTP.get_response(URI.parse(url))
buffer = resp.body
result = JSON.parse(buffer)
#result.to_hash
#pp result
puts result
And got the output as:
{"id"=>"ABC", "account_id"=>"123", "first_name"=> "PEUS" }
in JSON format but I only need the value of id to be printed as ABC.
Your incoming string in JSON would look like:
{"id":"ABC","account_id":"123","first_name":"PEUS"}
After parsing with JSON it's the hash:
{"id"=>"ABC", "account_id"=>"123", "first_name"=> "PEUS" }
So, I'd use:
hash = {"id"=>"ABC", "account_id"=>"123", "first_name"=> "PEUS" }
hash['id'] # => "ABC"
Here's a more compact version:
require 'json'
json = '{"id":"ABC","account_id":"123","first_name":"PEUS"}'
hash = JSON[json]
hash['id'] # => "ABC"
Note I'm using JSON[json]. The JSON [] class method is smart enough to sense what the parameter being passed in is. If it's a string it'll parse the string. If it's an Array or Hash it'll serialize it. I find that handy because it allows me to write JSON[...] instead of having to remember whether I'm parsing or using to_json or something. Using it is an example of the first virtue of programmers.

How do I check if a string is valid YAML?

I'd like to check if a string is valid YAML. I'd like to do this from within my Ruby code with a gem or library. I only have this begin/rescue clause, but it doesn't get rescued properly:
def valid_yaml_string?(config_text)
require 'open-uri'
file = open("https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration")
hard_failing_bad_yaml = file.read
config_text = hard_failing_bad_yaml
begin
YAML.load config_text
return true
rescue
return false
end
end
I am unfortunately getting the terrible error of:
irb(main):089:0> valid_yaml_string?("b")
Psych::SyntaxError: (<unknown>): mapping values are not allowed in this context at line 6 column 19
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:203:in `parse'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:203:in `parse_stream'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:151:in `parse'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:127:in `load'
from (irb):83:in `valid_yaml_string?'
from (irb):89
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/bin/irb:12:in `<main>'
Using a cleaned-up version of your code:
require 'yaml'
require 'open-uri'
URL = "https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration"
def valid_yaml_string?(yaml)
!!YAML.load(yaml)
rescue Exception => e
STDERR.puts e.message
return false
end
puts valid_yaml_string?(open(URL).read)
I get:
(<unknown>): mapping values are not allowed in this context at line 6 column 19
false
when I run it.
The reason is, the data you are getting from that URL isn't YAML at all, it's HTML:
open('https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration').read[0, 100]
=> " \n\n\n<!DOCTYPE html>\n<html>\n <head prefix=\"og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# githubog:"
If you only want a true/false response whether it's parsable YAML, remove this line:
STDERR.puts e.message
Unfortunately, going beyond that and determining if the string is a YAML string gets harder. You can do some sniffing, looking for some hints:
yaml[/^---/m]
will search for the YAML "document" marker, but a YAML file doesn't have to use those, nor do they have to be at the start of the file. We can add that in to tighten up the test:
!!YAML.load(yaml) && !!yaml[/^---/m]
But, even that leaves some holes, so adding in a test to see what the parser returns can help even more. YAML could return an Fixnum, String, an Array or a Hash, but if you already know what to expect, you can check to see what YAML wants to return. For instance:
YAML.load(({}).to_yaml).class
=> Hash
YAML.load(({}).to_yaml).instance_of?(Hash)
=> true
So, you could look for a Hash:
parsed_yaml = YAML.load(yaml)
!!yaml[/^---/m] && parsed_yaml.instance_of(Hash)
Replace Hash with whatever type you think you should get.
There might be even better ways to sniff it out, but those are what I'd try first.

Ruby: how can use the dump method to output data to a csv file?

I try to use the ruby standard csv lib to dump out the arr of object to a csv.file , called 'a.csv'
http://ruby-doc.org/stdlib-1.9.3/libdoc/csv/rdoc/CSV.html#method-c-dump
dump(ary_of_objs, io = "", options = Hash.new)
but in this method, how can i dump into a file?
there is no such examples exists and help. I google it no example to do for me...
Also, the docs said that...
The next method you can provide is an instance method called
csv_headers(). This method is expected to return the second line of
the document (again as an Array), which is to be used to give each
column a header. By default, ::load will set an instance variable if
the field header starts with an # character or call send() passing the
header as the method name and the field value as an argument. This
method is only called on the first object of the Array.
Anyone knows how to pass the instance method csv_headers() to this dump function?
I haven't tested this out yet, but it looks like io should be set to a file. According to the doc you linked "The io parameter can be used to serialize to a File"
Something like:
f = File.open("filename")
dump(ary_of_objs, io = f, options = Hash.new)
The accepted answer doesn't really answer the question so I thought I'd give a useful example.
First of all if you look at the docs at http://ruby-doc.org/stdlib-1.9.3/libdoc/csv/rdoc/CSV.html, if you hover over the method name for dump you see you can click to show source. If you do that you'll see that the dump method attempts to call csv_headers on the first object you pass in from ary_of_objs:
obj_template = ary_of_objs.first
...snip...
headers = obj_template.csv_headers
Then later you see that the method will call csv_dump on each object in ary_of_objs and pass in the headers:
ary_of_objs.each do |obj|
begin
csv << obj.csv_dump(headers)
rescue NoMethodError
csv << headers.map do |var|
if var[0] == #
obj.instance_variable_get(var)
else
obj[var[0..-2]]
end
end
end
end
So we need to augment each entry in array_of_objs to respond to those two methods. Here's an example wrapper class that would take a Hash, and return the hash keys as the CSV headers and then be able to dump each row based on the headers.
class CsvRowDump
def initialize(row_hash)
#row = row_hash
end
def csv_headers
#row.keys
end
def csv_dump(headers)
headers.map { |h| #row[h] }
end
end
There's one more catch though. This dump method wants to write an extra line at the top of the CSV file before the headers, and there's no way to skip that if you call this method due to this code at the top:
# write meta information
begin
csv << obj_template.class.csv_meta
rescue NoMethodError
csv << [:class, obj_template.class]
end
Even if you return '' from CsvRowDump.csv_meta that will still be a blank line where a parse expects the headers. So instead lets let dump write that line and then remove it afterwards when we call dump. This example assumes you have an array of hashes that all have the same keys (which will be the CSV header).
#rows = #hashes.map { |h| CsvRowDump.new(h) }
File.open(#filename, "wb") do |f|
str = CSV::dump(#rows)
f.write(str.split(/\n/)[1..-1].join("\n"))
end

Resources