Taking json data and converting it to a CSV file - ruby

Okay... so new to Ruby here but loving it so far. My problem is I cannot get the data to go into the CSV files.
#!/usr/bin/env ruby
require 'date'
require_relative 'amf'
require 'json'
require 'csv'
amf = Amf.new
#This makes it go out 3 days
apps = amf.post( 'Appointments.getBetweenDates',
{ 'startDate' => Date.today, 'endDate' => Date.today + 4 }
)
apps.each do |app|
cor_md_params = { 'appId' => app['appID'], 'relId' => 7 }
cor_md = amf.post( 'Clinicians.getByAppIdAndRelId', cor_md_params ).first
#this is where it breaks ----->
CSV.open("ile.csv", "wb") do |csv|
csv << ["column1", "column2", "etc.", "etc.."]
csv << ([
# if added puts ([ I can display the info and then make a csv...
app['patFirstName'],
app['patMiddleName'],
app['patLastName'],
app['patBirthdate'],
app['patHin'],
app['patPhone'],
app['patCellPhone'],
app['patBusinessPhone'],
app['appTime'],
app['appID'],
app['patPostalCode'],
app['patProvince'],
app['locName'],
# note that this is not exactly accurate for follow-ups,
# where you have to replace the "1" with the actual value
# in weeks, days, months, etc
#app[ 'bookName' ], => not sure this is needed
cor_md['id'],
cor_md['providerCode'],
cor_md['firstName'],
cor_md['lastName']
].join(', '))
end
end
Now, if I remove the attempt to make the ile.cvs file and just output it with a puts, all the data shows. But I don't want to have to go into the terminal and create a csv file... I would rather just run the .rb program and have it created. Also, hopefully I am making the columns correctly as well...
The thought occurred to me that I could just add another puts above the output.
Or, better, insert a row into the array before I output it...
Really not sure what is best practice here and standards.
This is what I have done and attempted. How can I get it to cleanly output to a CSV file since my attempts are not working
Also, to clarify where it breaks, it does add the column names just not the JSON info that is parsed. I could also be completely doing this the wrong way or a way that isn't possible. I just do not know.

What kind of error do you get? Is it this one:
<<': undefined methodmap' for "something":String (NoMethodError)
I think, you should remove the .join(', ')
The << method of CSV accepts an array, but not a String
http://ruby-doc.org/stdlib-1.9.2/libdoc/csv/rdoc/CSV.html#method-i-3C-3C
So instead of:
cor_md['lastName']
].join(', '))
rather:
cor_md['lastName']
])
The problem with the loop (why it writes only 1 row of data)
In the body of your loop, you always reopen the file, and always rewrite what you added before. What you want to do, is probably this:
CSV.open("ile3.csv", "wb") do |csv|
csv << ["column1", "column2", "etc.", "etc.."]
apps.each do |app|
cor_md_params = { 'appId' => app['appID'], 'relId' => 7 }
cor_md = amf.post( 'Clinicians.getByAppIdAndRelId', cor_md_params ).first
#csv << your long array
end
end

Related

Difficulty processing json with ruby

I have the following json...
{
"NumPages":"17",
"Page":"1",
"PageSize":"50",
"Total":"808",
"Start":"1",
"End":"50",
"FirstPageUri":"/v3/results?PAGE=1",
"LastPageUri":"/v3/results?PAGE=17",
"PreviousPageUri":"",
"NextPageUri":"/v3/results?PAGE=2",
"User":[
{
"RowNumber":"1",
"UserId":"86938",
"InternalId":"",
"CompletionPercentage":"100",
"DateTimeTaken":"2014-06-18T01:43:25Z",
"DateTimeLastUpdated":"2014-06-18T01:58:11Z",
"DateTimeCompleted":"2014-06-18T01:58:11Z",
"Account":{
"Id":"655",
"Name":"Technical Community College"
},
"FirstName":"Matthew",
"LastName":"Knice",
"EmailAddress":"knice#gmail.com",
"AssessmentResults":[
{
"Title":"Life Factors",
"Code":"LifeFactors",
"IsComplete":"1",
"AttemptNumber":"1",
"Percent":"58",
"Readiness":"fail",
"DateTimeCompleted":"2014-06-18T01:46:00Z"
},
{
"Title":"Learning Styles",
"Code":"LearnStyles",
"IsComplete":"0"
},
{
"Title":"Personal Attributes",
"Code":"PersonalAttributes",
"IsComplete":"1",
"AttemptNumber":"1",
"Percent":"52.08",
"Readiness":"fail",
"DateTimeCompleted":"2014-06-18T01:49:00Z"
},
{
"Title":"Technical Competency",
"Code":"TechComp",
"IsComplete":"1",
"AttemptNumber":"1",
"Percent":"100",
"Readiness":"pass",
"DateTimeCompleted":"2014-06-18T01:51:00Z"
},
{
"Title":"Technical Knowledge",
"Code":"TechKnowledge",
"IsComplete":"1",
"AttemptNumber":"1",
"Percent":"73.44",
"Readiness":"question",
"DateTimeCompleted":"2014-06-18T01:58:00Z"
},
{
"Title":"Reading Rate & Recall",
"Code":"Reading",
"IsComplete":"0"
},
{
"Title":"Typing Speed & Accuracy",
"Code":"Typing",
"IsComplete":"0"
}
]
},
{
"RowNumber":"2",
"UserId":"8654723",
"InternalId":"",
"CompletionPercentage":"100",
"DateTimeTaken":"2014-06-13T14:37:59Z",
"DateTimeLastUpdated":"2014-06-13T15:00:12Z",
"DateTimeCompleted":"2014-06-13T15:00:12Z",
"Account":{
"Id":"655",
"Name":"Technical Community College"
},
"FirstName":"Virginia",
"LastName":"Bustas",
"EmailAddress":"bigBusta#students.college.edu",
"AssessmentResults":[
{
...
I need to start processing where you see "User:" The stuff at the beginning (numpages, page, ect) I want to ignore. Here is the processing script I am working on...
require 'csv'
require 'json'
CSV.open("your_csv.csv", "w") do |csv| #open new file for write
JSON.parse(File.open("sample.json").read).each do |hash| #open json to parse
csv << hash.values
end
end
Right now this fails with the error:
convert.rb:6:in `block (2 levels) in <main>': undefined method `values' for ["NumPages", "17"]:Array (NoMethodError)
I have ran the json through a parser, and it seems to be valid. What is the best way to only process the "User" data?
You have to look at the structure of the JSON object being created. Here's a very small subset of your document being parsed, which makes it easier to see and understand:
require 'json'
foo = '{"NumPages":17,"User":[{"UserId":12345}]}'
bar = JSON[foo]
# => {"NumPages"=>17, "User"=>[{"UserId"=>12345}]}
bar['User'].first['UserId'] # => 12345
foo contains the JSON for a hash. bar contains the Ruby object created by the JSON parser after it reads foo.
User is the key pointing to an array of hashes. Because it's an array, you have to specify which of the hashes in the array you want to look at, which is what bar['User'].first does.
An alternate way to access that sub-hash is:
bar['User'][0]['UserId'] # => 12345
If there were multiple hashes inside the array, you could access them by using the appropriate index value. For example, if there are two hashes, and I want the second one:
foo = '{"NumPages":17,"User":[{"UserId":12345},{"UserId":12346}]}'
bar = JSON[foo]
# => {"NumPages"=>17, "User"=>[{"UserId"=>12345}, {"UserId"=>12346}]}
bar['User'].first['UserId'] # => 12345
bar['User'][0]['UserId'] # => 12345
bar['User'][1]['UserId'] # => 12346
I'm wondering if I am going down the wrong road with the JSON.parse(File.open("sample.json").read).each do |hash|?
Yes, you are. You need to understand what you're doing, and break your code into digestible pieces so they make sense to you. Consider this:
require 'csv'
require 'json'
json_object = JSON.parse(File.read("sample.json"))
CSV.open("your_csv.csv", "w") do |csv| #open new file for write
csv << %w[RowNumber UserID AccountID AccountName FirstName LastName EmailAddress]
json_object['User'].each do |user_hash|
puts 'RowNumber: %s' % user_hash['RowNumber']
puts 'UserID: %s' % user_hash['UserID']
account = user_hash['UserID']['Account']
puts 'Account->Id: %s' % account['Id']
puts 'Account->Name: %s' % account['Name']
puts 'FirstName: %s' % user_hash['FirstName']
puts 'LastName: %s' % user_hash['LastName']
puts 'EmailAddress: %s' % user_hash['EmailAddress']
csv << [
user_hash['RowNumber'],
user_hash['UserID'],
account['Id'],
account['Name'],
user_hash['FirstName'],
user_hash['LastName'],
user_hash['EmailAddress']
]
end
end
This reads the JSON file and parses it into a Ruby object immediately. There is no special magic or anything else that happens with the file, it's opened, read, closed, and its content is passed to the JSON parser and assigned to json_object.
Once parsed, the CSV file is opened and a header row is written. It could have been written as part of the open statement but this is clearer for explaining what's going on.
json_object is a hash, so to access the 'User' data you have to use a normal hash access json_object['User']. The value for the User key is an array of hashes, so those need to be iterated over, which is what json_object['User'].each does, passing the hash elements of that array into the block as user_hash.
Inside that block it's pretty much the same thing as access the value for 'User', each "element" is a key/value pair, except 'Account' which is an embedded hash.
Read the error message. each called on a hash is giving you a sequence of arrays with two members (the key and value together). There is no values method on an array. And in any case if what you have is a hash there seems little point cycling through it with each; if you want the "User" entry in the hash, why don't you ask for it up front?
Just for posterity and context this is the script I ended up using in its entity. I needed to pull from a url, and process the results and move them to a simple CSV. I needed to wite the student id, first name, last name, and the score from each of 4 assessments to the csv.
require 'csv'
require 'json'
require 'curb'
c = Curl::Easy.new('myURL/m/v3/results')
c.http_auth_types = :basic
c.username = 'myusername'
c.password = 'mypassword'
c.perform
json_object = JSON.parse(c.body_str)
CSV.open("your_csv.csv", "w") do |csv| #open new file for write
csv << %w[UserID FirstName LastName LifeFactors PersonalAttributes TechComp TechKnowledge]
json_object['User'].each do |user_hash|
csv << [
user_hash['UserId'],
user_hash['FirstName'],
user_hash['LastName'],
user_hash['AssessmentResults'][0]['Percent'],
user_hash['AssessmentResults'][2]['Percent'],
user_hash['AssessmentResults'][3]['Percent'],
user_hash['AssessmentResults'][4]['Percent']
]
end
end

CSV.generate and converters?

I'm trying to create a converter to remove newline characters from CSV output.
I've got:
nonewline=lambda do |s|
s.gsub(/(\r?\n)+/,' ')
end
I've verified that this works properly IF I load a variable and then run something like:
csv=CSV(variable,:converters=>[nonewline])
However, I'm attempting to use this code to update a bunch of preexisting code using CSV.generate, and it does not appear to work at all.
CSV.generate(:converters=>[nonewline]) do |csv|
csv << ["hello\ngoodbye"]
end
returns:
"\"hello\ngoodbye\"\n"
I've tried quite a few things as well as trying other examples I've found online, and it appears as though :converters has no effect when used with CSV.generate.
Is this correct, or is there something I'm missing?
You need to write your converter as as below :
CSV::Converters[:nonewline] = lambda do |s|
s.gsub(/(\r?\n)+/,' ')
end
Then do :
CSV.generate(:converters => [:nonewline]) do |csv|
csv << ["hello\ngoodbye"]
end
Read the documentation Converters .
Okay, above part I didn't remove, as to show you how to write the custom CSV converters. The way you wrote it is incorrect.
Read the documentation of CSV::generate
This method wraps a String you provide, or an empty default String, in a CSV object which is passed to the provided block. You can use the block to append CSV rows to the String and when the block exits, the final String will be returned.
After reading the docs, it is quite clear that this method is for writing to a csv file, not for reading. Now all the converters options ( like :converters, :header_converters) is applied, when you are reading a CSV file, but not applied when you are writing into a CSV file.
Let me show you 2 examples to illustrate this more clearly.
require 'csv'
string = <<_
foo,bar
baz,quack
_
File.write('a',string)
CSV::Converters[:upcase] = lambda do |s|
s.upcase
end
I am reading from a CSV file, so :converters option is applied to it.
CSV.open('a','r',:converters => :upcase) do |csv|
puts csv.read
end
output
# >> FOO
# >> BAR
# >> BAZ
# >> QUACK
Now I am writing into the CSV file, converters option is not applied.
CSV.open('a','w',:converters => :upcase) do |csv|
csv << ['dog','cat']
end
CSV.read('a') # => [["dog", "cat"]]
Attempting to remove newlines using :converters did not work.
I had to override the << method from csv.rb adding the following code to it:
# Change all CR/NL's into one space
row.map! { |element|
if element.is_a?(String)
element.gsub(/(\r?\n)+/,' ')
else
element
end
}
Placed right before
output = row.map(&#quote).join(#col_sep) + #row_sep # quote and separate
at line 21.
I would think this would be a good patch to CSV, as newlines will always produce bad CSV output.

How do I make an array of arrays out of a CSV?

I have a CSV file that looks like this:
Jenny, jenny#example.com ,
Ricky, ricky#example.com ,
Josefina josefina#example.com ,
I'm trying to get this output:
users_array = [
['Jenny', 'jenny#example.com'], ['Ricky', 'ricky#example.com'], ['Josefina', 'josefina#example.com']
]
I've tried this:
users_array = Array.new
file = File.new('csv_file.csv', 'r')
file.each_line("\n") do |row|
puts row + "\n"
columns = row.split(",")
users_array.push columns
puts users_array
end
Unfortunately, in Terminal, this returns:
Jenny
jenny#example.com
Ricky
ricky#example.com
Josefina
josefina#example.com
Which I don't think will work for this:
users_array.each_with_index do |user|
add_page.form_with(:id => 'new_user') do |f|
f.field_with(:id => "user_email").value = user[0]
f.field_with(:id => "user_name").value = user[1]
end.click_button
end
What do I need to change? Or is there a better way to solve this problem?
Ruby's standard library has a CSV class with a similar api to File but contains a number of useful methods for working with tabular data. To get the output you want, all you need to do is this:
require 'csv'
users_array = CSV.read('csv_file.csv')
PS - I think you are getting the output you expected with your file parsing as well, but maybe you're thrown off by how it is printing to the terminal. puts behaves differently with arrays, printing each member object on a new line instead of as a single array. If you want to view it as an array, use puts my_array.inspect.
Assuming that your CSV file actually has a comma between the name and email address on the third line:
require 'csv'
users_array = []
CSV.foreach('csv_file.csv') do |row|
users_array.push row.delete_if(&:nil?).map(&:strip)
end
users_array
# => [["Jenny", "jenny#example.com"],
# ["Ricky", "ricky#example.com"],
# ["Josefina", "josefina#example.com"]]
There may be a simpler way, but what I'm doing there is discarding the nil field created by the trailing comma and stripping the spaces around the email addresses.

Removing whitespaces in a CSV file

I have a string with extra whitespace:
First,Last,Email ,Mobile Phone ,Company,Title ,Street,City,State,Zip,Country, Birthday,Gender ,Contact Type
I want to parse this line and remove the whitespaces.
My code looks like:
namespace :db do
task :populate_contacts_csv => :environment do
require 'csv'
csv_text = File.read('file_upload_example.csv')
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
puts "First Name: #{row['First']} \nLast Name: #{row['Last']} \nEmail: #{row['Email']}"
end
end
end
#prices = CSV.parse(IO.read('prices.csv'), :headers=>true,
:header_converters=> lambda {|f| f.strip},
:converters=> lambda {|f| f ? f.strip : nil})
The nil test is added to the row but not header converters assuming that the headers are never nil, while the data might be, and nil doesn't have a strip method. I'm really surprised that, AFAIK, :strip is not a pre-defined converter!
You can strip your hash first:
csv.each do |unstriped_row|
row = {}
unstriped_row.each{|k, v| row[k.strip] = v.strip}
puts "First Name: #{row['First']} \nLast Name: #{row['Last']} \nEmail: #{row['Email']}"
end
Edited to strip hash keys too
CSV supports "converters" for the headers and fields, which let you get inside the data before it's passed to your each loop.
Writing a sample CSV file:
csv = "First,Last,Email ,Mobile Phone ,Company,Title ,Street,City,State,Zip,Country, Birthday,Gender ,Contact Type
first,last,email ,mobile phone ,company,title ,street,city,state,zip,country, birthday,gender ,contact type
"
File.write('file_upload_example.csv', csv)
Here's how I'd do it:
require 'csv'
csv = CSV.open('file_upload_example.csv', :headers => true)
[:convert, :header_convert].each { |c| csv.send(c) { |f| f.strip } }
csv.each do |row|
puts "First Name: #{row['First']} \nLast Name: #{row['Last']} \nEmail: #{row['Email']}"
end
Which outputs:
First Name: 'first'
Last Name: 'last'
Email: 'email'
The converters simply strip leading and trailing whitespace from each header and each field as they're read from the file.
Also, as a programming design choice, don't read your file into memory using:
csv_text = File.read('file_upload_example.csv')
Then parse it:
csv = CSV.parse(csv_text, :headers => true)
Then loop over it:
csv.each do |row|
Ruby's IO system supports "enumerating" over a file, line by line. Once my code does CSV.open the file is readable and the each reads each line. The entire file doesn't need to be in memory at once, which isn't scalable (though on new machines it's becoming a lot more reasonable), and, if you test, you'll find that reading a file using each is extremely fast, probably equally fast as reading it, parsing it then iterating over the parsed file.

Parse CSV file with header fields as attributes for each row

I would like to parse a CSV file so that each row is treated like an object with the header-row being the names of the attributes in the object. I could write this, but I'm sure its already out there.
Here is my CSV input:
"foo","bar","baz"
1,2,3
"blah",7,"blam"
4,5,6
The code would look something like this:
CSV.open('my_file.csv','r') do |csv_obj|
puts csv_obj.foo #prints 1 the 1st time, "blah" 2nd time, etc
puts csv.bar #prints 2 the first time, 7 the 2nd time, etc
end
With Ruby's CSV module I believe I can only access the fields by index. I think the above code would be a bit more readable. Any ideas?
Using Ruby 1.9 and above, you can get a an indexable object:
CSV.foreach('my_file.csv', :headers => true) do |row|
puts row['foo'] # prints 1 the 1st time, "blah" 2nd time, etc
puts row['bar'] # prints 2 the first time, 7 the 2nd time, etc
end
It's not dot syntax but it is much nicer to work with than numeric indexes.
As an aside, for Ruby 1.8.x FasterCSV is what you need to use the above syntax.
Here is an example of the symbolic syntax using Ruby 1.9. In the examples below, the code reads a CSV file named data.csv from Rails db directory.
:headers => true treats the first row as a header instead of a data row. :header_converters => :symbolize parameter then converts each cell in the header row into Ruby symbol.
CSV.foreach("#{Rails.root}/db/data.csv", {:headers => true, :header_converters => :symbol}) do |row|
puts "#{row[:foo]},#{row[:bar]},#{row[:baz]}"
end
In Ruby 1.8:
require 'fastercsv'
CSV.foreach("#{Rails.root}/db/data.csv", {:headers => true, :header_converters => :symbol}) do |row|
puts "#{row[:foo]},#{row[:bar]},#{row[:baz]}"
end
Based on the CSV provided by the Poul (the StackOverflow asker), the output from the example code above will be:
1,2,3
blah,7,blam
4,5,6
Depending on the characters used in the headers of the CSV file, it may be necessary to output the headers in order to see how CSV (FasterCSV) converted the string headers to symbols. You can output the array of headers from within the CSV.foreach.
row.headers
Easy to get a hash in Ruby 2.3:
CSV.foreach('my_file.csv', headers: true, header_converters: :symbol) do |row|
puts row.to_h[:foo]
puts row.to_h[:bar]
end
Although I am pretty late to the discussion, a few months ago I started a "CSV to object mapper" at https://github.com/vicentereig/virgola.
Given your CSV contents, mapping them to an array of FooBar objects is pretty straightforward:
"foo","bar","baz"
1,2,3
"blah",7,"blam"
4,5,6
require 'virgola'
class FooBar
include Virgola
attribute :foo
attribute :bar
attribute :baz
end
csv = <<CSV
"foo","bar","baz"
1,2,3
"blah",7,"blam"
4,5,6
CSV
foo_bars = FooBar.parse(csv).all
foo_bars.each { |foo_bar| puts foo_bar.foo, foo_bar.bar, foo_bar.baz }
Since I hit this question with some frequency:
array_of_hashmaps = CSV.read("path/to/file.csv", headers: true)
puts array_of_hashmaps.first["foo"] # 1
This is the non-block version, when you want to slurp the whole file.

Resources