Ruby difference between CSV.read() and CSV.new() - ruby

Given a link to download a CSV(clicking on the link downloads the CSV instead of opening it in a browser), can I read it using CSV.read()? I know that I can do it using:
CSV.new(open(params[:ad_csv]), headers: true).each |row|
puts row # ad dict with header value as keys
end
I can't read the csv like this CSV.read(open(params[:ad_csv]), headers: true, read_timeout: 600)
I read the documentation but it didn't clear things up for me. Hence my question, difference between CSV.read() and CSV.new().

CSV.new just initializes an instance of CSV that can be assigned to a variable and can be used to read from or write to.
Whereas CSV.read initializes an instance of CSV and immediately reads its content into an array. From the docs:
Use read to slurp a CSV file into an Array of Arrays. Pass the path to the file and any options ::new understands.
Simplified (very simplified) CSV.read is implemented like this:
def self.read(path, *options)
new(path, *options) { |csv| csv.read }
end

Related

How to save an array of objects into a file in ruby?

I have an array of objects, where the objects are instances of a class. I would like to save this array into a file in such format that I could read the file back to an array and the objects and its' instance variable values would be as they were before saving. Does someone know how this could be achieved?
The class instance objects that I would like to save to a file are fairly complex containing tens of instance variables that are often other class instance variables themselves.
WHAT I HAVE TRIED:
According to this post I tried the following:
TRIAL1:
Save file:
require 'pp'
$stdout = File.open('path/to/file.txt', 'w')
pp myArray
Load file:
require 'rubygems'
require 'json'
buffer = File.open('path/to/file.txt', 'r').read
myArray = JSON.parse(buffer)
but I got a JSON::ParserError
TRIAL2:
Save file
serialized_array = Marshal.dump(myArray)
File.open('./myArray.txt', 'w') {|f| f.write(serialized_array) }
received Encoding::UndefinedConversionError
TRIAL1 doesn't work because pp "prints arguments in pretty form" and that's not necessarily JSON.
TRIAL2 probably isn't working because Marshal produces binary data (not text) and you're not working with your file in binary mode, that could lead to encoding and EOL problems. Besides, Marshal isn't a great format for persistence since the format is tied to the version of Ruby you're using.
A modification of TRIAL1 to write JSON is probably the best solution these days:
require 'json'
File.open('path/to/file.json', 'w') { |f| JSON.dump(myArray, f) }
Finally managed to find a solution that worked!
dump = Marshal.dump(myArray)
File.write('./myarray', myArray, mode: 'r+b')
dump = File.read('./myarray')
user = Marshal.restore(dump)
Marshall was able to do the trick after changing the encoding to binary mode

How to add/read rows from Ruby CSV instance

Though it seems far more common for people to use the Ruby CSV class methods, I have an occasion to use a CSV instance, but it seams completely uncooperative.
What I'd like to do is create a CSV instance, add some rows to it, then be able to retrieve all those rows and write them to a file. Sadly, the following code doesn't work as I would like at all.
require 'csv'
csv = CSV.new('', headers: ['name', 'age'])
csv.read # Apparently I need to do this so that the headers are actually read in.
csv.add_row(['john', '22'])
csv.add_row(['jane', '24'])
csv.read
csv.to_a
csv.to_s
All I want to be able to retrieve the information I put into the csv and then write that to a file, but I can't seem to do that :/
What am I doing wrong?
You need to use CSV#rewind
Here is the sample:
require 'csv'
csv = CSV.new(File.new("data1.csv", "r+"), headers: ['name', 'age'], write_headers: true)
csv.add_row(['john', '22'])
csv.add_row(['jane', '24'])
p csv.to_a # Empty array
csv.rewind
p csv.to_a # Array with three CSV::Row objects (including header)

Ruby equivalent to Python's DictWriter?

I have a Ruby script that goes through a CSV, determines some information, and then puts out a resulting CSV file. In Python, I'm able to open both my source file and my results file with DictReader and DictWriter respectively and write rows as dictionaries, where keys are the file header values. It doesn't appear that there is a manageable way to do this in Ruby, but I'm hoping somebody can point me to a better solution than storing all of my result hashes in an array and writing them after the fact.
The standard library "CSV" gives rows hash-like behavior when headers are enabled.
require 'csv'
CSV.open("file.csv", "wb") do |csv_out|
CSV.foreach("test.csv", headers: true) do |row|
row["header2"].upcase! # hashlike behaviour
row["new_header"] = 12 # add a new column
csv_out << row
end
end
(test.csv has a header1, a header2 and some random comma separated string lines.)

CSV.generate and converters?

I'm trying to create a converter to remove newline characters from CSV output.
I've got:
nonewline=lambda do |s|
s.gsub(/(\r?\n)+/,' ')
end
I've verified that this works properly IF I load a variable and then run something like:
csv=CSV(variable,:converters=>[nonewline])
However, I'm attempting to use this code to update a bunch of preexisting code using CSV.generate, and it does not appear to work at all.
CSV.generate(:converters=>[nonewline]) do |csv|
csv << ["hello\ngoodbye"]
end
returns:
"\"hello\ngoodbye\"\n"
I've tried quite a few things as well as trying other examples I've found online, and it appears as though :converters has no effect when used with CSV.generate.
Is this correct, or is there something I'm missing?
You need to write your converter as as below :
CSV::Converters[:nonewline] = lambda do |s|
s.gsub(/(\r?\n)+/,' ')
end
Then do :
CSV.generate(:converters => [:nonewline]) do |csv|
csv << ["hello\ngoodbye"]
end
Read the documentation Converters .
Okay, above part I didn't remove, as to show you how to write the custom CSV converters. The way you wrote it is incorrect.
Read the documentation of CSV::generate
This method wraps a String you provide, or an empty default String, in a CSV object which is passed to the provided block. You can use the block to append CSV rows to the String and when the block exits, the final String will be returned.
After reading the docs, it is quite clear that this method is for writing to a csv file, not for reading. Now all the converters options ( like :converters, :header_converters) is applied, when you are reading a CSV file, but not applied when you are writing into a CSV file.
Let me show you 2 examples to illustrate this more clearly.
require 'csv'
string = <<_
foo,bar
baz,quack
_
File.write('a',string)
CSV::Converters[:upcase] = lambda do |s|
s.upcase
end
I am reading from a CSV file, so :converters option is applied to it.
CSV.open('a','r',:converters => :upcase) do |csv|
puts csv.read
end
output
# >> FOO
# >> BAR
# >> BAZ
# >> QUACK
Now I am writing into the CSV file, converters option is not applied.
CSV.open('a','w',:converters => :upcase) do |csv|
csv << ['dog','cat']
end
CSV.read('a') # => [["dog", "cat"]]
Attempting to remove newlines using :converters did not work.
I had to override the << method from csv.rb adding the following code to it:
# Change all CR/NL's into one space
row.map! { |element|
if element.is_a?(String)
element.gsub(/(\r?\n)+/,' ')
else
element
end
}
Placed right before
output = row.map(&#quote).join(#col_sep) + #row_sep # quote and separate
at line 21.
I would think this would be a good patch to CSV, as newlines will always produce bad CSV output.

Ruby: how can use the dump method to output data to a csv file?

I try to use the ruby standard csv lib to dump out the arr of object to a csv.file , called 'a.csv'
http://ruby-doc.org/stdlib-1.9.3/libdoc/csv/rdoc/CSV.html#method-c-dump
dump(ary_of_objs, io = "", options = Hash.new)
but in this method, how can i dump into a file?
there is no such examples exists and help. I google it no example to do for me...
Also, the docs said that...
The next method you can provide is an instance method called
csv_headers(). This method is expected to return the second line of
the document (again as an Array), which is to be used to give each
column a header. By default, ::load will set an instance variable if
the field header starts with an # character or call send() passing the
header as the method name and the field value as an argument. This
method is only called on the first object of the Array.
Anyone knows how to pass the instance method csv_headers() to this dump function?
I haven't tested this out yet, but it looks like io should be set to a file. According to the doc you linked "The io parameter can be used to serialize to a File"
Something like:
f = File.open("filename")
dump(ary_of_objs, io = f, options = Hash.new)
The accepted answer doesn't really answer the question so I thought I'd give a useful example.
First of all if you look at the docs at http://ruby-doc.org/stdlib-1.9.3/libdoc/csv/rdoc/CSV.html, if you hover over the method name for dump you see you can click to show source. If you do that you'll see that the dump method attempts to call csv_headers on the first object you pass in from ary_of_objs:
obj_template = ary_of_objs.first
...snip...
headers = obj_template.csv_headers
Then later you see that the method will call csv_dump on each object in ary_of_objs and pass in the headers:
ary_of_objs.each do |obj|
begin
csv << obj.csv_dump(headers)
rescue NoMethodError
csv << headers.map do |var|
if var[0] == #
obj.instance_variable_get(var)
else
obj[var[0..-2]]
end
end
end
end
So we need to augment each entry in array_of_objs to respond to those two methods. Here's an example wrapper class that would take a Hash, and return the hash keys as the CSV headers and then be able to dump each row based on the headers.
class CsvRowDump
def initialize(row_hash)
#row = row_hash
end
def csv_headers
#row.keys
end
def csv_dump(headers)
headers.map { |h| #row[h] }
end
end
There's one more catch though. This dump method wants to write an extra line at the top of the CSV file before the headers, and there's no way to skip that if you call this method due to this code at the top:
# write meta information
begin
csv << obj_template.class.csv_meta
rescue NoMethodError
csv << [:class, obj_template.class]
end
Even if you return '' from CsvRowDump.csv_meta that will still be a blank line where a parse expects the headers. So instead lets let dump write that line and then remove it afterwards when we call dump. This example assumes you have an array of hashes that all have the same keys (which will be the CSV header).
#rows = #hashes.map { |h| CsvRowDump.new(h) }
File.open(#filename, "wb") do |f|
str = CSV::dump(#rows)
f.write(str.split(/\n/)[1..-1].join("\n"))
end

Resources