Ruby - Merge CSV duplicate columns with same SKU - ruby

I have created a CSV file about my eshop that contains multiple items with different SKUs. Some SKUs appear more than once because they can be in more than one category (but the Title and Price will always be the same for a given SKU). Example:
SKU,Title,Category,Price
001,Soap,Bathroom,0.5
001,Soap,Kitchen,0.5
002,Water,Kitchen,0.4
002,Water,Garage,0.4
003,Juice,Kitchen,0.8
I now wish to create from that file another CSV file that has no duplicate SKU's and aggregates the "Category" attributes as follows:
SKU,Title,Category,Price
001,Soap,Bathroom/Kitchen,0.5
002,Water,Kitchen/Garage,0.4
003,Juice,Kitchen,0.8
How can I do that?

It's my understand you wish to read a CSV file, perform some operations on the data and then write the result to a new CSV file. You could do that as follows.
Code
require 'csv'
def convert(csv_file_in, csv_file_out, group_field, aggregate_field)
csv = CSV.read(FNameIn, headers: true)
headers = csv.headers
arr = csv.group_by { |row| row[group_field] }.
map do |_,a|
headers.map { |h| h==aggregate_field ?
(a.map { |row| row[aggregate_field] }.join('/')) : a.first[h] }
end
CSV.open(FNameOut, "wb") do |csv|
csv << headers
arr.each { |row| csv << row }
end
end
Example
Let's create a CSV file with the following data:
s =<<_
SKU,Title,Category,Price
001,Soap,Bathroom,0.5
001,Soap,Kitchen,0.5
002,Water,Kitchen,0.4
002,Water,Garage,0.4
003,Juice,Kitchen,0.8
_
FNameIn = 'testin.csv'
FNameOut = 'testout.csv'
IO.write(FNameIn, s)
#=> 135
Now execute the method with these values:
convert(FNameIn, FNameOut, "SKU", "Category")
and confirm FNameOut was written correctly:
puts IO.read(FNameOut)
SKU,Title,Category,Price
001,Soap,Bathroom/Kitchen,0.5
002,Water,Kitchen/Garage,0.4
003,Juice,Kitchen,0.8
Explanation
The steps are as follows:
csv_file_in = FNameIn
csv_file_out = FNameOut
group_field = "SKU"
aggregate_field = "Category"
csv = CSV.read(FNameIn, headers: true)
See CSV::read.
headers = csv.headers
#=> ["SKU", "Title", "Category", "Price"]
h = csv.group_by { |row| row[group_field] }
#=> {"001"=>[
#<CSV::Row "SKU":"001" "Title":"Soap" "Category":"Bathroom" "Price":"0.5">,
# #<CSV::Row "SKU":"001" "Title":"Soap" "Category":"Kitchen" "Price":"0.5">
# ],
# "002"=>[
# #<CSV::Row "SKU":"002" "Title":"Water" "Category":"Kitchen" "Price":"0.4">,
# #<CSV::Row "SKU":"002" "Title":"Water" "Category":"Garage" "Price":"0.4">
# ],
# "003"=>[
# #<CSV::Row "SKU":"003" "Title":"Juice" "Category":"Kitchen" "Price":"0.8">
# ]
# }
arr = h.map do |_,a|
headers.map { |h| h==aggregate_field ?
(a.map { |row| row[aggregate_field] }.join('/')) : a.first[h] }
end
#=> [["001", "Soap", "Bathroom/Kitchen", "0.5"],
# ["002", "Water", "Kitchen/Garage", "0.4"],
# ["003", "Juice", "Kitchen", "0.8"]]
See CSV#headers and Enumerable#group_by, an oft-used method. Lastly, write the output file:
CSV.open(FNameOut, "wb") do |csv|
csv << headers
arr.each { |row| csv << row }
end
See CSV::open. Now let's return to the calculation of arr. This is most easily explained by inserting some puts statements and executing the code.
arr = h.map do |_,a|
puts " _=#{_}"
puts " a=#{a}"
headers.map do |h|
puts " header=#{h}"
if h==aggregate_field
a.map { |row| row[aggregate_field] }.join('/')
else
a.first[h]
end.
tap { |s| puts " mapped to #{s}" }
end
end
See Object#tap. The following is displayed.
_=001
a=[#<CSV::Row "SKU":"001" "Title":"Soap" "Category":"Bathroom" "Price":"0.5">,
#<CSV::Row "SKU":"001" "Title":"Soap" "Category":"Kitchen" "Price":"0.5">]
header=SKU
mapped to 001
header=Title
mapped to Soap
header=Category
mapped to Bathroom/Kitchen
header=Price
mapped to 0.5
_=002
a=[#<CSV::Row "SKU":"002" "Title":"Water" "Category":"Kitchen" "Price":"0.4">,
#<CSV::Row "SKU":"002" "Title":"Water" "Category":"Garage" "Price":"0.4">]
header=SKU
mapped to 002
header=Title
mapped to Water
header=Category
mapped to Kitchen/Garage
header=Price
mapped to 0.4
_=003
a=[#<CSV::Row "SKU":"003" "Title":"Juice" "Category":"Kitchen" "Price":"0.8">]
header=SKU
mapped to 003
header=Title
mapped to Juice
header=Category
mapped to Kitchen
header=Price
mapped to 0.8

It seems that in order for this to be correct, we must assume the SKU number and the price are always the same. Since you know the only key you want to merge data between is Category here is how you can do it.
Assuming this is your test.csv in the same path as the ruby script:
# test.csv
SKU,Title,Category,Price
001,Soap,Bathroom,0.5
001,Soap,Kitchen,0.5
002,Water,Kitchen,0.4
002,Water,Garage,0.4
003,Juice,Kitchen,0.8
Ruby script in same directory as your test.csv file
# fix_csv.rb
require 'csv'
rows = CSV.read 'test.csv', :headers => true
skews = rows.group_by{|row| row['SKU']}.keys.uniq
values = rows.group_by{|row| row['SKU']}
merged = skews.map do |key|
group = values.select{|k,v| k == key}.values.flatten.map(&:to_h)
category = group.map{|k,v| k['Category']}.join('/')
new_data = group[0]
new_data['Category'] = category
new_data
end
CSV.open('merged_data.csv', 'w') do |csv|
csv << merged.first.keys # writes the header row
merged.each do |hash|
csv << hash.values
end
end
puts 'see contents of merged_data.csv'

Related

Writing data into a CSV file by two different CSV files

So, i'm learning ruby and i've been stuck with this for a long time and i need some help.
I need to write to a CSV file from two different CSV files and i have the code to do it but in 2 different functions and i need the two files together in one.
So thats the code:
require 'CSV'
class Plantas <
Struct.new( :code)
end
class Especies <
Struct.new(:id, :type, :code, :name_es, :name_ca, :name_en, :latin_name, :customer_id )
end
def ecode
f_inECODE = File.open("pflname.csv", "r") #get EPPOCODE
f_out=CSV.open("plantas.csv", "w+", :headers => true) #outputfile
f_inECODE.each_line do |line|
fields = line.split(',')
newPlant = Plantas.new
newPlant.code = fields[2].tr_s('"', '').strip #eppocode
plant = [newPlant.code] #linies a imprimir
f_out << plant
end
end
def data
f_dataspices=File.open("spices.csv", "r")
f_out=CSV.open("plantas.csv", "w+", :headers => true) #outputfile
f_dataspices.each_line do |line|
fields = line.split(',')
newEspecies = Especies.new
newEspecies.id = fields[0].tr_s('"', '').strip
newEspecies.type = fields[1].tr_s('"', '').strip
newEspecies.code = fields[2].tr_s('"', '').strip
newEspecies.name_es = fields[3].tr_s('"', '').strip
newEspecies.name_ca = fields[4].tr_s('"', '').strip
newEspecies.name_en = fields[5].tr_s('"', '').strip
newEspecies.latin_name = fields[6].tr_s('"', '').strip
newEspecies.customer_id = fields[7].tr_s('"', '').strip
especia = [newEspecies.id,newEspecies.type,newEspecies.code,newEspecies.name_es,newEspecies.name_ca,newEspecies.name_en,newEspecies.latin_name,newEspecies.customer_id]
f_out << especia
end
end
data
ecode
And the wished output would be like this: species.csv + ecode.csv
"id","type","code","name_es","name_ca","name_en","latin_name","customer_id","ecode"
7205,"DunSpecies",NULL,"0","0","0","",11630,LEECO
7437,"DunSpecies",NULL,"0","Xicoira","0","",5273,LEE3O
7204,"DunSpecies",NULL,"0","0","0","",11630,L4ECO
And the actual is this:
"id","type","code","name_es","name_ca","name_en","latin_name","customer_id"
7205,"DunSpecies",NULL,"0","0","0","",11630
7437,"DunSpecies",NULL,"0","Xicoira","0","",5273
7204,"DunSpecies",NULL,"0","0","0","",11630
(without ecode)
From one side i have the ecode and from the other the whole data i just need to put it together.
I'd like to put all together in the same file (plantas.csv)
I did in two different functions because I don't know how to put all together with one foreach I would like to put all in the same function but I don't how doing it.
If someone could help me to get this code all in one function and writing the results in the same file I would be so grateful.
An example of the input of the file ecode.csv (in which I just want the ecode field) is this:
"""identifier"",""datatype"",""code"",""lang"",""langno"",""preferred"",""status"",""creation"",""modification"",""country"",""fullname"",""authority"",""shortname"""
"""N1952"",""PFL"",""LEECO"",""la"",""1"",""0"",""N"",""06/06/2000"",""09/03/2010"","""",""Leea coccinea non"",""Planchon"",""Leea coccinea non"""
"""N2974"",""PFL"",""LEECO"",""en"",""1"",""0"",""N"",""06/06/2000"",""21/02/2011"","""",""west Indian holly"","""",""West Indian holly"""
An example of the input of the file data.csv (in which I want all the fields) is this:
"id","type","code","name_es","name_ca","name_en","latin_name","customer_id"
7205,"DunSpecies",NULL,"0","0","0","",11630
7437,"DunSpecies",NULL,"0","Xicoira","0","",5273
And the way to link both files is by creating a third file in which i write everything in it!
At least this is my idea, i dont know if there is a simpler way to do it.
Thanks!
Cleaning up ecode.csv made it more challenging, but here is what I came up with:
In case, data.csv and ecode.csv are matched by row numbers:
require 'csv'
data = CSV.read('data.csv', headers: true).to_a
headers = data.shift << 'eppocode'
double_quoted_ecode = CSV.read('ecode.csv')
ecodeIO = StringIO.new
ecodeIO.puts double_quoted_ecode.to_a
ecodeIO.rewind
ecode = CSV.parse(ecodeIO, headers: true)
CSV.open('plantas.csv', 'w+') do |plantas|
plantas << headers
data.each.with_index do |row, idx|
planta = row + [ecode['code'][idx]]
plantas << planta
end
end
Using your example files, this gives you the following plantas.csv:
id,type,code,name_es,name_ca,name_en,latin_name,customer_id,eppocode
7205,DunSpecies,NULL,0,0,0,"",11630,LEECO
7437,DunSpecies,NULL,0,Xicoira,0,"",5273,LEECO
In case, entries are matched by data.csv's id and ecode.csv's identifier:
require 'csv'
data = CSV.read('data.csv', headers: true)
headers = data.headers << 'eppocode'
double_quoted_ecode = CSV.read('ecode.csv')
ecodeIO = StringIO.new
ecodeIO.puts double_quoted_ecode.to_a
ecodeIO.rewind
ecode = CSV.parse(ecodeIO, headers: true)
CSV.open('plantas.csv', 'w+') do |plantas|
plantas << headers
data.each do |row|
id = row['id']
ecode_row = ecode.find { |entry| entry['identifier'] == id } || {}
planta = row << ecode_row['code']
plantas << planta
end
end
I hope you find this helpful.
Data
Let's begin by creating the two CSV files. To make the results easier to follow I have arbitrarily removed some of the fields in each file, and changed one field value.
ecode.csv
ecode = '"""identifier"",""datatype"",""code"",""lang"",""langno"",""preferred"",""status"",""creation"",""modification"",""country"",""fullname"",""authority"",""shortname""" """N1952"",""PFL"",""LEECO"",""la"",""1"",""0"",""N"",""06/06/2000"",""09/03/2010"","""",""Leea coccinea non"",""Planchon"",""Leea coccinea non""" """N2974"",""PFL"",""LEEC1"",""en"",""1"",""0"",""N"",""06/06/2000"",""21/02/2011"","""",""west Indian holly"","""",""West Indian holly"""'
File.write('ecode.csv', ecode)
#=> 452
data.csv
data = '"id","type","code","customer_id"\n7205,"DunSpecies",NULL,11630\n7437,"DunSpecies",NULL,,5273'
File.write('data.csv', data)
#=> 90
Code
CSV.open('plantas.csv', 'w') do |csv_out|
converter = ->(s) { s.delete('"') }
epposcode = CSV.foreach('ecode.csv',
headers:true,
header_converters: [converter],
converters: [converter]
).map { |csv| csv["code"] }
headers = CSV.open('data.csv', &:readline) << 'epposcode'
csv_out << headers
CSV.foreach('data.csv', headers:true) do |row|
csv_out << (row << epposcode.shift)
end
end
#=> 90
Result
Let's see what was written.
puts File.read('plantas.csv')
id,type,code,customer_id,epposcode
7205,DunSpecies,NULL,11630,LEECO
7437,DunSpecies,NULL,,5273,LEEC1
Explanation
The structure we want is the following.
CSV.open('plantas.csv', 'w') do |csv_out|
epposcode = <array of 'code' field values from 'ecode.csv'>
headers = <headers from 'data.csv' to which 'epposcode' is appended>
csv_out << headers
CSV.foreach('data.csv', headers:true) do |row|
csv_out << <row of 'data.csv' to which an element of epposcode is appended>>
end
end
CSV::open is the main CSV method for writing files and CSV::foreach is generally my method-of-choice for reading CSV files. I could have instead written the following.
csv_out = CSV.open('plantas.csv', 'w')
epposcode = <array of 'code' field values from 'ecode.csv'>
headers = <headers from 'data.csv' to which 'epposcode' is appended>
csv_out << headers
CSV.foreach('data.csv', headers:true) do |row|
csv_out << <row of 'data.csv' to which an element of epposcode is appended>>
end
csv_out.close
but using a block is convenient because the file is closed before returning from the block.
It is convenient to use a converter for both the header fields and the row fields:
converter = ->(s) { s.delete('"') }
This is a proc (I've defined a lambda) that removes double quotes from strings. They are specified as two of foreach's optional arguments:
epposcode = CSV.foreach('ecode.csv',
headers:true,
header_converters: [converter],
converters: [converter]
)
Search for "Data Converters" in the CSV doc.
We invoke foreach without a block to return an enumerator, so it can be chained to map:
epposcode = CSV.foreach('ecode.csv',
headers:true,
header_converters: [converter],
converters: [converter]
).map { |csv| csv["code"] }
For the example,
epposcode
#=> ["LEECO", "LEEC1"]

manipulating csv with ruby

I have a CSV from which I've removed the irrelevant data.
Now I need to split "Name and surname" into 2 columns by space but ignoring a 3rd column in case there are 3 names, then invert the order of the columns "Name and surname" and "Phone" (phone first) and then put them into a file ignoring the headers. I've never actually learned Ruby but I've played with Python 10 years ago. Can you help me? This is what I was able to do until now:
E.g.
require 'csv'
csv_table = CSV.read(ARGV[0], :headers => true)
keep = ["Name and surname", "Phone", "Email"]
new_csv_table = csv_table.by_col!.delete_if do |column_name,column_values|
!keep.include? column_name
end
new_csv_table.to_csv
Begin by creating a CSV file.
str =<<~END
Name and surname,Phone,Email
John Doe,250-256-3145,John#Doe.com
Marsha Magpie,250-256-3154,Marsha#Magpie.com
END
File.write('t_in.csv', str)
#=> 109
Initially, let's read the file, add two columns, "Name" and "Surname", and optionally delete the column, "Name and surname", without regard to column order.
First read the file into a CSV::Table object.
require 'csv'
tbl = CSV.read('t_in.csv', headers: true)
#=> #<CSV::Table mode:col_or_row row_count:3>
Add the new columns.
tbl.each do |row|
row["Name"], row["Surname"] = row["Name and surname"].split
end
#=> #<CSV::Table mode:col_or_row row_count:3>
Note that if row["Name and surname"] had equaled “John Paul Jones”, we would have obtained row["Name"] #=> “John” and row["Surname"] #=> “Paul”.
If the column "Name and surname" is no longer required we can delete it.
tbl.delete("Name and surname")
#=> ["John Doe", "Marsha Magpie"]
Write tbl to a new CSV file.
CSV.open('t_out.csv', "w") do |csv|
csv << tbl.headers
tbl.each { |row| csv << row }
end
#=> #<CSV::Table mode:col_or_row row_count:3>
Let's see what was written.
puts File.read('t_out.csv')
displays
Phone,Email,Name,Surname
250-256-3145,John#Doe.com,John,Doe
250-256-3154,Marsha#Magpie.com,Marsha,Magpie
Now let's rearrange the order of the columns.
header_order = ["Phone", "Name", "Surname", "Email"]
CSV.open('t_out.csv', "w") do |csv|
csv << header_order
tbl.each { |row| csv << header_order.map { |header| row[header] } }
end
puts File.read('t_out.csv')
#=> #<CSV::Table mode:col_or_row row_count:3>
displays
Phone,Name,Surname,Email
250-256-3145,John,Doe,John#Doe.com
250-256-3154,Marsha,Magpie,Marsha#Magpie.com

How to call hash values outside class from defined hash map inside class methods?

Read a csv format file and construct a new class with the name of the file dynamically. So if the csv is persons.csv, the ruby class should be person, if it's places.csv, the ruby class should be places
Also create methods for reading and displaying each value in "csv" file and values in first row of csv file will act as name of the function.
Construct an array of objects and associate each object with the row of a csv file. For example the content of the csv file could be
name,age,city
abd,45,TUY
kjh,65,HJK
Previous code :
require 'csv'
class Feed
def initialize(source_name, column_names = [])
if column_names.empty?
column_names = CSV.open(source_name, 'r', &:first)
end
columns = column_names.reduce({}) { |columns, col_name| columns[col_name] = []; columns }
define_singleton_method(:columns) { column_names }
column_names.each do |col_name|
define_singleton_method(col_name.to_sym) { columns[col_name] }
end
CSV.foreach(source_name, headers: true) do |row|
column_names.each do |col_name|
columns[col_name] << row[col_name]
end
end
end
end
feed = Feed.new('input.csv')
puts feed.columns #["name", "age", "city"]
puts feed.name # ["abd", "kjh"]
puts feed.age # ["45", "65"]
puts feed.city # ["TUY", "HJK"]
I am trying to refine this solution using class methods and split code into smaller methods. Calling values outside the class using key names but facing errors like "undefined method `age' for Feed:Class". Is that a way I can access values outside the class ?
My solution looks like -
require 'csv'
class Feed
attr_accessor :column_names
def self.col_name(source_name, column_names = [])
if column_names.empty?
#column_names = CSV.open(source_name, :headers => true)
end
columns = #column_names.reduce({}) { |columns, col_name| columns[col_name] = []; columns }
end
def self.get_rows(source_name)
col_name(source_name, column_names = [])
define_singleton_method(:columns) { column_names }
column_names.each do |col_name|
define_singleton_method(col_name.to_sym) { columns[col_name] }
end
CSV.foreach(source_name, headers: true) do |row|
#column_names.each do |col_name|
columns[col_name] << row[col_name]
end
end
end
end
obj = Feed.new
Feed.get_rows('Input.csv')
puts obj.class.columns
puts obj.class.name
puts obj.class.age
puts obj.class.city
Expected Result -
input = Input.new
p input.name # ["abd", "kjh"]
p input.age # ["45", "65"]
input.name ='XYZ' # Value must be appended to array
input.age = 25
p input.name # ["abd", "kjh", "XYZ"]
p input.age # ["45", "65", "25"]
Let's create the CSV file.
str =<<END
name,age,city
abd,45,TUY
kjh,65,HJK
END
FName = 'temp/persons.csv'
File.write(FName, str)
#=> 36
Now let's create a class:
klass = Class.new
#=> #<Class:0x000057d0519de8a0>
and name it:
class_name = File.basename(FName, ".csv").capitalize
#=> "Persons"
Object.const_set(class_name, klass)
#=> Persons
Persons.class
#=> Class
See File::basename, String#capitalize and Module#const_set.
Next read the CSV file with headers into a CSV::Table object:
require 'csv'
csv = CSV.read(FName, headers: true)
#=> #<CSV::Table mode:col_or_row row_count:3>
csv.class
#=> CSV::Table
See CSV#read. We may now create the methods name, age and city.
csv.headers.each { |header| klass.define_method(header) { csv[header] } }
See CSV#headers, Module::define_method and CSV::Row#[].
We can now confirm they work as intended:
k = klass.new
k.name
#=> ["abd", "kjh"]
k.age
#=> ["45", "65"]
k.city
#=> ["TUY", "HJK"]
or
p = Persons.new
#=> #<Persons:0x0000598dc6b01640>
p.name
#=> ["abd", "kjh"]
and so on.

Ruby iterating over hash of hash

I have the following array and am struggling to format it for my needs.
consolidated = [
{:name=>"Bob", :details=>{"work"=>"Carpenter", "age"=>"26", "Experience"=>"6"} },
{:name=>"Colin", :details=>{"work"=>"painting", "age"=>"20", "Experience"=>"4"} }
]
I am trying to format it as below:
Bob work Carpenter
age 26
Experience 6
Colin work painting
age 20
Experience 4
I tried the following:
require 'csv'
CSV.open("output.csv", "wb") do |csv|
csv << ["name", "nature", "details"]
consolidated.each do |val|
csv << [val[:name], val[:details]]
end
end
#=> [{:name=>"Bob", :details=>{"work"=>"Carpenter", "age"=>"26", "Experience"=>"6"}},
# {:name=>"Colin", :details=>{"work"=>"painting", "age"=>"20", "Experience"=>"4"}}]
but it prints the following
name nature details
Bob "work"=>"Carpenter", "age"=>"26", "Experience"=>"6"
Colin "work"=>"painting", "age"=>"20", "Experience"=>"4"
I'm not exactly sure how to iterate hash of hash from the 1st loop only to get the expected format.
Thanks.
Here's something to get you started:
require 'csv'
data = [
{:name => "Bob", :details=>{"work"=>"Carpenter", "age"=>"26", "Experience"=>"6"}},
{:name => "Colin", :details=>{"work"=>"painting", "age"=>"20", "Experience"=>"4"}}
]
str = CSV.generate do |csv|
data.each do |datum|
datum[:details].each do |detail_key, detail_value|
csv << [datum[:name], detail_key, detail_value]
end
end
end
puts str
# >> Bob,work,Carpenter
# >> Bob,age,26
# >> Bob,Experience,6
# >> Colin,work,painting
# >> Colin,age,20
# >> Colin,Experience,4
Simply iterate all details and emit a new row for each key-value pair there, adding a name of a person.
This will get you almost what you need. Missing only blank rows between sections and person's name is duplicated on each line. It'll be your homework to find out how to add those improvements.
I don't know about CSV generation (so, assuming it works as you have written), you can iterate on your object this way:
consolidated = [{:name => "Bob", :details=>{"work"=>"Carpenter", "age"=>"26", "Experience"=>"6"}}, {:name => "Colin", :details=> {"work"=>"painting", "age"=>"20", "Experience"=>"4"}}]
CSV.open("output.csv", "wb") do |csv|
csv << ["name", "nature", "details"]
consolidated.each do |val|
details = val[:details]
nature_1 = details.keys.first
detail_1 = details.delete(nature_1)
csv << [val[:name], nature_1, detail_1]
details.each do |k, v|
csv << [nil, k, v]
end
end
end
Note: This will corrupt your original data array consolidated. So, if you want to preserve it, dup it first. Or modify the logic to not delete the first key-value from val[:details].
You need to iterate the embedded hash by each_pair iterator.
Something like this:
data = {:name => "Bob", :details=>{"work"=>"Carpenter", "age"=>"26", "Experience"=>"6"}}
CSV.open("output.csv", "wb") do |csv|
csv << ["name", "nature", "details"]
data.each do |val|
csv << [ val[:name], val[:details]['work'] ]
data[:details].each_pair do |key, value]
# here we have to drop the first pair because i've used it earlier
next if key == 'work'
csv << [ "", key, value ]
end
end
end

Output array to CSV in Ruby

It's easy enough to read a CSV file into an array with Ruby but I can't find any good documentation on how to write an array into a CSV file. Can anyone tell me how to do this?
I'm using Ruby 1.9.2 if that matters.
To a file:
require 'csv'
CSV.open("myfile.csv", "w") do |csv|
csv << ["row", "of", "CSV", "data"]
csv << ["another", "row"]
# ...
end
To a string:
require 'csv'
csv_string = CSV.generate do |csv|
csv << ["row", "of", "CSV", "data"]
csv << ["another", "row"]
# ...
end
Here's the current documentation on CSV: http://ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html
If you have an array of arrays of data:
rows = [["a1", "a2", "a3"],["b1", "b2", "b3", "b4"], ["c1", "c2", "c3"]]
Then you can write this to a file with the following, which I think is much simpler:
require "csv"
File.write("ss.csv", rows.map(&:to_csv).join)
I've got this down to just one line.
rows = [['a1', 'a2', 'a3'],['b1', 'b2', 'b3', 'b4'], ['c1', 'c2', 'c3'], ... ]
csv_str = rows.inject([]) { |csv, row| csv << CSV.generate_line(row) }.join("")
#=> "a1,a2,a3\nb1,b2,b3\nc1,c2,c3\n"
Do all of the above and save to a csv, in one line.
File.open("ss.csv", "w") {|f| f.write(rows.inject([]) { |csv, row| csv << CSV.generate_line(row) }.join(""))}
NOTE:
To convert an active record database to csv would be something like this I think
CSV.open(fn, 'w') do |csv|
csv << Model.column_names
Model.where(query).each do |m|
csv << m.attributes.values
end
end
Hmm #tamouse, that gist is somewhat confusing to me without reading the csv source, but generically, assuming each hash in your array has the same number of k/v pairs & that the keys are always the same, in the same order (i.e. if your data is structured), this should do the deed:
rowid = 0
CSV.open(fn, 'w') do |csv|
hsh_ary.each do |hsh|
rowid += 1
if rowid == 1
csv << hsh.keys# adding header row (column labels)
else
csv << hsh.values
end# of if/else inside hsh
end# of hsh's (rows)
end# of csv open
If your data isn't structured this obviously won't work
If anyone is interested, here are some one-liners (and a note on loss of type information in CSV):
require 'csv'
rows = [[1,2,3],[4,5]] # [[1, 2, 3], [4, 5]]
# To CSV string
csv = rows.map(&:to_csv).join # "1,2,3\n4,5\n"
# ... and back, as String[][]
rows2 = csv.split("\n").map(&:parse_csv) # [["1", "2", "3"], ["4", "5"]]
# File I/O:
filename = '/tmp/vsc.csv'
# Save to file -- answer to your question
IO.write(filename, rows.map(&:to_csv).join)
# Read from file
# rows3 = IO.read(filename).split("\n").map(&:parse_csv)
rows3 = CSV.read(filename)
rows3 == rows2 # true
rows3 == rows # false
Note: CSV loses all type information, you can use JSON to preserve basic type information, or go to verbose (but more easily human-editable) YAML to preserve all type information -- for example, if you need date type, which would become strings in CSV & JSON.
Building on #boulder_ruby's answer, this is what I'm looking for, assuming us_eco contains the CSV table as from my gist.
CSV.open('outfile.txt','wb', col_sep: "\t") do |csvfile|
csvfile << us_eco.first.keys
us_eco.each do |row|
csvfile << row.values
end
end
Updated the gist at https://gist.github.com/tamouse/4647196
Struggling with this myself. This is my take:
https://gist.github.com/2639448:
require 'csv'
class CSV
def CSV.unparse array
CSV.generate do |csv|
array.each { |i| csv << i }
end
end
end
CSV.unparse [ %w(your array), %w(goes here) ]

Resources