Replacing text in one CSV column using FasterCSV - ruby

Being relatively new to Ruby, I am trying to figure out how to do the following using FasterCSV:
Open a CSV file, pick a column by its header, in this column only replace all occurrences of string x with y, write out the new file to STDOUT.
The following code almost works:
filename = ARGV[0]
csv = FCSV.read(filename, :headers => true, :header_converters => :symbol, :return_headers => true, :encoding => 'u')
mycol = csv[:mycol]
# construct a mycol_new by iterating over mycol and doing some string replacement
puts csv[:mycol][0] # produces "MyCol" as expected
puts mycol_new[0] # produces "MyCol" as expected
csv[:mycol] = mycol_new
puts csv[:mycol][0] # produces "mycol" while "MyCol" is expected
csv.each do |r|
puts r.to_csv(:force_quotes => true)
end
The only problem is that there is a header conversion where I do not expect it. If the header of the chosen column is "MyCol" before the substitution of the columns in the csv table it is "mycol" afterwards (see comments in the code). Why does this happen? And how to avoid it? Thanks.

There's a couple of things you can change in the initialization line that will help. Change:
csv = FCSV.read(filename, :headers => true, :return_headers => true, :encoding => 'u')
to:
csv = FCSV.read(filename, :headers => true, :encoding => 'u')
I'm using CSV, which is FasterCSV only it's part of Ruby 1.9. This will create a CSV file in the current directory called "temp.csv" with a modified 'FName' field:
require 'csv'
data = "ID,FName,LName\n1,mickey,mouse\n2,minnie,mouse\n3,donald,duck\n"
# read and parse the data
csv_in = CSV.new(data, :headers => true)
# open the temp file
CSV.open('./temp.csv', 'w') do |csv_out|
# output the headers embedded in the object, then rewind to the start of the list
csv_out << csv_in.first.headers
csv_in.rewind
# loop over the rows
csv_in.each do |row|
# munge the first name
if (row['FName']['mi'])
row['FName'] = row['FName'][1 .. -1] << '-' << row['FName'][0] << 'ay'
end
# output the record
csv_out << row.fields
end
end
The output looks like:
ID,FName,LName
1,ickey-may,mouse
2,innie-may,mouse
3,donald,duck

It is possible to manipulate the desired column directly in the FasterCSV object instead of creating a new column and then trying to replace the old one with the new one.
csv = FCSV.read(filename, :headers => true, :header_converters => :symbol, :return_headers => true, :encoding => 'u')
mycol = csv[:my_col]
mycol.each do |row|
row.gsub!(/\s*;\s*/,"///") unless row.nil? # or any other substitution
csv.each do |r|
puts r.to_csv(:force_quotes => true)
end

Related

Filtering into a new CSV with headers in Ruby?

I have a CSV with a basic list of people, their genders, and ages, and corresponding headers:
"First Name","Age","Gender"
"Adam",31,"Male"
"Bruce",36,"Male"
"Lawrence",34,"Male"
"James",32,"Male"
"Elyse",30,"Female"
"Matt",32,"Male"
I'd like to open this CSV in Ruby, go through line by line, and append all male members to a new CSV with the same headers, and save this CSV to a new file.
My code right now (which is not working)
require 'csv'
file = 'cast.csv'
new_cast = CSV.new(:headers => CSV.read(file, :headers => :true).headers)
CSV.foreach(file, :headers => :true, :header_converters => :symbol) do |row|
if row[:gender] == 'Male'
new_cast.add_row(row)
end
end
File.open('new_cast.csv', 'w') do |f|
f.write(new_cast)
end
The error message I am receiving:
/usr/local/Cellar/ruby/2.3.0/lib/ruby/2.3.0/csv.rb:1692:in `<<': undefined method `<<' for {:headers=>["First Name", "Age", "Gender"]}:Hash (NoMethodError)
Did you mean? <
from csv.rb:8:in `block in <main>'
from /usr/local/Cellar/ruby/2.3.0/lib/ruby/2.3.0/csv.rb:1748:in `each'
from /usr/local/Cellar/ruby/2.3.0/lib/ruby/2.3.0/csv.rb:1131:in `block in foreach'
from /usr/local/Cellar/ruby/2.3.0/lib/ruby/2.3.0/csv.rb:1282:in `open'
from /usr/local/Cellar/ruby/2.3.0/lib/ruby/2.3.0/csv.rb:1130:in `foreach'
from csv.rb:6:in `<main>'
So, it seems like I'm doing something pretty wrong. What would be the simplest way to do this?
CSV#new takes a "string or IO object" as its first argument, and an optional hash as its second, per the docs.
So it looks like the error is actually caused by this line:
new_cast = CSV.new(:headers => CSV.read(file, :headers => :true).headers)
which should be
new_cast = CSV.new("", :headers => CSV.read(file, :headers => :true).headers)
Note the empty string.
But even with that, this won't write the new CSV. For that, I think you want to write_headers in your new CSV, and then rewind it before writing, exposing the underlying IO object.
require 'csv'
file = 'cast.csv'
new_cast = CSV.new("", :headers => CSV.read(file, :headers => :true).headers, write_headers: true)
CSV.foreach(file, :headers => :true, :header_converters => :symbol) do |row|
if row[:gender] == 'Male'
new_cast.add_row(row)
end
end
CSV.open('new_cast.csv', 'w') do |csv|
new_cast.rewind
new_cast.each {|row| csv << row}
end
Hope that helps!

Ruby Read and Write CSV with Quotes

I'd like to read in a csv row, update one field then output the row again with quotes.
Row Example Input => "Joe", "Blow", "joe#blow.com"
Desired Row Example Output => "Joe", "Blow", "xxxx#xxxx.xxx"
My script below outputs => Joe, Blow, xxxx#xxxx.xxx
It loses the double quotes which I want to retain.
I've tried various options but no joy so far .. any tips?
Many thanks!
require 'csv'
CSV.foreach('transactions.csv',
:quote_char=>'"',
:col_sep =>",",
:headers => true,
:header_converters => :symbol ) do |row|
row[:customer_email] = 'xxxx#xxxx.xxx'
puts row
end
Quotes in CSV fields are usually unnecessary, unless the field itself contains a delimiter or a newline character. But you can force the CSV file to always use quotes. For that, you need to set force_quotes => true:
CSV.foreach('transactions.csv',
:quote_char=>'"',
:col_sep =>",",
:headers => true,
:force_quotes => true,
:header_converters => :symbol ) do |row|
You can manually add them to all your items
Hash[row.map { |k,v| [k,"\"#{v}\""] }]
(edited because I forgot you had a hash and not an array)
Thanks Justin L.
Built on your solution and ended up with this.
I get the feeling Ruby has something more elegant but this does what I need:
require 'csv'
CSV.foreach('trans.csv',
:quote_char=>'"',
:col_sep =>",",
:headers => true,
:header_converters => :symbol ) do |row|
row[:customer_email] = 'xxxx#xxxx.xxx'
row = Hash[row.map { |k,v| [k,"\"#{v}\""] }]
new_row = ""
row.each_with_index do | (k, v) ,i|
new_row += v.to_s
if i != row.length - 1
new_row += ','
end
end
puts new_row
end

Ruby CSV input value format

I'm using ruby CSV module to read in a csv file.
One of the values inside the csv file is in format is XXX_XXXXX where X are number. I treat this value as string, actually, but CSV module is reading in these values as XXXXXXXX, as numbers, which I do not want.
Options I am currently using
f = CSV.read('file.csv', {:headers => true, :header_converters => :symbol, :converters => :all} )
Is there a way to tell CSV to not do that?
f = CSV.read('file.csv', {:headers => true, :header_converters => :symbol)}
Leave out the :converters => :all; that one tries (amongst others) to convert all numerical looking strings to numbers.
The :convertors => all causes this, try the following
require "csv"
CSV.parse(DATA, :col_sep => ",", :headers => true, :converters => :all).each do |row|
puts row["numfield"]
end
__END__
textfield,datetimefield,numfield
foo,2008-07-01 17:50:55.004688,123_45678
bar,2008-07-02 17:50:55.004688,234_56789
# gives
# 12345678
# 23456789
and
CSV.parse(DATA, :col_sep => ",", :headers => true).each do |row|
puts row["numfield"]
end
__END__
textfield,datetimefield,numfield
foo,2008-07-01 17:50:55.004688,123_45678
bar,2008-07-02 17:50:55.004688,234_56789
# gives
# 123_45678
# 234_56789

How to save a hash into a CSV

I am new in ruby so please forgive the noobishness.
I have a CSV with two columns. One for animal name and one for animal type.
I have a hash with all the keys being animal names and the values being animal type. I would like to write the hash to the CSV without using fasterCSV. I have thought of several ideas what would be easiest.. here is the basic layout.
require "csv"
def write_file
h = { 'dog' => 'canine', 'cat' => 'feline', 'donkey' => 'asinine' }
CSV.open("data.csv", "wb") do |csv|
csv << [???????????]
end
end
When I opened the file to read from it I opened it File.open("blabla.csv", headers: true)
Would it be possible to write back to the file the same way?
If you want column headers and you have multiple hashes:
require 'csv'
hashes = [{'a' => 'aaaa', 'b' => 'bbbb'}]
column_names = hashes.first.keys
s=CSV.generate do |csv|
csv << column_names
hashes.each do |x|
csv << x.values
end
end
File.write('the_file.csv', s)
(tested on Ruby 1.9.3-p429)
Try this:
require 'csv'
h = { 'dog' => 'canine', 'cat' => 'feline', 'donkey' => 'asinine' }
CSV.open("data.csv", "wb") {|csv| h.to_a.each {|elem| csv << elem} }
Will result:
1.9.2-p290:~$ cat data.csv
dog,canine
cat,feline
donkey,asinine
I think the simplest solution to your original question:
def write_file
h = { 'dog' => 'canine', 'cat' => 'feline', 'donkey' => 'asinine' }
CSV.open("data.csv", "w", headers: h.keys) do |csv|
csv << h.values
end
end
With multiple hashes that all share the same keys:
def write_file
hashes = [ { 'dog' => 'canine', 'cat' => 'feline', 'donkey' => 'asinine' },
{ 'dog' => 'rover', 'cat' => 'kitty', 'donkey' => 'ass' } ]
CSV.open("data.csv", "w", headers: hashes.first.keys) do |csv|
hashes.each do |h|
csv << h.values
end
end
end
CSV can take a hash in any order, exclude elements, and omit a params not in the HEADERS
require "csv"
HEADERS = [
'dog',
'cat',
'donkey'
]
def write_file
CSV.open("data.csv", "wb", :headers => HEADERS, :write_headers => true) do |csv|
csv << { 'dog' => 'canine', 'cat' => 'feline', 'donkey' => 'asinine' }
csv << { 'dog' => 'canine'}
csv << { 'cat' => 'feline', 'dog' => 'canine', 'donkey' => 'asinine' }
csv << { 'dog' => 'canine', 'cat' => 'feline', 'donkey' => 'asinine', 'header not provided in the options to #open' => 'not included in output' }
end
end
write_file # =>
# dog,cat,donkey
# canine,feline,asinine
# canine,,
# canine,feline,asinine
# canine,feline,asinine
This makes working with the CSV class more flexible and readable.
I tried the solutions here but got an incorrect result (values in wrong columns) since my source is a LDIF file that not always has all the values for a key. I ended up using the following.
First, when building up the hash I remember the keys in a separate array which I extend with the keys that are not allready there.
# building up the array of hashes
File.read(ARGV[0]).each_line do |lijn|
case
when lijn[0..2] == "dn:" # new record
record = {}
when lijn.chomp == '' # end record
if record['telephonenumber'] # valid record ?
hashes << record
keys = keys.concat(record.keys).uniq
end
when ...
end
end
The important line here is keys = keys.concat(record.keys).uniq which extends the array of keys when new keys (headers) are found.
Now the most important: converting our hashes to a CSV
CSV.open("export.csv", "w", {headers: keys, col_sep: ";"}) do |row|
row << keys # add the headers
hashes.each do |hash|
row << hash # the whole hash, not just the array of values
end
end
[BEWARE] All the answers in this thread are assuming that the order of the keys defined in the hash will be constant amongst all rows.
To prevent problems (that I am facing right now) where some values are assigned to the wrong keys in the csv (Ex:)
hahes = [
{:cola => "hello", :colb => "bye"},
{:colb => "bye", :cola => "hello"}
]
producing the following table using the code from the majority (including best answer) of the answers on this thread:
cola | colb
-------------
hello | bye
-------------
bye | hello
You should do this instead:
require "csv"
csv_rows = [
{:cola => "hello", :colb => "bye"},
{:colb => "bye", :cola => "hello"}
]
column_names = csv_rows.first.keys
s=CSV.generate do |csv|
csv << column_names
csv_rows.each do |row|
csv << column_names.map{|column_name| row[column_name]} #To be explicit
end
end
Try this:
require 'csv'
data = { 'one' => '1', 'two' => '2', 'three' => '3' }
CSV.open("data.csv", "a+") do |csv|
csv << data.keys
csv << data.values
end
Lets we have a hash,
hash_1 = {1=>{:rev=>400, :d_odr=>3}, 2=>{:rev=>4003, :d_price=>300}}
The above hash_1 having keys as some id 1,2,.. and values to those are again hash with some keys as (:rev, :d_odr, :d_price).
Suppose we want a CSV file with headers,
headers = ['Designer_id','Revenue','Discount_price','Impression','Designer ODR']
Then make a new array for each value of hash_1 and insert it in CSV file,
CSV.open("design_performance_data_temp.csv", "w") do |csv|
csv << headers
csv_data = []
result.each do |design_data|
csv_data << design_data.first
csv_data << design_data.second[:rev] || 0
csv_data << design_data.second[:d_price] || 0
csv_data << design_data.second[:imp] || 0
csv_data << design_data.second[:d_odr] || 0
csv << csv_data
csv_data = []
end
end
Now you are having design_performance_data_temp.csv file saved in your corresponding directory.
Above code can further be optimized.

fastercsv - save object table in one go (ruby)

I read my csv using the line below
data = FCSV.table("test.csv", {:quote_char => '"', :col_sep =>',', :row_sep =>:auto, :headers => true, :return_headers => false, :header_converters => :downcase, :converters => :all} )
QUESTION
Can I save object data in the same manner (one line, one go + csv options)? see above
I sort the table (see the code below) and then I want so save it again. I couldn't work out how to save the table in one go. I know how to do it row by row though.
array_of_arrays = data.to_a()
headers = array_of_arrays.shift # remove the headers
array_of_arrays.sort_by {|e| [e[3], e[4].to_s, e[1]]} .each {|line| p line }
array_of_arrays.insert(0,headers)
Anything I tried did not work and gave me something very similar to
csv.rb:33: syntax error, unexpected '{', expecting ')'
... FCSV.table("sorted.csv","w" {:quote_char => '"', :col_sep =...
NOTE:
Please note that I want to use all the CSV options when saving the file {:quote_char => '"', :col_sep =>',', :row_sep =>:auto, :headers => true, :return_headers => false, :header_converters => :downcase, :converters => :all}
Since you've got an array of arrays in data, it looks like you can just do:
FCSV::Table.new(data).to_csv
to get all the csv for data as a string, then output that back to the file.
Just following up on what dunedain said, the following will write the file out
#csv = FCSV::Table.new(data).to_csv
File.open("modified_csv.csv", 'w') {|f| f.write(#csv) }
also the error you had in the code below is because you didnt have a comma after the "w" and before the { but it looks like you were perhaps tring to the reader functions instead of the writer functions
csv.rb:33: syntax error, unexpected '{', expecting ')'
... FCSV.table("sorted.csv","w" {:quote_char => '"', :col_sep =...

Resources