How to Modify exiting csv column in ruby? - ruby

I have a CSV file. in csv some fields are blank. i want to update these fields with some value. I have try following but no luck.
CSV.foreach("/home/mypc/Desktop/data.csv", { encoding: "UTF-8", headers: true, header_converters: :symbol, converters: :all}).with_index do |row,i|
if row[:images].nil?
row[:images] << ['im1']
end
end

row[:images] ||= "im1"
will set the images cell of your row to "im1" if it's empty.
If you want to write a new CSV file with updated cells, you could use :
require 'csv'
write_parameters = { write_headers: true, headers: %w(images name) }
read_parameters = { encoding: 'UTF-8',
headers: true,
header_converters: :symbol,
converters: :all }
CSV.open('new_data.csv', 'w+', write_parameters) do |new_csv|
CSV.foreach('data.csv', read_parameters) do |row|
row[:images] ||= 'im1'
new_csv << row
end
end
With data.csv :
images,name
im2,name2
im3,name3
,name1
im4,name4
im5,
new_data.csv becomes :
images,name
im2,name2
im3,name3
im1,name1
im4,name4
im5,
If you're sure that new_data.csv is properly written, you could delete data.csv and rename new_data.csv to data.csv.
I wouldn't write data.csv in place. If anything goes wrong, you'd lose data.

Related

Writing data into a CSV file by two different CSV files

So, i'm learning ruby and i've been stuck with this for a long time and i need some help.
I need to write to a CSV file from two different CSV files and i have the code to do it but in 2 different functions and i need the two files together in one.
So thats the code:
require 'CSV'
class Plantas <
Struct.new( :code)
end
class Especies <
Struct.new(:id, :type, :code, :name_es, :name_ca, :name_en, :latin_name, :customer_id )
end
def ecode
f_inECODE = File.open("pflname.csv", "r") #get EPPOCODE
f_out=CSV.open("plantas.csv", "w+", :headers => true) #outputfile
f_inECODE.each_line do |line|
fields = line.split(',')
newPlant = Plantas.new
newPlant.code = fields[2].tr_s('"', '').strip #eppocode
plant = [newPlant.code] #linies a imprimir
f_out << plant
end
end
def data
f_dataspices=File.open("spices.csv", "r")
f_out=CSV.open("plantas.csv", "w+", :headers => true) #outputfile
f_dataspices.each_line do |line|
fields = line.split(',')
newEspecies = Especies.new
newEspecies.id = fields[0].tr_s('"', '').strip
newEspecies.type = fields[1].tr_s('"', '').strip
newEspecies.code = fields[2].tr_s('"', '').strip
newEspecies.name_es = fields[3].tr_s('"', '').strip
newEspecies.name_ca = fields[4].tr_s('"', '').strip
newEspecies.name_en = fields[5].tr_s('"', '').strip
newEspecies.latin_name = fields[6].tr_s('"', '').strip
newEspecies.customer_id = fields[7].tr_s('"', '').strip
especia = [newEspecies.id,newEspecies.type,newEspecies.code,newEspecies.name_es,newEspecies.name_ca,newEspecies.name_en,newEspecies.latin_name,newEspecies.customer_id]
f_out << especia
end
end
data
ecode
And the wished output would be like this: species.csv + ecode.csv
"id","type","code","name_es","name_ca","name_en","latin_name","customer_id","ecode"
7205,"DunSpecies",NULL,"0","0","0","",11630,LEECO
7437,"DunSpecies",NULL,"0","Xicoira","0","",5273,LEE3O
7204,"DunSpecies",NULL,"0","0","0","",11630,L4ECO
And the actual is this:
"id","type","code","name_es","name_ca","name_en","latin_name","customer_id"
7205,"DunSpecies",NULL,"0","0","0","",11630
7437,"DunSpecies",NULL,"0","Xicoira","0","",5273
7204,"DunSpecies",NULL,"0","0","0","",11630
(without ecode)
From one side i have the ecode and from the other the whole data i just need to put it together.
I'd like to put all together in the same file (plantas.csv)
I did in two different functions because I don't know how to put all together with one foreach I would like to put all in the same function but I don't how doing it.
If someone could help me to get this code all in one function and writing the results in the same file I would be so grateful.
An example of the input of the file ecode.csv (in which I just want the ecode field) is this:
"""identifier"",""datatype"",""code"",""lang"",""langno"",""preferred"",""status"",""creation"",""modification"",""country"",""fullname"",""authority"",""shortname"""
"""N1952"",""PFL"",""LEECO"",""la"",""1"",""0"",""N"",""06/06/2000"",""09/03/2010"","""",""Leea coccinea non"",""Planchon"",""Leea coccinea non"""
"""N2974"",""PFL"",""LEECO"",""en"",""1"",""0"",""N"",""06/06/2000"",""21/02/2011"","""",""west Indian holly"","""",""West Indian holly"""
An example of the input of the file data.csv (in which I want all the fields) is this:
"id","type","code","name_es","name_ca","name_en","latin_name","customer_id"
7205,"DunSpecies",NULL,"0","0","0","",11630
7437,"DunSpecies",NULL,"0","Xicoira","0","",5273
And the way to link both files is by creating a third file in which i write everything in it!
At least this is my idea, i dont know if there is a simpler way to do it.
Thanks!
Cleaning up ecode.csv made it more challenging, but here is what I came up with:
In case, data.csv and ecode.csv are matched by row numbers:
require 'csv'
data = CSV.read('data.csv', headers: true).to_a
headers = data.shift << 'eppocode'
double_quoted_ecode = CSV.read('ecode.csv')
ecodeIO = StringIO.new
ecodeIO.puts double_quoted_ecode.to_a
ecodeIO.rewind
ecode = CSV.parse(ecodeIO, headers: true)
CSV.open('plantas.csv', 'w+') do |plantas|
plantas << headers
data.each.with_index do |row, idx|
planta = row + [ecode['code'][idx]]
plantas << planta
end
end
Using your example files, this gives you the following plantas.csv:
id,type,code,name_es,name_ca,name_en,latin_name,customer_id,eppocode
7205,DunSpecies,NULL,0,0,0,"",11630,LEECO
7437,DunSpecies,NULL,0,Xicoira,0,"",5273,LEECO
In case, entries are matched by data.csv's id and ecode.csv's identifier:
require 'csv'
data = CSV.read('data.csv', headers: true)
headers = data.headers << 'eppocode'
double_quoted_ecode = CSV.read('ecode.csv')
ecodeIO = StringIO.new
ecodeIO.puts double_quoted_ecode.to_a
ecodeIO.rewind
ecode = CSV.parse(ecodeIO, headers: true)
CSV.open('plantas.csv', 'w+') do |plantas|
plantas << headers
data.each do |row|
id = row['id']
ecode_row = ecode.find { |entry| entry['identifier'] == id } || {}
planta = row << ecode_row['code']
plantas << planta
end
end
I hope you find this helpful.
Data
Let's begin by creating the two CSV files. To make the results easier to follow I have arbitrarily removed some of the fields in each file, and changed one field value.
ecode.csv
ecode = '"""identifier"",""datatype"",""code"",""lang"",""langno"",""preferred"",""status"",""creation"",""modification"",""country"",""fullname"",""authority"",""shortname""" """N1952"",""PFL"",""LEECO"",""la"",""1"",""0"",""N"",""06/06/2000"",""09/03/2010"","""",""Leea coccinea non"",""Planchon"",""Leea coccinea non""" """N2974"",""PFL"",""LEEC1"",""en"",""1"",""0"",""N"",""06/06/2000"",""21/02/2011"","""",""west Indian holly"","""",""West Indian holly"""'
File.write('ecode.csv', ecode)
#=> 452
data.csv
data = '"id","type","code","customer_id"\n7205,"DunSpecies",NULL,11630\n7437,"DunSpecies",NULL,,5273'
File.write('data.csv', data)
#=> 90
Code
CSV.open('plantas.csv', 'w') do |csv_out|
converter = ->(s) { s.delete('"') }
epposcode = CSV.foreach('ecode.csv',
headers:true,
header_converters: [converter],
converters: [converter]
).map { |csv| csv["code"] }
headers = CSV.open('data.csv', &:readline) << 'epposcode'
csv_out << headers
CSV.foreach('data.csv', headers:true) do |row|
csv_out << (row << epposcode.shift)
end
end
#=> 90
Result
Let's see what was written.
puts File.read('plantas.csv')
id,type,code,customer_id,epposcode
7205,DunSpecies,NULL,11630,LEECO
7437,DunSpecies,NULL,,5273,LEEC1
Explanation
The structure we want is the following.
CSV.open('plantas.csv', 'w') do |csv_out|
epposcode = <array of 'code' field values from 'ecode.csv'>
headers = <headers from 'data.csv' to which 'epposcode' is appended>
csv_out << headers
CSV.foreach('data.csv', headers:true) do |row|
csv_out << <row of 'data.csv' to which an element of epposcode is appended>>
end
end
CSV::open is the main CSV method for writing files and CSV::foreach is generally my method-of-choice for reading CSV files. I could have instead written the following.
csv_out = CSV.open('plantas.csv', 'w')
epposcode = <array of 'code' field values from 'ecode.csv'>
headers = <headers from 'data.csv' to which 'epposcode' is appended>
csv_out << headers
CSV.foreach('data.csv', headers:true) do |row|
csv_out << <row of 'data.csv' to which an element of epposcode is appended>>
end
csv_out.close
but using a block is convenient because the file is closed before returning from the block.
It is convenient to use a converter for both the header fields and the row fields:
converter = ->(s) { s.delete('"') }
This is a proc (I've defined a lambda) that removes double quotes from strings. They are specified as two of foreach's optional arguments:
epposcode = CSV.foreach('ecode.csv',
headers:true,
header_converters: [converter],
converters: [converter]
)
Search for "Data Converters" in the CSV doc.
We invoke foreach without a block to return an enumerator, so it can be chained to map:
epposcode = CSV.foreach('ecode.csv',
headers:true,
header_converters: [converter],
converters: [converter]
).map { |csv| csv["code"] }
For the example,
epposcode
#=> ["LEECO", "LEEC1"]

Ruby - reading CSV from STDIN

I'm trying to read from .CSV file and create objects with attributes of every row.
My code works fine:
def self.load_csv
puts "Name of a file?"
filename = STDIN.gets.chomp
rows = []
text = File.read(filename).gsub(/\\"/,'""')
CSV.parse(text, headers: true, header_converters: :symbol) do |row|
row = row.to_h
row = row.each_with_object({}){|(k,v), h| h[k.to_sym] = v}
rows << row
end
rows.map do |row|
Call.new(row)
end
end
end
Now I wanted to take filename from STDIN. I simply changed:
def self.load_csv(filename)
rows = []
text = File.read(filename).gsub(/\\"/,'""')
CSV.parse(text, headers: true, header_converters: :symbol) do |row|
row = row.to_h
row = row.each_with_object({}){|(k,v), h| h[k.to_sym] = v}
rows << row
end
rows.map do |row|
Call.new(row)
end
end
end
and when I try ruby program.rb filename.csv I got error no implicit conversion of String into IO, and after removing line with File.read it does nothing - like an infinite loop maybe? Of course I invoke ceratain methods with STDIN argument in different parts of the code. I used similiar code for reading from STDIN with success in the past, what am I doing wrong this time?
This code is working:
require 'csv'
class Call
def initialize(args)
end
end
def load_csv(filename)
rows = []
text = File.read(filename).gsub(/\\"/,'""')
CSV.parse(text, headers: true, header_converters: :symbol) do |row|
row = row.to_h
row = row.each_with_object({}){ |(k,v), h| h[k.to_sym] = v }
rows << row
end
rows.map { |row| Call.new(row) }
end
filename = ARGV[0]
load_csv(filename)

Ruby- CSV merge field value

I started learning Ruby this weekend. I'm working on a script that is going to read a CSV file that has a Date field and a Time field, and merge the values into a new DateTime field written to the output.
What I have is partially working, but the problem I have is the Date and Time values are comma separated. I would like to remove the comma and replace it with a space. How can I remove the comma and merge the values together?
require 'csv'
CSV.open("output.csv", "wb", :headers => true) do |output|
CSV.foreach("input.csv", :headers => true, :return_headers => true) do |row|
if row.header_row?
output << (row << 'DateTime')
else
output << (row << row['Date'].to_s << (row['Time'].to_s))
end
end
end
You can use tr to replace contents in a string.
date = row['Date']
time = row['Time']
datetime = "#{date} #{time}".tr(',', ' ')
Something like this should help:
require 'csv'
CSV.open("output.csv", "wb", :headers => true) do |output|
output << 'DateTime'
CSV.foreach("input.csv", :headers => true, :return_headers => true, :header_converters => :symbolize) do |row|
output << ["#{row[:date] row[:time]}"]
end
end
The changes here represent this functionality:
:return_headers => true converts header field names to symbol, which can significantly improve performance for even moderate-length CSV files
Moved the header output outside the input CSV loop, as the headers can safely be written before any row data
Use the efficient column reference mechanism for row[:date] and row[:time]
wrote the row data as an array (of one element), consisting of the interpolated string containing both row[:date] and row[:time]

How to use CSV.open and CSV.foreach methods to convert specific data in a csv file?

The Old.csv file contains these headers, "article_category_id", "articleID", "timestamp", "udid", but some of the values in those columns are strings. So, I am trying to convert them to integers and store in another CSV file, New.csv. This is my code:
require 'csv'
require 'time'
CSV.foreach('New.csv', "wb", :write_headers=> true, :headers =>["article_category_id", "articleID", "timestamp", "udid"]) do |csv|
CSV.open('Old.csv', :headers=>true) do |row|
csv['article_category_id']=row['article_category_id'].to_i
csv['articleID']=row['articleID'].to_i
csv['timestamp'] = row['timestamp'].to_time.to_i unless row['timestamp'].nil?
unless udids.include?(row['udid'])
udids << row['udid']
end
csv['udid'] = udids.index(row['udid']) + 1
csv<<row
end
end
But, I am getting the following error: in 'foreach': ruby wrong number of arguments (3 for 1..2) (ArgumentError).
When I change the foreach to open, I get the following error: undefined method '[]' for #<CSV:0x36e0298> (NoMethodError). Why is that? And how can I resolve it? Thanks.
CSV#foreach does not accept file access rights as second parameter:
CSV.open('New.csv', :headers=>true) do |csv|
CSV.foreach('Old.csv',
:write_headers => true,
:headers => ["article_category_id", "articleID", "timestamp", "udid"]
) do |row|
row['article_category_id'] = row['article_category_id'].to_i
...
csv << row
end
end
CSV#open should be placed before foreach. You are to iterate the old one and produce the new one. Inside the loop you should change row and than append it to the output.
You can refer my code:
require 'csv'
require 'time'
CSV.open('New.csv', "wb") do |csv|
csv << ["article_category_id", "articleID", "timestamp", "udid"]
CSV.foreach('Old.csv', :headers=>true) do |row|
array = []
article_category_id=row['article_category_id'].to_i
articleID=row['articleID'].to_i
timestamp = row['timestamp'].to_i unless row['timestamp'].nil?
unless udids.include?(row['udid'])
udids << row['udid']
end
udid = udids.index(row['udid']) + 1
array << [article_category_id, articleID, timestamp, udid]
csv<<array
end
end
The problem with Vinh answer is that at the end array variable is an array which has array inside.
So what is inserted indo CVS looks like
[[article_category_id, articleID, timestamp, udid]]
And that is why you get results in double quotes.
Please try something like this:
require 'csv'
require 'time'
CSV.open('New.csv', "wb") do |csv|
csv << ["article_category_id", "articleID", "timestamp", "udid"]
CSV.foreach('Old.csv', :headers=>true) do |row|
article_category_id = row['article_category_id'].to_i
articleID = row['articleID'].to_i
timestamp = row['timestamp'].to_i unless row['timestamp'].nil?
unless udids.include?(row['udid'])
udids << row['udid']
end
udid = udids.index(row['udid']) + 1
output_row = [article_category_id, articleID, timestamp, udid]
csv << output_row
end
end

JumpStart Labs Event Manager: Syntax Error, unexpected ':'

I'm working on the JumpStart Labs Event Manager, specifically the time/day of the week targeting, and I'm running into trouble. When I run the following code through Terminal, it gives me the following error [EDIT]:
austin-winslows-macbook-4:event_manager HypnoBoy$ ruby event_manager.rb
event_manager.rb:8: odd number list for Hash
...vent_attendees.csv', {headers: true, header_converters: :sym...
^
event_manager.rb:8: syntax error, unexpected ':', expecting '}'
...vent_attendees.csv', {headers: true, header_converters: :sym...
^
event_manager.rb:8: Can't assign to true
...ttendees.csv', {headers: true, header_converters: :symbol})
^
event_manager.rb:8: syntax error, unexpected ':', expecting '='
...ders: true, header_converters: :symbol})
I've posted my code below, and am looking for suggestions! Something about the syntax is obviously off, but I've followed the steps to the letter thus far, and haven't had any problems, so I'm not sure where to look anymore. Any help would be a great help, thanks!
require 'csv'
require 'sunlight/congress'
require 'erb'
require 'date'
Sunglight::Congress.api_key = "e179a6973728c4dd3fb1204283aaccb5"
contents = CSV.open('event_attendees.csv', {headers: true, header_converters: :symbol})
def clean_zipcode(zipcode)
zipcode.to_s.rjust(5,"0")[0..4]
end
def clean_phone(number)
number.to_s.rjust(10,"0")[0..4]
end
def legislators_by_zipcode(zipcode)
Sunglight::Congress::Legislator.by_zipcode(zipcode)
end
def peak_days
time = row[:regdate]
day_array = []
time.each { |t|
array << Datetime.strptime(t, '%m/%d/%Y %H:%M').wday }
end
def peak_hours
time = row[:regdate]
hr_array = []
time.each { |t|
array << DateTime.strptime(t, '%m/%d/%Y %H:%M').hour }
array
end
def save_thanks_you_letters(id,form_letter)
Dir.mkdir("output") unless Dir.exists? "output"
filename = "output/thanks_#{id}.html"
File.open(filename, 'w') { |file|
file.puts form_letter}
end
puts "EventManager Initialized!"
template_letter = File.read "form_letter.erb"
erb_template = ERB.new template_letter
contents.each { |row|
id = row[0]
name = row[:first_name]
zipcode = clean_zipcode(row[:zipcode])
phone = clean_phone(row[:homephone])
legislators = legislators_by_zipcode(zipcode)
form_letter = erb_template.result(binding)
save_thank_you_letters(id,form_letter)
}
from the doc CSV::open you are using the construct :
open( filename, options = Hash.new )
So you line :
contents = CSV.open 'event_attendees.csv', headers: true, header_converters: :symbol is wrong,as from 2nd parameter onward it is expecting a Hash. Thus change it to:
contents = CSV.open('event_attendees.csv', {headers: true, header_converters: :symbol})
I completed this exercise today. I didn't have to change the contents = CSV.open line. What caused the error for me was that the date was not formatted in the Excel file. I formatted that date column to mm/dd/yyyy hh:mm in Excel. Also capitalization seemed to matter in the '%m/%d/%Y %H:%M' string -- I used lowercase 'y'.
This is what my first time exercise looks like:
# Iteration: Time Targeting
contents = CSV.open "event_attendees.csv", headers: true, header_converters: :symbol
regtimes = Array.new(25, 0)
contents.each do |row|
reghour = DateTime.strptime(row[:regdate],'%m/%d/%y %H:%M').hour
regtimes[reghour] += 1
end

Resources