How to split an array of cells in Ruby? - ruby

Code:
doc = Nokogiri::HTML(html)
showings = []
doc.css('.ok-product').each do |showing|
showing_id = showing['data-cart-id'].to_i
price = showing.at_css('.ok-product__price-main').text.gsub(/[\u0440\u0443\u0431.]/, '').strip
showings.push(
id: showing_id,
price: price
)
end
CSV.open("file.csv", "wb") do |csv|
csv << showings
end
I get the data in csv in cell A1:
{:id=>26999, :price=>"395,00"},"{:id=>26963, :price=>""254,00""}"...
Need break the data into cells and remove unnecessary symbols.

CSV.open("file.csv", "wb") do |csv|
showings.each do |id_price|
csv << [id_price[:id], id_price[:price]]
end
end

Related

Writing data into a CSV file by two different CSV files

So, i'm learning ruby and i've been stuck with this for a long time and i need some help.
I need to write to a CSV file from two different CSV files and i have the code to do it but in 2 different functions and i need the two files together in one.
So thats the code:
require 'CSV'
class Plantas <
Struct.new( :code)
end
class Especies <
Struct.new(:id, :type, :code, :name_es, :name_ca, :name_en, :latin_name, :customer_id )
end
def ecode
f_inECODE = File.open("pflname.csv", "r") #get EPPOCODE
f_out=CSV.open("plantas.csv", "w+", :headers => true) #outputfile
f_inECODE.each_line do |line|
fields = line.split(',')
newPlant = Plantas.new
newPlant.code = fields[2].tr_s('"', '').strip #eppocode
plant = [newPlant.code] #linies a imprimir
f_out << plant
end
end
def data
f_dataspices=File.open("spices.csv", "r")
f_out=CSV.open("plantas.csv", "w+", :headers => true) #outputfile
f_dataspices.each_line do |line|
fields = line.split(',')
newEspecies = Especies.new
newEspecies.id = fields[0].tr_s('"', '').strip
newEspecies.type = fields[1].tr_s('"', '').strip
newEspecies.code = fields[2].tr_s('"', '').strip
newEspecies.name_es = fields[3].tr_s('"', '').strip
newEspecies.name_ca = fields[4].tr_s('"', '').strip
newEspecies.name_en = fields[5].tr_s('"', '').strip
newEspecies.latin_name = fields[6].tr_s('"', '').strip
newEspecies.customer_id = fields[7].tr_s('"', '').strip
especia = [newEspecies.id,newEspecies.type,newEspecies.code,newEspecies.name_es,newEspecies.name_ca,newEspecies.name_en,newEspecies.latin_name,newEspecies.customer_id]
f_out << especia
end
end
data
ecode
And the wished output would be like this: species.csv + ecode.csv
"id","type","code","name_es","name_ca","name_en","latin_name","customer_id","ecode"
7205,"DunSpecies",NULL,"0","0","0","",11630,LEECO
7437,"DunSpecies",NULL,"0","Xicoira","0","",5273,LEE3O
7204,"DunSpecies",NULL,"0","0","0","",11630,L4ECO
And the actual is this:
"id","type","code","name_es","name_ca","name_en","latin_name","customer_id"
7205,"DunSpecies",NULL,"0","0","0","",11630
7437,"DunSpecies",NULL,"0","Xicoira","0","",5273
7204,"DunSpecies",NULL,"0","0","0","",11630
(without ecode)
From one side i have the ecode and from the other the whole data i just need to put it together.
I'd like to put all together in the same file (plantas.csv)
I did in two different functions because I don't know how to put all together with one foreach I would like to put all in the same function but I don't how doing it.
If someone could help me to get this code all in one function and writing the results in the same file I would be so grateful.
An example of the input of the file ecode.csv (in which I just want the ecode field) is this:
"""identifier"",""datatype"",""code"",""lang"",""langno"",""preferred"",""status"",""creation"",""modification"",""country"",""fullname"",""authority"",""shortname"""
"""N1952"",""PFL"",""LEECO"",""la"",""1"",""0"",""N"",""06/06/2000"",""09/03/2010"","""",""Leea coccinea non"",""Planchon"",""Leea coccinea non"""
"""N2974"",""PFL"",""LEECO"",""en"",""1"",""0"",""N"",""06/06/2000"",""21/02/2011"","""",""west Indian holly"","""",""West Indian holly"""
An example of the input of the file data.csv (in which I want all the fields) is this:
"id","type","code","name_es","name_ca","name_en","latin_name","customer_id"
7205,"DunSpecies",NULL,"0","0","0","",11630
7437,"DunSpecies",NULL,"0","Xicoira","0","",5273
And the way to link both files is by creating a third file in which i write everything in it!
At least this is my idea, i dont know if there is a simpler way to do it.
Thanks!
Cleaning up ecode.csv made it more challenging, but here is what I came up with:
In case, data.csv and ecode.csv are matched by row numbers:
require 'csv'
data = CSV.read('data.csv', headers: true).to_a
headers = data.shift << 'eppocode'
double_quoted_ecode = CSV.read('ecode.csv')
ecodeIO = StringIO.new
ecodeIO.puts double_quoted_ecode.to_a
ecodeIO.rewind
ecode = CSV.parse(ecodeIO, headers: true)
CSV.open('plantas.csv', 'w+') do |plantas|
plantas << headers
data.each.with_index do |row, idx|
planta = row + [ecode['code'][idx]]
plantas << planta
end
end
Using your example files, this gives you the following plantas.csv:
id,type,code,name_es,name_ca,name_en,latin_name,customer_id,eppocode
7205,DunSpecies,NULL,0,0,0,"",11630,LEECO
7437,DunSpecies,NULL,0,Xicoira,0,"",5273,LEECO
In case, entries are matched by data.csv's id and ecode.csv's identifier:
require 'csv'
data = CSV.read('data.csv', headers: true)
headers = data.headers << 'eppocode'
double_quoted_ecode = CSV.read('ecode.csv')
ecodeIO = StringIO.new
ecodeIO.puts double_quoted_ecode.to_a
ecodeIO.rewind
ecode = CSV.parse(ecodeIO, headers: true)
CSV.open('plantas.csv', 'w+') do |plantas|
plantas << headers
data.each do |row|
id = row['id']
ecode_row = ecode.find { |entry| entry['identifier'] == id } || {}
planta = row << ecode_row['code']
plantas << planta
end
end
I hope you find this helpful.
Data
Let's begin by creating the two CSV files. To make the results easier to follow I have arbitrarily removed some of the fields in each file, and changed one field value.
ecode.csv
ecode = '"""identifier"",""datatype"",""code"",""lang"",""langno"",""preferred"",""status"",""creation"",""modification"",""country"",""fullname"",""authority"",""shortname""" """N1952"",""PFL"",""LEECO"",""la"",""1"",""0"",""N"",""06/06/2000"",""09/03/2010"","""",""Leea coccinea non"",""Planchon"",""Leea coccinea non""" """N2974"",""PFL"",""LEEC1"",""en"",""1"",""0"",""N"",""06/06/2000"",""21/02/2011"","""",""west Indian holly"","""",""West Indian holly"""'
File.write('ecode.csv', ecode)
#=> 452
data.csv
data = '"id","type","code","customer_id"\n7205,"DunSpecies",NULL,11630\n7437,"DunSpecies",NULL,,5273'
File.write('data.csv', data)
#=> 90
Code
CSV.open('plantas.csv', 'w') do |csv_out|
converter = ->(s) { s.delete('"') }
epposcode = CSV.foreach('ecode.csv',
headers:true,
header_converters: [converter],
converters: [converter]
).map { |csv| csv["code"] }
headers = CSV.open('data.csv', &:readline) << 'epposcode'
csv_out << headers
CSV.foreach('data.csv', headers:true) do |row|
csv_out << (row << epposcode.shift)
end
end
#=> 90
Result
Let's see what was written.
puts File.read('plantas.csv')
id,type,code,customer_id,epposcode
7205,DunSpecies,NULL,11630,LEECO
7437,DunSpecies,NULL,,5273,LEEC1
Explanation
The structure we want is the following.
CSV.open('plantas.csv', 'w') do |csv_out|
epposcode = <array of 'code' field values from 'ecode.csv'>
headers = <headers from 'data.csv' to which 'epposcode' is appended>
csv_out << headers
CSV.foreach('data.csv', headers:true) do |row|
csv_out << <row of 'data.csv' to which an element of epposcode is appended>>
end
end
CSV::open is the main CSV method for writing files and CSV::foreach is generally my method-of-choice for reading CSV files. I could have instead written the following.
csv_out = CSV.open('plantas.csv', 'w')
epposcode = <array of 'code' field values from 'ecode.csv'>
headers = <headers from 'data.csv' to which 'epposcode' is appended>
csv_out << headers
CSV.foreach('data.csv', headers:true) do |row|
csv_out << <row of 'data.csv' to which an element of epposcode is appended>>
end
csv_out.close
but using a block is convenient because the file is closed before returning from the block.
It is convenient to use a converter for both the header fields and the row fields:
converter = ->(s) { s.delete('"') }
This is a proc (I've defined a lambda) that removes double quotes from strings. They are specified as two of foreach's optional arguments:
epposcode = CSV.foreach('ecode.csv',
headers:true,
header_converters: [converter],
converters: [converter]
)
Search for "Data Converters" in the CSV doc.
We invoke foreach without a block to return an enumerator, so it can be chained to map:
epposcode = CSV.foreach('ecode.csv',
headers:true,
header_converters: [converter],
converters: [converter]
).map { |csv| csv["code"] }
For the example,
epposcode
#=> ["LEECO", "LEEC1"]

Read from a CSV file, multiply two columns, and then write back to the CSV file in ruby?

I have created csv file with values.I am able to read rows but don't know how to access individual values of a column.
require "csv"
CSV.open("file.csv", "w")
do |csv|
csv << ["val1", "val2","mul"]
csv << ["53", "27"]
csv<<["32","20"]
end
You probably need to ignore the header row if you have one. But the general idea is this:
CSV.open('dest.csv', 'w') do |csv|
csv << ["val1", "val2","mul"]
CSV.foreach('source.csv') do |row|
c1 = row[0]
c2 = row[1]
csv << [c1, c2, c1*c2]
end
end
If you have headers, you could do this:
CSV.open('dest.csv', 'w') do |csv|
csv << ["val1", "val2", "mul"]
CSV.foreach('source.csv', headers: true) do |row|
c1 = row['val1']
c2 = row['val2']
csv << [c1, c2, c1*c2]
end
end
You can use the one below for a non-ruby solution too:
awk -F "," '{print $1,$2,$1*$2}' source.csv > dest.csv

Ruby: How to add two lines at once to a csv?

I have a list of items and a script which generates two lines of csv for each item.
May I add two lines at once to csv generator? I want something like this:
CSV.generate do |csv|
items.each do |item|
csv << rows(item)
end
end
def rows(item)
return \
['value1', 'value2', 'value2'],
['value3', 'value4', 'value5']
end
But csv << can't receive two lines at once.
Now my the best code is:
CSV.generate do |csv|
items.each do |item|
rows(item).each { |row| csv << row }
end
end
Update: Now the best code without adding two line at once looks like:
CSV.generate do |csv|
items.
flat_map(&method(:rows)).
each(&csv.method(:<<))
end
CSV.generate do |csv|
csv << items.flat_map(&method(:rows))
end
Array#push or Array#append work the same way, and can take multiple arguments. Edit: As it turns out, CSV.generate yields a CSV object which has neither of those methods.
You can also do it like this:
CSV.generate do |csv|
items.each do |item|
r = rows(item)
csv << r[0] << r[1]
end
end

How to use CSV.open and CSV.foreach methods to convert specific data in a csv file?

The Old.csv file contains these headers, "article_category_id", "articleID", "timestamp", "udid", but some of the values in those columns are strings. So, I am trying to convert them to integers and store in another CSV file, New.csv. This is my code:
require 'csv'
require 'time'
CSV.foreach('New.csv', "wb", :write_headers=> true, :headers =>["article_category_id", "articleID", "timestamp", "udid"]) do |csv|
CSV.open('Old.csv', :headers=>true) do |row|
csv['article_category_id']=row['article_category_id'].to_i
csv['articleID']=row['articleID'].to_i
csv['timestamp'] = row['timestamp'].to_time.to_i unless row['timestamp'].nil?
unless udids.include?(row['udid'])
udids << row['udid']
end
csv['udid'] = udids.index(row['udid']) + 1
csv<<row
end
end
But, I am getting the following error: in 'foreach': ruby wrong number of arguments (3 for 1..2) (ArgumentError).
When I change the foreach to open, I get the following error: undefined method '[]' for #<CSV:0x36e0298> (NoMethodError). Why is that? And how can I resolve it? Thanks.
CSV#foreach does not accept file access rights as second parameter:
CSV.open('New.csv', :headers=>true) do |csv|
CSV.foreach('Old.csv',
:write_headers => true,
:headers => ["article_category_id", "articleID", "timestamp", "udid"]
) do |row|
row['article_category_id'] = row['article_category_id'].to_i
...
csv << row
end
end
CSV#open should be placed before foreach. You are to iterate the old one and produce the new one. Inside the loop you should change row and than append it to the output.
You can refer my code:
require 'csv'
require 'time'
CSV.open('New.csv', "wb") do |csv|
csv << ["article_category_id", "articleID", "timestamp", "udid"]
CSV.foreach('Old.csv', :headers=>true) do |row|
array = []
article_category_id=row['article_category_id'].to_i
articleID=row['articleID'].to_i
timestamp = row['timestamp'].to_i unless row['timestamp'].nil?
unless udids.include?(row['udid'])
udids << row['udid']
end
udid = udids.index(row['udid']) + 1
array << [article_category_id, articleID, timestamp, udid]
csv<<array
end
end
The problem with Vinh answer is that at the end array variable is an array which has array inside.
So what is inserted indo CVS looks like
[[article_category_id, articleID, timestamp, udid]]
And that is why you get results in double quotes.
Please try something like this:
require 'csv'
require 'time'
CSV.open('New.csv', "wb") do |csv|
csv << ["article_category_id", "articleID", "timestamp", "udid"]
CSV.foreach('Old.csv', :headers=>true) do |row|
article_category_id = row['article_category_id'].to_i
articleID = row['articleID'].to_i
timestamp = row['timestamp'].to_i unless row['timestamp'].nil?
unless udids.include?(row['udid'])
udids << row['udid']
end
udid = udids.index(row['udid']) + 1
output_row = [article_category_id, articleID, timestamp, udid]
csv << output_row
end
end

Output array to CSV in Ruby

It's easy enough to read a CSV file into an array with Ruby but I can't find any good documentation on how to write an array into a CSV file. Can anyone tell me how to do this?
I'm using Ruby 1.9.2 if that matters.
To a file:
require 'csv'
CSV.open("myfile.csv", "w") do |csv|
csv << ["row", "of", "CSV", "data"]
csv << ["another", "row"]
# ...
end
To a string:
require 'csv'
csv_string = CSV.generate do |csv|
csv << ["row", "of", "CSV", "data"]
csv << ["another", "row"]
# ...
end
Here's the current documentation on CSV: http://ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html
If you have an array of arrays of data:
rows = [["a1", "a2", "a3"],["b1", "b2", "b3", "b4"], ["c1", "c2", "c3"]]
Then you can write this to a file with the following, which I think is much simpler:
require "csv"
File.write("ss.csv", rows.map(&:to_csv).join)
I've got this down to just one line.
rows = [['a1', 'a2', 'a3'],['b1', 'b2', 'b3', 'b4'], ['c1', 'c2', 'c3'], ... ]
csv_str = rows.inject([]) { |csv, row| csv << CSV.generate_line(row) }.join("")
#=> "a1,a2,a3\nb1,b2,b3\nc1,c2,c3\n"
Do all of the above and save to a csv, in one line.
File.open("ss.csv", "w") {|f| f.write(rows.inject([]) { |csv, row| csv << CSV.generate_line(row) }.join(""))}
NOTE:
To convert an active record database to csv would be something like this I think
CSV.open(fn, 'w') do |csv|
csv << Model.column_names
Model.where(query).each do |m|
csv << m.attributes.values
end
end
Hmm #tamouse, that gist is somewhat confusing to me without reading the csv source, but generically, assuming each hash in your array has the same number of k/v pairs & that the keys are always the same, in the same order (i.e. if your data is structured), this should do the deed:
rowid = 0
CSV.open(fn, 'w') do |csv|
hsh_ary.each do |hsh|
rowid += 1
if rowid == 1
csv << hsh.keys# adding header row (column labels)
else
csv << hsh.values
end# of if/else inside hsh
end# of hsh's (rows)
end# of csv open
If your data isn't structured this obviously won't work
If anyone is interested, here are some one-liners (and a note on loss of type information in CSV):
require 'csv'
rows = [[1,2,3],[4,5]] # [[1, 2, 3], [4, 5]]
# To CSV string
csv = rows.map(&:to_csv).join # "1,2,3\n4,5\n"
# ... and back, as String[][]
rows2 = csv.split("\n").map(&:parse_csv) # [["1", "2", "3"], ["4", "5"]]
# File I/O:
filename = '/tmp/vsc.csv'
# Save to file -- answer to your question
IO.write(filename, rows.map(&:to_csv).join)
# Read from file
# rows3 = IO.read(filename).split("\n").map(&:parse_csv)
rows3 = CSV.read(filename)
rows3 == rows2 # true
rows3 == rows # false
Note: CSV loses all type information, you can use JSON to preserve basic type information, or go to verbose (but more easily human-editable) YAML to preserve all type information -- for example, if you need date type, which would become strings in CSV & JSON.
Building on #boulder_ruby's answer, this is what I'm looking for, assuming us_eco contains the CSV table as from my gist.
CSV.open('outfile.txt','wb', col_sep: "\t") do |csvfile|
csvfile << us_eco.first.keys
us_eco.each do |row|
csvfile << row.values
end
end
Updated the gist at https://gist.github.com/tamouse/4647196
Struggling with this myself. This is my take:
https://gist.github.com/2639448:
require 'csv'
class CSV
def CSV.unparse array
CSV.generate do |csv|
array.each { |i| csv << i }
end
end
end
CSV.unparse [ %w(your array), %w(goes here) ]

Resources