I currently have a single column CSV file such as: ["firstname lastname", "firstname lastname", ...].
I would like to create a CSV file such as ["f.lastname", "f.lastname"...]; f being the first letter of the firstname.
Any idea how I should do that ?
update
Ok well, I feel that I am close thanks to you guys, here's what I got so far :
require 'csv'
filename = CSV.read("mails.csv")
mails = []
CSV.foreach(filename) do |col|
mails << filename.map { |n| n.sub(/\A(\w)\w* (\w+)\z/, '\1. \2') }
end
puts mails.to_s
But I still get an error.
update2
Ok this works just fine :
require 'csv'
mails = []
CSV.foreach('mails.csv', :headers => false) do |row|
mails << row.map(&:split).map{|f,l| "#{f[0]}.#{l}#mail.com" }
end
File.open("mails_final.csv", 'w') {|f| f.puts mails }
puts mails.to_s
Thanks a lot to all of you ;)
A solution without using regular expression:
ary = ["firstname lastname", "firstname lastname"]
ary.map(&:split).map{|f, l| "#{f[0]}. #{l}" }
#=> ["f. lastname", "f. lastname"]
ary = ["firstname lastname", "firstname lastname"]
ary.map{|a| e=a.split(" "); e[0][0]+"."+e[1]}
#=> ["f.lastname", "f.lastname"]
You need to modify your this following code:--
CSV.foreach(filename) do |col|
mails << filename.map { |n| n.sub(/\A(\w)\w* (\w+)\z/, '\1. \2') }
end
to match something like the following:--
CSV.foreach(path_to_csv_file/mails.csv, headers: true/false) do |row|
# row is kind _of CSV::Row, do not use filename.map => causing error
mails << row.to_hash.map { |n| n.sub(/\A(\w)\w* (\w+)\z/, '\1. \2') }
end
I would do that way:
array = ["firstname lastname", "firstname lastname"]
array.map { |n| "#{n[0]}.#{n.split[1]}" }
Related
I have a CSV from which I've removed the irrelevant data.
Now I need to split "Name and surname" into 2 columns by space but ignoring a 3rd column in case there are 3 names, then invert the order of the columns "Name and surname" and "Phone" (phone first) and then put them into a file ignoring the headers. I've never actually learned Ruby but I've played with Python 10 years ago. Can you help me? This is what I was able to do until now:
E.g.
require 'csv'
csv_table = CSV.read(ARGV[0], :headers => true)
keep = ["Name and surname", "Phone", "Email"]
new_csv_table = csv_table.by_col!.delete_if do |column_name,column_values|
!keep.include? column_name
end
new_csv_table.to_csv
Begin by creating a CSV file.
str =<<~END
Name and surname,Phone,Email
John Doe,250-256-3145,John#Doe.com
Marsha Magpie,250-256-3154,Marsha#Magpie.com
END
File.write('t_in.csv', str)
#=> 109
Initially, let's read the file, add two columns, "Name" and "Surname", and optionally delete the column, "Name and surname", without regard to column order.
First read the file into a CSV::Table object.
require 'csv'
tbl = CSV.read('t_in.csv', headers: true)
#=> #<CSV::Table mode:col_or_row row_count:3>
Add the new columns.
tbl.each do |row|
row["Name"], row["Surname"] = row["Name and surname"].split
end
#=> #<CSV::Table mode:col_or_row row_count:3>
Note that if row["Name and surname"] had equaled “John Paul Jones”, we would have obtained row["Name"] #=> “John” and row["Surname"] #=> “Paul”.
If the column "Name and surname" is no longer required we can delete it.
tbl.delete("Name and surname")
#=> ["John Doe", "Marsha Magpie"]
Write tbl to a new CSV file.
CSV.open('t_out.csv', "w") do |csv|
csv << tbl.headers
tbl.each { |row| csv << row }
end
#=> #<CSV::Table mode:col_or_row row_count:3>
Let's see what was written.
puts File.read('t_out.csv')
displays
Phone,Email,Name,Surname
250-256-3145,John#Doe.com,John,Doe
250-256-3154,Marsha#Magpie.com,Marsha,Magpie
Now let's rearrange the order of the columns.
header_order = ["Phone", "Name", "Surname", "Email"]
CSV.open('t_out.csv', "w") do |csv|
csv << header_order
tbl.each { |row| csv << header_order.map { |header| row[header] } }
end
puts File.read('t_out.csv')
#=> #<CSV::Table mode:col_or_row row_count:3>
displays
Phone,Name,Surname,Email
250-256-3145,John,Doe,John#Doe.com
250-256-3154,Marsha,Magpie,Marsha#Magpie.com
I have created a CSV file about my eshop that contains multiple items with different SKUs. Some SKUs appear more than once because they can be in more than one category (but the Title and Price will always be the same for a given SKU). Example:
SKU,Title,Category,Price
001,Soap,Bathroom,0.5
001,Soap,Kitchen,0.5
002,Water,Kitchen,0.4
002,Water,Garage,0.4
003,Juice,Kitchen,0.8
I now wish to create from that file another CSV file that has no duplicate SKU's and aggregates the "Category" attributes as follows:
SKU,Title,Category,Price
001,Soap,Bathroom/Kitchen,0.5
002,Water,Kitchen/Garage,0.4
003,Juice,Kitchen,0.8
How can I do that?
It's my understand you wish to read a CSV file, perform some operations on the data and then write the result to a new CSV file. You could do that as follows.
Code
require 'csv'
def convert(csv_file_in, csv_file_out, group_field, aggregate_field)
csv = CSV.read(FNameIn, headers: true)
headers = csv.headers
arr = csv.group_by { |row| row[group_field] }.
map do |_,a|
headers.map { |h| h==aggregate_field ?
(a.map { |row| row[aggregate_field] }.join('/')) : a.first[h] }
end
CSV.open(FNameOut, "wb") do |csv|
csv << headers
arr.each { |row| csv << row }
end
end
Example
Let's create a CSV file with the following data:
s =<<_
SKU,Title,Category,Price
001,Soap,Bathroom,0.5
001,Soap,Kitchen,0.5
002,Water,Kitchen,0.4
002,Water,Garage,0.4
003,Juice,Kitchen,0.8
_
FNameIn = 'testin.csv'
FNameOut = 'testout.csv'
IO.write(FNameIn, s)
#=> 135
Now execute the method with these values:
convert(FNameIn, FNameOut, "SKU", "Category")
and confirm FNameOut was written correctly:
puts IO.read(FNameOut)
SKU,Title,Category,Price
001,Soap,Bathroom/Kitchen,0.5
002,Water,Kitchen/Garage,0.4
003,Juice,Kitchen,0.8
Explanation
The steps are as follows:
csv_file_in = FNameIn
csv_file_out = FNameOut
group_field = "SKU"
aggregate_field = "Category"
csv = CSV.read(FNameIn, headers: true)
See CSV::read.
headers = csv.headers
#=> ["SKU", "Title", "Category", "Price"]
h = csv.group_by { |row| row[group_field] }
#=> {"001"=>[
#<CSV::Row "SKU":"001" "Title":"Soap" "Category":"Bathroom" "Price":"0.5">,
# #<CSV::Row "SKU":"001" "Title":"Soap" "Category":"Kitchen" "Price":"0.5">
# ],
# "002"=>[
# #<CSV::Row "SKU":"002" "Title":"Water" "Category":"Kitchen" "Price":"0.4">,
# #<CSV::Row "SKU":"002" "Title":"Water" "Category":"Garage" "Price":"0.4">
# ],
# "003"=>[
# #<CSV::Row "SKU":"003" "Title":"Juice" "Category":"Kitchen" "Price":"0.8">
# ]
# }
arr = h.map do |_,a|
headers.map { |h| h==aggregate_field ?
(a.map { |row| row[aggregate_field] }.join('/')) : a.first[h] }
end
#=> [["001", "Soap", "Bathroom/Kitchen", "0.5"],
# ["002", "Water", "Kitchen/Garage", "0.4"],
# ["003", "Juice", "Kitchen", "0.8"]]
See CSV#headers and Enumerable#group_by, an oft-used method. Lastly, write the output file:
CSV.open(FNameOut, "wb") do |csv|
csv << headers
arr.each { |row| csv << row }
end
See CSV::open. Now let's return to the calculation of arr. This is most easily explained by inserting some puts statements and executing the code.
arr = h.map do |_,a|
puts " _=#{_}"
puts " a=#{a}"
headers.map do |h|
puts " header=#{h}"
if h==aggregate_field
a.map { |row| row[aggregate_field] }.join('/')
else
a.first[h]
end.
tap { |s| puts " mapped to #{s}" }
end
end
See Object#tap. The following is displayed.
_=001
a=[#<CSV::Row "SKU":"001" "Title":"Soap" "Category":"Bathroom" "Price":"0.5">,
#<CSV::Row "SKU":"001" "Title":"Soap" "Category":"Kitchen" "Price":"0.5">]
header=SKU
mapped to 001
header=Title
mapped to Soap
header=Category
mapped to Bathroom/Kitchen
header=Price
mapped to 0.5
_=002
a=[#<CSV::Row "SKU":"002" "Title":"Water" "Category":"Kitchen" "Price":"0.4">,
#<CSV::Row "SKU":"002" "Title":"Water" "Category":"Garage" "Price":"0.4">]
header=SKU
mapped to 002
header=Title
mapped to Water
header=Category
mapped to Kitchen/Garage
header=Price
mapped to 0.4
_=003
a=[#<CSV::Row "SKU":"003" "Title":"Juice" "Category":"Kitchen" "Price":"0.8">]
header=SKU
mapped to 003
header=Title
mapped to Juice
header=Category
mapped to Kitchen
header=Price
mapped to 0.8
It seems that in order for this to be correct, we must assume the SKU number and the price are always the same. Since you know the only key you want to merge data between is Category here is how you can do it.
Assuming this is your test.csv in the same path as the ruby script:
# test.csv
SKU,Title,Category,Price
001,Soap,Bathroom,0.5
001,Soap,Kitchen,0.5
002,Water,Kitchen,0.4
002,Water,Garage,0.4
003,Juice,Kitchen,0.8
Ruby script in same directory as your test.csv file
# fix_csv.rb
require 'csv'
rows = CSV.read 'test.csv', :headers => true
skews = rows.group_by{|row| row['SKU']}.keys.uniq
values = rows.group_by{|row| row['SKU']}
merged = skews.map do |key|
group = values.select{|k,v| k == key}.values.flatten.map(&:to_h)
category = group.map{|k,v| k['Category']}.join('/')
new_data = group[0]
new_data['Category'] = category
new_data
end
CSV.open('merged_data.csv', 'w') do |csv|
csv << merged.first.keys # writes the header row
merged.each do |hash|
csv << hash.values
end
end
puts 'see contents of merged_data.csv'
I have an array of hashes:
records = [
{
ID: 'BOATY',
Name: 'McBoatface, Boaty'
},
{
ID: 'TRAINY',
Name: 'McTrainface, Trainy'
}
]
I'm trying to combine them into an array of strings:
["ID,BOATY","Name,McBoatface, Boaty","ID,TRAINY","Name,McTrainface, Trainy"]
This doesn't seem to do anything:
irb> records.collect{|r| r.each{|k,v| "\"#{k},#{v}\"" }}
#=> [{:ID=>"BOATY", :Name=>"McBoatface, Boaty"}, {:ID=>"TRAINY", :Name=>"McTrainface, Trainy"}]
** edit **
Formatting (i.e. ["Key0,Value0","Key1,Value1",...] is required to match a vendor's interface.
** /edit **
What am I missing?
records.flat_map(&:to_a).map { |a| a.join(',') }
#=> ["ID,BOATY", "Name,McBoatface, Boaty", "ID,TRAINY", "Name,McTrainface, Trainy"]
records = [
{
ID: 'BOATY',
Name: 'McBoatface, Boaty'
},
{
ID: 'TRAINY',
Name: 'McTrainface, Trainy'
}
]
# strait forward code
result= []
records.each do |hash|
hash.each do |key, value|
result<< key.to_s
result<< value
end
end
puts result.inspect
# a rubyish way (probably less efficient, I've not done the benchmark)
puts records.map(&:to_a).flatten.map(&:to_s).inspect
Hope it helps.
li = []
records.each do |rec|
rec.each do |k,v|
li << "#{k.to_s},#{v.to_s}".to_s
end
end
print li
["ID,BOATY", "Name,McBoatface, Boaty", "ID,TRAINY", "Name,McTrainface,
Trainy"]
You sure you wanna do it this way?
Check out Marshal. Or JSON.
You could even do it this stupid way using Hash#inspect and eval:
serialized_hashes = records.map(&:inspect) # ["{ID: 'Boaty'...", ...]
unserialized = serialized_hashes.map { |s| eval(s) }
I have the following array and am struggling to format it for my needs.
consolidated = [
{:name=>"Bob", :details=>{"work"=>"Carpenter", "age"=>"26", "Experience"=>"6"} },
{:name=>"Colin", :details=>{"work"=>"painting", "age"=>"20", "Experience"=>"4"} }
]
I am trying to format it as below:
Bob work Carpenter
age 26
Experience 6
Colin work painting
age 20
Experience 4
I tried the following:
require 'csv'
CSV.open("output.csv", "wb") do |csv|
csv << ["name", "nature", "details"]
consolidated.each do |val|
csv << [val[:name], val[:details]]
end
end
#=> [{:name=>"Bob", :details=>{"work"=>"Carpenter", "age"=>"26", "Experience"=>"6"}},
# {:name=>"Colin", :details=>{"work"=>"painting", "age"=>"20", "Experience"=>"4"}}]
but it prints the following
name nature details
Bob "work"=>"Carpenter", "age"=>"26", "Experience"=>"6"
Colin "work"=>"painting", "age"=>"20", "Experience"=>"4"
I'm not exactly sure how to iterate hash of hash from the 1st loop only to get the expected format.
Thanks.
Here's something to get you started:
require 'csv'
data = [
{:name => "Bob", :details=>{"work"=>"Carpenter", "age"=>"26", "Experience"=>"6"}},
{:name => "Colin", :details=>{"work"=>"painting", "age"=>"20", "Experience"=>"4"}}
]
str = CSV.generate do |csv|
data.each do |datum|
datum[:details].each do |detail_key, detail_value|
csv << [datum[:name], detail_key, detail_value]
end
end
end
puts str
# >> Bob,work,Carpenter
# >> Bob,age,26
# >> Bob,Experience,6
# >> Colin,work,painting
# >> Colin,age,20
# >> Colin,Experience,4
Simply iterate all details and emit a new row for each key-value pair there, adding a name of a person.
This will get you almost what you need. Missing only blank rows between sections and person's name is duplicated on each line. It'll be your homework to find out how to add those improvements.
I don't know about CSV generation (so, assuming it works as you have written), you can iterate on your object this way:
consolidated = [{:name => "Bob", :details=>{"work"=>"Carpenter", "age"=>"26", "Experience"=>"6"}}, {:name => "Colin", :details=> {"work"=>"painting", "age"=>"20", "Experience"=>"4"}}]
CSV.open("output.csv", "wb") do |csv|
csv << ["name", "nature", "details"]
consolidated.each do |val|
details = val[:details]
nature_1 = details.keys.first
detail_1 = details.delete(nature_1)
csv << [val[:name], nature_1, detail_1]
details.each do |k, v|
csv << [nil, k, v]
end
end
end
Note: This will corrupt your original data array consolidated. So, if you want to preserve it, dup it first. Or modify the logic to not delete the first key-value from val[:details].
You need to iterate the embedded hash by each_pair iterator.
Something like this:
data = {:name => "Bob", :details=>{"work"=>"Carpenter", "age"=>"26", "Experience"=>"6"}}
CSV.open("output.csv", "wb") do |csv|
csv << ["name", "nature", "details"]
data.each do |val|
csv << [ val[:name], val[:details]['work'] ]
data[:details].each_pair do |key, value]
# here we have to drop the first pair because i've used it earlier
next if key == 'work'
csv << [ "", key, value ]
end
end
end
there is a Hash like this:
params = { k1: :v1, k2: :v2, etc: :etc }
i need it converted to a string like this:
k1="v1", k2="v2", etc="etc"
i have a working version:
str = ""
params.each_pair { |k,v| str << "#{k}=\"#{v}\", " }
but it smells like ten PHP spirits ...
what's the Ruby way to do this?
try this:
str = params.map {|p| '%s="%s"' % p }.join(', ')
see it in action here
Try this...
hash.collect { |k,v| "#{k} = #{v}" }.join(" ,")