ruby - create CSV::Table from 2d array - ruby

I have a 2D array... is their any way to create CSV::Table with first row considered as headers and assuming all rows has same number of headers.

You can create a CSV::Table object with headers from a 2D array using CSV.parse.
First convert your 2d array to a string where the values in each row are joined by a comma, and each row is joined by a newline, then pass that string to CSV.parse along with the headers: true option
require 'csv'
sample_array = [
["column1", "column2", "column3"],
["r1c1", "r1c2", "r1c3"],
["r2c1", "r2c2", "r2c3"],
["r3c1", "r3c2", "r3c3"],
]
csv_data = sample_array.map {_1.join(",")}.join("\n")
table = CSV.parse(csv_data, headers: true)
p table
p table.headers
p table[0]
p table[1]
p table[2]
=>
#<CSV::Table mode:col_or_row row_count:4>
["column1", "column2", "column3"]
#<CSV::Row "column1":"r1c1" "column2":"r1c2" "column3":"r1c3">
#<CSV::Row "column1":"r2c1" "column2":"r2c2" "column3":"r2c3">
#<CSV::Row "column1":"r3c1" "column2":"r3c2" "column3":"r3c3">

Below is the basic example to create CSV file in ruby:
hash = {a: [1, 2, 3], b: [4, 5, 6]}
require 'csv'
CSV.open("my_file.csv", "wb") do |csv|
csv << %w(header1 header2 header3)
hash.each_value do |array|
csv << array
end
end
#diwanshu-tyagi will this help to resolve your question? if not please add example of your input value, I'll update this answer.
Thanks

Related

How to extract multiple columns from a csv file with ruby?

Right now I can extract 1 column (column 6) from the csv file. How could I edit the script below to extract more than 1 column? Let's say I also want to extract column 9 and 10 as well as 6. I would want the output to be such that column 6 ends up in column 1 of the output file, 9 in the 2nd column of the output file, and column 10 in the 3rd column of the output file.
ruby -rcsv -e 'CSV.foreach(ARGV.shift) {|row| puts row [5]}' input.csv &> output.csv
Since row is an array, your question boils down to how to pick certain elements from an array; this is not related to CSV.
You can use values_at:
row.values_at(5,6,9,10)
returns the fields 5,6,9 and 10.
If you want to present these picked fields in a different order, it is however easier to map each index explicitly:
output_row = Array.new(row.size) # Or row.dup, depending on your needs
output_row[1] = row[6]
# Or, if you have used row.dup and want to swap the rows:
output_row[1],output_row[6] = row[6],row[1]
# and so on
out_csv.puts(output_row)
This assumes that you have defined before
out_csv=CSV.new(STDOUT)
since you want to have your new CSV be created on standard output.
Let's first create a (header-less) CSV file:
enum = 1.step
FNameIn = 't_in.csv'
CSV.open(FNameIn, "wb") { |csv| 3.times { csv << 5.times.map { enum.next } } }
#=> 3
I've assumed the file contains string representations of integers.
The file contains the three lines:
File.read(FNameIn).each_line { |line| p line }
"1,2,3,4,5\n"
"6,7,8,9,10\n"
"11,12,13,14,15\n"
Now let's extract the columns at indices 1 and 3. These columns are to be written to the output file in that order.
cols = [1, 3]
Now write to the CSV output file.
arr = CSV.read(FNameIn, converters: :integer).
map { |row| row.values_at(*cols) }
#=> [[2, 4], [7, 9], [12, 14]]
FNameOut = 't_out.csv'
CSV.open(FNameOut, 'wb') { |csv| arr.each { |row| csv << row } }
We have written three lines:
File.read(FNameOut).each_line { |line| p line }
"2,4\n"
"7,9\n"
"12,14\n"
which we can read back into an array:
CSV.read(FNameOut, converters: :integer)
#=> [[2, 4], [7, 9], [12, 14]]
A straightforward transformation of these operations is required to perform these operations from the command line.

How to write a CSV without headers

I have a file that looks like this:
milk 7,dark 0,white 0,sugar free 1
milk 0,dark 3,white 0,sugar free 0
milk 0,dark 3,white 0,sugar free 5
milk 0,dark 1,white 5,sugar free 3
There are no headers in that CSV. However, every time I open this file in a program like Numbers, it looks like this:
Is this a problem with how I wrote data to the CSV or a problem with the CSV program... it probably is because the choice of delimiter is wrong.
My code when writing looks like this:
def self.write(output_path:, data:, write_headers: false)
CSV.open(output_path, "w", write_headers: write_headers) do |csv|
data.each do |row|
csv << row
end
end
end
What does the write_headers part even do?
Is this a problem with how I wrote data to the CSV or a problem with the CSV program... it probably is because the choice of delimiter is wrong.
Your CSV data is perfectly fine. CSV is only loosely standardized and there's no concept of marking a header row as such within the file. It cannot be distinguished from any other row, syntax-wise.
What does the write_headers part even do?
It's used together with headers. From the docs for CSV.new:
:write_headers
When true and :headers is set, a header row will be added to the output.
Example:
require 'csv'
CSV.generate(headers: %w[a b c], write_headers: true) do |csv|
csv << [1, 2, 3]
csv << [4, 5, 6]
end
#=> "a,b,c\n1,2,3\n4,5,6\n"
as opposed to:
CSV.generate(headers: %w[a b c], write_headers: false) do |csv|
csv << [1, 2, 3]
csv << [4, 5, 6]
end
#=> "1,2,3\n4,5,6\n"
write_headers will add a header row to the output when it's set true. I think setting headers: false in your options hash will fix your issue (when it's set true, it treats the first line of the CSV as headers)

Is it possible to generate a CSV using hashes as rows?

I have a big array of hashes, like this
# note that the key order isn't consistent
data = [
{foo: 1, bar: 2, baz: 3},
{foo: 11, baz: 33, bar: 22}
]
I want to turn this into a CSV
foo,bar,baz
1,2,3
11,22,33
I am doing so like this:
columns = [:foo, :bar, :baz]
csv_string = CSV.generate do |csv|
csv << columns
data.each do |d|
row = []
columns.each do |column|
row << d[column]
end
csv << row
end
end
Is there a better way to do this? What I'd like to do is something like...
csv_string = CSV.generate do |csv|
csv << [:foo, :bar, :baz]
data.each do |row|
csv.add_row_hash row
end
end
With the appropriate options passed to generate, you can achieve what you want. Note that you can add the hash directly to the CSV once the headers are set.
c = CSV.generate(:headers => [:foo, :bar, :baz], :write_headers => true) do |csv|
data.each { |row| csv << row }
end
Output:
foo,bar,baz
1,2,3
11,22,33
If keys can be missing, you need to get all the possible keys
keys = data.map(&:keys).flatten.uniq
Then map each row using those keys.
csv_string = CSV.generate do |csv|
csv << keys
data.each do |row|
csv << row.values_at(keys)
end
end
My first idea: Your data could be used to insert data into a database table.
If you combine this with a csv-output of a DB-table you have another solution.
Example:
data = [
{foo: 1, bar: 2, baz: 3},
{foo: 11, baz: 33, bar: 22},
{foo: 11, baz: 33, bar: 22, xx: 3}, #additional parameters are no problem
]
#Prepare DB as a helper
require 'sequel'
DB = Sequel.sqlite
DB.extension(:sequel_3_dataset_methods) #define to_csv
DB.create_table(:tab){
add_column :foo
add_column :bar
add_column :baz
}
DB[:tab].multi_insert(data) #fill table
#output as csv (the gsub is necessary on Windows, maybe not necessary on other OS
puts DB[:tab].to_csv.gsub("\r\n","\n")
Disadvantage: You need Sequel
Advantage: You can adapt the order quite easy:
puts DB[:tab].select(:bar, :baz).to_csv.gsub("\r\n","\n")

How do I skip headers while writing CSV?

I am writing a CSV file and CSV.dump outputs two header lines which I don't want.
I tried setting :write_headers => false but still it outputs a header:
irb> A = Struct.new(:a, :b)
=> A
irb> a = A.new(1,2)
=> #<struct A a=1, b=2>
irb> require 'csv'
=> true
irb> puts CSV.dump [a], '', :write_headers => false, :headers=>false
class,A
a=,b=
1,2
I don't think you can do it with option parameters. But you can easily accomplish what you want by not using the generate method
irb> arr = [a, a]
=> [#<struct A a=1, b=2>, #<struct A a=1, b=2>]
irb> csv_string = CSV.generate do |csv|
irb* arr.each {|a| csv << a}
irb> end
irb> puts csv_string
1,2
1,2
=> nil
I think the problem is two-fold:
CSV.dump [a]
wraps an instance of the struct a in an array, which then CSV tries to marshall. While that might be useful sometimes, when trying to generate a CSV file for consumption by some other non-Ruby app that recognizes CSV, you're going to end up with values that can't be used. Looking at the output, it isn't CSV:
class,A
a=,b=
1,2
Looking at it in IRB shows:
=> "class,A\na=,b=\n1,2\n"
which, again, isn't going to be accepted by something like a spreadsheet or database. So, another tactic is needed.
Removing the array from a doesn't help:
CSV.dump a
=> "class,Fixnum\n\n\n\n"
Heading off a different way, I looked at a standard way of generating CSV from an array:
puts a.to_a.to_csv
=> 1,2
An alternate way to create it is:
CSV.generate do |csv|
csv << a.to_a
end
=> "1,2\n"

Transform data from one column to multiple columns in Ruby

I have two columns of data in csv format as shown below from a prediction server.The first column is an index position for each variable for each prediction. Therefore new data starts at index 1.
1,2.0
2,1.5
3,1.4
1,1.1
2,2.0
3,1.5
4,2.0
5,1.6
1,2.0
2,4.0
.
.
.
I would like to have the data in this format instead,
2.0,1.1,2.0
1.5,2.0,4.0
1.4,1.5
2.0
1.6
For ease of work, The empty 'cells' can be filled with zeros or # e.g
2.0,1.1,2.0
1.5,2.0,4.0
1.4,1.5,0
0, 2.0,0
0, 1.6,0
Someone with an elegant way to do this in Ruby?
Let's try to transpose it with Array#transpose:
# first get a 2d representation of the data
rows = CSV.read(fn).slice_before{|row| "1" == row[0]}.map{|x| x.map{|y| y[1]}}
# we want to transpose the array but first we have to fill empty cells
max_length = rows.max_by{|x| x.length}.length
rows.each{|row| row.fill '#', row.length..max_length}
# now we can transpose the array
pp rows.transpose
["2.0", "1.1", "2.0", "5.0"],
["1.5", "2.0", "4.0", "#"],
["1.4", "1.5", "#", "#"],
["#", "2.0", "#", "#"],
["#", "1.6", "#", "#"],
["#", "#", "#", "#"]
This should work for you:
require 'csv'
# read csv contents from file to array
rows = CSV.read("path/to/in_file.csv")
res = Hash.new {|h,k| h[k] = []}
rows.each do |(key, val)|
res[key] << val
end
# write to output csv file
CSV.open("path/to/out_file.csv", "wb") do |csv|
# sort res hash by keys, map to have array of values and add to csv
res.sort_by{|k, v| k}.map{|k, v| v}.each do |r|
csv << r
end
end

Resources