How to write a CSV without headers - ruby

I have a file that looks like this:
milk 7,dark 0,white 0,sugar free 1
milk 0,dark 3,white 0,sugar free 0
milk 0,dark 3,white 0,sugar free 5
milk 0,dark 1,white 5,sugar free 3
There are no headers in that CSV. However, every time I open this file in a program like Numbers, it looks like this:
Is this a problem with how I wrote data to the CSV or a problem with the CSV program... it probably is because the choice of delimiter is wrong.
My code when writing looks like this:
def self.write(output_path:, data:, write_headers: false)
CSV.open(output_path, "w", write_headers: write_headers) do |csv|
data.each do |row|
csv << row
end
end
end
What does the write_headers part even do?

Is this a problem with how I wrote data to the CSV or a problem with the CSV program... it probably is because the choice of delimiter is wrong.
Your CSV data is perfectly fine. CSV is only loosely standardized and there's no concept of marking a header row as such within the file. It cannot be distinguished from any other row, syntax-wise.
What does the write_headers part even do?
It's used together with headers. From the docs for CSV.new:
:write_headers
When true and :headers is set, a header row will be added to the output.
Example:
require 'csv'
CSV.generate(headers: %w[a b c], write_headers: true) do |csv|
csv << [1, 2, 3]
csv << [4, 5, 6]
end
#=> "a,b,c\n1,2,3\n4,5,6\n"
as opposed to:
CSV.generate(headers: %w[a b c], write_headers: false) do |csv|
csv << [1, 2, 3]
csv << [4, 5, 6]
end
#=> "1,2,3\n4,5,6\n"

write_headers will add a header row to the output when it's set true. I think setting headers: false in your options hash will fix your issue (when it's set true, it treats the first line of the CSV as headers)

Related

ruby - create CSV::Table from 2d array

I have a 2D array... is their any way to create CSV::Table with first row considered as headers and assuming all rows has same number of headers.
You can create a CSV::Table object with headers from a 2D array using CSV.parse.
First convert your 2d array to a string where the values in each row are joined by a comma, and each row is joined by a newline, then pass that string to CSV.parse along with the headers: true option
require 'csv'
sample_array = [
["column1", "column2", "column3"],
["r1c1", "r1c2", "r1c3"],
["r2c1", "r2c2", "r2c3"],
["r3c1", "r3c2", "r3c3"],
]
csv_data = sample_array.map {_1.join(",")}.join("\n")
table = CSV.parse(csv_data, headers: true)
p table
p table.headers
p table[0]
p table[1]
p table[2]
=>
#<CSV::Table mode:col_or_row row_count:4>
["column1", "column2", "column3"]
#<CSV::Row "column1":"r1c1" "column2":"r1c2" "column3":"r1c3">
#<CSV::Row "column1":"r2c1" "column2":"r2c2" "column3":"r2c3">
#<CSV::Row "column1":"r3c1" "column2":"r3c2" "column3":"r3c3">
Below is the basic example to create CSV file in ruby:
hash = {a: [1, 2, 3], b: [4, 5, 6]}
require 'csv'
CSV.open("my_file.csv", "wb") do |csv|
csv << %w(header1 header2 header3)
hash.each_value do |array|
csv << array
end
end
#diwanshu-tyagi will this help to resolve your question? if not please add example of your input value, I'll update this answer.
Thanks

How to extract multiple columns from a csv file with ruby?

Right now I can extract 1 column (column 6) from the csv file. How could I edit the script below to extract more than 1 column? Let's say I also want to extract column 9 and 10 as well as 6. I would want the output to be such that column 6 ends up in column 1 of the output file, 9 in the 2nd column of the output file, and column 10 in the 3rd column of the output file.
ruby -rcsv -e 'CSV.foreach(ARGV.shift) {|row| puts row [5]}' input.csv &> output.csv
Since row is an array, your question boils down to how to pick certain elements from an array; this is not related to CSV.
You can use values_at:
row.values_at(5,6,9,10)
returns the fields 5,6,9 and 10.
If you want to present these picked fields in a different order, it is however easier to map each index explicitly:
output_row = Array.new(row.size) # Or row.dup, depending on your needs
output_row[1] = row[6]
# Or, if you have used row.dup and want to swap the rows:
output_row[1],output_row[6] = row[6],row[1]
# and so on
out_csv.puts(output_row)
This assumes that you have defined before
out_csv=CSV.new(STDOUT)
since you want to have your new CSV be created on standard output.
Let's first create a (header-less) CSV file:
enum = 1.step
FNameIn = 't_in.csv'
CSV.open(FNameIn, "wb") { |csv| 3.times { csv << 5.times.map { enum.next } } }
#=> 3
I've assumed the file contains string representations of integers.
The file contains the three lines:
File.read(FNameIn).each_line { |line| p line }
"1,2,3,4,5\n"
"6,7,8,9,10\n"
"11,12,13,14,15\n"
Now let's extract the columns at indices 1 and 3. These columns are to be written to the output file in that order.
cols = [1, 3]
Now write to the CSV output file.
arr = CSV.read(FNameIn, converters: :integer).
map { |row| row.values_at(*cols) }
#=> [[2, 4], [7, 9], [12, 14]]
FNameOut = 't_out.csv'
CSV.open(FNameOut, 'wb') { |csv| arr.each { |row| csv << row } }
We have written three lines:
File.read(FNameOut).each_line { |line| p line }
"2,4\n"
"7,9\n"
"12,14\n"
which we can read back into an array:
CSV.read(FNameOut, converters: :integer)
#=> [[2, 4], [7, 9], [12, 14]]
A straightforward transformation of these operations is required to perform these operations from the command line.

Consolidate csv data in ruby to get totals/sums of unique values

I'm still struggling with a basic problem I have not found an answer to online.
I am getting CSV like data as name and quantity:
Foo, 1.5
Bar, 1.2
Foo, 1.1
...
And want to consolidate it to unique names with the totals as a new value:
Foo, 2.6 #total of both Foo lines
Bar, 1.2
...
Every single time the data set is not large, but the task is quite repetitive.
I tried to convert it into an array of hashes, finding uniq names, and then use inject, but somehow it got quite complicated and did not work. Also, looping through everything seems not to be the ideal approach.
Does anyone have a nice and easy idea or solution I am missing? (I only found "Extract value from row in csv and sum it" for PHP.)
First of all, you can use Ruby's CSV library to parse and convert your CSV data:
require 'csv'
csv_data = "Foo, 1.5\nBar, 1.2\nFoo, 1.1"
data_array = CSV.parse(csv_data, converters: :numeric)
#=> [["Foo", 1.5], ["Bar", 1.2], ["Foo", 1.1]]
To sum the values I'd use a hash along with each_with_object:
data_array.each_with_object(Hash.new(0)) { |(k, v), h| h[k] += v }
#=> {"Foo"=>2.6, "Bar"=>1.2}
Passing 0.0 as the default option for your Hash accounts nicely for the first occurrence of each item:
input = [ ['Foo', 1.5],
['Bar', 1.2],
['Foo', 1.1] ]
result = input.inject(Hash.new(0.0)) do |sum, (key, value)|
sum[key] += value
sum
end
p result
The array of hash seems to be the easiest approach:
Let's say that:
CSV=[["foo",1.5],["bar",2.2],["foo",1.1]]
Just do:
myCSV=[["foo",1.5],["bar",1.2],["foo",1.1]]
myCSV.each_with_object(Hash.new(0.0)){|row,sum| sum[row[0]]+=row[1]}
=> {
"foo" => 2.6,
"bar" => 1.2
}
If you are reading from a file, it's more or less the same using the CSV library:
sum=Hash.new(0.0)
CSV.foreach("path/to/file.csv") do |row|
sum[row[0]]+=row[1]
end

How do I skip headers while writing CSV?

I am writing a CSV file and CSV.dump outputs two header lines which I don't want.
I tried setting :write_headers => false but still it outputs a header:
irb> A = Struct.new(:a, :b)
=> A
irb> a = A.new(1,2)
=> #<struct A a=1, b=2>
irb> require 'csv'
=> true
irb> puts CSV.dump [a], '', :write_headers => false, :headers=>false
class,A
a=,b=
1,2
I don't think you can do it with option parameters. But you can easily accomplish what you want by not using the generate method
irb> arr = [a, a]
=> [#<struct A a=1, b=2>, #<struct A a=1, b=2>]
irb> csv_string = CSV.generate do |csv|
irb* arr.each {|a| csv << a}
irb> end
irb> puts csv_string
1,2
1,2
=> nil
I think the problem is two-fold:
CSV.dump [a]
wraps an instance of the struct a in an array, which then CSV tries to marshall. While that might be useful sometimes, when trying to generate a CSV file for consumption by some other non-Ruby app that recognizes CSV, you're going to end up with values that can't be used. Looking at the output, it isn't CSV:
class,A
a=,b=
1,2
Looking at it in IRB shows:
=> "class,A\na=,b=\n1,2\n"
which, again, isn't going to be accepted by something like a spreadsheet or database. So, another tactic is needed.
Removing the array from a doesn't help:
CSV.dump a
=> "class,Fixnum\n\n\n\n"
Heading off a different way, I looked at a standard way of generating CSV from an array:
puts a.to_a.to_csv
=> 1,2
An alternate way to create it is:
CSV.generate do |csv|
csv << a.to_a
end
=> "1,2\n"

How to parse output of array.inspect back into an array

I want to store multidimensional arrays in text files and reload them efficiently. The tricky part is that the array includes strings which could look like " ] , [ \\\"" or anything.
Easiest way of writing the table to file is just as my_array.inspect (right?)
How do I then recreate the array as quickly and painlessly as possible from a string read back from the text file that might look like "[\" ] , [ \\\\\\\"\"]" (as in the above case)?
If your array only includes objects that are literally written such as Numerals, Strings, Arrays, Hashes, you can use eval.
a = [1, 2, 3].inspect
# => "[1, 2, 3]"
eval(a)
# => [1, 2, 3]
In my opinion, this sounds like too much trouble. Use YAML instead.
require 'yaml'
a = [ [ [], [] ], [ [], [] ] ]
File.open("output.yml", "w") do |f|
f.write a.to_yaml
end
b = YAML.load File.open('output.yml', 'r')
As an alternative, you could use JSON instead.
Say you have array
ary
You could write the array to a file:
File.open(path, 'w') { |f| f.write Marshal.dump(ary) }
and then re-create the array by reading the file into a string and saying
ary = Marshal.load(File.read(path))

Resources