Ruby: Write to CSV if condition met - ruby

I am brand new to Ruby and using it to try to read/write to csv. So far, I have a script that does the following:
Imports data from a CSV file, storing select columns as a separate array (I don't need data from every column)
Performs calculations on the data, stores the results in newly created arrays
Transposes the arrays to table rows, to be outputted to a csv
table = [Result1, Result2, Result3].transpose
Currently, I am able to output the table using the following:
CSV.open(resultsFile, "wb",
:write_headers=> true,
:headers => ["Result1", "Result2", "Result3"]
) do |csv|
table.each do |row|
csv << row
end
My question is, how can I add a conditional to only output rows where one of the results equals a certain text string. For example, if the value in result2 is equal to "Apple", I want the data in that row to be written to the csv file. If not, then skip that row.
I've tried placing if/else in a few different areas and have not had any success.
Thanks for any help

You could do something like below:
header = ["Result1", "Result2", "Result3"]
CSV.open(resultsFile, "wb", :write_headers=> true, :headers => header) do |csv|
table.each do |row|
csv << row if header.zip(row).to_h["Result2"] == "Apple"
end
end
zip merges two arrays and produces array of arrays where each sub-array has element from input arrays at same index, and to_h can convert any array of 2-element arrays into hash. For example:
row = ["Orange", "Apple", "Guava"]
header = ["Result1", "Result2", "Result3"]
header.zip(row).to_h
=> {"Result1"=>"Orange", "Result2"=>"Apple", "Result3"=>"Guava"}

Related

Ruby: How to iterate through a hash created from a csv file

I am trying to take an existing CSV file, add a fourth row to it, and then iterate through the second and third row to create the fourth rows values. Using Ruby I've created hashes where the headers are the keys and the column values are the hash values (ex: "id"=>"1", "new_fruit" => "apple")
My practice CSV file looks like this:practice csv file image
My goal is to create a fourth column: "brand_new" (which I was able to do) and then add values to it by concatenating the values from the second and third row (which I am stuck on). At the moment I just have a placement value (x) for the fourth columns values so I could see if adding the fourth column to the hash actually worked: Results with x = 1
Here is my code:
require 'csv'
def self.import
table = []
CSV.foreach(File.path("practice.csv"), headers: true) do |row|
table.each do |row|
row["brand_new"] = full_name
end
table << row.to_h
end
table
end
def full_name
x = 1
return x
end
# Add another col, row by row:
import.each do |row|
row["brand_new"] = full_name
end
puts import
Any suggestions or guidance would be much appreciated. Thank you.
Simplified your code a bit. I read the file first, then iterate about the read content.
Note: Change col_sep to comma or delete it to use the default if needed.
require "csv"
def self.import
table = CSV.read("practice.csv", headers: true , col_sep: ";")
table.each do |row|
row["brand_new"] = "#{row["old_fruit"]} #{row["new_fruit"]}"
end
puts table
end
I use the read method to read the CSV file content. It allows you to directly access the column/cell values.
Line 7 shows how to concatenate the column values as string:
"#{row["old_fruit"]} #{row["new_fruit"]}"
Refer to this old SO post and the CSV Ruby docs to learn more about working with CSV files.

How to create a new CSV row of data per X amount of strings in an array

I'm trying to create a spreadsheet from an array.
#Loop through each .olpOffer (product listing) and gather content from various elements
parse_page.css('.olpOffer').each do |a|
if a.css('.olpSellerName img').empty?
seller = a.css('.olpSellerName').text.strip
else
seller = a.css('.olpSellerName img').attr('alt').value
end
offer_price = a.css('.olpOfferPrice').text.strip
prime = a.css('.supersaver').text.strip
shipping_info = a.css('.olpShippingInfo').text.strip.squeeze(" ").gsub!(/(\n)/, '')
condition = a.css('.olpCondition').text.strip
fba = "FBA" unless a.css('.olpBadge').empty?
#Push data from each product listing into array
arr.push(seller,offer_price,prime,shipping_info,condition,fba)
end
#Need to make each product listing's data begin in new row [HELP!!]
CSV.open("file.csv", "wb") do |csv|
csv << ["Seller", "Price", "Prime", "Shipping", "Condition", "FBA"]
end
end
I need to reset the row that the array is writing to after the "FBA" column so that I don't end up with one huge row of data in row 2.
I can't figure out how to correlate each string to a specific column header. Should I not use an array?
I figured it out. I needed the array that I was feeding into my csv to create a new row after every 7 strings in the array. Here's how I did it:
arr = an array that has some given amount of strings, always divisible by 7
rows = arr.each_slice(7)
CSV.open("#{file_name}", "ab") do |csv|
csv << [title, asin]
rows.each do |row|
csv << row
end
end

Ruby CSV::Table sort in place

I'm sorting a CSV::Table object. I have a table with headers ("date", "amount", "source") and O(50) entries.
Input:
data = CSV.table('filename.csv', headers:true) # note headers are :date, :source, :amount
amounts = []
data[:amount].each {|i| amounts << i.to_f}
data.sort_by! {|row| row[:amount]}
# error - not a defined function
data = data.sort_by {|row| row[:amount]}
# sorted but data is now an array not CSV::Table. would like to retain access to headers
I want a bang function to sort the table in place by the "amount" column without loosing the CSV::Table structure. Specifically, I want the result to be a CSV::Table, so that I still have access to the headers. Right now, I'm getting an Array, which is not what I want.
I'm sure there is an easier way to do this, especially with the CSV::Table class. Any help?
You can use:
CSV::Table.new(data) to convert Array to CSV::Table object if that is what you want.
sort_by is a method from Enumerable module which will always return an array when block is given as an argument
Suppose you define the following string:
txt=<<-CSV_TXT
Item, Type, Amount, Date
gasoline, expense, 200.00, 2022-01-01
Food, expense, 25.66, 2021-12-24
Plates, expense, 333.03, 2021-04-24
Presents, expense, 1500.01, 2021-12-15
Pay check, income, 2000, 2021-12-07
Consulting, income, 300, 2021-12-16
CSV_TXT
# for giggles, using a multi character separator of ', '
Now create a CSV Table from that (this in the IRB...):
> require 'csv'
=> true
> options={:col_sep=>", ", :headers=>true, :return_headers=>true}
=> {:col_sep=>", ", :headers=>true, :return_headers=>true}
> data=CSV.parse(txt, **options)
=> #<CSV::Table mode:col_or_row row_count:7>
We now have a CSV::Table with a header defined. If you use CSV::Table the header is not optional.
There are two ways (that I know of) you can now sort this table by the Date field and end up with a CSV::Table with the header unchanged. Neither is fully an 'in-place' solution.
The first, create a new CSV::Table after a round trip an array of CSV::Rows. The call to .sort_by creates that array of CSV::Rows for you and you can index a CSV::Row for sorting purposes. You use the first row of the existing table as the header argument:
> data=CSV::Table.new([data[0]]+data[1..].sort_by{ |r| r[3] })
=> #<CSV::Table mode:col_or_row row_count:7>
The second, is similar but allows the header to more easily be split off by using .to_a to create an array. This also allows the individual rows to be filtered or otherwise processed further:
> data=CSV.parse(txt, **options).to_a
=>
[["Item", "Type", "Amount", "Date"],
...
> header=data.shift.to_csv(**options)
=> "Item, Type, Amount, Date\n"
Now you have data with the header split off. With that array, you can sort or filter or process at will; then put back into an array of CSV strings. This is all in place:
> data.sort_by!{|r| r[3]}.map!{|r| r.to_csv(**options)}
=>
["Plates, expense, 333.03, 2021-04-24\n",
"\"Pay check\", income, 2000, 2021-12-07\n",
"Presents, expense, 1500.01, 2021-12-15\n",
"Consulting, income, 300, 2021-12-16\n",
"Food, expense, 25.66, 2021-12-24\n",
"gasoline, expense, 200.00, 2022-01-01\n"]
(Note the field with "Pay check" is quoted. If any character from a multi-character :col_sep is in a field, Ruby will quote it...)
To print the first, just use puts data since Ruby knows how to print a CSV::Table; to print the second, you can do puts header,data.join("")
For the second, to rejoin the header and data into a new table, use parse with options again:
> data_new=CSV.parse(header+data.join(""), **options)
=> #<CSV::Table mode:col_or_row row_count:7>

Open CSV without reading header rows in Ruby

I'm opening CSV using Ruby:
CSV.foreach(file_name, "r+") do |row|
next if row[0] == 'id'
update_row! row
end
and I don't really care about headers row.
I don't like next if row[1] == 'id' inside loop. Is there anyway to tell CSV to skip headers row and just iterate through rows with data ?
I assume provided CSVs always have a header row.
There are a few ways you could handle this. The simplest method would be to pass the {headers: true} option to your loop:
CSV.foreach(file_name, headers: true) do |row|
update_row! row
end
Notice how there is no mode specified - this is because according to the documentation, CSV::foreach takes only the file and options hash as its arguments (as opposed to, say, CSV::open, which does allow one to specify mode.
Alternatively, you could read the data into an array (rather than using foreach), and shift the array before iterating over it:
my_csv= CSV.read(filename)
my_csv.shift
my_csv.each do |row|
update_row! row
end
According to Ruby doc:
options = {:headers=>true}
CSV.foreach(file_name, options) ...
should suffice.
A simple thing to do that works when reading files line-by-line is:
CSV.foreach(file_name, "r+") do |row|
next if $. == 1
update_row! row
end
$. is a global variable in Ruby that contains the line-number of the file being read.

Pull min and max value from CSV file

I have a CSV file like:
123,hat,19.99
321,cap,13.99
I have this code:
products_file = File.open('text.txt')
while ! products_file.eof?
line = products_file.gets.chomp
puts line.inspect
products[ line[0].to_i] = [line[1], line[2].to_f]
end
products_file.close
which is reading the file. While it's not at the end of the file, it reads each line. I don't need the line.inspect in there. but it stores the file in an array inside of my products hash.
Now I want to pull the min and max value from the hash.
My code so far is:
read_file = File.open('text.txt', "r+").read
read_file.(?) |line|
products[ products.length] = gets.chomp.to_f
products.min_by { |x| x.size }
smallest = products
puts "Your highest priced product is #{smallest}"
Right now I don't have anything after read_file.(?) |line| so I get an error. I tried using min or max but neither worked.
Without using CSV
If I understand your question correctly, you don't have to use CSV class methods: just read the file (less header) into an array and determine the min and max as follows:
arr = ["123,hat,19.99", "321,cap,13.99",
"222,shoes,33.41", "255,shirt,19.95"]
arr.map { |s| s.split(',').last.to_f }.minmax
#=> [13.99, 33.41]
or
arr.map { |s| s[/\d+\.\d+$/].to_f }.minmax
#=> [13.99, 33.41]
If you want the associated records:
arr.minmax_by { |s| s.split(',').last.to_f }
=> ["321,cap,13.99", "222,shoes,33.41"]
With CSV
If you wish to use CSV to read the file into an array:
arr = [["123", "hat", "19.99"],
["321", "cap", "13.99"],
["222", "shoes", "33.41"],
["255", "shirt", "19.95"]]
then
arr.map(&:last).minmax
# => ["13.99", "33.41"]
or
arr.minmax_by(&:last)
#=> [["321", "cap", "13.99"],
# ["222", "shoes", "33.41"]]
if you want the records. Note that in the CSV examples I didn't convert the last field to a float, assuming that all records have two decimal digits.
You should use the built-in CSV class as such:
require 'CSV'
data = CSV.read("text.txt")
data.sort!{ |row1, row2| row1[2].to_f <=> row2[2].to_f }
least_expensive = data.first
most_expensive = data.last
The Array#sort! method modifies data in place, so it is sorted based on the condition in the block for later usage. As you can see, the block sorts based on the values in each row at index 2 - in your case, the prices. Incidentally, you don't need to convert these values to floats - strings will sort the same way. Using to_f stops working if you have leading non-digit characters (eg, $), so you might find it better just keep the values as strings during your sort.
Then you can grab the most and least expensive, or the 5 most expensive, or whatever, at your leisure.

Resources