Iterate over an Excel workbook and index everything? - ruby

This would be done in Ruby..I have provided what I have attempted thus far.
I am curious as to if it is possible to iterate over an excel workbook (so it would be multiple sheets) and basically index/record where everything is located. Lets say I have a workbook of 10 sheets. I want it to grab the first sheet, record that sheets name, then move to the first cell and begin indexing(not sure if correct word) the data on that sheet. It would record the cell location so for the first (1,A) and the data thats in it. I am trying to output the data into a format as such like a CSV file or something:
Some code I have written that basically just iterates over every sheet and every cell in a workbook (removes whitespaces) and grabs its data and puts into a CSV...no sheet names or cell numbers present. I am using the roo and csv gems:
require 'rubygems'
require 'roo'
#Classes Used
class ArrayIterator
def initialize(array)
#array = array
#index = 0
end
def has_next?
#index < #array.length
end
def item
#array[#index]
end
def next_item
value = #array[#index]
#index += 1
value
end
end
#Open up files to compare
w1 = Excelx.new ( "C:/Ruby/myworkbook.xlsx" )
$values = Array.new
i = 0.to_i
# Continue until no worksheets left
num_sheets = w1.sheets().size
while (i < num_sheets)
puts "i is currently : #{i}"
puts "length of sheet array is : #{num_sheets}"
#Grab first sheet of each workbook
w1.default_sheet = w1.sheets[i]
1.upto(w1.last_row) do | row |
1.upto(w1.last_column) do | column |
string = w1.cell(row, column).to_s
if (string.strip.empty?)
puts "Whitespace!"
else
$values << string
end
end
end
i = i + 1.to_i
end
count = 0.to_i
CSV.open('C:/Ruby/results.csv', "w") do |csv|
csv << ['String']
i = ArrayIterator.new($values)
while i.has_next?
csv << [i.next_item]
count += 1
end
end

I took the liberty to shorten your script while adding check on empty sheets which produced errors.
require 'roo'
w1 = Excelx.new ( "C:/Ruby193/test/roo/book1.xlsx" )
CSV.open("book1.csv", "w") do |csv|
w1.sheets.each do |sheet|
w1.default_sheet = sheet
if w1.first_row && w1.first_column
eval(w1.to_s).each do |index, value|
csv << [sheet, index, value]
end
end
end
end
which gives in book1.csv
Sheet1,"[1, 1]",a1
Sheet1,"[1, 2]",b1
Sheet1,"[2, 1]",a2
Sheet1,"[2, 2]",b2
Sheet2,"[1, 1]",aa1
Sheet2,"[1, 2]",bb1
Sheet2,"[2, 1]",aa2
Sheet2,"[2, 2]",bb2

Related

Write an array to multi column CSV format using Ruby

I have an array of arrays in Ruby that i'm trying to output to a CSV file (or text). That I can then easily transfer over to another XML file for graphing.
I can't seem to get the output (in text format) like so. Instead I get one line of data which is just a large array.
0,2
0,3
0,4
0,5
I originally tried something along the lines of this
File.open('02.3.gyro_trends.text' , 'w') { |file| trend_array.each { |x,y| file.puts(x,y)}}
And it outputs
0.2
46558
0
46560
0
....etc etc.
Can anyone point me in the "write" direction for getting either:
(i) .text file that can put my data like so.
trend_array[0][0], trend_array[0][1]
trend_array[1][0], trend_array[1][1]
trend_array[2][0], trend_array[2][1]
trend_array[3][0], trend_array[3][1]
(ii) .csv file that would put this data in separate columns.
edit I recently added more than two values into my array, check out my answer combining Cameck's solution.
This is currently what I have at the moment.
trend_array=[]
j=1
# cycle through array and find change in gyro data.
while j < gyro_array.length-2
if gyro_array[j+1][1] < 0.025 && gyro_array[j+1][1] > -0.025
trend_array << [0, gyro_array[j][0]]
j+=1
elsif gyro_array[j+1][1] > -0.025 # if the next value is increasing by x1.2 the value of the previous amount. Log it as +1
trend_array << [0.2, gyro_array[j][0]]
j+=1
elsif gyro_array[j+1][1] < 0.025 # if the next value is decreasing by x1.2 the value of the previous amount. Log it as -1
trend_array << [-0.2, gyro_array[j][0]]
j+=1
end
end
#for graphing and analysis purposes (wanted to print it all as a csv in two columns)
File.open('02.3test.gyro_trends.text' , 'w') { |file| trend_array.each { |x,y| file.puts(x,y)}}
File.open('02.3test.gyro_trends_count.text' , 'w') { |file| trend_array.each {|x,y| file.puts(y)}}
I know it's something really easy, but for some reason I'm missing it. Something with concatenation, but I found that if I try and concatenate a \\n in my last line of code, it doesn't output it to the file. It outputs it in my console the way I want it, but not when I write it to a file.
Thanks for taking the time to read this all.
File.open('02.3test.gyro_trends.text' , 'w') { |file| trend_array.each { |a| file.puts(a.join(","))}}
Alternately using the CSV Class:
def write_to_csv(row)
if csv_exists?
CSV.open(#csv_name, 'a+') { |csv| csv << row }
else
# create and add headers if doesn't exist already
CSV.open(#csv_name, 'wb') do |csv|
csv << CSV_HEADER
csv << row
end
end
end
def csv_exists?
#exists ||= File.file?(#csv_name)
end
Call write_to_csv with an array [col_1, col_2, col_3]
Thank you both #cameck & #tkupari, both answers were what I was looking for. Went with Cameck's answer in the end, because it "cut out" cutting and pasting text => xml. Here's what I did to get an array of arrays into their proper places.
require 'csv'
CSV_HEADER = [
"Apples",
"Oranges",
"Pears"
]
#csv_name = "Test_file.csv"
def write_to_csv(row)
if csv_exists?
CSV.open(#csv_name, 'a+') { |csv| csv << row }
else
# create and add headers if doesn't exist already
CSV.open(#csv_name, 'wb') do |csv|
csv << CSV_HEADER
csv << row
end
end
end
def csv_exists?
#exists ||= File.file?(#csv_name)
end
array = [ [1,2,3] , ['a','b','c'] , ['dog', 'cat' , 'poop'] ]
array.each { |row| write_to_csv(row) }

How to add data from array of array into cell using Ruby Spreadsheet gem

I have an array of arrays like:
arr_all = [arr_1, arr_2, arr_3, arr_r]
where:
arr_1 = [2015-08-19 17:30:24 -0700, 2015-08-19 17:30:34 -0700, 2015-08-19 17:30:55 -0700]
arr_2 = ...
arr_3 = ...
I have a file to modify. I know how to add an array as a row, but I need help to insert each of the arrays in ##ar_data as columns. I find the Row to insert the data, and then I want to insert arr_1 in the cell (next_empty_row, B), then arr_2 at (next_empty_row, C), etc. Please advice. The Number of rows to fill the data is the size of each array. arr_1, arr_2, arr_3 are of size 3.
def performance_report
Spreadsheet.client_encoding = 'UTF-8'
f = "PerformanceTest_S.xls"
if File.exist? (f)
# Open the previously created Workbook
book = Spreadsheet.open(f)
sheet_1_row_index = book.worksheet(0).last_row_index + 1
sheet_2_row_index = book.worksheet(1).last_row_index + 1
# Indicate the row index to the user
print "Inserting new row at index: #{sheet_2_row_index}\n"
# Insert array as column - I need help with the below code to insert data in arr_data which is array of arrays.
column = 1
row
##ar_data.each do |time|
len = time.size
book.worksheet(0).cell(sheet_1_row_index, )
book.worksheet(0).Column.size
end
# This insert row is for worksheet 2 and works fine.
book.worksheet(1).insert_row(sheet_2_row_index, ##ar_calc)
# Delete the file so that it can be re-written
File.delete(f)
# puts ##ar_calc
# Write out the Workbook again
book.write(f)
Am not really sure what you tried to accomplish. This is an example of how you can write your arrays to the end of your file.
require 'spreadsheet'
arrays = [
['Text 1','Text 2','Text 3'],
['Text 4','Text 5','Text 6'],
['Text 7','Text 8','Text 9']]
f = '/your/file/path.xls'
Spreadsheet.client_encoding = 'UTF-8'
if File.exist? f
book = Spreadsheet.open(f)
sheet = book.worksheet(0)
lastRow = sheet.last_row_index + 1
arrays.each_with_index do |row, rowNum|
row.each_with_index do |cell, cellNum|
sheet[ rowNum + lastRow, cellNum ] = cell
end
end
File.delete f
book.write f
end
I don't even know what you're talking about, but what you should probably do is convert that xls to csvs (one for each sheet), and parse it something like this. I'll use a hash to make it platform-independent (but normally I just directly add spreadsheet data to a database using rails):
require 'csv' #not necessary in newer versions of ruby
rows = CSV.open("filename.csv").read
column_names = rows.shift
records = []
rows.each do |row|
this_record = {}
column_names.each_with_index do |col, i|
this_record[col] = row[i]
end
records << this_record
end
If you don't want to manually convert each sheet into CSV, what you could do is use the Spreadsheet gem or something like it to convert each sheet into an array of arrays and that's basically a CSV file right there.
In ruby, Hashes inherit from the Enumerable class just like Arrays do. So to convert your hash into an array of tuples (two-element arrays with key and value for each), you'd just have to do this:
records = records.map(&:to_a)
But that's not even necessary, you can directly iterate on and simultaneously assign on hashes just like you can with an array of arrays
records.each_with_index do |hsh, i|
hsh.each do |k,v|
puts "record #{i}: #{k}='#{v}'"
end
end

Trouble writing to a workbook using ruby spreadsheet gem from CSV file

I am currently new to Ruby and am having a hard time writing to an excel file.
I want to parse through a CSV file, extract data where the 'food' column in the csv file = butter and put the rows where 'food' column = butter into a new excel workbook. I can write the data that contains butter in the 'food' column just fine into a CSV file but am having trouble writing it to a workbook (excel format).
require 'rubygems'
require 'csv'
require 'spreadsheet'
csv_fname = 'commissions.csv'
options = { headers: :first_row }
food_type = { 'food' => 'butter'}
food_type_match = nil
CSV.open(csv_fname, 'r', options) do |csv|
food_type_match = csv.find_all do |row|
Hash[row].select { |k,v| food_type[k] } == food_type
end
end
#writing the 'butter' data to a CSV file
#CSV.open('butter.csv', 'w') do |csv_object|
# food_type_match.each do |row_array|
# csv_object << row_array
# end
#end
book = Spreadsheet::Workbook.new
sheet1 = book.create_worksheet
food_type_match.each do |csv|
csv.each_with_index do |row, i|
sheet1.row(i).replace(row)
end
end
The spreadsheet generates but comes out blank. I have searched through numerous topics on ruby spreadsheet but I cannot get it to work. Any help would be greatly appreciated.
Updated Completely
What if you try this:
book = Spreadsheet::Workbook.new
sheet1 = book.create_worksheet
food_type_match.each do |csv|
csv.each_with_index do |row, i|
sheet1.insert_row(i,row)
end
end
book.write('/path_to_output_location/book.xls')
Also where does this output to? I cannot see a give path for this so I would think that is the issue but you say it generates? I added the write line because the code states this for #write
Write this Workbook to a File, IO Stream or Writer Object. The latter will
make more sense once there are more than just an Excel-Writer available.
Like I said I am completely unfamiliar with this gem and the documentation is terrible with axslx it would be something like this
package = Axlsx::Package.new
book = package.workbook
book.add_worksheet do |sheet|
food_type_match.each do |csv|
sheet.add_row csv
end
end
package.serialize('/path_to_output_location/book.xlsx')
Try write_xlsx gem. Here is my simple csvtoxlsx.rb script to combine *.csv in a folder to a single.xlsx:
require "csv"
require "write_xlsx"
def csvtoxls(csv, xlsx)
count = 0
workbook = WriteXLSX.new(xlsx)
Dir[csv].sort.each do | file |
puts file
name = File.basename(file, ".csv")
worksheet = workbook.add_worksheet(name)
i = 0
CSV.foreach(file) do | row |
worksheet.write_row(i, 0, row)
i = i + 1
count = count + 1
end
end
workbook.close
count
end
abort("Syntax: ruby -W0 csvtoxlsx.rb 'folder/*.csv' single.xlsx") if ARGV.length < 2
time_begin = Time.now
count = csvtoxls(ARGV[0], ARGV[1])
time_spent = Time.now - time_begin
puts "csvtoxlsx process #{ARGV[0]} with #{count} rows in #{time_spent.round(2)} seconds"

Output many arrays to CSV-files in Ruby

I have a question about Ruby. What I want to do is first to sort my items ascending and then write them out to a CSV-file. Now, the problem is further complicated by the fact that I want to iterate over a lot of CSV-files. I found this thread and the answer looks fine, but I am not able to get more than the last line written to my output file.
How can I get the whole data sorted and written to different CSV-files?
My code:
require 'date'
require 'csv'
class Daily <
# Daily has a open
Struct.new(:open)
# a method to print out a csv record for the current Daily.
def print_csv_record
printf("%s,", open)
printf("\n")
end
end
#------#
# MAIN #
#------#
# This is where I iterate over my csv-files:
foobar = ['foo', 'bar']
foobar.each do |foobar|
# get the input filename from the command line
input_file = "#{foobar}.csv"
# define an array to hold the Daily records
arr = Array.new
# loop through each record in the csv file, adding
# each record to my array while overlooking the header.
f = File.open(input_file, "r")
f.each_with_index { |row, i|
next if i == 0
words = row.split(',')
p = Daily.new
# do a little work here to convert my numbers
p.open = words[1].to_f
arr.push(p)
}
# sort the data by ascending opens
arr.sort! { |a,b| a.open <=> b.open }
# print out all the sorted records (just print to stdout)
arr.each { |p|
CSV.open("#{foobar}_new.csv", "w") do |csv|
csv << p.print_csv_record
end
}
end
My input CSV-file:
Open
52.23
52.45
52.36
52.07
52.69
52.38
51.2
50.99
51.41
51.89
51.38
50.94
49.55
50.21
50.13
50.14
49.49
48.5
47.92
My output CSV-file:
47.92
You need to put the iteration inside the open CSV file:
CSV.open("#{foobar}_new.csv", "w") do |csv|
arr.each { |p|
csv << p.print_csv_record
}
end

Calling multiple methods on a CSV object

I have constructed an Event Manager class that performs parsing actions on a CSV file, and produces html letters using erb. It is part of a jumpstart labs tutorial
The program works fine, but I am unable to call multiple methods on an object without the earlier methods interfering with the later methods. As a result, I have opted to create multiple objects to call instance methods on, which seems like a clunky inelegant solution. Is there a better way to do this, where I can create a single new object and call methods on it?
Like so:
eventmg = EventManager.new("event_attendees.csv")
eventmg.print_valid_phone_numbers
eventmg_2 = EventManager.new("event_attendees.csv")
eventmg_2.print_zipcodes
eventmg_3 = EventManager.new("event_attendees.csv")
eventmg_3.time_targeter
eventmg_4 = EventManager.new("event_attendees.csv")
eventmg_4.day_of_week
eventmg_5 = EventManager.new("event_attendees.csv")
eventmg_5.create_thank_you_letters
The complete code is as follows
require 'csv'
require 'sunlight/congress'
require 'erb'
class EventManager
INVALID_PHONE_NUMBER = "0000000000"
Sunlight::Congress.api_key = "e179a6973728c4dd3fb1204283aaccb5"
def initialize(file_name, list_selections = [])
puts "EventManager Initialized."
#file = CSV.open(file_name, {:headers => true,
:header_converters => :symbol} )
#list_selections = list_selections
end
def clean_zipcode(zipcode)
zipcode.to_s.rjust(5,"0")[0..4]
end
def print_zipcodes
puts "Valid Participant Zipcodes"
#file.each do |line|
zipcode = clean_zipcode(line[:zipcode])
puts zipcode
end
end
def clean_phone(phone_number)
converted = phone_number.scan(/\d/).join('').split('')
if converted.count == 10
phone_number
elsif phone_number.to_s.length < 10
INVALID_PHONE_NUMBER
elsif phone_number.to_s.length == 11 && converted[0] == 1
phone_number.shift
phone_number.join('')
elsif phone_number.to_s.length == 11 && converted[0] != 1
INVALID_PHONE_NUMBER
else
phone_number.to_s.length > 11
INVALID_PHONE_NUMBER
end
end
def print_valid_phone_numbers
puts "Valid Participant Phone Numbers"
#file.each do |line|
clean_number = clean_phone(line[:homephone])
puts clean_number
end
end
def time_targeter
busy_times = Array.new(24) {0}
#file.each do |line|
registration = line[:regdate]
prepped_time = DateTime.strptime(registration, "%m/%d/%Y %H:%M")
prepped_time = prepped_time.hour.to_i
# inserts filtered hour into the array 'list_selections'
#list_selections << prepped_time
end
# tallies number of registrations for each hour
i = 0
while i < #list_selections.count
busy_times[#list_selections[i]] += 1
i+=1
end
# delivers a result showing the hour and the number of registrations
puts "Number of Registered Participants by Hour:"
busy_times.each_with_index {|counter, hours| puts "#{hours}\t#{counter}"}
end
def day_of_week
busy_day = Array.new(7) {0}
d_of_w = ["Monday:", "Tuesday:", "Wednesday:", "Thursday:", "Friday:", "Saturday:", "Sunday:"]
#file.each do |line|
registration = line[:regdate]
# you have to reformat date because of parser format
prepped_date = Date.strptime(registration, "%m/%d/%y")
prepped_date = prepped_date.wday
# adds filtered day of week into array 'list selections'
#list_selections << prepped_date
end
i = 0
while i < #list_selections.count
# i is minus one since days of week begin at '1' and arrays begin at '0'
busy_day[#list_selections[i-1]] += 1
i+=1
end
#busy_day.each_with_index {|counter, day| puts "#{day}\t#{counter}"}
prepared = d_of_w.zip(busy_day)
puts "Number of Registered Participants by Day of Week"
prepared.each{|date| puts date.join(" ")}
end
def legislators_by_zipcode(zipcode)
Sunlight::Congress::Legislator.by_zipcode(zipcode)
end
def save_thank_you_letters(id,form_letter)
Dir.mkdir("output") unless Dir.exists?("output")
filename = "output/thanks_#{id}.html"
File.open(filename,'w') do |file|
file.puts form_letter
end
end
def create_thank_you_letters
puts "Thank You Letters Available in Output Folder"
template_letter = File.read "form_letter.erb"
erb_template = ERB.new template_letter
#file.each do |line|
id = line[0]
name = line[:first_name]
zipcode = clean_zipcode(line[:zipcode])
legislators = legislators_by_zipcode(zipcode)
form_letter = erb_template.result(binding)
save_thank_you_letters(id,form_letter)
end
end
end
The reason you're experiencing this problem is because when you apply each to the result of CSV.open you're moving the file pointer each time. When you get to the end of the file with one of your methods, there is nothing for anyone else to read.
An alternative is to read the contents of the file into an instance variable at initialization with readlines. You'll get an array of arrays which you can operate on with each just as easily.
"Is there a better way to do this, where I can create a single new object and call methods on it?"
Probably. If your methods are interfering with one another, it means you're changing state within the manager, instead of working on local variables.
Sometimes, it's the right thing to do (e.g. Array#<<); sometimes not (e.g. Fixnum#+)... Seeing your method names, it probably isn't.
Nail the offenders down and adjust the code accordingly. (I only scanned your code, but those Array#<< calls on an instance variable, in particular, look fishy.)

Resources