How to read excel values using queries in ruby? - ruby

I need to read an Excel sheet(.xls) values with Ruby using query. Is there any gems available in ruby to do this? If so please help me on this.
Any tips or advice on this would be great.
Thanks
Anto

You can use Sequel and OLEDB to read Excel Files:
require 'sequel'
Encoding.default_external = 'utf-8' #needed for umlauts in excel
def read_excel(source)
source = File.expand_path(source) #Full path needed
db = Sequel.ado(:conn_string=>"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=#{source};Extended Properties=Excel 8.0;")
# Excel 2000 (for table names, use a dollar after the sheet name, e.g. Sheet1$)
p db.test_connection
dataset = db[:'Tabelle1$']
p dataset
dataset.each{|row|
puts row
}
end #test_read
read_excel('my_spreadsheet.xls')
You should know the name of the tab (in my example it's Tabelle1)
The 'real' solution here is not Sequel, but the ADO-Interface. I'm not familiar with other ORM, so I may not really help you. But you may check for example active record.
There are hints, how to connect MS-Access or sqlserver via ADO, some use ActiveRecord.
If you replace the connection string with the Excel-String in my Sequel example, then you may use other ORMs.
You may also try to read Excel-Data via an ODBC-connection.

Read data from excel file using spreadsheet gem
require 'spreadsheet'
doc = Spreadsheet.open('simple.xls')
sheet = doc.worksheet(0) # list number, first list is 0 and so on...
val = sheet2[r,c] # read particular cell from list 0, r for row, c for column
Some information is there.
More information on the net, just use Google.

Related

Editing a spreadsheet using SPREADSHEET ruby gem

I have to read data from a spread sheet modify some rows and then write the updated rows / cells into the same file.
I have used Spreadsheet gem with Ruby 2.0.0.
When I write the results back to the same file, I am unable to open the xls any more. I get an error
"File Format is not Valid"
in MS Excel.
When the updates are written onto a different file, I am able to open the file but it is in protected view. Is there a solution to this issue?
Below is the sample code:
require 'rubygems'
require 'spreadsheet'
book = Spreadsheet::open('filePath')
sheet = book.worksheet 0
## have application logic in here
book.write('filePath')
I've worked with this problem a few times and they've had the issue on log for around a year now.
The first problem is that it locks the file when spreadsheet loads it and there is no clear way to close it the only way I've been able to get it to not lock is with this code block. It opens it and stores the first worksheet off into its own variable then closes the file.
worksheet = nil
Spreadsheet.open workbook_name do |inner_book|
worksheet = inner_book.worksheet 0
end
worksheet
If you want all the worksheets you could do something similar. In addition to the file opening closing/problem you have the issue around capturing the content of the worksheet depending on the format. I know for my purposes I end up doing the following to capture the content. This sadly loses any formatting you might have had in the source spreadsheet.
rows = []
worksheet.each do |row|
rows << row
end
You can then make your own workbook/sheet and iterate through the rows and add them to the new sheet/book. Then save the new book with the same file name.
Its not fun or efficient, but it is a way to go about solving the problem. Hope this helped.
check your file extension.
spreadsheet, writeexcel..etc gems seem couldn't work with xlsx files.
try .xls not .xlsx

Ruby and Excel Data Extraction

I am learning Ruby and trying to manipulate Excel data.
my goal:
To be able to extract email addresses from an excel file and place them in a text file one per line and add a comma to the end.
my ideas:
i think my answer lies in the use of spreadsheet and File.new.
What I am looking for is direction. I would like to hear any tips or rather hints to accomplish my goal. thanks
Please do not post exact code only looking for direction would like to figure it out myself...
thanks, karen
UPDATE::
So, regex seems to be able to find all matching strings and store them into an array. I´m having some trouble setting that up but should be able to figure it out....but for right now to get started I will extract only the column labeled "E Mail"..... the question I have now is:
`parse_csv = CSV.parse(read_csv, :headers => true)`
The default value for :skip_blanks is set to false.. I need to set it to true but nowhere can I find the correct syntax for doing so... I was assumming something like
`parse_csv = CSV.parse(read_csv, :headers => true :skip_blanks => true)`
But no.....
save your excel file as csv (comma separated value) and work with Ruby's libraries
besides spreadsheet (which can read and write), you can read Excel and other file types with with RemoteTable.
gem install remote_table
and
require 'remote_table'
t = RemoteTable.new('/path/to/file.xlsx', headers: :first_row)
when you write the CSV, as #aug2uag says, you can use ruby's standard library (no gem install required):
require 'csv'
puts [name, email].to_csv
Personally, I'd keep it as simple as possible and use a CSV.
Here is some pseudocode of how that would work:
read in your file line by line
extract your fields using regex, or cell count (depending on how consistent the email address location is), and insert into an arry
iterate through the array and write the values in the fashion you wish (to console, or file)
The code in the comment you had is a great start, however, puts will only write to console, not file. You will also need to figure out how you are going to know you are getting the email address.
Hope this helps.

Extracting text from a webtable in Watir / Ruby

I'm trying to extract the data from an Income Statement, url is http://finance.yahoo.com/q/is?s=LMT+Income+Statement&annual
I was unable to find the table using the browser.table(:name, 'blah') or (:id, 'blah'), but had some luck using the xpath with Nokogiri using this code, which picks up after I've initialized everything and browsed to the page:
page_html = Nokogiri::HTML.parse(browser.html)
tobj = page_html.xpath('//*[#id="yfncsumtab"]').inner_text
Now I'm able to take tobj and pull the data out, but it doesn't do me any good for trying to manipulate the object as a table. Any suggestions on how to go about storing the table as a variable would help. I can probably figure out iterating through the rows/columns from there, but I wouldn't mind if you tacked on some code that would do that.
Do you know Watir has xpath support?
browser.element(:xpath => '//*[#id="yfncsumtab"]')
Look at it this way:
doc = Nokogiri::HTML.parse(browser.html)
table = doc.at('table#yfncsumtab')
# iterate through tr's
table.search('tr').each do |tr|
# do something with tr
end
Try browser.element(id: "yfncsumtab").text

Rails 3 - Export to Excel with gridlines

What can I add to this method to force full gridlines in Excel export?
def export_invoices
headers['Content-Type'] = "application/vnd.ms-excel"
headers['Content-Disposition'] = 'attachment; filename="Invoices.xls"'
headers['Cache-Control'] = ''
#invoices = Invoice.all
render :layout => nil
end
Thanks!
Hmm, lots of things going on here that I don't think make sense. The line
#invoices = Invoice.all
results in SQL like SELECT "invoices".* FROM "invoices" -- the * means you want all columns from the table, and the .all means you want all the invoices, not just one. Unless the contents of the table is a single column binary type, I cannot see this working, since Excel's file format is vendor-specific binary (I think!).
Are you using some gem like paperclip or other to handle saving files? Unless you are manipulating the actual excel data from within Rails (perhaps with a gem that knows how to do this), either the file was saved with gridlines on, or not.
This page describes how you can format your Excel file using XML.
If I understand you question correctly, you are looking to style the output in excel. To do that you need to actually generate an office open XML document, not dump CSV with application headers.
Have a look at these two gems
http://rubygems.org/gems/axlsx
http://rubygems.org/gems/acts_as_xlsx
They should give you what you want.

Ruby: Parse Excel 95-2003 files?

Is there a way to read Excel 97-2003 files from Ruby?
Background
I'm currently using the Ruby Gem parseexcel -- http://raa.ruby-lang.org/project/parseexcel/
But it is an old port of the perl module. It works fine, but the latest format it parses is Excel 95. And guess what? Excel 2007 will not produce the Excel 95 format.
John McNamara has taken over duties as the maintainer for the Perl Excel parser, see http://metacpan.org/pod/Spreadsheet::ParseExcel The current version will parse Excel 95-2003 files. But is there a port to Ruby?
My other thought is to build some Ruby to Perl glue code to enable use of the Perl library itself from Ruby. Eg, see What's the best way to export UTF8 data into Excel?
(I think it would be much faster to write the glue code than to port the parser.)
Thanks,
Larry
I'm using spreadsheet, give it a shot.
There is also roo:
http://roo.rubyforge.org/
In my experience spreadsheet works much faster than roo, however roo can support the .xlsx format which spreadsheet cannot.
As khell mentioned, spreadsheet is a great tool. See my code below that I used to build a crawler.
require 'find'
require 'spreadsheet'
Spreadsheet.client_encoding = 'UTF-8'
count = 0
Find.find('/Users/toor/crawler/') do |file| # begin iteration of each file of a specified directory
if file =~ /\b.xls$\b/ # check if a given file is xls format
workbook = Spreadsheet.open(file).worksheets # creates an object containing all worksheets of an excel workbook
workbook.each do |worksheet| # begin iteration over each worksheet
worksheet.each do |row| # begin iteration over each row of a worksheet
if row.to_s =~ /regex/ # rows must be converted to strings in order to match the regex
puts file
count += 1
end
end
end
end
end
puts "#{count} pieces of information were found"
I've not tried to parse Excel files before, but I know FasterCSV is a great library for parsing CSV files (which Excel can produce).
In the case that you are Windows,
you can always use WIN32OLE.
Have a look at http://rubyonwindows.blogspot.com/search/label/excel

Resources