Update existing excel file template formulas using ruby - ruby

I had been using spreadsheet to read in a template excel file, modify it and output a new file for the end-user.
As far as I can identify from the documentation spreadsheet provides no way to input or edit formulas in the produced document.
However, the purpose of my script is to read an undefined number of items from a site and enter them into the spreadsheet, then calculate totals and subtotals.
The end user (using excel or libreoffice etc) is then able to make slight modifications to the quantity of items whilst the totals update (due to formulas) as they are accustomed.
I have looked into the writeexcel gem which claims to be able to input formulas, but I can't see how to take an existing template file and modify it to produce my output. I can only create fresh workbooks.
Any tips please? I do not want to use Win32OLE.

This is surprisingly difficult; apparently all Gems for handling Excel files are missing some crucial functionality.
I can think of two approaches for this problem:
use a combination of spreadsheet (to read the Excel file) and use writeexcel (to write the output file)
use an input file that already contains the required formulas on a separate "formula" sheet and copies the formulas to the "real" sheet
Here's a simplistic version of the second approach:
require 'rubygems'
require 'spreadsheet'
Dir.chdir(File.dirname(__FILE__))
# input file, contains this data
# Sheet0: headers + data (for this simple demo, we will generate the data on-the-fly)
# Sheet1: Formula '=SUM(Worksheet1.A2:A255) in cell A1
book = Spreadsheet.open 'in.xls'
sheet = book.worksheet 0
formulasheet = book.worksheet 1
# insert some input data (in a real application,
# this data would already be present in the input sheet)
rows = rand(20) + 1
(1..rows).each do |i|
sheet[i,0] = i
end
# add total at bottom of column C
sheet[rows+1,2] = formulasheet[0,0]
# write output file
book.write 'out.xls'
However, this will fail if
you're using the same column for your input data and your totals (since then, the total will try to include itself in the calculation)

Related

Update cell values referenced by a formula with RubyXL

I have a xlsx with values in cells, many of them are referenced with a formula in other cell (in the same sheet). Im working with RubyXL because i wasn't found another gem which help me to write, edit and save an existed xlsx file.
Now to be clear, lets see an example of what im doing and i want.
Imagine a group of 3 cells; A1, B1 and C1 where C1 is the sum of A1 and B1 (=A1+B1), so if we have a 4 in A1, a 6 in B1 then C1 is equivalent to 10. I'm opening the xlsx with workbook = RubyXL::Parser.parse('example.xlsx'), afther that i modify the value of cell A1 from 4 to 5 and save it. Here is the problem, if we read the cell C1 after the change we still have the previous result 10.
How i can update that accord to the formula? Is posible with RubyXL? or is there another solution?
The problem with the accepted answer is that on servers office often isn't installed, so the COM automation won't work there.
When you use the following code the formulas are recalculated when Excel opens the spreadsheet.
workbook = RubyXL::Parser.parse(file)
workbook.calc_pr.full_calc_on_load = true
From the RubyXL README it sounds like that utility is intended to read Excel files and then write them back out. Then when you opened the file in Excel you would see your changes and the formulas would be recalculated.
You might want to look at win32ole if you want to do COM automation of Excel.
Finally i solved this. If you are interested I used win32ole because after tested a lot of rubygems this was the unic which works like i said in the question.
require 'win32ole'
begin
xl = WIN32OLE.new('Excel.Application')
workbook = xl.Workbooks.Open('my_route')
worksheet = workbook.Worksheets(1)
# Here we make operations like this one...
worksheet.Cells(2,2).value = 2
# After any operations we can see the results of referenced cells inmediatly
# Save the file if you want
# workbook.Save
workbook.Close
rescue => e
xl.Quit
end
So in conclusion RubyXL work fine but dont reflect the results of cells referenced in formulas when you edit the file. win32ole do that.

SPSS syntax for naming individual analyses in output file outline

I have created syntax in SPSS that gives me 90 separate iterations of general linear model, each with slightly different variations fixed factors and covariates. In the output file, they are all just named as "General Linear Model." I have to then manually rename each analysis in the output, and I want to find syntax that will add a more specific name to each result that will help me identify it out of the other 89 results (e.g. "General Linear Model - Males Only: Mean by Gender w/ Weight covariate").
This is an example of one analysis from the syntax:
USE ALL.
COMPUTE filter_$=(Muscle = "BICEPS" & Subj = "S1" & SMU = 1 ).
VARIABLE LABELS filter_$ 'Muscle = "BICEPS" & Subj = "S1" & SMU = 1 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0). FILTER BY filter_$.
EXECUTE.
GLM Frequency_Wk6 Frequency_Wk9
Frequency_Wk12 Frequency_Wk16
Frequency_Wk20
/WSFACTOR=Time 5 Polynomial
/METHOD=SSTYPE(3)
/PLOT=PROFILE(Time)
/EMMEANS=TABLES(Time)
/CRITERIA=ALPHA(.05)
/WSDESIGN=Time.
I am looking for syntax to add to this that will name this analysis as: "S1, SMU1 BICEPS, GLM" Not to name the whole output file, but each analysis within the output so I don't have to do it one-by-one. I have over 200 iterations at times that come out in a single output file, and renaming them individually within the output file is taking too much time.
Making an assumption that you are exporting the models to Excel (please clarify otherwise).
There is an undocumented command (OUTPUT COMMENT TEXT) that you can utilize here, though there is also a custom extension TEXT also designed to achieve the same but that would need to be explicitly downloaded via:
Utilities-->Extension Bundles-->Download And Install Extension Bundles--->TEXT
You can use OUTPUT COMMENT TEXT to assign a title/descriptive text just before the output of the GLM model (in the example below I have used FREQUENCIES as an example).
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
oms /select all /if commands=['output comment' 'frequencies'] subtypes=['comment' 'frequencies']
/destination format=xlsx outfile='C:\Temp\ExportOutput.xlsx' /tag='ExportOutput'.
output comment text="##Model##: This is a long/descriptive title to help me identify the next model that is to be run - jobcat".
freq jobcat.
output comment text="##Model##: This is a long/descriptive title to help me identify the next model that is to be run - gender".
freq gender.
output comment text="##Model##: This is a long/descriptive title to help me identify the next model that is to be run - minority".
freq minority.
omsend tag=['ExportOutput'].
You could use TITLE command here also but it is limited to only 60 characters.
You would have to change the OMS tags appropriately if using TITLE or TEXT.
Edit:
Given the OP wants to actually add a title to the left hand pane in the output viewer, a solution for this is as follows (credit to Albert-Jan Roskam for the Python code):
First save the python file "editTitles.py" to a valid Python search path (for example (for me anyway): "C:\ProgramData\IBM\SPSS\Statistics\23\extensions")
#editTitles.py
import tempfile, os, sys
import SpssClient
def _titleToPane():
"""See titleToPane(). This function does the actual job"""
outputDoc = SpssClient.GetDesignatedOutputDoc()
outputItemList = outputDoc.GetOutputItems()
textFormat = SpssClient.DocExportFormat.SpssFormatText
filename = tempfile.mktemp() + ".txt"
for index in range(outputItemList.Size()):
outputItem = outputItemList.GetItemAt(index)
if outputItem.GetDescription() == u"Page Title":
outputItem.ExportToDocument(filename, textFormat)
with open(filename) as f:
outputItem.SetDescription(f.read().rstrip())
os.remove(filename)
return outputDoc
def titleToPane(spv=None):
"""Copy the contents of the TITLE command of the designated output document
to the left output viewer pane"""
try:
outputDoc = None
SpssClient.StartClient()
if spv:
SpssClient.OpenOutputDoc(spv)
outputDoc = _titleToPane()
if spv and outputDoc:
outputDoc.SaveAs(spv)
except:
print "Error filling TITLE in Output Viewer [%s]" % sys.exc_info()[1]
finally:
SpssClient.StopClient()
Re-start SPSS Statistics and run below as a test:
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
title="##Model##: jobcat".
freq jobcat.
title="##Model##: gender".
freq gender.
title="##Model##: minority".
freq minority.
begin program.
import editTitles
editTitles.titleToPane()
end program.
The TITLE command will initially add a title to main output viewer (right hand side) but then the python code will transfer that text to the left hand pane output tree structure. As mentioned already, note TITLE is capped to 60 characters only, a warning will be triggered to highlight this also.
This editTitles.py approach is the closest you are going to get to include a descriptive title to identify each model. To replace the actual title "General Linear Model." with a custom title would require scripting knowledge and would involve a lot more code. This is a simpler alternative approach. Python integration required for this to work.
Also consider using:
SPLIT FILE SEPARATE BY <list of filter variables>.
This will automatically produce filter labels in the left hand pane.
This is easy to use for mutually exclusive filters but even if you have overlapping filters you can re-run multiple times (and have filters applied to get as close to your desired set of results).
For example:
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
sort cases by jobcat minority.
split file separate by jobcat minority.
freq educ.
split file off.

Editing a spreadsheet using SPREADSHEET ruby gem

I have to read data from a spread sheet modify some rows and then write the updated rows / cells into the same file.
I have used Spreadsheet gem with Ruby 2.0.0.
When I write the results back to the same file, I am unable to open the xls any more. I get an error
"File Format is not Valid"
in MS Excel.
When the updates are written onto a different file, I am able to open the file but it is in protected view. Is there a solution to this issue?
Below is the sample code:
require 'rubygems'
require 'spreadsheet'
book = Spreadsheet::open('filePath')
sheet = book.worksheet 0
## have application logic in here
book.write('filePath')
I've worked with this problem a few times and they've had the issue on log for around a year now.
The first problem is that it locks the file when spreadsheet loads it and there is no clear way to close it the only way I've been able to get it to not lock is with this code block. It opens it and stores the first worksheet off into its own variable then closes the file.
worksheet = nil
Spreadsheet.open workbook_name do |inner_book|
worksheet = inner_book.worksheet 0
end
worksheet
If you want all the worksheets you could do something similar. In addition to the file opening closing/problem you have the issue around capturing the content of the worksheet depending on the format. I know for my purposes I end up doing the following to capture the content. This sadly loses any formatting you might have had in the source spreadsheet.
rows = []
worksheet.each do |row|
rows << row
end
You can then make your own workbook/sheet and iterate through the rows and add them to the new sheet/book. Then save the new book with the same file name.
Its not fun or efficient, but it is a way to go about solving the problem. Hope this helped.
check your file extension.
spreadsheet, writeexcel..etc gems seem couldn't work with xlsx files.
try .xls not .xlsx

How do I create a copy of some columns of a CSV file in Ruby with different data in one column?

I have a CSV file called "A.csv". I need to generate a new CSV file called "B.csv" with data from "A.csv".
I will be using a subset of columns from "A.csv" and will have to update one column's values to new values in "B.csv". Ultimately, I will use this data from B.csv to validate against a database.
How do I create a new CSV file?
How do I copy the required columns' data from A.csv to "B.csv"?
How do I append values for a particular column?
I am new to Ruby, but I am able to read CSV to get an array or hash.
As mikeb pointed out, there are the docs - http://ruby-doc.org/stdlib-1.9.3/libdoc/csv/rdoc/CSV.html - Or you can follow along with the examples below (all are tested and working):
To create a new file:
In this file we'll have two rows, a header row and data row, very simple CSV:
require "csv"
CSV.open("file.csv", "wb") do |csv|
csv << ["animal", "count", "price"]
csv << ["fox", "1", "$90.00"]
end
result, a file called "file.csv" with the following:
animal,count,price
fox,1,$90.00
How to append data to a CSV
Almost the same formula as above only instead of using "wb" mode, we'll use "a+" mode. For more information on these see this stack overflow answer: What are the Ruby File.open modes and options?
CSV.open("file.csv", "a+") do |csv|
csv << ["cow", "3","2500"]
end
Now when we open our file.csv we have:
animal,count,price
fox,1,$90.00
cow,3,2500
Read from our CSV file
Now you know how to copy and to write to a file, to read a CSV and therefore grab the data for manipulation you just do:
CSV.foreach("file.csv") do |row|
puts row #first row would be ["animal", "count", "price"] - etc.
end
Of course, this is like one of like a hundred different ways you can pull info from a CSV using this gem. For more info, I suggest visiting the docs now that you have a primer: http://ruby-doc.org/stdlib-1.9.3/libdoc/csv/rdoc/CSV.html
Have you seen Ruby's CSV class? It seems pretty comprehensive. Check it out here:
http://ruby-doc.org/stdlib-1.9.3/libdoc/csv/rdoc/CSV.html
You will probably want to use CSV::parse to help Ruby understand your CSV as the table of data that it is and enable easy access to values by header.
Unfortunately, the available documentation on the CSV::parse method doesn't make it very clear how to actually use it for this purpose.
I had a similar task and was helped much more by How to Read & Parse CSV Files With Ruby on rubyguides.com than by the CSV class documentation or by the answers pointing to it from here.
I recommend reading that page in its entirety. The crucial part is about transforming a given CSV into a CSV::Table object using:
table = CSV.parse(File.read("cats.csv"), headers: true)
Now there's documentation on the CSV::Table class, but again you might be helped more by the clear examples on the rubyguides.com page. One thing I'll highlight is that when you tell .parse to expect headers, the resulting table will treat the first row of data as row [0].
You will probably be especially interested in the .by_col method available for your new Table object. This will allow you to iterate through different column index positions in the input and/or output and either copy from one to the other or add a new value to the output. If I get it working, I'll come back and post an example.

Microsoft Excel spreadsheet used as a computation engine called from code

I have a MS Excel spreadsheet which does some complex computations. I'd like to create a script which will create a CSV file with the results obtained from the spreadsheet.
I could rewrite the logic from the spreadsheet in my programming language (for example Ruby, but I'm open to use a different language), but then I would have to update my code whenever someone changes the logic in the spreadsheet. Is it possible to use a MS Excel spreadsheet as a black box, a computation engine, which can be invoked from my code? Then I would only have write the CSV part and input data download in my code, the whole computation logic could stay in the spreadsheet and could be easily updated.
Ideally, I don't want to add any CSV generation or data download code to the spreadsheet, because it's used by domain-experts (not programmers). Additionally, I have to download some data from the Internet and pass it to the spreadsheet as the input values. I'd like to keep that part of the code externally, in a version control system like Git. One additional note is that the spreadsheet uses the Solver Excel plugin.
Any help how to do that would be very appreciated.
Thanks,
Michal
To manipulate an Excel spreadsheet using Ruby, you may want to use win32ole
Here's a sample script:
data = [["Hello", "World"]]
# Require the WIN32OLE library
require 'win32ole'
# Create an instance of the Excel application object
xl = WIN32OLE.new('Excel.Application')
# Make Excel visible
xl.Visible = 1
# Add a new Workbook object
wb = xl.Workbooks.Add
# Get the first Worksheet
ws = wb.Worksheets(1)
# Set the name of the worksheet tab
ws.Name = 'Sample Worksheet'
# For each row in the data set
data.each_with_index do |row, r|
# For each field in the row
row.each_with_index do |field, c|
# Write the data to the Worksheet
ws.Cells(r+1, c+1).Value = field.to_s
end
end
# Save the workbook
wb.SaveAs('workbook.xls')
# Close the workbook
wb.Close
# Quit Excel
xl.Quit
To work out more complicated code, just record a macro of what you want to do, and then look at the code of your macro, and convert it from VB into Ruby.

Resources