How to edit tab in .xls with Spreadsheet gem? - ruby

I am trying to open an existing .xls file and overwrite the contents in one spreadsheet (tab).There are many tabs on the file and many have pivottables and other visual presentations.
I have tried Spreadsheet and axlsx. Axlsx has great controls but overwrites the entire file including any other created tabs. Spreadsheet will open and edit a file but you have to copy the other tabs which will remove the excel formatting.
Is there a way to use Ruby to add data to only one tab in a spreadsheet without changing content in the other tabs?
Update:
Here is what I am testing now using the Spreadsheet gem. I can open a spreadsheet that has multiple tabs where one tab contains a pivottable, another contains a chart, and another the raw data. They have to be saved out as a new doc otherwise you get a File Format is not Valid error.
open_book = Spreadsheet.open('../data/exports/test_output_dashboard.xls')
puts "#{open_book.worksheet(0)}"
puts "#{open_book.worksheet(1)}"
puts "#{open_book.worksheet(2)}"
open_book.write('../data/exports/test_output_dashboard_2.xls')
If I just open and resave the new document is fine, a working copy of the original. However, if I edit the tab with the raw data as in this code then when I open the file it is shown as needs to be 'repaired' and none of the tabs show the correct information.
open_book = Spreadsheet.open('../data/exports/test_output_dashboard.xls')
puts "#{open_book.worksheet(0)}"
puts "#{open_book.worksheet(1)}"
puts "#{open_book.worksheet(2)}"
new_row_index = open_book.worksheet(1).last_row_index + 1
open_book.worksheet(1).insert_row(new_row_index, row_2)
open_book.write('../data/exports/test_output_dashboard_4.xls')
Any suggestions for adding data to one tab of an excel doc while keeping the other tabs intact would be greatly appreciated. The solution can be any gem or could be any language or automatable tool.
UPDATE:
Here is an example Excel dashboard that I am using for testing. I am writing rows into the data tab. https://dl.dropboxusercontent.com/u/23226147/test_output_dashboard.xlsx
UPDATE:
With RubyXL I can open and inspect the content of each tab but the saved doc cannot be opened by Excel.
workbook = RubyXL::Parser.parse("../data/exports/test_output_dashboard.xlsx")
puts "#{workbook.worksheets[0].inspect}"
puts "#{workbook.worksheets[1].inspect}"
puts "#{workbook.worksheets[2].inspect}"
workbook.write("../data/exports/test_output_dashboard_5.xlsx")

If you're just looking for a quick tool, RubyXL might do the trick for you:
https://github.com/weshatheleopard/rubyXL
It parses existing .xlsx .xlsm files and has a decent set of documentation.

You could try cloudxls.com API. You can merge data into existing xls and xlsx files using its API. If that is not an option, you most likely have to use some java libraries like Apache POI.

Solution for Windows users only.
I use to modify Excel spreadhseet with the gem win32ole and it works fine.
In case it is interesting for you here is a short sample to open a file and activate a given tab:
excel = WIN32OLE.new('Excel.Application')
excel.visible = true
filepath = 'e:\tmp\file.xlsx'
cur_book = excel.workbooks.Open(filepath)
sheet_name = 'sheet1'
cur_sheet = cur_book.Worksheets(sheet_name)
# put value 10 in Cell(2,2)
cur_sheet.Cells(2,2).Value = 10
Official documentation: http://ruby-doc.org/stdlib-1.9.3/libdoc/win32ole/rdoc/WIN32OLE.html

Related

How to edit numbers of Evernote notes, and make them sync?

I've been trying to format my Evernote notes (thousands of them) so that they are readable on any device.
I've accessed evernote storage on my Mac and saw folders of entries -- every folder contains a note.xhtml and a content.enml files, which directly stores note contents.
I can modify the *.xhtml file, and changes are reflected on Evernote client, but they just won't sync over to the server. Additionally, the *.enml file contains corresponding content to xthml file, but the change won't go there.
Is there any way I can neatly edit my notes, on the HTML level?
Thx!
In AppleScript, it's pretty easy to get and set the HTML. To actually manipulate the HTML you might want another language.
Here's how you read and write HTML content to a single selected Evernote note:
tell application "Evernote"
set noteList to selection
set n to item 1 of noteList
set extractedHtml to HTML content of n
set HTML content of n to "<p>Foo Bar</p><p>foo baz</P>"
end tell
Evernote provides some good examples of using AppleScript on their developer site. You can also use xsltproc for some more systematic manipulation. I have a read-only example of using xslt via AppleScript in a recent post of mine. This above little snippet might be enough of an example to tell you how to set the HTML content.
But, to give you a better answer, I'd need to know a little more about how you want to manipulate your notes. The above example just grabs the first item in your current selection and sets the content.

I need code to unblock adding new worksheets in excel vba

I am writing an add in for excel. It is supposed to create a new worksheet, then copy data from the pre-existing worksheets.
Now the whole add-in works on another excel document. But the one I need it to work in has disabled the ability to add new worksheets.
Can someone please tell me what code enables this?
Sub Auto_Open()
Dim WSheet As Worksheet
On Error Resume Next
Set WSheet = Sheets("DispersionList")
On Error Resume Next
Dim works As Worksheet
ActiveWorkbook.Unprotect
If WSheet Is Nothing Then
Set works = Worksheets.add(after:=Sheets(Worksheets.Count)).Name = "DispersionList"
Call makeFormat
Worksheets(1).Activate
End If
DispersionForm.Enabled = True
DispersionForm.Show
End Sub
Like pnuts mentioned above, if you search Google/SO, you will find plenty of posts which talk about hacking the password. However my answer is not about hacking the password but about alternative(s) that you can employ to play around with the workbook. If someone protected the Excel file then it is obvious that you are not meant to fiddle with it :) And as a personal choice, I do not assist with hacking. If you can get the password from the author then there's nothing like it.
There could be 2 kinds of protection that I can think of which will stop you from adding sheets to that workbook.
Via Code
Workbook Structure is protected.
If you want to play around with the workbook with not the intention of hacking it then you have couple of options.
If the protection is via code, then that file could be an xls or xlsm file which supports macros. Here what the author might have done is written a code where the moment you add a sheet, it is immediately deleted. In such a case, simply resave the file as an xlsx file. Close the file and re-open it so that you can then add sheets to it and play with it without hacking it.
If the workbook structure is protected then there is nothing much you can do except copy (if it allows) the cells from the existing workbook to a new workbook and then play with it

Sphinx: include xlsx data into rst

What I want is to include an xlsx sheet (or rather the data of said file) into Sphinx documentation.
Is there any way to convert an xlsx sheet to restructuredText?
You do not need to convert it to rst. If you export your xlsx sheet to a comma separated file (csv) you can then use the csv-table directive.
The great thing is that you need to set up your table in your csv only once, and whenever you update your xlsx sheet, just export again to csv to the same location of where your table was before.
.. csv-table:: The contents of my xlsx sheet exported to mytable.csv
:widths: 15 40 20
:header: "Header 1", "Header 2", "Header 3"
:file: mytable.csv
That is all. Add as many widths and headers as you have columns in your file.
Note that in the documentation a security warning is given when using the :file: option. You can choose to copy paste the comma separated text into the document. However, if you update the table regularly I find it easiest to just export again to csv.
I have just released sphinxcontrib-excel, which embeds xlsx, xls and ods data (without style, font, charts) into your sphinx documentation. And you do not need to convert from xlsx to csv because the library does it for it.
Here is an example page where it was used to render excel data in readthedocs.org directly.

Save thousands of html files as txt files, using firefox - how to automate this job?

I have thousands of html files, and need to save each of them as txt, using firefox.
If I do this job manually, I would open each html file in firefox, click the File menu, click the 'Save Page As' menu item, then select the format as 'TEXT', and save to local disk.
But how to automate this job?
Any script/tool can help this?
Thanks.
If your goal is to get firefox to strip the html out of each page and save just the text, then there are a ton of options. I'm not aware of any firefox add-ons that will be intelligent enough to loop over every file in a directory in order to perform a macro, so here are some options:
Refer to this SO question regarding how to use python to strip the html from each file. It provides examples for both the built in HTMLParser module and for using BeautifulSoup
Use Selenium to automate your webbrowser: http://seleniumhq.org/
If you know javascript, you can use PhantomJS:
http://www.phantomjs.org/, which is a headless web browser that you
drive with javascript scripts.
I have thousands of html files...
Do you actually have these files on-hand, or are they online?
...and need to save each of them as txt...
Any text editor should be able to save the data within (i.e. why use FireFox), and I think a straight rename of .htm or .html to .txt. will work (at least on any Windows system). Or do you mean: save just the displayed text of the HTML file?
EDIT:
First, start off with this link, which has a good explanation of how to get started with shdocvw, which you will need to do this.
Once you have the reference set up, using the functions
Function GetNewIE() As SHDocVw.InternetExplorer
and
Function LoadWebPage(i_IE As SHDocVw.InternetExplorer, i_URL As String) As Boolean
from the link (just copy into your project as described in the link) to load your individual html files, using a loop to get through each file. (Excel would be good for this, because you can put your list of files into the cells, and cycle through each cell to retrieve.) I have never done something like this with so many files, so I cannot guarantee this will work, unfortunately...
Dim IE As SHDocVw.InternetExplorer
Dim lRow as Long 'Long in case you have a LOT of files
Dim iFNum As Integer
Dim sFilePath As String
Set IE = GetNewIE
For lRow = 1 To 5000 Step 1 ' Assuming you have 5,000 html files, so 5,000 rows with the paths to each
sFilePath = ActiveSheet.Range("A" & lRow).Value ' This should also include the filepath. i.e. "C:\dir\..."
If LoadWebPage(IE, sFilePath) Then
iFNum = FreeFile(lRow)
Open sFilePath & ".txt" For Output As iFNum
Write #iFNum, IE.Document.InnerText
Close #iFNum
End If
Next lRow

OOXML - Spreadsheet (.XLSX) created with Ruby won't recalc

I am building a ruby class/component to use in my Rails projects for creating reports/exports based on Excel .xlsx files. With the component, I can open a "template" .xlsx file, add data in rows to a sheet, save and then download the file to the user. It has been working well for several months now.
Now I need to take a pre-existing .xlsx file (think "form"), open it as a template, insert values in several of the cells, and then save and download to the user. For the most part, the process works. The one hitch is that one of the cells I am updating with data is within a range of cells that gets a SUM function applied to it. The problem: the SUM cell doesn't have the correct sum in it.
I've checked the cell both in Excel upon download, and also the underlying xml - the cell and its data is numeric - not text. When I try to manually recalc the sheet - nada. I can update one of the other cells in the range that is getting SUM'd, and it magically starts working - the SUM cell shows the proper total.
I read a post earlier today that mentioned removing the element from the total field in order to signal to Excel when the spreadsheet is opened that it should recalc - nope.
I'd really like to open source this component once I get this further along; I think it would be a BIG help to the Ruby community. Thanks in advance for any help!
Sounds like you need to set the fullCalcOnLoad attribute of the calcPr element to true:
<workbook>
<calcPr fullCalcOnLoad="1"/>
</workbook>
This will cause the Excel document to perform the caluclations within all the workbooks when the file is opened.

Resources