Excel List-Object VBA Performance Bug? - performance

I have an issue with performance on an excel application which uses List Objects (AKA Excel Tables). I suspect it may be a bug, but despite my Googling I could not find any reference of it. I've already developed a workaround for my application, but what I’m interested in is if anyone can give any insight into why this happens.
Note: I’m using Excel 2007 on Windows Vista. The setup is as follows: I have a spreadsheet which holds data in a List Object, with VBA code which can be kicked off via a command button; this code may make several edits to any number of cells on the worksheet, so Excel’s Calculation mode is set to Manual prior to any edits.
The problem I’ve encountered is that if the currently active cell is within the List Object, then setting the Calculation Mode to manual seems to have no effect whatsoever. So if a user happens to have a heavy calculation workbook open in the same instance, then the VBA code runs very slowly. I practically had to pull my application apart to discover that this was caused by the active cell; and I created a new workbook with simple version of this scenario to confirm that there wasn’t some sort of corruption on my application.
I’ve been doing a number of test cases with this, and below are the results from what I’ve found:
Although it seems generally related to the calculation, there is still a time difference when the calculation mode is switched between Manual and Automatic...
Manual = 7.64 secs
Automatic = 9.39 secs
Manual mode is just fewer than 20% faster than Automatic. But my expectation was they’d be more or less the same, considering the issue seems to be the calculation kicking off even when in Manual mode.
Compare that to when the active cell is not on a List Object, and the results are vastly different...
Manual = 0.14 secs
Automatic = 3.23 secs
Now, the Manual run is 50 times faster, and Automatic run shows that the calculation shouldn’t have taken any more than 3.2 secs! So now the first test looks like it might have run the Calculation twice while in Manual mode, and nearly 3 times while in Automatic mode.
Repeating this test again, this time in an instance with no calculation formula in any cells, and suddenly it doesn’t seem as bad,
Active cell is List Object & Calc is Manual = 0.17 secs
Active cell is List Object & Calc is Automatic = 0.20 secs
Active cell is Empty & Calc is Manual = 0.14 secs
Active cell is Empty & Calc is Automatic = 0.18 secs
It’s still slower, but now it’s only by 10-20%, making it unnoticeable. But this does show that the issue must be related to the Calculation in some way, as otherwise it should have taken just as long as the first test.
If anyone wants to create these tests to see for themselves, the setup is as follows:
New Workbook with a List Object added (doesn’t have to be linked to any data)
Add some formula that will take excel a while to calculate (I just did ‘=1*1’ repeated 30,000 times)
Write a quick VBA code which will; (i) loop through a simple edit of a cell several hundred times, (ii) and record the time it took
Then just run the code while changing the active cell between the List Object and an empty cell
I’d be very interested to hear if anyone can explain why Excel behaves in this way, and if is a bug or if is some feature to do with List Objects which actually has some genuine use?
Thanks,
Stuart

This is not relative to the "bug" you found, which is quite interesting and intriguing.
I just want to share that there is a great way to avoid calculation delays. I had fantastic results with this and now I use it all the time.
Simply put, Excel takes a long time copying data back and forth between the "VBA world" and the "spreadsheet world".
If you do all the "reads" at once, process, and then do all the "writes" at once, you get amazing performance. This is done using variant arrays as documented here:
http://msdn.microsoft.com/en-us/library/ff726673.aspx#xlFasterVBA
in the section labeled: Read and Write Large Blocks of Data in a Single Operation
I was able to refactor some code I had that took 5 minutes to run and bring it down to 1.5 minutes. The refactoring took me 10 minutes, which is amazing because it was quite complex code.

Regarding Table performance (and performance, in general):
I know this is an old question, but I want to get this documented.
One thing that changed between older versions of Excel and the post-2007 versions is that Excel now activates the target sheet of any PasteSpecial operation. You cannot override it by turning off ScreenUpdating and making calculations manual. Such Activation WILL make the sheet visible, and cause uncontrollable flicker.
My original VBA code ran very fast on an old, single-processor XP box running Excel 2000. The change to Excel 2013 on a modern machine was stunning in the terrible slowness of code execution. The three areas that kill performance are PasteSpecial from one sheet to another, any other code that requires activating sheets (Zoom level, Advanced Filter, Sheet-Level range names, etc), and automating sheet protection/unprotection.
This is too bad, because PasteSpecial helped "cleanse" data you copy (Direct use of .Copy to a target will throw the occasional error).
So you need to review your code and make sure you are using direct assignment to the right property for the data type you need (from among Value, Value2, Text, and Formula, for example), instead of PasteSpecial.
e.g. .Range("MYRANGE").Value = .Cells(5, 7).Value2
You also need to be scrupulous in resisting use of Select and Activate throughout your code.
As referenced above, many comments you'll find in Excel fora about that last point will make statements that you "never" need to use Activation, which is clearly untrue, since several things in Excel only apply to or require active sheets. Understanding the cases where activation is forced automatically by a particular method or use of an object will help in coding as well. Unfortunately, you won't see much in the way of documentation of this.
Update:
Regarding Conditional Formatting, you'll find many complaints in various fora about the slowness of Excel when encountering a large number of Conditionally-formatted cells. I suspected this would impact Excel Tables since they have many table format options. To test this, I took a large workbook we use that is currently formatted as several worksheets with the same style of Excel Table on them.
After converting the tables to a conventional range, I noticed no difference in speed of code execution. This would seem to indicate that use of Excel Table formats is far superior to conditionally-formatting your own arrays of cells.

Related

Lost Duration while Debugging Apex CPU time limit exceeded

I'm open to posting the code in this section to work through the optimization but its a bit length and complex, so instead I'm hoping that somebody can assist me with a few debugging questions I have. My goal is to find out what is causing my Apex CPU Time Limit Exceeded issue.
When using the Debug Log in its basic or normal layout I receive the message
Maximum CPU Time: 15062 out of 10,000 ** Close to Limit
I've optimized and re-wrote various loops and queries several times now and in each case this number concludes around there which leads me to believe it is lying to me and that my actual usage far exceeds that number. So on my journey I switched the Log Panels of the Developer Console to Analysis in hopes of isolating exactly what loop, method, or area of the code is giving me a headache.
This leads me to my main question and problem.
Execution Tree, Performance Tree & Executed Units
All show me that my durations UNDER the 10,000ms allowance. My largest consumption is 3,556.19ms which is being used by a wrapper class I created and consumed in the constructor method where there is a fair amount of logic that is constructing a fairly complicated wrapper class that spans over 5-7 custom objects. Still even with those 3,000ms the remainder of the process shows at negligible times bringing my total around 4,000ms. Again my question is.... Why am I unable to see or find what is consuming all my time?
Incorrect Iteration Data
In addition to this, on the Performance tree there is a column of data that shows the number of iterations for each method. I know that my Production Org has 81 objects that would essentially call the constructor for my custom wrapper object. I.E. my Constructor SHOULD be called 81 times, but instead it is called 32 times. So my other question is can I rely on the iteration data in the column? Or because it was iterating so many times does it stop counting at a certain point? Its possible that one of my objects is corrupted or causing an infinite loop somehow, but I don't want to dig through all the data in search of that conclusion if its a known issue that the iteration data is not accurate anyway.
System.Debug in the Production org
The Last question is why my System.Debug() lines are not displaying in my Developer Console on the production org. I've added serveral breadcrumbs throughout the code that would help me isolate just which objects are making it through and which are not, however, I cannot in any layout view system.debug messages outside of my Sandbox.
Sorry for the wealth of questions but I did want to give an honest effort to better understand the debugging process in Salesforce. If this is a lost cause I'm happy to start sharing some code as well but hopefully some debugging tips can get me to the solution.
It's likely your debug log got truncated, see "Each debug log must be 20 MB or smaller. If it exceeds this amount, you won’t see everything you need." in https://trailhead.salesforce.com/en/content/learn/modules/apex_basics_dotnet/debugging_diagnostics
Download the log and search for text similar to "skipped 123456 bytes of detailed log" to confirm, some system.debug statements will just not show up.
You might have to fine-tune the log levels (don't log validation rules and workflows? don't log every single variable assignment with "FINE" level etc). You might have to set all flags to NONE, then track only 1 particular class/trigger that you suspect (see https://help.salesforce.com/articleView?id=code_debug_log_classes.htm&type=5 and https://salesforce.stackexchange.com/questions/214380/how-are-we-supposed-to-use-debug-logs-for-a-specific-apex-class-only)
If it's truncated it's possible analysis tools give up (I had mixed luck with console to be honest, sometimes https://apextimeline.herokuapp.com/ is great to give overview - but it'll also fail to parse a 20 MB log...
When all else fails you can load up the log into Notepad++ (or any editor of your choice), find lines related to method entry/method exit (you might need a regular expression search), take these filtered lines tor excel, play with "text to columns" and just look at timing manually, see if there's a record that causes the spike. Because it could be #10 that's the problem, the fact it exhausts limits on #32 of 81 doesn't mean much. Search like [METHOD_ENTRY|METHOD_EXIT]MyTriggerHandler.onBeforeUpdate could be a good start. But first thing is to make sure log is not truncated.

Making an application in Visual Basic to handle Dialogue in Morrowind?

I want to make a program for a very catered, specific purpose, to aid me in making a large set of quest mods to the videogame Elder Scrolls III: Morrowind. I’m attempting to do this through either excel or Visual Basic, and here I’ve provided a little summary of how dialogue works in the game’s normal creation program and then what I want to create outside of it and improve on.
How Morrowind Dialogue works?
For those of you who may be familiar with the game, you’ll remember that the talking to NPC’s will bring up a set of text, and this text is their dialogue. There are different “topics” that if an NPC has dialogue set for, the player can see the topic and click on it, bringing up a new wall of text, and this is generally how dialogue works in the entire game on the player’s end.
In creating a Morrowind Mod, the way dialogue really works in the “Construction Set” (the program used to create and edit the game) is that a database contains every entry of text, and this these entries have conditions set to them which limit which NPCs can say a given entry of dialogue. So for instance, a topic like “latest rumors”, will have lots of entries in it with lots of different NPCs having something to say about it. The topic itself is a condition of sorts with potentially dozens of entries attached to it, and conditions set to specific entries can also be applied. Conditions can include checking to see if the NPC is in a given city, if the in-game time is night or day, if the player is at a certain numbered stage/index of a given quest line and much, much more. This system is what makes all quests possible and the game dynamic.
What I want to create:
I am beginning a rather large mod project that includes many entries of dialogue, many new and old topics, and many quest and quest stages. I could list all the reasons here but essentially my problem is that the Construction Set has many limitations in terms of organization that make it difficult to make a large mod’s dialogue in. I would be better off to design, set the topics for, and edit all of my dialogue entries outside of the Construction Set program and implement them when I’m confident that the writing and quests are finished.
Essentially if this is too complicated I could just write all the quests and dialogue in Microsoft Word, but optimistically I'd like to do something more dynamic and helpful to me, as a writer, and be able to use real variables to store and set Journal/Quest Indexes, filter dialogue by Quest or by NPC, and easily edit dialogue and quests without getting lost in the normal game’s thousands of lines of other dialogue.
*I can't post more than two links here, but I posted on reddit and there I have a gallery showing how the Construction Set works and what I have made in Visual Studio so far:
https://www.reddit.com/r/learnprogramming/comments/4oap6w/making_an_application_in_visual_basic_to_handle/
So, my intention is to make a program in Visual Studio using Visual Basic or Python that leaves me with a program that lets me write, organize, and set the text for dialogue and filter based on conditions.
This likely requires creating a database file for the program in Visual Studio and being able to create variables in runtime, for the program. That is because I want the user of the program to be able to add new dialogue topics, new journal/quests, and all of these things will have conditions with values associated with them.
Any help, advice, and direction is appreciated. I am relearning Visual Studio (I took two courses in it) and I am unfortunately very new to excel and databases in general.
You are correct in that a database of some kind would be needed. However, you could approach this several different ways depending upon your comfort level, money, portability requirements, etc...
One way to do it would be to use XML to store your data. It has the advantage of being extremely portable and transformable. Since this is likely a program where only one person would be directly accessing the data at any given time, it might be your best bet.
Another option is to use MS Access if you have office. This gives you a workable, albeit fairly basic, relational database. This would probably be a better choice if you have 2 or 3 people that could possibly be working in it.
A third option would be a full DBMS. MySQL is free and you could install that to your local machine, or to a remote server. Installed to a remote server would give you the option of allowing many people to connect to it and modify data transactionally. However, this would be overkill if it is only a one or two person system.
Circling back around to XML... That will most likely be your best bet. It is simple and integrates perfectly with .Net applications. It can be imported/transformed to any data-store later once you are finished (or multiple times as you progress). Interfacing with XML via .Net allows you to work with it like a database within your code, so if you design your data layer properly up front, you could even migrate to a full database later if the project expands drastically. The biggest downside to XML would be that it isn't relational in the way that a regular DBMS is, and it is not inherently transactional. You do not have atomic updates, so if you have several people modifying things at once you could lose data if it is overwritten.
You could get around that to an extent by writing a more advanced data layer to interface with the XML files, but if only one person is making changes locally, and then the data file is, say, uploaded to a remote datastore later, the only thing to keep in mind would be coordinating when and who can modify that file. Mostly logistics stuff at that point.

TClientDataSet works VERY SLOW with 100K+ rows

i have problem retrieving data with Delphi TClientDataSet
Code with ADO:
ADOQuery1.SQL.Text:='SELECT * FROM Table1 WITH (NoLock)';
DataSource1.DataSet:=ADOQuery1;
DataSource1.DataSet.Open;
DataSource1.DataSet.Last;
Code above returns over 180k rows in 3-6 seconds when using pure ADO.
Same code with TClientDataSet:
ADOQuery1.SQL.Text:='SELECT * FROM Table1 WITH (NoLock)';
CDS1.SetProvider(ADOQuery1);
DataSource1.DataSet:=CDS1;
DataSource1.DataSet.Open;
DataSource1.DataSet.Last;
Following code returns same amount of rows(over 180k) but within 3-4minutes.
What's wrong with CDS? It's about 100-times slower then using ADO.
Is it possible to fix it?
Code above returns over 180k rows in 3-6 seconds when using pure ADO.
For some reasons, I wouldn't expect the code you posted to return every single of the 180k records... I'd expect to see the first "X" records loaded after TADOQuery.Open is called, and then the last "X" records sent when TADOQuery.Last is called. Going while not EoF do instead of ".Last" would probably be better performance test since (I assume) you actually want to browse all the records.
When calling TClientDataset.Last when linked to a DataProvider, it most likely do the equivalent of a while not EoF do on your query, which transfer all the 180k records. Also, TClientDataset Insert/Append operation tend to get slower and slower the more records you have in it. My best guess is that it has to realloc it's memory buffer every now and then. If that's the case, I haven't found a way to tell the TClientDataset: "Hey! Brace yourself, 180k records incoming!"(Akin to TList.SetCapacity).
If you have an older version of delphi, one thing that could help is the Midas Speed Fix.
This is pretty old but there are a lot of new Delphi programmers out there now. Here is a bit of a scoop.
When using CDS in delphi you are in fact creating a memory table. Your query probably went to swap hell.
To get the most out of a CDS use DBX components to grab data. They are so called 'fast forward' cursors that dont create a temp table with a cursor in the database. forward only doesnt do the fancy stuff that ADO does.
If you need massive datasets with total update notifications and full control, then use ADO. If you need to plow through a ton of data in a hurry with little load on the server that is where CDS/DBX shines.
The way of DBX is harder. It is just a dragster. First and Next are the only thing that works for them. No updates, no commitments, just a fast one way relationship. Hook up a DBX/provider/CDS combo and you have it all. Speed and edit ability. Use a version number to detect another user doing something to the data while you were editing. Study the provider options to see how to get power with flexibility. This is pretty much as hard core as it gets in Delphi.
CDS1.LogChanges property = false or true
its the same problem
Try setting the CDS1.LogChanges property to False before loading the data. This needs to be done in code as it is not a published property.
From the helpfile: For large datasets, a value of True for LogChanges may severely impact the performance of the application.
You can then turn it on after the initial load.

Are Sitecore's sublayout rendering stats incorrect?

The built-in Sitecore rendering stats http://<sitename>/sitecore/admin/stats.aspx is really helpful for identifying inefficient and slow-loading XSLT renders. Recently I've started switching to .ascx sub layouts to take advantage of the Sitecore C# API which can help improve performance when used correctly.
However, I've noticed that sub layouts (as opposed to XSLT renders) are not reported correctly on the stats page. See the screenshot below....
I know for a fact that this sub layout takes about 1.8 seconds to generate (I calculated this in the code-behind). Caching is turned off. I've refreshed the page 20 times to ensure I get an average. You will see that the "Avg. items" is always 0 - I can live with this - but the "Avg. time (ms)" is less than 1ms which is just clearly wrong.
Does anyone have any insights into this? Has anyone found a way to get it to work correctly?
Judging whether a statistic is right/wrong is going to rely on understanding exactly what it is measuring.
Digging around in Sitecore.Diagnostics.Statistics using Reflector I note the following:
Sitecore.Web.UI.Webcontrol contains a field m_timer
This is 'started' in the BeforeRender() method and 'stopped' in the AfterRender() method
Data from that timer is sent to Statistics.AddRenderingData() and is logged against the control
This means it is measuring the time taken to render the control, which for an XSLT includes the processing time for preparing all the data in it, but as much of the work of a normal ASCX is done prior to the Render-stage the statistic is much less useful. Incorporating the Load stage in the time would inadvertently include the processing time for all child components, since the Load sequence is chained and called recursively, so that probably doesn't help much either.
I suspect there is no good way of measuring the processing time for a specific ASCX control (excluding children) without first acquiring cumulative data then post-processing the call chain and splitting the time apart. This is the sort of thing RedGate ANTS does really well, but might not be so good if it was being executed on a live production system, given the overheads.

Performance tuning VBA code in large procedure

I've been asked to tune the performance of a specific function which loads every time a worksheet is opened (so it's important that it doesn't make things slow). One of the things that seems to make this function slow is that it does a long call to the database (which is remote), but there are a bunch of other possibilities too. So far, I've been stepping through the code, and when something seems to take a long time making a note of it as a candidate for tuning.
I'd like a more objective way to tell which calls are slowing me down. Searching for timing and VBA yields a lot of results which basically amount to "Write a counter, and start and stop it either side of the critical section" (often with the macro explicitly called). I was wondering whether there was a way to (in the debugger) do something like "Step to next line, and tell me the time elapsed".
If not, can someone suggest a reasonable macro that I could use in the Immediate window to get what I'm after? Specifically, I would like to be able to time an arbitrary line of code within a larger procedure (rather than a whole procedure at once, which is what I found through Google).
Keywords for your further search would be to look for a "Profiler" for VBA. I've heard of VB Watch and VBA Code Profiler System (VBACP) as well as from Stephen Bull's PerfMon, but sparing the latter they're mostly not free.
So far for the official part of my answer, and I toss in some extra in terms of maybe useless suggestions:
Identifying "slow" code by "humanly measurement" (run a line and say: "Woah, that takes forever") in the debugger is certainly helpful, and you can then start looking into why they're slow. Your remote database call may take quite long if it has to transmit a lot of data - in which cases it may be a good idea to timestamp the data on both ends and ask the DB whether data had been modified before you grab it.
Writing the data into the sheet may be slow depending on the way you write it - which can sometimes be improved by writing arrays to a range instead of some form of iteration.
And I probably don't need to tell you about ScreenUpdating and EnableEvents and so on?

Resources