Best Practices for Multiple OnEdit Functions - performance

Problem
I have 6 OnEdit Functions, which work as intended individually, but when together they don't work as intended. By this I mean some simply don't trigger.
Properties of the Script
They have different names - function onEdit(e) {code}, function onEdit1(e1) {code}, function onEdit2(e2) {code}, function onEdit3(e3) {code}, function onEdit4(e4) {code}, function onEdit5(e5) {code}
They are all in the same .gs tab
Some of them have the same variables. For example OnEdit has var range = e.range; and OnEdit5 has var range = e5.range;
My Understanding
I believe that you can run multiple OnEdit functions within the same .gs tab. Is this correct? Or do I need to somehow create new .gs tabs?
I believe that my onEdit functions should be named differently, so they are called correctly. Is this correct, or should I be getting rid of the different functions and putting them into one massive function? (I imagine this would lead to slower execution and more cases of not being able to isolate incorrect code).
I believe that the variables that are created within each function are specific to that function. Is this true? Or are they impacting each other?
Why I'm asking this
Iterations of this question seem to have been asked before. But people generally give advice on integrating two functions into one big one, rather than preparing someone to integrate 10-20 different OnEdit functions. Nor do they give a clear indication of best coding practices.
I've spent hours reading through this subject and feel that people new to scripts, like me, would greatly benefit from knowing this.
Thank you in advance for any contributions!

Notes:
There can only be one function with a same name. If there are two, the latter will overwrite the former. It's like the former never existed.
A function named onEdit is triggered automatically on (You guessed it!)edit
There's no simple trigger for other names like onEdit1 or onEdit2....
Simple triggers are limited to 30 seconds of execution
So, in a single code.gs file or even in a single project, there can only be one function named onEdit and trigger successfully.
If you create multiple projects, onEdit will trigger in each project asynchronously. But there are limits to number of projects that can be created and other quotas will apply.
Alternatively, you can use installed triggers: which doesn't have limit of 30s. You can also use any name for your function.
The best way to optimize functions is to never touch the spreadsheet unless it is absolutely necessary. For example, sorting various values inside the script is better than repeatedly calling .sort on the multiple ranges multiple times. The lesser the interaction between sheets and scripts, the better. A highly optimized script will only require two calls to spreadsheet: one to get the data and the other to set the data.
After optimizing the number of calls to sheet, you can optimize the script itself: Control the logic such that only the necessary amount of operations are done for each edit. For example, if the edit is in A1(A1,B1 are checkboxes, if clicked clears A2:A10,B2:B10 respectively), then you should check if A1 is clicked and If clicked, clear the range and exit and not to check for B1 again. Script optimization requires atleast a basic knowledge of JavaScript objects. Nevertheless, this isn't as effective as reducing the number of calls-which is the slowest part of any apps script.
References:
Best practices

Related

Performance implications of function calls in PSM1 Modules

I have a function that does a find/replace on text files, and it has worked well for some time. Until I needed to process a 12 million line file.
My initial code used Get-Content and Write-Content, and with the massive file it was going to take hours to process, not to mention the memory implications of loading 12 million lines into RAM.
So, I wrote a little test script to compare that approach vs Stream Reader/Writer. And Streaming looked like it was going to be a massive performance improvement, dropping processing to 30 seconds. I then added a .Replace() on each line, and total processing time only went up to maybe a minute. All good. So then I went to implement it in my real code, and performance has tanked again. That code is a PS1 that loads a number of PSM1 files. The function to do the find replace is in one of those PSM1 files, and that code calls functions in another PSM1. The test script was everything in a single small PS1.
Given that my test script didn't use a function call at all, I tested that first, so there is a function in the PS1 that gets called 12 million times from the loop in the same PS1. No real performance impact.
So, my thought then was that calling a function in one PSM1 that then calls a function in another PSM1 (12 million times) might be the issue. So I made a dummy function (which just returns the passed string, as if no replacement was needed) in the same PSM1 as the loop. And that is orders of magnitude slower.
I have not tested this with everything in the PS1, mostly because these functions are needed in three different scripts with very different argument requirements, so implementing it with Modules really made a lot of sense logistically, and changing that would be a massive undertaking.
That said, is there a known performance hit when calling a function that lives in a Module? I was under the impression that once the Modules are loaded, it's basically the same as if it was all in a single PS1, but perhaps not? FWIW, I am not using NameSpaces. All of my functions just have function name prefix on the noun side to avoid conflicts.
I also can't really post minimally functional code very easily since that's in a single file that doesn't exhibit the behavior. If there is no obvious answer to someone I guess my next step is to implement the test script with some modules, but that's not really apples to apples either, since my real modules are rather large.
To add a little context: When the function (in a PSM1) does not call a function and simply sets $writeLine = $originalLine total time is 15 seconds.
When doing an actual find and replace inline (no call to a function) like this $writeLine = $originalLine.Replace($replace, $with) total processing time is 16 seconds.
When calling a function in the same PSM1 that just returns the original string total time is 17 minutes.
But again, when it's all in a PS1 file with no modules, calling a function has minimal impact. So it certainly seems like calling a function in a PSM1, even from a function in that same PSM1, has a massive performance overhead.
And more context:
I moved the replace function in the test script into a Module. No appreciable change. So I moved the main code, including the loop, into a function in that module, and called it from the main script. Again, no real change. Both took around 15 seconds.
So, it's not something innate in Modules. That then begs the question, what could I be doing in my other modules that would trigger this behavior? This modules are 3000-10,000 lines of code, so there is a lot going on. Hopefully someone has some insight as to best practices with modules to mitigate this. And hopefully it's not "Don't use big modules". ;)
Final update:
It seems it IS a function of how big the module is. I deleted all the other functions in the Module that contains the loop, and performance is fine, 17 seconds. So, basically even as of PS5.0, the implementation of modules is pretty useless for anything large. Rather disconcerting. I wonder if the same would be true if all the functions where in a single file, and PowerShell performance with large files with lots of functions is just bad? Anyone have any experience down this road?

How to get the best perfomance in Google Script?

As you might know, Google gives only 1 hour of execution time for triggered scripts (web-apps and manual calls are not in this hour).
I hadn't found a good explanation of how to boost script perfomance, so I had to find out it myself. Here's what helped me:
10 tips & tricks to get the best perfomance in Google Script
There is an official Google's best practices documentation on how to write scripts. However it's not that comprehensive and misses some important tips and explanations, which you will additionally find here.
Use as less as possible time consuming methods.
Any method which is considered as time-consuming by Google you will always find in the execution report .
Some examples of time-consuming methods:
Methods, which read/save information to the Sheet: getValue, getValues, getDisplayValue, getRangeByName, getSheet, setValue, deleteRow etc. Some consume more, some less.
Methods, using other external services, like GmailApp, Utilities etc.
Document/Script properties (but they are a somewhat faster than getting data from sheet).
If you use Libraries, don't forget to create a new version of the library, when you finish coding, and switch off developer mode in you calling scripts (sorry for non-English screenshots). In my case it speeded up the dummy script launches (when the script got a signal, that it should stop) by 3 times.
The parts of code, which are non included into any function, and can be executed conditionally must be grouped into initialization functions and called when a condition is met. This actually can be applied to any part of code. The thing is that the code, which is located outside of any function, is always executed, whether it's in your basic project or in a library.
Use Batch operations when possible. One getValues() on 100 cells is much faster then 100 getValue() on each cell. Same for deleteRows() and setValues() etc.
If you use many script/document properties, use batch methods also.
If you use many static Named Ranges, create a cache for all of them and make an object (hash array) from that. Make a refreshing procedure and use it when required. My project has 137 named ranges: this point had a huge effect.
Avoid sleep method, if possible.
Google advises to use cache for web pages with fetchUrl (if applicable).
Google advises to avoid Library usage in UI heavy scripts (e.g. when you use triggers based on Google Sheet actions).
If your triggered script isn't supposed to work 24 hours make a work schedule for it and route your script to a lightweight procedure.
Example:
if ((new Date()).getHours() < 9) {
var TriggeredEveryMinuteFunction = function() {
//...do some lightweight stuff or nothing...
}
} else {
function TriggeredEveryMinuteFunction() {
// ...do some heavy stuff...
func2();
}
function func2() { /*some more stuff*/ }
var func3 = function() { /*some more stuff*/ }
var etc() { /*some more stuff*/ }
}
In this example functions func2,func3,etc are not compiled when it's less than 9 o'clock.
If you try to call them, you'll get "not found" message.

Does using global variables impact performance in MATLAB?

As I understand, MATLAB cannot use pass by reference when sending arguments to other functions. I am doing audio processing, and I frequently have to pass waveforms as arguments into functions, and because MATLAB uses pass by value for these arguments, it really eats up a lot of RAM when I do this.
I was considering using global variables as a method to pass my waveforms into functions, but everywhere I read there seems to be a general opinion that this is a bad idea, for organization of code, and potentially performance issues... but I haven't really read any detailed answers on how this might impact performance...
My question: What are the negative impacts of using global variables (with sizes > 100MB) to pass arguments to other functions in MATLAB, both in terms of 1) performance and 2) general code organization and good practice.
EDIT: From #Justin's answer below, it turns out MATLAB does on occasion use pass by reference when you do not modify the argument within the function! From this, I have a second related question about global variable performance:
Will using global variables be any slower than using pass by reference arguments to functions?
MATLAB does use pass by reference, but also uses copy-on-write. That is to say, your variable will be passed by reference into the function (and so won't double up on RAM), but if you change the variable within the the function, then MATLAB will create a copy and change the copy (leaving the original unaffected).
This fact doesn't seem to be too well known, but there's a good post on Loren's blog discussing it.
Bottom line: it sounds like you don't need to use global variables at all (which are a bad idea as #Adriaan says).
While relying on copy on write as Justin suggested is typically the best choice, you can easily implement pass by reference. With Matlab oop being nearly as fast as traditional functions in Matlab 2015b or newer, using handle is a reasonable option.
I encountered an interesting use case of a global variable yesterday. I tried to parallellise a piece of code (1200 lines, multiple functions inside the main function, not written by me), using parfor.
Some weird errors came out and it turned out that this piece of code wrote to a log file, but used multiple functions to write to the log file. Rather than opening and closing the relevant log file every time a function wanted to write to it, which is very slow, the file ID was made global, so that all write-functions could access it.
For the serial case this made perfect sense, but when trying to parallellise this, using global apparently breaks the scope of a worker instance as well. So suddenly we had 4 workers all trying to write into the same log file, which resulted in some weird errors.
So all in all, I maintain my position that using global variables is generally a bad idea, although I can see its use in specific cases, provided you know what you're doing.
Using global variables in Matlab may increase performance alot. This is because you can avoid copying of data in some cases.
Before attempting to gain such performance tweaks, think carefully of the cost to your project, in terms of the many drawbacks that global variables come with. There are also pitfalls to using globals with bad consequences to performance, and those may be difficult to avoid(although possible). Any code that is littered with globals tend to be difficult to comprehend.
If you want to see globals in use for performance, you can look at this real-time toolbox for optical flow that I made. This is the only project in native Matlab that is capable of real-time optical flow that I know of. Using globals was one of the reasons this was doable. It is also a reason to why the code is quite difficult to grasp: Globals are evil.
That globals can be used this way is not a way to argue for their use, rather it should be a hint that something should be updated with Matlabs unflexible notions of workspace and inefficient alternatives to globals such as guidata/getappdata/setappdata.

Is it bad to have many global functions?

I'm relatively new to software development, and I'm on my way to completing my first app for the iPhone.
While learning Swift, I learned that I could add functions outside the class definition, and have it accessible across all views. After a while, I found myself making many global functions for setting app preferences (registering defaults, UIAppearance, etc).
Is this bad practice? The only alternate way I could think of was creating a custom class to encapsulate them, but then the class itself wouldn't serve any purpose and I'd have to think of ways to passing it around views.
Global functions: good (IMHO anyway, though some disagree)
Global state: bad (fairly universally agreed upon)
By which I mean, it’s probably a good practice to break up your code to create lots of small utility functions, to make them general, and to re-use them. So long as they are “pure functions”
For example, suppose you find yourself checking if all the entries in an array have a certain property. You might write a for loop over the array checking them. You might even re-use the standard reduce to do it. Or you could write a re-useable function, all, that takes a closure that checks an element, and runs it against every element in the array. It’s nice and clear when you’re reading code that goes let allAboveGround = all(sprites) { $0.position.y > 0 } rather than a for…in loop that does the same thing. You can also write a separate unit test specifically for your all function, and be confident it works correctly, rather than a much more involved test for a function that includes embedded in it a version of all amongst other business logic.
Breaking up your code into smaller functions can also help avoid needing to use var so much. For example, in the above example you would probably need a var to track the result of your looping but the result of the all function can be assigned using let. Favoring immutable variables declared with let can help make your program easier to reason about and debug.
What you shouldn’t do, as #drewag points out in his answer, is write functions that change global variables (or access singletons which amount to the same thing). Any global function you write should operate only on their inputs and produce the exact same results every time regardless of when they are called. Global functions that mutate global state (i.e. make changes to global variables (or change values of variables passed to them as arguments by reference) can be incredibly confusing to debug due to unexpected side-effects they might cause.
There is one downside to writing pure global functions,* which is that you end up “polluting the namespace” – that is, you have all these functions lying around that might have specific relevance to a particular part of your program, but accessible everywhere. To be honest, for a medium-sized application, with well-written generic functions named sensibly, this is probably not an issue. If a function is purely of use to a specific struct or class, maybe make it a static method. If your project really is getting too big, you could perhaps factor out your most general functions into a separate framework, though this is quite a big overhead/learning exercise (and Swift frameworks aren’t entirely fully-baked yet), so if you are just starting out so I’d suggest leaving this for now until you get more confident.
* edit: ok two downsides – member functions are more discoverable (via autocomplete when you hit .)
Updated after discussion with #AirspeedVelocity
Global functions can be ok and they really aren't much different than having type methods or even instance methods on a custom type that is not actually intended to contain state.
The entire thing comes down mostly to personal preference. Here are some pros and cons.
Cons:
They sometimes can cause unintended side effects. That is they can change some global state that you or the caller forgets about causing hard to track down bugs. As long as you are careful about not using global variables and ensure that your function always returns the same result with the same input regardless of the state of the rest of the system, you can mostly ignore this con.
They make code that uses them difficult to test which is important once you start unit testing (which is a definite good policy in most circumstances). It is hard to test because you can't mock out the implementation of a global function easily. For example, to change the value of a global setting. Instead your test will start to depend on your other class that sets this global setting. Being able to inject a setting into your class instead of having to fake out a global function is generally preferable.
They sometimes hint at poor code organization. All of your code should be separable into small, single purpose, logical units. This ensures your code will remain understandable as your code base grows in size and age. The exception to this is truly universal functions that have very high level and reusable concepts. For example, a function that lets you test all of the elements in a sequence. You can also still separate global functions into logical units by separating them into well named files.
Pros:
High level global functions can be very easy to test. However, you cannot ignore the need to still test their logic where they are used because your unit test should not be written with knowledge of how your code is actually implemented.
Easily accessible. It can often be a pain to inject many types into another class (pass objects into an initializer and probably store it as a property). Global functions can often remove this boiler plate code (even if it has the trade off of being less flexible and less testable).
In the end, every code architecture decision is a balance of trade offs each time you go to use it.
I have a Framework.swift that contains a set of common global functions like local(str:String) to get rid of the 2nd parameter from NSLocalize. Also there are a number of alert functions internally using local and with varying number of parameters which makes use of NSAlert as modal dialogs more easy.
So for that purpose global functions are good. They are bad habit when it comes to information hiding where you would expose internal class knowledge to some global functionality.

Excel List-Object VBA Performance Bug?

I have an issue with performance on an excel application which uses List Objects (AKA Excel Tables). I suspect it may be a bug, but despite my Googling I could not find any reference of it. I've already developed a workaround for my application, but what I’m interested in is if anyone can give any insight into why this happens.
Note: I’m using Excel 2007 on Windows Vista. The setup is as follows: I have a spreadsheet which holds data in a List Object, with VBA code which can be kicked off via a command button; this code may make several edits to any number of cells on the worksheet, so Excel’s Calculation mode is set to Manual prior to any edits.
The problem I’ve encountered is that if the currently active cell is within the List Object, then setting the Calculation Mode to manual seems to have no effect whatsoever. So if a user happens to have a heavy calculation workbook open in the same instance, then the VBA code runs very slowly. I practically had to pull my application apart to discover that this was caused by the active cell; and I created a new workbook with simple version of this scenario to confirm that there wasn’t some sort of corruption on my application.
I’ve been doing a number of test cases with this, and below are the results from what I’ve found:
Although it seems generally related to the calculation, there is still a time difference when the calculation mode is switched between Manual and Automatic...
Manual = 7.64 secs
Automatic = 9.39 secs
Manual mode is just fewer than 20% faster than Automatic. But my expectation was they’d be more or less the same, considering the issue seems to be the calculation kicking off even when in Manual mode.
Compare that to when the active cell is not on a List Object, and the results are vastly different...
Manual = 0.14 secs
Automatic = 3.23 secs
Now, the Manual run is 50 times faster, and Automatic run shows that the calculation shouldn’t have taken any more than 3.2 secs! So now the first test looks like it might have run the Calculation twice while in Manual mode, and nearly 3 times while in Automatic mode.
Repeating this test again, this time in an instance with no calculation formula in any cells, and suddenly it doesn’t seem as bad,
Active cell is List Object & Calc is Manual = 0.17 secs
Active cell is List Object & Calc is Automatic = 0.20 secs
Active cell is Empty & Calc is Manual = 0.14 secs
Active cell is Empty & Calc is Automatic = 0.18 secs
It’s still slower, but now it’s only by 10-20%, making it unnoticeable. But this does show that the issue must be related to the Calculation in some way, as otherwise it should have taken just as long as the first test.
If anyone wants to create these tests to see for themselves, the setup is as follows:
New Workbook with a List Object added (doesn’t have to be linked to any data)
Add some formula that will take excel a while to calculate (I just did ‘=1*1’ repeated 30,000 times)
Write a quick VBA code which will; (i) loop through a simple edit of a cell several hundred times, (ii) and record the time it took
Then just run the code while changing the active cell between the List Object and an empty cell
I’d be very interested to hear if anyone can explain why Excel behaves in this way, and if is a bug or if is some feature to do with List Objects which actually has some genuine use?
Thanks,
Stuart
This is not relative to the "bug" you found, which is quite interesting and intriguing.
I just want to share that there is a great way to avoid calculation delays. I had fantastic results with this and now I use it all the time.
Simply put, Excel takes a long time copying data back and forth between the "VBA world" and the "spreadsheet world".
If you do all the "reads" at once, process, and then do all the "writes" at once, you get amazing performance. This is done using variant arrays as documented here:
http://msdn.microsoft.com/en-us/library/ff726673.aspx#xlFasterVBA
in the section labeled: Read and Write Large Blocks of Data in a Single Operation
I was able to refactor some code I had that took 5 minutes to run and bring it down to 1.5 minutes. The refactoring took me 10 minutes, which is amazing because it was quite complex code.
Regarding Table performance (and performance, in general):
I know this is an old question, but I want to get this documented.
One thing that changed between older versions of Excel and the post-2007 versions is that Excel now activates the target sheet of any PasteSpecial operation. You cannot override it by turning off ScreenUpdating and making calculations manual. Such Activation WILL make the sheet visible, and cause uncontrollable flicker.
My original VBA code ran very fast on an old, single-processor XP box running Excel 2000. The change to Excel 2013 on a modern machine was stunning in the terrible slowness of code execution. The three areas that kill performance are PasteSpecial from one sheet to another, any other code that requires activating sheets (Zoom level, Advanced Filter, Sheet-Level range names, etc), and automating sheet protection/unprotection.
This is too bad, because PasteSpecial helped "cleanse" data you copy (Direct use of .Copy to a target will throw the occasional error).
So you need to review your code and make sure you are using direct assignment to the right property for the data type you need (from among Value, Value2, Text, and Formula, for example), instead of PasteSpecial.
e.g. .Range("MYRANGE").Value = .Cells(5, 7).Value2
You also need to be scrupulous in resisting use of Select and Activate throughout your code.
As referenced above, many comments you'll find in Excel fora about that last point will make statements that you "never" need to use Activation, which is clearly untrue, since several things in Excel only apply to or require active sheets. Understanding the cases where activation is forced automatically by a particular method or use of an object will help in coding as well. Unfortunately, you won't see much in the way of documentation of this.
Update:
Regarding Conditional Formatting, you'll find many complaints in various fora about the slowness of Excel when encountering a large number of Conditionally-formatted cells. I suspected this would impact Excel Tables since they have many table format options. To test this, I took a large workbook we use that is currently formatted as several worksheets with the same style of Excel Table on them.
After converting the tables to a conventional range, I noticed no difference in speed of code execution. This would seem to indicate that use of Excel Table formats is far superior to conditionally-formatting your own arrays of cells.

Resources