Perl, Alias sub to variable - performance

I'm currently doing micro-optimization of a perl program and like to optimize some getters.
I have a package with this getter-structure:
package Test;
our $ABC;
sub GetABC { $ABC } # exported sub...
Calling GetABC() creates a lot of sub-related overhead. Accessing the variable directly via $Test::ABC is insanely faster.
Is there a way to alias the getter to the variable to gain the same performanceboost as if I would call the variable directly? Inlining hint with "()" doesn seem to work...

There is no way to turn a variable into an accessor sub, or to replace a sub with a variable access. You will have to live with the overhead.
Non-solutions:
Using a () prototype does not turn calls into your sub to constant accesses because that prototype merely makes a sub potentially eligible for inlining. Since the body of the sub is not itself constant, this sub cannot be a constant.
The overhead is per-call as perl has to do considerable bookkeeping for each call. Therefore, rewriting that accessor in XS won't help much.
Creating a constant won't help because the constant will be a copy, not an alias of your variable.
But looking at the constant.pm source code seems to open up an interesting solution. Note that this a hack, and may not work in all versions of Perl: When we assign a scalar ref to a symbol table entry directly where that entry does not yet contain a typeglob, then an inlineable sub springs into place:
package Foo;
use strict;
use warnings;
use feature 'say';
my $x = 7;
BEGIN { $Foo::{GetX} = \$x } # don't try this at home
say GetX; #=> 7
$x = 3;
say GetX; #=> 3
This currently works on most of my installed perl versions (5.14, 5.22, 5.24, 5.26). However, my 5.22-multi and 5.26-multi die with “Modification of a read-only value attempted”. This is not a problem for the constant module since it makes the reference target readonly first and (more importantly) never modifies that variable.
So not only doesn't this work reliably, this will also completely mess up constant folding.
If the function call overhead is indeed unbearable (e.g. takes a double-digit percentage of your processing time), then doing the inlining yourself in the source code is going to be your best bet. Even if you have a lot of call locations, you can probably create a simple script that fixes the easy cases for you: select all files that import your module and only have a single package declaration. Within such files, replace calls to GetABC (with or without parens) to fully qualified variable accesses. Hopefully that token is not mentioned within any strings. Afterwards, you can manually inspect the few remaining occurrences of these calls.

Related

Local declaration of (built-in) Lua functions to reduce overhead

It is often said that one should re-declare (certain) Lua functions locally, as this reduces the overhead.
But what is the exact rule / principle behind this? How do I know for which functions this should be done and for which it is superfluous? Or should it be done for EVERY function, even your own?
Unfortunately I can't figure it out from the Lua manual.
The principle is that every time you write table.insert for example, the Lua interpreter looks up the "insert" entry in the table called table. Actually, it means _ENV.table.insert - _ENV is where the "global variables" are in Lua 5.2+. Lua 5.1 has something similar but it's not called _ENV. The interpreter looks up the string "table" in _ENV and then looks up the string "insert" in that table. Two table lookups every time you call table.insert, before the function actually gets called.
But if you put it in a local variable then the interpreter gets the function directly from the local variable, which is faster. It still has to look it up, to fill in the local variable.
It is superfluous if you only call the function once within the scope of the local variable, but that is pretty rare. There is no reason to do it for functions which are already declared as local. It also makes the code harder to read, so typically you won't do it except when it actually matters (in code that runs a lot of times).
My favorit tool for speed up things in Lua is to place all the useable stuff for a table in a metatable called: __index
A common example for this is the datatype: string
It has all string functions in his __index metatable as methods.
Therefore you can do things like that directly on a string...
print(('istaqsinaayok'):upper():reverse())
-- Output: KOYAANISQATSI
The Logic above...
The lookup for a method in a string fails directly and therefore the __index metamethod will be looked up for that method.
I like to implement same behaviour for the datatype number...
-- do debug.setmetatable() only once for all further defined/used numbers
math.pi = debug.setmetatable(math.pi, {__index = math})
-- From now numbers are objects ;-)
-- Lets output Pi but not using Pi this time
print((180):rad()) -- Pi calcing with method rad()
-- Output: 3.1415926535898
The Logic: If not exists then lookup __index
Is only one step behind: local
...imho.
Another Example, that works with this method...
-- koysenv.lua
_G = setmetatable(_G,
{ -- Metamethods
__index = {}, -- Table constructor
__name = 'Global Environment'
})
-- Reference whats in _G into __index
for key, value in pairs(_G) do
getmetatable(_G)['__index'][key] = value
end
-- Remove all whats in __index now from _G
for key, value in pairs(getmetatable(_G)['__index']) do
_G[key] = nil
end
return _G
When started as a last require it move all in _G into fresh created metatable method __index.
After that _G looks totally empty ;-P
...but the environment is working like nothing happen.
To add to what #user253751 already said:
Code Quality
Lua is a very flexible language. Other languages require you to import the parts of the standard library you use; Lua doesn't. Lua usually provides one global environment not to be polluted. If you play with the environment _ENV (setfenv/getfenv on Lua 5.1 / LuaJIT), you'll want to be able to still access Lua libraries. For that purpose you may to localize them before changing the environment; you can then use your "clean" environment for your module / API table / class / whatever. Another option here is to use metatables; metatable chains may quickly get hairy though and are likely to harm performance, as a failed table lookup is required each time to trigger indexing metamethods. localizing otherwise global variables can thus be seen as a way of importing them; to give a minimal & rough example:
local print = print -- localize ("import") everything we need first
_ENV = {} -- set environment to clean table for module
function hello() -- this writes to _ENV instead of _G
print("Hello World!")
end
hello() -- inside the environment, all variables set here are accessible
return _ENV -- "export" the API table
Performance
Very minor nitpick: Local variables aren't strictly always faster. In very extreme cases (i.e. lots of upvalues), indexing a table (which doesn't need an upvalue if it's the environment, the string metatable or the like) may actually be faster.
I imagine that localizing variables is required for many optimizations of optimizing compilers such as LuaJIT to be applicable though; otherwise Lua makes very little code. A global like print might as well be overwritten somewhere in a deep code path - thus the indexing operation has to be repeated every time; for a local on the other hand, the interpreter will has way more guarantees regarding its scope. It is thus able to detect constants that are only written to once, on initialization for instance; for globals very little code analysis is possible.

Is there a way to use a dynamic function name in Elixir from string interpolation like in Ruby?

I want to be able to construct a function call from a string in elixir. Is this possible? The equivalent ruby method call would be:
"uppercase".send("u#{:pcase}")
Although the answer by #fhdhsni is perfectly correct, I’d add some nitpicking clarification.
The exact equivalent of Kernel#send from ruby in elixir is impossible, because Kernel#send allows to call private methods on the receiver. In elixir, private functions do not ever exist in the compiled code.
If you meant Kernel#public_send, it might be achieved with Kernel.apply/3, as mentioned by #fhdhsni. The only correction is since the atom table is not garbage collected, and one surely wants to call an indeed existing function, it should be done with String.to_existing_atom/1.
apply(
String,
String.to_existing_atom("u#{:pcase}"),
["uppercase"]
)
Also, one might use macros during the compilation stage to generate respective clauses when the list of functions to call is predictable (when it’s not, the code already smells.)
defmodule Helper do
Enum.each(~w|upcase|a, fn fname ->
def unquote(fname)(param),
do: String.unquote(fname)(param)
# or
# defdelegate unquote(fname)(param), to: String
end)
end
Helper.upcase("uppercase")
#⇒ "UPPERCASE"
In Elixir module and function names are atoms. You can use apply to call them dynamically.
apply(String, String.to_atom("u#{:pcase}"), ["uppercase"]) # "UPPERCASE"
Depending on your use case it might not be a good idea to create atoms dynamically (since the atom table is not garbage collected).

Type mismatch when calling a function in qtp

I am using QTP 11.5 for automating a web application.I am trying to call an action in qtp through driverscript as below:
RFSTestPath = "D:\vf74\D Drive\RFS Automation\"
LoadAndRunAction RFStestPath & LogInApplication,"Action1",oneIteration
Inside the LogInApplication(Action1) am calling a login function as:
Call fncLogInApplication(strURL,strUsesrName,strPasssword)
Definition of fncLogInApplication is written in fncLogInApplication.vbs
When I associate the fncLogInApplication.vbs file to driverscript, am able to execute my code without any errors. But when I de-associate .vbs file from driverscript and associate it to LogInApplication test am getting "Type mismatch: 'fncLogInApplication'"
Can anyone help me in the association please. I want fncLogInApplication to be executed when I associate to LogInApplication not to the main driverscript.
Please comment back if you require any more info
There is only one set of associated libraries that is active at any one time: That is always the outermost test's one.
This means if test A calls test B, test B will be executed with the libraries loaded based upon test A´s associated libraries list, not B's.
This also means that if B depends on a library, and B associated this library, but is called from test A (which does not associated this library), then B will fail to call (locate) the function since the associated libraries of B are never loaded (only those from A are). (As would A, naturally.).
If you are still interested: "Type mismatch" is QTPs (or VBScript´s) poor way of telling you: "The function called is not known, so I bet you instead meant an array variable dereference, and the variable you specified is equal to empty, so it is not an array, and thus cannot be dereferenced as an array variable, which is what I call a 'type mismatch'."
This reasoning is valid, considering the syntax tree of VB/VBScript: Function calls and array variable dereferences cannot be formally differentiated. Syntactically, they are very similar, or identical in most cases. So be prepared to handle "Type mismatch" like the "Unknown function referenced" message that VB/VBScript never display when creating VBScript code.
You can, however, load the library you want in test B´s code (for example, using LoadFunctionLibrary), but this still allows A to call functions from that library once B loaded it and returned from A´s call. This, and all the possible variations of this procedure, however, have side-effects to aspects like debugging, forward references and visibility of global variables, so I would recommend against it.
Additional notes:
There is no good reason to use CALL. Just call the sub or function.
If you call a function and use the result it returns, you must include the arguments in parantheses.
If you call a sub (or a function, and don´t use the result it returns), you must not include the arguments in parantheses. If the sub or function accepts only one argument, it might look like you are allowed to put it in parantheses, but this is not true. In this case, the argument is simply treated like a term in parantheses.
The argument "bracketing" aspects just listed can create very nasty bugs, especially if the argument is byRef, also due (but not limited) to the fact that VBScripts unfortunately allows you to pass values for a byRef argument (where a variable parameter is expected), so it is generally a good idea to put paranthesis only where it belongs (i.e. where absolutely needed).

Using the name "function" for a variable in Python code

Is using the word function for the name of an argument considered bad style in Python code?
def conjunction_junction(function):
pass # do something and call function in here
This pattern occurs all the time, especially in decorators. You see func, fn and f used all of the time but I prefer to avoid abbreviations when possible. Is the fact that it's the name of a type enough to warrant abbreviating it?
>> type(conjunction_junction).__name__
'function'
It's not a reserved keyword, so I don't see why not.
From the Style Guide
If a function argument's name clashes with a reserved keyword, it is
generally better to append a single trailing underscore rather than
use an abbreviation or spelling corruption. Thus class_ is better than
clss. (Perhaps better is to avoid such clashes by using a synonym.)
Using function is perfectly fine.
There is nothing in the style guide about it specifically. The reason that the use of type names such as str and list is highly discouraged is because they have functionality within the language. Overwriting them would obscure the functionality of the code. function on the other hand, does nothing.
I suspect func, fn, and f are used because they are all shorter than typing function ;)

Using function arguments as local variables

Something like this (yes, this doesn't deal with some edge cases - that's not the point):
int CountDigits(int num) {
int count = 1;
while (num >= 10) {
count++;
num /= 10;
}
return count;
}
What's your opinion about this? That is, using function arguments as local variables.
Both are placed on the stack, and pretty much identical performance wise, I'm wondering about the best-practices aspects of this.
I feel like an idiot when I add an additional and quite redundant line to that function consisting of int numCopy = num, however it does bug me.
What do you think? Should this be avoided?
As a general rule, I wouldn't use a function parameter as a local processing variable, i.e. I treat function parameters as read-only.
In my mind, intuitively understandabie code is paramount for maintainability, and modifying a function parameter to use as a local processing variable tends to run counter to that goal. I have come to expect that a parameter will have the same value in the middle and bottom of a method as it does at the top. Plus, an aptly-named local processing variable may improve understandability.
Still, as #Stewart says, this rule is more or less important depending on the length and complexity of the function. For short simple functions like the one you show, simply using the parameter itself may be easier to understand than introducing a new local variable (very subjective).
Nevertheless, if I were to write something as simple as countDigits(), I'd tend to use a remainingBalance local processing variable in lieu of modifying the num parameter as part of local processing - just seems clearer to me.
Sometimes, I will modify a local parameter at the beginning of a method to normalize the parameter:
void saveName(String name) {
name = (name != null ? name.trim() : "");
...
}
I rationalize that this is okay because:
a. it is easy to see at the top of the method,
b. the parameter maintains its the original conceptual intent, and
c. the parameter is stable for the rest of the method
Then again, half the time, I'm just as apt to use a local variable anyway, just to get a couple of extra finals in there (okay, that's a bad reason, but I like final):
void saveName(final String name) {
final String normalizedName = (name != null ? name.trim() : "");
...
}
If, 99% of the time, the code leaves function parameters unmodified (i.e. mutating parameters are unintuitive or unexpected for this code base) , then, during that other 1% of the time, dropping a quick comment about a mutating parameter at the top of a long/complex function could be a big boon to understandability:
int CountDigits(int num) {
// num is consumed
int count = 1;
while (num >= 10) {
count++;
num /= 10;
}
return count;
}
P.S. :-)
parameters vs arguments
http://en.wikipedia.org/wiki/Parameter_(computer_science)#Parameters_and_arguments
These two terms are sometimes loosely used interchangeably; in particular, "argument" is sometimes used in place of "parameter". Nevertheless, there is a difference. Properly, parameters appear in procedure definitions; arguments appear in procedure calls.
So,
int foo(int bar)
bar is a parameter.
int x = 5
int y = foo(x)
The value of x is the argument for the bar parameter.
It always feels a little funny to me when I do this, but that's not really a good reason to avoid it.
One reason you might potentially want to avoid it is for debugging purposes. Being able to tell the difference between "scratchpad" variables and the input to the function can be very useful when you're halfway through debugging.
I can't say it's something that comes up very often in my experience - and often you can find that it's worth introducing another variable just for the sake of having a different name, but if the code which is otherwise cleanest ends up changing the value of the variable, then so be it.
One situation where this can come up and be entirely reasonable is where you've got some value meaning "use the default" (typically a null reference in a language like Java or C#). In that case I think it's entirely reasonable to modify the value of the parameter to the "real" default value. This is particularly useful in C# 4 where you can have optional parameters, but the default value has to be a constant:
For example:
public static void WriteText(string file, string text, Encoding encoding = null)
{
// Null means "use the default" which we would document to be UTF-8
encoding = encoding ?? Encoding.UTF8;
// Rest of code here
}
About C and C++:
My opinion is that using the parameter as a local variable of the function is fine because it is a local variable already. Why then not use it as such?
I feel silly too when copying the parameter into a new local variable just to have a modifiable variable to work with.
But I think this is pretty much a personal opinion. Do it as you like. If you feel sill copying the parameter just because of this, it indicates your personality doesn't like it and then you shouldn't do it.
If I don't need a copy of the original value, I don't declare a new variable.
IMO I don't think mutating the parameter values is a bad practice in general,
it depends on how you're going to use it in your code.
My team coding standard recommends against this because it can get out of hand. To my mind for a function like the one you show, it doesn't hurt because everyone can see what is going on. The problem is that with time functions get longer, and they get bug fixes in them. As soon as a function is more than one screen full of code, this starts to get confusing which is why our coding standard bans it.
The compiler ought to be able to get rid of the redundant variable quite easily, so it has no efficiency impact. It is probably just between you and your code reviewer whether this is OK or not.
I would generally not change the parameter value within the function. If at some point later in the function you need to refer to the original value, you still have it. in your simple case, there is no problem, but if you add more code later, you may refer to 'num' without realizing it has been changed.
The code needs to be as self sufficient as possible. What I mean by that is you now have a dependency on what is being passed in as part of your algorithm. If another member of your team decides to change this to a pass by reference then you might have big problems.
The best practice is definitely to copy the inbound parameters if you expect them to be immutable.
I typically don't modify function parameters, unless they're pointers, in which case I might alter the value that's pointed to.
I think the best-practices of this varies by language. For example, in Perl you can localize any variable or even part of a variable to a local scope, so that changing it in that scope will not have any affect outside of it:
sub my_function
{
my ($arg1, $arg2) = #_; # get the local variables off the stack
local $arg1; # changing $arg1 here will not be visible outside this scope
$arg1++;
local $arg2->{key1}; # only the key1 portion of the hashref referenced by $arg2 is localized
$arg2->{key1}->{key2} = 'foo'; # this change is not visible outside the function
}
Occasionally I have been bitten by forgetting to localize a data structure that was passed by reference to a function, that I changed inside the function. Conversely, I have also returned a data structure as a function result that was shared among multiple systems and the caller then proceeded to change the data by mistake, affecting these other systems in a difficult-to-trace problem usually called action at a distance. The best thing to do here would be to make a clone of the data before returning it*, or make it read-only**.
* In Perl, see the function dclone() in the built-in Storable module.
** In Perl, see lock_hash() or lock_hash_ref() in the built-in Hash::Util module).

Resources