improving performance of matlab code with anonymous-function bottlenecks

improving performance of matlab code with anonymous-function bottlenecks - performance

I'm running into serious performance issues with anonymous functions in matlab 2011a, where the overhead introduced by an anonymous container function is far greater than the time taken by the enclosed function itself.
I've read a couple of related questions in which users have helpfully explained that this is a problem that others experience, showing that I could increase performance dramatically by doing away with the anonymous containers. Unfortunately, my code is structured in such a way that I'm not sure how to do that without breaking a lot of things.
So, are there workarounds to improve performance of anonymous functions without doing away with them entirely, or design patterns that would allow me to do away with them without bloating my code and spending a lot of time refactoring?
Some details that might help:
Below is the collection of anonymous functions, which are stored as a class property. Using an int array which is in turn used by a switch statement could replace the array in principle, but the content of GPs is subject to change -- there are other functions with the same argument structure as traingps that could be used there -- and GPs' contents may in some cases be determined at runtime.
m3.GPs = {#(X,ytrain,xStar,noisevar,params)traingp(X,ytrain,xStar,noisevar,1,params,[1 0]');
#(X,ytrain,xStar,noisevar,params)traingp(X,ytrain,xStar,noisevar,1,params,[-1 1]');
#(X,ytrain,xStar,noisevar,params)traingp(X,ytrain,xStar,noisevar,2,params,0);
#(X,ytrain,xStar,noisevar,params)traingp(X,ytrain,xStar,noisevar,3,params,0);
#(X,ytrain,xStar,noisevar,params)traingp(X,ytrain,xStar,noisevar,4,params,[0 0 0]')};
Later, elements of GPs are called by a member function of the class, like so:
GPt = GPs{t(j)}(xj,yj,gridX(xi),thetaT(1),thetaT(2:end));
According to the profiler, the self-time for the anonymous wrapper takes 95% of the total time (1.7 seconds for 44 calls!), versus 5% for the contained function. I'm using a similar approach elsewhere, where the anonymous wrapper's cost is even greater, proportionally speaking.
Does anyone have any thoughts on how to reduce the overhead of the anonymous calls, or, absent that, how to replace the anonymous function while retaining the flexibility they provide (and not introducing a bunch of additional bookkeeping and argument passing)?
Thanks!

It all comes down to how much pain are you willing to endure to improve performance. Here's one trick that avoids anonymous functions. I don't know how it will profile for you. You can put these "tiny" functions at end of class files I believe (I know you can put them at the end of regular function files.)
function [output] = GP1(x,ytrain,xstar,noisevar,params)
output = traingp(X,ytrain,xStar,noisevar,1,params,[1 0]);
end
...
m3.GPS = {#GP1, #GP2, ...};

Perhaps a function "factory" would help:
>> factory = #(a,b,c) #(x,y,z) a*x+b*y+c*z;
>> f1 = factory(1,2,3);
>> f2 = factory(0,1,2);
>> f1(1,2,3)
ans =
14
>> f1(4,5,6)
ans =
32
>> f2(1,2,3)
ans =
8
>> f2(4,5,6)
ans =
17
Here, factory is a function that return a new function with different arguments. Another example could be:
factory = #(a,b,c) #(x,y,z) some_function(x,y,z,a,b,c)
which returns a function of x,y,z with a,b,c specified.

Related

Vectorize object oriented implementation in MATLAB

I'm trying to optimize a given object oriented code in matlab. It is an economical model and consists of a Market and Agents. The time consuming part is to update certain attributes of all Agents during each timestep which is implemented in a for loop.
However, I fail to vectorize the object oriented code.
Here is an example (Note, the second thing that slows down the code so far is the fact, that new entries are attached to the end of the vector. I'm aware of that and will fix that also):
for i=1:length(obj.traders)
obj.traders(i).update(obj.Price,obj.Sentiment(end),obj.h);
end
Where update looks like
function obj=update(obj,price,s,h)
obj.pos(end+1)=obj.p;
obj.wealth(end+1)=obj.w(1,1,1);
obj.g(end+1)=s;
obj.price=price;
obj.Update_pos(sentiment,h);
if (obj.c)
obj.Switch_Pos;
end
...
My first idea was to try something like
obj.traders(:).update(obj.Price,obj.Sentiment(end),obj.h);
Which didn't work. If someone has any suggestions how to vectorize this code, while keeping the object oriented implementation, I would be very happy.

I cannot provide a complete solution as this depends on the details of your implementation, but here are some tips which you could use to improve your code:
Remembering that a MATLAB object generally behaves like a struct, assignment of a constant value to a field can be done using [obj.field] =deal(val); e.g.:
[obj.trader.price] = deal(obj.Price);
This can also be extended to non-constant RHS, using cell, like so:
[aStruct.(fieldNamesCell{idx})] = deal(valueCell{:}); %// or deal(numericVector(:));
To improve the update function, I would suggest making several lines where you create the RHS vectors\cells followed by "simultaneous" assignment to all relevant fields of the objects in the array.
Other than that consider:
setfield: s = setfield(s,{sIndx1,...,sIndxM},'field',{fIndx1,...,fIndxN},value);
structfun:
s = structfun(#(x)x(1:3), s, 'UniformOutput', false, 'ErrorHandler', #errfn);
"A loop-based solution can be flexible and easily readable".
P.S.
On a side note, I'd suggest you name the obj in your functions according to the class name, which would make it more readable to others, i.e.:
function obj=update(obj,price,s,h) => function traderObj=update(traderObj,price,s,h)

try catch or type conversion performance in julia - (Julia 73 seconds, Python 0.5 seconds)

I have been playing with Julia because it seems syntactically similar to python (which I like) but claims to be faster. However, I tried making a similar script to something I have in python for tesing where numerical values are within a text file which uses this function:
function isFloat(s)
try:
float64(s)
return true
catch:
return false
end
end
For some reason, this takes a great deal of time for a text file with a reasonable amount of rows of text (~500000).
Why would this be? Is there a better way to do this? What general feature of the language can I understand from this to apply to other languages?
Here are the two exact scripts i ran with the times for reference:
python: ~0.5 seconds
def is_number(s):
try:
np.float64(s)
return True
except ValueError:
return False
start = time.time()
file_data = open('SMW100.asc').readlines()
file_data = map(lambda line: line.rstrip('\n').replace(',',' ').split(), file_data)
bools = [(all(map(is_number, x)), x) for x in file_data]
print time.time() - start
julia: ~73.5 seconds
start = time()
function isFloat(s)
try:
float64(s)
return true
catch:
return false
end
end
x = map(x-> split(replace(x, ",", " ")), open(readlines, "SMW100.asc"))
u = [(all(map(isFloat, i)), i) for i in x]
print(start - time())

Note also that you can use the float64_isvalid function in the standard library to (a) check whether a string is a valid floating-point value and (b) return the value.
Note also that the colons (:) after try and catch in your isFloat code are wrong in Julia (this is a Pythonism).
A much faster version of your code should be:
const isFloat2_out = [1.0]
isFloat2(s::String) = float64_isvalid(s, isFloat2_out)
function foo(L)
x = split(L, ",")
(all(isFloat2, x), x)
end
u = map(foo, open(readlines, "SMW100.asc"))
On my machine, for a sample file with 100,000 rows and 10 columns of data, 50% of which are valid numbers, your Python code takes 4.21 seconds and my Julia code takes 2.45 seconds.

This is an interesting performance problem that might be worth submitting to julia-users to get more focused feedback than SO will probably provide. At a first glance, I think you're hitting problems because (1) try/catch is just slightly slow to begin with and then (2) you're using try/catch in a context where there's a very considerable amount of type uncertainty because of lots of function calls that don't return stable types. As a result, the Julia interpreter spend its time trying to figure out the types of objects rather than doing your computation. It's a bit hard to tell exactly where the big bottlenecks are because you're doing a lot of things that are not very idiomatic in Julia. Also you seem to be doing your computations in the global scope, where Julia's compiler can't perform many meaningful optimizations due to additional type uncertainty.

Python is oddly ambiguous on the subject of whether using exceptions for control flow is good or bad. See Python using exceptions for control flow considered bad?. But even in Python, the consensus is that user code shouldn't use exceptions for control flow (although for some reason generators are allowed to do this). So basically, the simple answer is that you should not be doing that – exceptions are for exceptional situations, not for control flow. That is why almost zero effort has been put into making Julia's try/catch construct faster – you shouldn't be using it like that in the first place. Of course, we will probably get around to making it faster at some point.
That said, the onus is on us as the designers of Julia's standard library to make sure that we provide APIs that never force you to use exceptions for control flow. In this case, you need a function that allows you to try to parse something as a floating-point value and indicate whether that was possible or not – not by throwing an exception, but rather by returning normal values. We don't provide such an API, so this ultimately a shortcoming of Julia's standard library – as it exists right now. I've opened an issue to discuss this API design question: https://github.com/JuliaLang/julia/issues/5704. We'll see how it pans out.

Lua optimize memory

I what to optimize my code. I have 3 option don't know which is better for memory in Lua:
1)
local Test = {}
Test.var1 = function ()
-- Code
end
Test.var2 = function ()
-- Code
end
2) Or
function var1()
-- Code
end
function var2()
-- Code
end
3) Or maybe
local var1 = function ()
-- Code
end
local var2 = function ()
-- Code
end

Quoting from Lua Programming Gem, the two maxims of program optimization:
Rule #1: Don’t do it.
Rule #2: Don’t do it yet. (for experts only)
Back to your examples, the second piece of code is a little bit worse as the access to global ones is slower. But the performance difference is hardly noticeable.
It depends on your needs, the first one uses an extra table than the third one, but the namespace is cleaner.

None will really affect memory, barring the use of a table in #1 (so some 40 bytes + some per entry).
If its performance you want, then option #3 is far better, assuming you can access said functions at the local scope.

If it's about memory usage more than processing and you're using object-oriented programming where you're instantiating multiple instances of Test as you showed above, you have a fourth option with metatables.
TestMt = {}
TestMt.func1 = function(self, ...)
...
end
TestMt.func2 = function(self, ...)
...
end
TestMt.func3 = function(self, ...)
...
end
function new_test()
local t = {}
t.data = ...
setmetatable(t, {__index = TestMt})
return t
end
foo = new_test()
foo:func1()
foo:func2()
foo:func3()
If you're doing object-oriented kind of programming, metatables can lead to a massive savings in memory (I accidentally used over 1 gigabyte once for numerous mathematical vectors this way, only to reduce it down to 40 megabytes by using the metatable).
If it's not about objects and tables that get instantiated many times, and just about organizing your globally-accessible functions, worrying about memory here is ridiculous. It's like putting the entirety of your lua code into one file in order to reduce file system overhead. You're talking about such negligible savings that you should really need an extraordinary use case backed by meticulous measurements to even concern yourself with that.
If it's about processing, then you can get some small improvements by keeping your global functions out of nested tables, and by favoring locals when possible.

Anonymous Methods / Lambda's (Coding Standards)

In Jeffrey Richter's "CLR via C#" (the .net 2.0 edtion page, 353) he says that as a self-discipline, he never makes anonymous functions longer than 3 lines of code in length. He cites mostly readability / understandability as his reasons. This suites me fine, because I already had a self-discipline of using no more than 5 lines for an anonymous method.
But how does that "coding standard" advice stack against lambda's? At face value, I'd treat them the same - keeping a lambda equally as short. But how do others feel about this? In particular, when lambda's are being used where (arguably) they shine brightest - when used in LINQ statements - is there genuine cause to abandon that self-discipline / coding standard?

Bear in mind that things have changed a lot since 2.0. For example, consider .NET 4's Parallel Extensions, which use delegates heavily. You might have:
Parallel.For(0, 100, i =>
{
// Potentially significant amounts of code
});
To me it doesn't matter whether this is a lambda expression or an anonymous method - it's not really being used in the same way that delegates typically were in .NET 2.0.
Within normal LINQ, I don't typically find myself using large lambda expressions - certainly not in terms of the number of statements. Sometimes a particular single expression will be quite long in terms of lines because it's projecting a number of properties; the alternative is having huge lines!
In fact, LINQ tends to favour single-expression lambda expressions (which don't even have braces). I'd be fairly surprised to see a good use of LINQ which had a lambda expression with 5 statements in.

I don't know if having a guideline for short lambda's and delegates is really useful. However, have a guideline for having short functions. The methods I write are on average 6 or 7 lines long. Functions should hardly ever be 20 lines long. You should create the most readable code and if you follow Robert Martin's or Steve McConnell's advice, they tell you to keep functions short and also keep the inner part of loops as short of possible, favorably just a single method call.
So you shouldn't write a for loop as follows:
for (int i = 0; i < 100; i++)
{
// Potentially significant amounts of code
}
but simply with a single method call inside the loop:
for (int i = 0; i < 100; i++)
{
WellDescribedOperationOnElementI(i);
}
With this in mind, while I in general agree with Jon Skeet’s answer, I don't see any reason why you shouldn't want his example to be written as:
Parallel.For(0, 100, i =>
{
WellDescribedPartOfHeavyCalculation(i);
});
or
Parallel.For(0, 100, i => WellDescribedPartOfHeavyCalculation(i));
or even:
Parallel.For(0, 100, WellDescribedPartOfHeavyCalculation);
Always go for the most readable code, and many times this means: short anonymous methods, and short lambda's, but most of all short -but well described- methods.

Using function arguments as local variables

Something like this (yes, this doesn't deal with some edge cases - that's not the point):
int CountDigits(int num) {
int count = 1;
while (num >= 10) {
count++;
num /= 10;
}
return count;
}
What's your opinion about this? That is, using function arguments as local variables.
Both are placed on the stack, and pretty much identical performance wise, I'm wondering about the best-practices aspects of this.
I feel like an idiot when I add an additional and quite redundant line to that function consisting of int numCopy = num, however it does bug me.
What do you think? Should this be avoided?

As a general rule, I wouldn't use a function parameter as a local processing variable, i.e. I treat function parameters as read-only.
In my mind, intuitively understandabie code is paramount for maintainability, and modifying a function parameter to use as a local processing variable tends to run counter to that goal. I have come to expect that a parameter will have the same value in the middle and bottom of a method as it does at the top. Plus, an aptly-named local processing variable may improve understandability.
Still, as #Stewart says, this rule is more or less important depending on the length and complexity of the function. For short simple functions like the one you show, simply using the parameter itself may be easier to understand than introducing a new local variable (very subjective).
Nevertheless, if I were to write something as simple as countDigits(), I'd tend to use a remainingBalance local processing variable in lieu of modifying the num parameter as part of local processing - just seems clearer to me.
Sometimes, I will modify a local parameter at the beginning of a method to normalize the parameter:
void saveName(String name) {
name = (name != null ? name.trim() : "");
...
}
I rationalize that this is okay because:
a. it is easy to see at the top of the method,
b. the parameter maintains its the original conceptual intent, and
c. the parameter is stable for the rest of the method
Then again, half the time, I'm just as apt to use a local variable anyway, just to get a couple of extra finals in there (okay, that's a bad reason, but I like final):
void saveName(final String name) {
final String normalizedName = (name != null ? name.trim() : "");
...
}
If, 99% of the time, the code leaves function parameters unmodified (i.e. mutating parameters are unintuitive or unexpected for this code base) , then, during that other 1% of the time, dropping a quick comment about a mutating parameter at the top of a long/complex function could be a big boon to understandability:
int CountDigits(int num) {
// num is consumed
int count = 1;
while (num >= 10) {
count++;
num /= 10;
}
return count;
}
P.S. :-)
parameters vs arguments
http://en.wikipedia.org/wiki/Parameter_(computer_science)#Parameters_and_arguments
These two terms are sometimes loosely used interchangeably; in particular, "argument" is sometimes used in place of "parameter". Nevertheless, there is a difference. Properly, parameters appear in procedure definitions; arguments appear in procedure calls.
So,
int foo(int bar)
bar is a parameter.
int x = 5
int y = foo(x)
The value of x is the argument for the bar parameter.

It always feels a little funny to me when I do this, but that's not really a good reason to avoid it.
One reason you might potentially want to avoid it is for debugging purposes. Being able to tell the difference between "scratchpad" variables and the input to the function can be very useful when you're halfway through debugging.
I can't say it's something that comes up very often in my experience - and often you can find that it's worth introducing another variable just for the sake of having a different name, but if the code which is otherwise cleanest ends up changing the value of the variable, then so be it.
One situation where this can come up and be entirely reasonable is where you've got some value meaning "use the default" (typically a null reference in a language like Java or C#). In that case I think it's entirely reasonable to modify the value of the parameter to the "real" default value. This is particularly useful in C# 4 where you can have optional parameters, but the default value has to be a constant:
For example:
public static void WriteText(string file, string text, Encoding encoding = null)
{
// Null means "use the default" which we would document to be UTF-8
encoding = encoding ?? Encoding.UTF8;
// Rest of code here
}

About C and C++:
My opinion is that using the parameter as a local variable of the function is fine because it is a local variable already. Why then not use it as such?
I feel silly too when copying the parameter into a new local variable just to have a modifiable variable to work with.
But I think this is pretty much a personal opinion. Do it as you like. If you feel sill copying the parameter just because of this, it indicates your personality doesn't like it and then you shouldn't do it.

If I don't need a copy of the original value, I don't declare a new variable.
IMO I don't think mutating the parameter values is a bad practice in general,
it depends on how you're going to use it in your code.

My team coding standard recommends against this because it can get out of hand. To my mind for a function like the one you show, it doesn't hurt because everyone can see what is going on. The problem is that with time functions get longer, and they get bug fixes in them. As soon as a function is more than one screen full of code, this starts to get confusing which is why our coding standard bans it.
The compiler ought to be able to get rid of the redundant variable quite easily, so it has no efficiency impact. It is probably just between you and your code reviewer whether this is OK or not.

I would generally not change the parameter value within the function. If at some point later in the function you need to refer to the original value, you still have it. in your simple case, there is no problem, but if you add more code later, you may refer to 'num' without realizing it has been changed.

The code needs to be as self sufficient as possible. What I mean by that is you now have a dependency on what is being passed in as part of your algorithm. If another member of your team decides to change this to a pass by reference then you might have big problems.
The best practice is definitely to copy the inbound parameters if you expect them to be immutable.

I typically don't modify function parameters, unless they're pointers, in which case I might alter the value that's pointed to.

I think the best-practices of this varies by language. For example, in Perl you can localize any variable or even part of a variable to a local scope, so that changing it in that scope will not have any affect outside of it:
sub my_function
{
my ($arg1, $arg2) = #_; # get the local variables off the stack
local $arg1; # changing $arg1 here will not be visible outside this scope
$arg1++;
local $arg2->{key1}; # only the key1 portion of the hashref referenced by $arg2 is localized
$arg2->{key1}->{key2} = 'foo'; # this change is not visible outside the function
}
Occasionally I have been bitten by forgetting to localize a data structure that was passed by reference to a function, that I changed inside the function. Conversely, I have also returned a data structure as a function result that was shared among multiple systems and the caller then proceeded to change the data by mistake, affecting these other systems in a difficult-to-trace problem usually called action at a distance. The best thing to do here would be to make a clone of the data before returning it*, or make it read-only**.
* In Perl, see the function dclone() in the built-in Storable module.
** In Perl, see lock_hash() or lock_hash_ref() in the built-in Hash::Util module).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

improving performance of matlab code with anonymous-function bottlenecks - performance

Related

Vectorize object oriented implementation in MATLAB

try catch or type conversion performance in julia - (Julia 73 seconds, Python 0.5 seconds)

Lua optimize memory

Anonymous Methods / Lambda's (Coding Standards)

Using function arguments as local variables

Categories

Resources