I have a function that returns an vector. Since I call this function many times, I want it to update a vector I provide to it rather than create a new vector. This is to avoid use of memory and so increase speed.
The original code essentially looks like:
function!(prob1,pi,prob0)
prob1=pi'*prob0
return prob1
end
Of course this creates a new prob1 vector each time. I've attempted to amend this in two different ways:
function!(prob1,pi,prob0)
for i in 1:length(prob1)
prob1[i]=pi[:,i]'*prob0
end
return prob1
end
#OR
function!(prob1,pi,prob0)
for i in 1:length(prob1)
prob1[i]=dot(pi[:,i],prob0)
end
return prob1
end
However, both run slower than the original code although they do use less memory. Any suggestions for improving performance time would be great.
You actually don't need to define a function, there already is one (albeit undocumented): At_mul_B!(prob1,pi,prob0) should give you what you want.
Related
I'm trying to optimize a given object oriented code in matlab. It is an economical model and consists of a Market and Agents. The time consuming part is to update certain attributes of all Agents during each timestep which is implemented in a for loop.
However, I fail to vectorize the object oriented code.
Here is an example (Note, the second thing that slows down the code so far is the fact, that new entries are attached to the end of the vector. I'm aware of that and will fix that also):
for i=1:length(obj.traders)
obj.traders(i).update(obj.Price,obj.Sentiment(end),obj.h);
end
Where update looks like
function obj=update(obj,price,s,h)
obj.pos(end+1)=obj.p;
obj.wealth(end+1)=obj.w(1,1,1);
obj.g(end+1)=s;
obj.price=price;
obj.Update_pos(sentiment,h);
if (obj.c)
obj.Switch_Pos;
end
...
My first idea was to try something like
obj.traders(:).update(obj.Price,obj.Sentiment(end),obj.h);
Which didn't work. If someone has any suggestions how to vectorize this code, while keeping the object oriented implementation, I would be very happy.
I cannot provide a complete solution as this depends on the details of your implementation, but here are some tips which you could use to improve your code:
Remembering that a MATLAB object generally behaves like a struct, assignment of a constant value to a field can be done using [obj.field] =deal(val); e.g.:
[obj.trader.price] = deal(obj.Price);
This can also be extended to non-constant RHS, using cell, like so:
[aStruct.(fieldNamesCell{idx})] = deal(valueCell{:}); %// or deal(numericVector(:));
To improve the update function, I would suggest making several lines where you create the RHS vectors\cells followed by "simultaneous" assignment to all relevant fields of the objects in the array.
Other than that consider:
setfield: s = setfield(s,{sIndx1,...,sIndxM},'field',{fIndx1,...,fIndxN},value);
structfun:
s = structfun(#(x)x(1:3), s, 'UniformOutput', false, 'ErrorHandler', #errfn);
"A loop-based solution can be flexible and easily readable".
P.S.
On a side note, I'd suggest you name the obj in your functions according to the class name, which would make it more readable to others, i.e.:
function obj=update(obj,price,s,h) => function traderObj=update(traderObj,price,s,h)
I have following recursive function written in Ruby, however I find that the method is running too slowly. I am unsure if this the correct way to do it, so please suggest how to improve the performance of this code. The total file count including the subdirectories is 4,535,347
def start(directory)
Dir.foreach(directory) do |file|
next if file == '.' or file == '..'
full_file_path = "#{directory}/#{file}"
if File.directory?(full_file_path)
start(full_file_path)
elsif File.file?(full_file_path)
extract(full_file_path)
else
raise "Unexpected input type neither file nor folder"
end
end
With 4.5M directories, you might be better off working with a specialized lazy enumerator so as to only process entries you actually need, rather than generating each and every one of those 4.5M lists, returning the entire set and iterating through it in entirety.
Here's the example from the docs:
class Enumerator::Lazy
def filter_map
Lazy.new(self) do |yielder, *values|
result = yield *values
yielder << result if result
end
end
end
(1..Float::INFINITY).lazy.filter_map{|i| i*i if i.even?}.first(5)
http://ruby-doc.org/core-2.1.1/Enumerator/Lazy.html
It's not a very good example, btw: the important part is Lazy.new() rather than the fact that Enumerator::Lazy gets monkey patched. Here's a much better example imho:
What's the best way to return an Enumerator::Lazy when your class doesn't define #each?
Further reading on the topic:
http://patshaughnessy.net/2013/4/3/ruby-2-0-works-hard-so-you-can-be-lazy
Another option you might want to consider is computing the list across multiple threads.
I don't think there's a way to speed up much your start method; it does the correct things of going through your files and processing them as soon as it encounters them. You can probably simplify it with a single Dir.glob do, but it will still be slow. I suspect that this is not were most of the time is spent.
There very well might be a way to speed up your extract method, impossible to know without the code.
The other way to speed this up might be to split the processing to multiple processes. Since reading & writing is probably what is slowing you down, this way would give you hope that the ruby code executes while another process is waiting for the IO.
I what to optimize my code. I have 3 option don't know which is better for memory in Lua:
1)
local Test = {}
Test.var1 = function ()
-- Code
end
Test.var2 = function ()
-- Code
end
2) Or
function var1()
-- Code
end
function var2()
-- Code
end
3) Or maybe
local var1 = function ()
-- Code
end
local var2 = function ()
-- Code
end
Quoting from Lua Programming Gem, the two maxims of program optimization:
Rule #1: Don’t do it.
Rule #2: Don’t do it yet. (for experts only)
Back to your examples, the second piece of code is a little bit worse as the access to global ones is slower. But the performance difference is hardly noticeable.
It depends on your needs, the first one uses an extra table than the third one, but the namespace is cleaner.
None will really affect memory, barring the use of a table in #1 (so some 40 bytes + some per entry).
If its performance you want, then option #3 is far better, assuming you can access said functions at the local scope.
If it's about memory usage more than processing and you're using object-oriented programming where you're instantiating multiple instances of Test as you showed above, you have a fourth option with metatables.
TestMt = {}
TestMt.func1 = function(self, ...)
...
end
TestMt.func2 = function(self, ...)
...
end
TestMt.func3 = function(self, ...)
...
end
function new_test()
local t = {}
t.data = ...
setmetatable(t, {__index = TestMt})
return t
end
foo = new_test()
foo:func1()
foo:func2()
foo:func3()
If you're doing object-oriented kind of programming, metatables can lead to a massive savings in memory (I accidentally used over 1 gigabyte once for numerous mathematical vectors this way, only to reduce it down to 40 megabytes by using the metatable).
If it's not about objects and tables that get instantiated many times, and just about organizing your globally-accessible functions, worrying about memory here is ridiculous. It's like putting the entirety of your lua code into one file in order to reduce file system overhead. You're talking about such negligible savings that you should really need an extraordinary use case backed by meticulous measurements to even concern yourself with that.
If it's about processing, then you can get some small improvements by keeping your global functions out of nested tables, and by favoring locals when possible.
I'm running into serious performance issues with anonymous functions in matlab 2011a, where the overhead introduced by an anonymous container function is far greater than the time taken by the enclosed function itself.
I've read a couple of related questions in which users have helpfully explained that this is a problem that others experience, showing that I could increase performance dramatically by doing away with the anonymous containers. Unfortunately, my code is structured in such a way that I'm not sure how to do that without breaking a lot of things.
So, are there workarounds to improve performance of anonymous functions without doing away with them entirely, or design patterns that would allow me to do away with them without bloating my code and spending a lot of time refactoring?
Some details that might help:
Below is the collection of anonymous functions, which are stored as a class property. Using an int array which is in turn used by a switch statement could replace the array in principle, but the content of GPs is subject to change -- there are other functions with the same argument structure as traingps that could be used there -- and GPs' contents may in some cases be determined at runtime.
m3.GPs = {#(X,ytrain,xStar,noisevar,params)traingp(X,ytrain,xStar,noisevar,1,params,[1 0]');
#(X,ytrain,xStar,noisevar,params)traingp(X,ytrain,xStar,noisevar,1,params,[-1 1]');
#(X,ytrain,xStar,noisevar,params)traingp(X,ytrain,xStar,noisevar,2,params,0);
#(X,ytrain,xStar,noisevar,params)traingp(X,ytrain,xStar,noisevar,3,params,0);
#(X,ytrain,xStar,noisevar,params)traingp(X,ytrain,xStar,noisevar,4,params,[0 0 0]')};
Later, elements of GPs are called by a member function of the class, like so:
GPt = GPs{t(j)}(xj,yj,gridX(xi),thetaT(1),thetaT(2:end));
According to the profiler, the self-time for the anonymous wrapper takes 95% of the total time (1.7 seconds for 44 calls!), versus 5% for the contained function. I'm using a similar approach elsewhere, where the anonymous wrapper's cost is even greater, proportionally speaking.
Does anyone have any thoughts on how to reduce the overhead of the anonymous calls, or, absent that, how to replace the anonymous function while retaining the flexibility they provide (and not introducing a bunch of additional bookkeeping and argument passing)?
Thanks!
It all comes down to how much pain are you willing to endure to improve performance. Here's one trick that avoids anonymous functions. I don't know how it will profile for you. You can put these "tiny" functions at end of class files I believe (I know you can put them at the end of regular function files.)
function [output] = GP1(x,ytrain,xstar,noisevar,params)
output = traingp(X,ytrain,xStar,noisevar,1,params,[1 0]);
end
...
m3.GPS = {#GP1, #GP2, ...};
Perhaps a function "factory" would help:
>> factory = #(a,b,c) #(x,y,z) a*x+b*y+c*z;
>> f1 = factory(1,2,3);
>> f2 = factory(0,1,2);
>> f1(1,2,3)
ans =
14
>> f1(4,5,6)
ans =
32
>> f2(1,2,3)
ans =
8
>> f2(4,5,6)
ans =
17
Here, factory is a function that return a new function with different arguments. Another example could be:
factory = #(a,b,c) #(x,y,z) some_function(x,y,z,a,b,c)
which returns a function of x,y,z with a,b,c specified.
In C#, you could do something like this:
public IEnumerable<T> GetItems<T>()
{
for (int i=0; i<10000000; i++) {
yield return i;
}
}
This returns an enumerable sequence of 10 million integers without ever allocating a collection in memory of that length.
Is there a way of doing an equivalent thing in Ruby? The specific example I am trying to deal with is the flattening of a rectangular array into a sequence of values to be enumerated. The return value does not have to be an Array or Set, but rather some kind of sequence that can only be iterated/enumerated in order, not by index. Consequently, the entire sequence need not be allocated in memory concurrently. In .NET, this is IEnumerable and IEnumerable<T>.
Any clarification on the terminology used here in the Ruby world would be helpful, as I am more familiar with .NET terminology.
EDIT
Perhaps my original question wasn't really clear enough -- I think the fact that yield has very different meanings in C# and Ruby is the cause of confusion here.
I don't want a solution that requires my method to use a block. I want a solution that has an actual return value. A return value allows convenient processing of the sequence (filtering, projection, concatenation, zipping, etc).
Here's a simple example of how I might use get_items:
things = obj.get_items.select { |i| !i.thing.nil? }.map { |i| i.thing }
In C#, any method returning IEnumerable that uses a yield return causes the compiler to generate a finite state machine behind the scenes that caters for this behaviour. I suspect something similar could be achieved using Ruby's continuations, but I haven't seen an example and am not quite clear myself on how this would be done.
It does indeed seem possible that I might use Enumerable to achieve this. A simple solution would be to us an Array (which includes module Enumerable), but I do not want to create an intermediate collection with N items in memory when it's possible to just provide them lazily and avoid any memory spike at all.
If this still doesn't make sense, then consider the above code example. get_items returns an enumeration, upon which select is called. What is passed to select is an instance that knows how to provide the next item in the sequence whenever it is needed. Importantly, the whole collection of items hasn't been calculated yet. Only when select needs an item will it ask for it, and the latent code in get_items will kick into action and provide it. This laziness carries along the chain, such that select only draws the next item from the sequence when map asks for it. As such, a long chain of operations can be performed on one data item at a time. In fact, code structured in this way can even process an infinite sequence of values without any kinds of memory errors.
So, this kind of laziness is easily coded in C#, and I don't know how to do it in Ruby.
I hope that's clearer (I'll try to avoid writing questions at 3AM in future.)
It's supported by Enumerator since Ruby 1.9 (and back-ported to 1.8.7). See Generator: Ruby.
Cliche example:
fib = Enumerator.new do |y|
y.yield i = 0
y.yield j = 1
while true
k = i + j
y.yield k
i = j
j = k
end
end
100.times { puts fib.next() }
Your specific example is equivalent to 10000000.times, but let's assume for a moment that the times method didn't exist and you wanted to implement it yourself, it'd look like this:
class Integer
def my_times
return enum_for(:my_times) unless block_given?
i=0
while i<self
yield i
i += 1
end
end
end
10000.my_times # Returns an Enumerable which will let
# you iterate of the numbers from 0 to 10000 (exclusive)
Edit: To clarify my answer a bit:
In the above example my_times can be (and is) used without a block and it will return an Enumerable object, which will let you iterate over the numbers from 0 to n. So it is exactly equivalent to your example in C#.
This works using the enum_for method. The enum_for method takes as its argument the name of a method, which will yield some items. It then returns an instance of class Enumerator (which includes the module Enumerable), which when iterated over will execute the given method and give you the items which were yielded by the method. Note that if you only iterate over the first x items of the enumerable, the method will only execute until x items have been yielded (i.e. only as much as necessary of the method will be executed) and if you iterate over the enumerable twice, the method will be executed twice.
In 1.8.7+ it has become to define methods, which yield items, so that when called without a block, they will return an Enumerator which will let the user iterate over those items lazily. This is done by adding the line return enum_for(:name_of_this_method) unless block_given? to the beginning of the method like I did in my example.
Without having much ruby experience, what C# does in yield return is usually known as lazy evaluation or lazy execution: providing answers only as they are needed. It's not about allocating memory, it's about deferring computation until actually needed, expressed in a way similar to simple linear execution (rather than the underlying iterator-with-state-saving).
A quick google turned up a ruby library in beta. See if it's what you want.
C# ripped the 'yield' keyword right out of Ruby- see Implementing Iterators here for more.
As for your actual problem, you have presumably an array of arrays and you want to create a one-way iteration over the complete length of the list? Perhaps worth looking at array.flatten as a starting point - if the performance is alright then you probably don't need to go too much further.