What would be the best way to manage large number of instances of the same class in MATLAB?
Using the naive way produces absymal results:
classdef Request
properties
num=7;
end
methods
function f=foo(this)
f = this.num + 4;
end
end
end
>> a=[];
>> tic,for i=1:1000 a=[a Request];end;toc
Elapsed time is 5.426852 seconds.
>> tic,for i=1:1000 a=[a Request];end;toc
Elapsed time is 31.261500 seconds.
Inheriting handle drastically improve the results:
classdef RequestH < handle
properties
num=7;
end
methods
function f=foo(this)
f = this.num + 4;
end
end
end
>> tic,for i=1:1000 a=[a RequestH];end;toc
Elapsed time is 0.097472 seconds.
>> tic,for i=1:1000 a=[a RequestH];end;toc
Elapsed time is 0.134007 seconds.
>> tic,for i=1:1000 a=[a RequestH];end;toc
Elapsed time is 0.174573 seconds.
but still not an acceptable performance, especially considering the increasing reallocation overhead
Is there a way to preallocate class array? Any ideas on how to manage lange quantities of object effectively?
Thanks,
Dani
Coming to this late, but would this not be another solution?
a = Request.empty(1000,0); tic; for i=1:1000, a(i)=Request; end; toc;
Elapsed time is 0.087539 seconds.
Or even better:
a(1000, 1) = Request;
Elapsed time is 0.019755 seconds.
This solution expands on Marc's answer. Use repmat to initialize an array of RequestH objects and then use a loop to create the desired objects:
>> a = repmat(RequestH,10000,1);tic,for i=1:10000 a(i)=RequestH;end;toc
Elapsed time is 0.396645 seconds.
This is an improvement over:
>> a=[];tic,for i=1:10000 a=[a RequestH];end;toc
Elapsed time is 2.313368 seconds.
repmat is your friend:
b = repmat(Request, 1000, 1);
Elapsed time is 0.056720 seconds
b = repmat(RequestH, 1000, 1);
Elapsed time is 0.021749 seconds.
Growing by appending is abysmally slow, which is why mlint calls it out.
Related
Is there a way to vectorize this for loop to speed up?
thank you
for j =1 :size(Rond_Input2Cell,1)
for k=1: size(Rond_Input2Cell,2)
Rond_Input2Cell(j,k)= (Pre_Rond_Input2Cell(j,k)*Y_FGate(k))+(net_Cell(k)*Y_InGate(k)*tmp_input(j)) ;
end
end
P.s.
Matrix size:
Rond_Input2Cell =39*120
Pre_Rond_Input2Cell = 39*120
Y_FGate=1*120 (row vector)
net_Cell=1*120 (row vector)
Y_InGate =1*120 (row vector)
tmp_input =1*39 (row vector)
You can speed up this calculation without using a for loop but instead using bsxfun which uses memory to speed up the processing
This code below perform the same function row by row and adds them
Rond_Input2Cell = bsxfun(#times,tmp_input.' ,net_Cell.*Y_InGate) + bsxfun(#times ,Pre_Rond_Input2Cell,Y_FGate);
Exlpanation :
Pre_Rond_Input2Cell(j,k)*Y_FGate(k)
This is performed by using bsxfun(#times ,Pre_Rond_Input2Cell,Y_FGate) which mutiplies each 39 rows of Pre_Rond_Input2Cell with 120 columns of Y_FGate
net_Cell(k)*Y_InGate(k)*tmp_input(j) is replaced by bsxfun(#times,tmp_input.' ,net_Cell.*Y_InGate) which mutiplies each element of tmp_input with dot mutiplication of net_Cell and Y_InGateIn the end the it is stored in Rond_Input2Cell
Here is a performance check
>> perform_check
Elapsed time is 0.000475 seconds.
Elapsed time is 0.000156 seconds.
>> perform_check
Elapsed time is 0.001089 seconds.
Elapsed time is 0.000288 seconds.
One more Method is to use repmat
tic;
Rond_Input2Cell =(Pre_Rond_Input2Cell.*repmat(Y_FGate,size(Pre_Rond_Input2Cell,1),1)) + (repmat(tmp_input.',1,size(Pre_Rond_Input2Cell,2)).*repmat(net_Cell.*Y_InGate,size(Pre_Rond_Input2Cell,1),1));
toc;
Here is a performance test with a for loop
>> perf_test
Elapsed time is 0.003268 seconds.
Elapsed time is 0.001719 seconds.
>> perf_test
Elapsed time is 0.004211 seconds.
Elapsed time is 0.002348 seconds.
>> perf_test
Elapsed time is 0.002384 seconds.
Elapsed time is 0.000509 seconds.
Here is an article by Loren on Performance of repmat vs bsxfun
Your vectorized code should be something like this.
temp_mat = tmp_input' * (net_Cell .* Y_InGate) - size (39*120)
Rond_Input2Cell = (Pre_Rond_Input2Cell .* Y_FGate) .+ temp_mat - size (39*120)
I get a pretty consistent time difference for small matrices in favor of max(A(:)):
>> A=rand(100); tic; max(A(:)); toc; tic; max(max(A)); toc;
Elapsed time is 0.000060 seconds.
Elapsed time is 0.000083 seconds.
but for large matrices, the time difference is inconsistent:
>> A=rand(1e3); tic; max(A(:)); toc; tic; max(max(A)); toc;
Elapsed time is 0.001072 seconds.
Elapsed time is 0.001103 seconds.
>> A=rand(1e3); tic; max(A(:)); toc; tic; max(max(A)); toc;
Elapsed time is 0.000847 seconds.
Elapsed time is 0.000792 seconds.
same for larger,
>> A = rand(1e4); tic; max(A(:)); toc; tic; max(max(A)); toc;
Elapsed time is 0.049073 seconds.
Elapsed time is 0.050206 seconds.
>> A = rand(1e4); tic; max(A(:)); toc; tic; max(max(A)); toc;
Elapsed time is 0.072577 seconds.
Elapsed time is 0.060357 seconds.
Why is there a difference and what would be the best practice?
As horchler says this is machine dependent. However, on my machine I saw a clear performance decrease for the max(max(max(... for higher dimensions. I also saw a slight (but consistent) advantage in speed for max(A(:)) for a more sorted type o matrix as the toeplitz matrix. Still, for the test case that you tried I saw hardly any difference.
Also max(max(max(... is error prone due to all the paranthesis I would prefer the max(A(:)). The execution time for this function seems to be stable for all dimensions, which means that it is easy to know how much time this function takes to execute.
Thirdly: The function max seems to be very fast and this mean that the performance should be a minor issue here. This means that max(A(:)) would be preferred in this case for its readability.
So as a conclusion, I would prefer max(A(:)), but if you think that max(max(A)) is clearer you could probably use this.
On my machine there are no differences in times that are really worth worrying about.
n = 2:0.2:4;
for i = 1:numel(n)
a = rand(floor(10^n(i)));
t1(i) = timeit(#()max(a(:)));
t2(i) = timeit(#()max(max(a)));
end
>> t1
t1 =
Columns 1 through 7
7.4706e-06 1.5349e-05 3.1569e-05 2.803e-05 5.6141e-05 0.00041006 0.0011328
Columns 8 through 11
0.0027755 0.006876 0.0171 0.042889
>> t2
t2 =
Columns 1 through 7
1.1959e-05 2.2539e-05 2.3641e-05 4.1313e-05 7.6301e-05 0.00040654 0.0011396
Columns 8 through 11
0.0027885 0.0068966 0.01718 0.042997
I have the following data:
a cell array of labels (e.g. a cell array of 4 options of types of messages where each type is a string)
an cell array of messages (e.g. a cell array of 5000 messages where each message is a cell array of many words strings).
an cell array of labels for each message (e.g. a cell array of 5000 strings where string in cell i is type of message in cell i in array in part 2).
My goal is to get from this data a cell array of size as of num of labels where in each cell there is concatenated contents from all the messages of type as the label (e.g. get a cell array of 4 cells where in cell i there is a cell array of all the words from all the messages that their type is i).
I implemented 3 method to perform this. This is the code for my 3 implementations:
%...............................................................
% setting data for tic toc tests
messagesTypesOptions = {'type1';'type2';'type3';'type4'};
messages = cell(5000,1);
for i = 1:5000
messages{i} = {'word1';'word2';'word3';'word4';'word5';'word6';'word7';'word8';'word9';'word10'};
end
messages_labels = cell(5000,1);
for i = 1:5000
messages_labels{i} = messagesTypesOptions{randi([1 4])};
end
%...............................................................
% start test
% method 1
type_to_msgs1 = cell(size(messagesTypesOptions,1),1);
tic
for i = 1:size(messagesTypesOptions,1)
type_to_msgs1{i} = messages(strcmp(messages_labels,messagesTypesOptions{i}));
end
type_to_concatenated1 = cell(4,1);
for i = 1:4
type_to_msgs1{i} = type_to_msgs1{i}';
end
for i =1:4
label_msgs = type_to_msgs1{i};
num_of_label_msgs = size(label_msgs,2);
for j = 1: num_of_label_msgs
label_msgs{j} = label_msgs{j}';
end
type_to_concatenated1{i} = [label_msgs{:}];
end
toc
% method 2
type_to_concatenated2 = cell(4,1);
tic
labelStr_to_labelIndex = containers.Map(messagesTypesOptions,1:4);
for textIndex = 1:5000
type_to_concatenated2{labelStr_to_labelIndex(messages_labels{textIndex})} = ...
[type_to_concatenated2{labelStr_to_labelIndex(messages_labels{textIndex})},...
messages{textIndex}'];
end
toc
% method 3
type_to_concatenated3 = cell(4,1);
tic
labelStr_to_labelIndex2 = containers.Map(messagesTypesOptions,1:4);
matrix_label_to_isMsgFromLabel = zeros(4,5000);
for textIndex = 1:5000
matrix_label_to_isMsgFromLabel(labelStr_to_labelIndex2(messages_labels{textIndex})...
,textIndex) = 1;
end
for i = 1:4
label_msgs3 = messages(~~matrix_label_to_isMsgFromLabel(i,:))';
num_of_label_msgs3 = size(label_msgs3,2);
for j = 1: num_of_label_msgs3
label_msgs3{j} = label_msgs3{j}';
end
type_to_concatenated3{i} = [label_msgs3{:}];
end
toc
Those are the results I get:
Elapsed time is 0.033120 seconds.
Elapsed time is 0.471959 seconds.
Elapsed time is 0.095011 seconds.
So, the conclusion is that method 1 is the fastest.
Now, my question is: Is there a way to solve this in a faster way?
Intuitively, it seams that my method1 is not very efficient because it has a for loop with strcmp and the strcmp is reading all the messages, so it is reading num of labels times all the messages, i.e reading num of labels (types) the same thing.
So, is there a way to modify one of my methods to get faster solution? Is there another method which is faster?
EDIT: Here I used for the examples constant messages. But, I want a solution for the case that the messages are different from each other and can be of different size.
EDIT2: Also, the types are strings that don't necessarily has numbers in them. (e.g. instead of type1,type2,... that I used for the example code, it can be 'error', 'warning', 'valid').
Basically you have messages and need to index into them to get output for each cell of the output cell array and finally concatenate the elements. For indexing you can use logical indexing which in most cases is very efficient. For getting the logical indexing arrays, you can take help of bsxfun. Here's the code to wrap up the discussion -
%// Get the parameters
lbls_len = numel(messages_labels);
msgtypeops_len = numel(messagesTypesOptions);
%// Tag messages_labels and messagesTypesOptions with numbers
alltypes = [messages_labels ; messagesTypesOptions];
[~,~,IDs] = unique(alltypes,'stable');
lbls = IDs(1:lbls_len);
typeops = IDs(lbls_len+1:end);
%// Positions of matches for each label IDs against type IDS
pos = bsxfun(#eq,lbls,typeops'); %//'
%// Logically index into messages and select the ones based on positions
%// obtained in the previous step for the final output and finally
%// concatenate along the rows to get us the final output cell array
out = arrayfun(#(n) vertcat(messages{pos(:,n)})',1:msgtypeops_len,'Uni',0)';
Benchmarking
Here are some runtimes comparing Method - 1 that turned out to be best one as listed in the question against the proposed solution.
1) With length of messages_labels as 5000:
------------------ With Method - 1
Elapsed time is 0.072821 seconds.
------------------ With Proposed solution
Elapsed time is 0.053961 seconds.
2) With length of messages_labels as 500000:
------------------ With Method - 1
Elapsed time is 6.998149 seconds.
------------------ With Proposed solution
Elapsed time is 2.765090 seconds.
An almost 1.5x-2.5x speeedup might be good enough for you!
As ever, this boils down to a simple indexing problem, and for cell arrays of strings MATLAB has a nice way to generate those indices: ismember. There might be a clever way to then use that index vector to pull all the messages out in one go, but logical indexing is easy and quick enough, and JIT magic actually makes the trivial loop faster than arrayfun (using R2013b on Linux). That gives us this:
tic
out = cell(4,1);
[~, idx] = ismember(messages_labels, messagesTypesOptions);
for ii=1:4
out{ii} = vertcat(messages{idx == ii})';
end
toc
With the above added to the end of the original code:
>> test
Elapsed time is 0.056497 seconds.
Elapsed time is 0.857934 seconds.
Elapsed time is 0.201966 seconds.
Elapsed time is 0.017667 seconds.
Not bad :D
Replace all the 5000's with 50000's and it still scales linearly like #1 and #3:
>> test
Elapsed time is 0.550462 seconds.
Elapsed time is 48.685048 seconds.
Elapsed time is 1.965559 seconds.
Elapsed time is 0.162989 seconds.
Just to be sure:
>> isequal(type_to_concatenated1, type_to_concatenated2, type_to_concatenated3, out)
ans =
1
And, if you can handle the grouped messages being column vectors rather than rows, take out the transpose...
...
out{ii} = vertcat(messages{idx == ii});
...
...and it's twice as fast again:
>> test
Elapsed time is 0.552040 seconds.
Elapsed time is <skipped>
Elapsed time is 1.986059 seconds.
Elapsed time is 0.077958 seconds.
Just to check how parallel processing works in matlab, I tried the below piece of codes and measured the time of execution. But I found the parallel processing code takes more time than normal code which is unexpected. Am I doing wrong somewhere?
Code with parallel processing
function t = parl()
matlabpool('open',2);
tic;
A = 5:10000000;
parfor i = 1:length(A)
A(i) = 3*A(i) + (A(i)/5);
A(i) = 0.456*A(i) + (A(i)/45);
end
tic;
matlabpool('close');
t = toc;
end
There result for parallel processing
>> parl Starting matlabpool using the 'local' profile ... connected to 2 workers. Sending a stop signal to all the workers ... stopped.
ans =
3.3332
function t = parl()
tic;
A = 5:10000000;
for i = 1:length(A)
A(i) = 3*A(i) + (A(i)/5);
A(i) = 0.456*A(i) + (A(i)/45);
end
tic;
t = toc;
end
Result for without parallel processing code
>> parl
ans =
2.8737e-05
Look at the time to (apparently) execute the serial version of the code, it is effectively 0. That's suspicious, so look at the code ...
tic;
t = toc;
Hmmm, this starts a stopwatch and immediately stops it. Yep, that should take about 0s. Have a look at the parallel code ...
tic;
matlabpool('close');
t = toc;
Ahh, in this case the code times the execution of the closing of the pool of workers. That's requires a fair bit of work and the time it takes, the 3.33s, is part of the overhead of using parallel computation in Matlab.
Yes, I do believe that you are doing something wrong, you are not measuring what you (probably) think you are measuring. tic starts a stopwatch and toc reads it. Your code starts a stopwatch twice and reads it once, it should probably start timing only once.
I have this trial timer code to time euler solutions in Ruby.
$RUNS = 12
def run(solve)
times = []
$RUNS.times do
start_t = Time.now.usec
solve.call
end_t = Time.now.usec
times << (end_t - start_t)/1000.0
end
#times = times.delete_if {|i| i < 0}
puts times.inspect
times.sort
mean = times.inject{|a,c| a+c} / $RUNS
puts("Mean:\t#{mean}");
if (times.length % 2 == 0) then
median = (times[times.length / 2 - 1] + times[times.length / 2]) / 2.0
else
median = times[times.length / 2];
end
puts("Median: #{median}");
end
Unfortunately, I keep getting answers like this:
[409.805, 418.16, -582.23, 402.223, -581.94, 413.196, 426.816, -584.732, 519.457, -569.557, 558.918, -579.176]
What can I do to avoid these strange negative numbers?
usec returns the microseconds from the time in the same was as month returns the month. It is not the number of microseconds for the given time since the epoch.
So if start_t was 1049896564.259970 seconds and end_t was 1049896592.123130 seconds then you would get 123130 - 259970 if you subtracted the usecs. i.e. a negative number.
Instead you could use Time.now.to_f to convert to floating point number of seconds since epoch and subtract those from each other. You can also just subtract one Time object from another directly e.g.
start_t = Time.now
solve.call
end_t = Time.now
times << end_t - start_t
Current time in seconds since the Epoch:
Time.now.to_f
=> 1278631398.143
That should have microsecond resolution, despite only three decimal places being shown here.