Speed increase for textscan use - performance

I did a program that takes a .csv file and stacks the 3rd column of each files in the appropriate 3rd dimension of a 512x512xNumberOfFiles cell array. The code goes like this :
[filenames,filepath] = uigetfile('*.csv','Opening the data files','','Multiselect','on');
filenames = fullfile(filepath,filenames);
NumFiles = numel(filenames);
Pixel = cell(512,512,NumFiles);
count=0;
num_pixels = size(Pixel,1)*size(Pixel,2);
for k = 1:NumFiles
fid = fopen(char(filenames(k)));
C = textscan(fid, '%d, %d, %d','HeaderLines',1);
Pixel(count + sub2ind(size(Pixel),C{1}+1,C{2}+1)) = num2cell(C{3});
count = count + num_pixels;
fclose(fid);
end
The textscan call here takes approximately 0.5 +/- 0.03s per file I open (which is 262144 (512x512) data long), and my sub2ind call takes approximately 0.2 +/- 0.01s per file.
Is there any way to decrease this time or this seems like the most optimal way to run the code? I'll be working with approximately 1000 files each time, so waiting 8-9 minutes only to get the data right seems a bit excessive (considering I haven't used it yet for anything else).
Any tips?
Marc-Olivier

Hoping this would result in some improvement by still keeping it with textscan. Also, make sure the values look good.
Code
[filenames,filepath] = uigetfile('*.csv','Opening the data files',...
'','Multiselect','on');
filenames = fullfile(filepath,filenames);
NumFiles = numel(filenames);
PixelDouble = NaN(512*512,NumFiles);
for k = 1:NumFiles
fid = fopen(char(filenames(k)));
C = textscan(fid, '%d, %d, %d','HeaderLines',1);
PixelDouble(:,k) = C{3};
fclose(fid);
end
Pixel = num2cell(permute(reshape(PixelDouble,512,512,[]),[2 1 3]))
I must encourage you to follow this question - Fastest Matlab file reading? and it's answers.

Related

Is heap sort supposed to be very slow on MATLAB?

I wrote a heap sort function in MATLAB and it works fine, except that when the length of input is greater or equal to 1000, it can take a long time (e.g. the length of 1000 takes half a second). I'm not sure if it's that MATLAB doesn't run very fast on heap sort algorithm or it's just my code needs to be improved.
My code is shown below:
function b = heapsort(a)
[~,n] = size(a);
b = zeros(1,n);
for i = 1:n
a = build_max_heap(a);
b(n+1-i) = a(1);
temp = a(1);
a(1) = a(n+1-i);
a(n+1-i) = temp;
a(n+1-i) = [];
a = heapify(a,1);
end
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function a = build_max_heap(a)
[~,n] = size(a);
m = floor(n/2);
for i = m:-1:1
a = heapify(a,i);
end
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function a = heapify(a,i)
[~,n] = size(a);
left = 2*i;
right = 2*i + 1;
if left <= n
if a(left) >= a(i)
large = left;
else
large = i;
end
else
return
end
if right <= n
if a(right) >= a(large)
large = right;
end
end
if large ~= i
temp = a(large);
a(large) = a(i);
a(i) = temp;
a = heapify(a,large);
end
end
I'm aware that maybe it's the code a(n+1-i) = []; that may consume a lot of time. But when I changed the [] into -999 (lower than any number of the input vector), it doesn't help but took even more time.
You should use the profiler to check which lines that takes the most time. It's definitely a(n+1-i) = []; that's slowing down your function.
Resizing arrays in loops is very slow, so you should always try to avoid it.
A simple test:
Create a function that takes a large vector as input, and iteratively removes elements until it's empty.
Create a function that takes the same vector as input and iteratively sets each value to 0, Inf, NaN or something else.
Use timeit to check which function is faster. You'll see that the last function is approximately 100 times faster (depending on the size of the vector of course).
The reason why -999 takes more time is most likely because a no longer gets smaller and smaller, thus a = heapify(a,1); won't get faster and faster. I haven't tested it, but if you try the following in your first function you'll probably get a much faster program (you must insert the n+1-i) other places in your code as well, but I'll leave that to you):
a(n+1-ii) = NaN;
a(1:n+1-ii) = heapify(a(1:n+1-ii),1);
Note that I changed i to ii. That's partially because I want to give you a good advice, and partially to avoid being reminded to not use i and j as variables in MATLAB.

How to create array of concatenated contents from an array of labeld arrays

I have the following data:
a cell array of labels (e.g. a cell array of 4 options of types of messages where each type is a string)
an cell array of messages (e.g. a cell array of 5000 messages where each message is a cell array of many words strings).
an cell array of labels for each message (e.g. a cell array of 5000 strings where string in cell i is type of message in cell i in array in part 2).
My goal is to get from this data a cell array of size as of num of labels where in each cell there is concatenated contents from all the messages of type as the label (e.g. get a cell array of 4 cells where in cell i there is a cell array of all the words from all the messages that their type is i).
I implemented 3 method to perform this. This is the code for my 3 implementations:
%...............................................................
% setting data for tic toc tests
messagesTypesOptions = {'type1';'type2';'type3';'type4'};
messages = cell(5000,1);
for i = 1:5000
messages{i} = {'word1';'word2';'word3';'word4';'word5';'word6';'word7';'word8';'word9';'word10'};
end
messages_labels = cell(5000,1);
for i = 1:5000
messages_labels{i} = messagesTypesOptions{randi([1 4])};
end
%...............................................................
% start test
% method 1
type_to_msgs1 = cell(size(messagesTypesOptions,1),1);
tic
for i = 1:size(messagesTypesOptions,1)
type_to_msgs1{i} = messages(strcmp(messages_labels,messagesTypesOptions{i}));
end
type_to_concatenated1 = cell(4,1);
for i = 1:4
type_to_msgs1{i} = type_to_msgs1{i}';
end
for i =1:4
label_msgs = type_to_msgs1{i};
num_of_label_msgs = size(label_msgs,2);
for j = 1: num_of_label_msgs
label_msgs{j} = label_msgs{j}';
end
type_to_concatenated1{i} = [label_msgs{:}];
end
toc
% method 2
type_to_concatenated2 = cell(4,1);
tic
labelStr_to_labelIndex = containers.Map(messagesTypesOptions,1:4);
for textIndex = 1:5000
type_to_concatenated2{labelStr_to_labelIndex(messages_labels{textIndex})} = ...
[type_to_concatenated2{labelStr_to_labelIndex(messages_labels{textIndex})},...
messages{textIndex}'];
end
toc
% method 3
type_to_concatenated3 = cell(4,1);
tic
labelStr_to_labelIndex2 = containers.Map(messagesTypesOptions,1:4);
matrix_label_to_isMsgFromLabel = zeros(4,5000);
for textIndex = 1:5000
matrix_label_to_isMsgFromLabel(labelStr_to_labelIndex2(messages_labels{textIndex})...
,textIndex) = 1;
end
for i = 1:4
label_msgs3 = messages(~~matrix_label_to_isMsgFromLabel(i,:))';
num_of_label_msgs3 = size(label_msgs3,2);
for j = 1: num_of_label_msgs3
label_msgs3{j} = label_msgs3{j}';
end
type_to_concatenated3{i} = [label_msgs3{:}];
end
toc
Those are the results I get:
Elapsed time is 0.033120 seconds.
Elapsed time is 0.471959 seconds.
Elapsed time is 0.095011 seconds.
So, the conclusion is that method 1 is the fastest.
Now, my question is: Is there a way to solve this in a faster way?
Intuitively, it seams that my method1 is not very efficient because it has a for loop with strcmp and the strcmp is reading all the messages, so it is reading num of labels times all the messages, i.e reading num of labels (types) the same thing.
So, is there a way to modify one of my methods to get faster solution? Is there another method which is faster?
EDIT: Here I used for the examples constant messages. But, I want a solution for the case that the messages are different from each other and can be of different size.
EDIT2: Also, the types are strings that don't necessarily has numbers in them. (e.g. instead of type1,type2,... that I used for the example code, it can be 'error', 'warning', 'valid').
Basically you have messages and need to index into them to get output for each cell of the output cell array and finally concatenate the elements. For indexing you can use logical indexing which in most cases is very efficient. For getting the logical indexing arrays, you can take help of bsxfun. Here's the code to wrap up the discussion -
%// Get the parameters
lbls_len = numel(messages_labels);
msgtypeops_len = numel(messagesTypesOptions);
%// Tag messages_labels and messagesTypesOptions with numbers
alltypes = [messages_labels ; messagesTypesOptions];
[~,~,IDs] = unique(alltypes,'stable');
lbls = IDs(1:lbls_len);
typeops = IDs(lbls_len+1:end);
%// Positions of matches for each label IDs against type IDS
pos = bsxfun(#eq,lbls,typeops'); %//'
%// Logically index into messages and select the ones based on positions
%// obtained in the previous step for the final output and finally
%// concatenate along the rows to get us the final output cell array
out = arrayfun(#(n) vertcat(messages{pos(:,n)})',1:msgtypeops_len,'Uni',0)';
Benchmarking
Here are some runtimes comparing Method - 1 that turned out to be best one as listed in the question against the proposed solution.
1) With length of messages_labels as 5000:
------------------ With Method - 1
Elapsed time is 0.072821 seconds.
------------------ With Proposed solution
Elapsed time is 0.053961 seconds.
2) With length of messages_labels as 500000:
------------------ With Method - 1
Elapsed time is 6.998149 seconds.
------------------ With Proposed solution
Elapsed time is 2.765090 seconds.
An almost 1.5x-2.5x speeedup might be good enough for you!
As ever, this boils down to a simple indexing problem, and for cell arrays of strings MATLAB has a nice way to generate those indices: ismember. There might be a clever way to then use that index vector to pull all the messages out in one go, but logical indexing is easy and quick enough, and JIT magic actually makes the trivial loop faster than arrayfun (using R2013b on Linux). That gives us this:
tic
out = cell(4,1);
[~, idx] = ismember(messages_labels, messagesTypesOptions);
for ii=1:4
out{ii} = vertcat(messages{idx == ii})';
end
toc
With the above added to the end of the original code:
>> test
Elapsed time is 0.056497 seconds.
Elapsed time is 0.857934 seconds.
Elapsed time is 0.201966 seconds.
Elapsed time is 0.017667 seconds.
Not bad :D
Replace all the 5000's with 50000's and it still scales linearly like #1 and #3:
>> test
Elapsed time is 0.550462 seconds.
Elapsed time is 48.685048 seconds.
Elapsed time is 1.965559 seconds.
Elapsed time is 0.162989 seconds.
Just to be sure:
>> isequal(type_to_concatenated1, type_to_concatenated2, type_to_concatenated3, out)
ans =
1
And, if you can handle the grouped messages being column vectors rather than rows, take out the transpose...
...
out{ii} = vertcat(messages{idx == ii});
...
...and it's twice as fast again:
>> test
Elapsed time is 0.552040 seconds.
Elapsed time is <skipped>
Elapsed time is 1.986059 seconds.
Elapsed time is 0.077958 seconds.

More Efficient Alternatives to Max and Min

So, I'm trying to optimize a program I made, and two glaring inefficiencies I have found with the help of the profiler are these:
if (min(image_arr(j,i,:)) > 0.1)
image_arr(j,i,:) = image_arr(j,i,:) - min(image_arr(j,i,:));
end
%"Grounds" the data, making sure the points start close to 0
Called 4990464 times, takes 58.126s total, 21.8% of total compile time.
[max_mag , max_index] = max(image_arr(j, i, :));
%finds the maximum value and its index in the set
Called 4990464 times, takes 50.900s total, 19.1% of total compile time.
Is there any alternative to max and min that I can use here, that would be more efficient?
There is no way to reduce the number of times these lines are called.
Based on the call count, these are probably inside a loop. Both min and max are vectorized (they work on vectors of vectors).
Since you want to find extrema along the third dimension, you can use:
image_arr = bsxfun(#minus, image_arr, min(image_arr, [], 3));
and
[max_mag , max_index] = max(image_arr, [], 3);
It seems like:
if (min(image_arr(j,i,:)) > 0.1)
image_arr(j,i,:) = image_arr(j,i,:) - min(image_arr(j,i,:));
end
could be rewritten like this:
data = image_arr(j,i,:);
mn = min(data);
if (mn > 0.1)
image_arr(j,i,:) = data - mn;
end
which seems like the inner loop of something that could be written like:
minarr = min(image_arr)
[a,b] = find(minarr > 0.1);
image_arr(a,b,:) = image_arr(a,b,:) - minarr(a,b)
Rename your i and j.
Those names have meaning to MATLAB, and every time it sees them it has to check whether you have your own definition or they mean sqrt(-1).
The first part can be done without loops using bsxfun.
m = min(image_arr,[],3);
image_arr = bsxfun(#minus, image_arr, m.*(m>0.1));

How can I avoid if else statements within a for loop?

I have a code that yields a solution similar to the desired output, and I don't know how to perfect this.
The code is as follows.
N = 4; % sampling period
for nB = -30:-1;
if rem(nB,N)==0
xnB(abs(nB)) = -(cos(.1*pi*nB)-(4*sin(.2*pi*nB)));
else
xnB(abs(nB)) = 0;
end
end
for nC = 1:30;
if rem(nC,N)==0
xnC(nC) = cos(.1*pi*nC)-(4*sin(.2*pi*nC));
else
xnC(nC) = 0;
end
end
nB = -30:-1;
nC = 1:30;
nD = 0;
xnD = 0;
plot(nA,xnA,nB,xnB,'r--o',nC,xnC,'r--o',nD,xnD,'r--o')
This produces something that is close, but not close enough for proper data recovery.
I have tried using an index that has the same length but simply starts at 1 but the output was even worse than this, though if that is a viable option please explain thoroughly, how it should be done.
I have tried running this in a single for-loop with one if-statement but there is a problem when the counter passes zero. What is a way around this that would allow me to avoid using two for-loops? (I'm fairly confident that, solving this issue would increase the accuracy of my output enough to successfully recover the signal.)
EDIT/CLARIFICATION/ADD - 1
I do in fact want to evaluate the signal at the index of zero. The if-statement cannot handle an index of zero which is an index that I'd prefer not to skip.
The goal of this code is to be able to sample a signal, and then I will build a code that will put it through a recovery filter.
EDIT/UPDATE - 2
nA = -30:.1:30; % n values for original function
xnA = cos(.1*pi*nA)-(4*sin(.2*pi*nA)); % original function
N = 4; % sampling period
n = -30:30;
xn = zeros(size(n));
xn(rem(n,N)==0) = -(cos(.1*pi*n)-(4*sin(.2*pi*n)));
plot(nA,xnA,n,xn,'r--o')
title('Original seq. x and Sampled seq. xp')
xlabel('n')
ylabel('x(n) and xp(n)')
legend('original','sampled');
This threw an error at the line xn(rem(n,N)==0) = -(cos(.1*pi*n)-(4*sin(.2*pi*n))); which read: In an assignment A(I) = B, the number of elements in B and I must be the same. I have ran into this error before, but my previous encounters were usually the result of faulty looping. Could someone point out why it isn't working this time?
EDIT/Clarification - 3
N = 4; % sampling period
for nB = -30:30;
if rem(nB,N)==0
xnB(abs(nB)) = -(cos(.1*pi*nB)-(4*sin(.2*pi*nB)));
else
xnB(abs(nB)) = 0;
end
end
The error message resulting is as follows: Attempted to access xnB(0); index must be a positive integer or logical.
EDIT/SUCCESS - 4
After taking another look at the answers posted, I realized that the negative sign in front of the cos function wasn't supposed to be in the original coding.
You could do something like the following:
nB = -30:1
nC = 1:30
xnB = zeros(size(nB));
remB = rem(nB,N)==0;
xnB(remB) = -cos(.1*pi*nB(remB))-(4*sin(.2*pi*nB(remB));
xnC = zeros(size(nC));
remC = rem(nC,N)==0;
xnC(remC) = cos(.1*pi*nC(remC))-(4*sin(.2*pi*nC(remC)));
This avoids the issue of having for-loops entirely. However, this would produce the exact same output as you had before, so I'm not sure that it would fix your initial problem...
EDIT for your most recent addition:
nB = -30:30;
xnB = zeros(size(nB));
remB = rem(nB,N)==0;
xnB(remB) = -(cos(.1*pi*nB(remB))-(4*sin(.2*pi*nB(remB)));
In your original post you had the sign dependent on the sign of nB - if you wanted to maintain this functionality, you would do the following:
xnB(remB) = sign(nB(remB).*(cos(.1*pi*nB(remB))-(4*sin(.2*pi*nB(remB)));
From what I understand, you want to iterate over all integer values in [-30, 30] excluding 0 using a single for loop. this can be easily done as:
for ii = [-30:-1,1:30]
%Your code
end
Resolution for edit - 2
As per your updated code, try replacing
xn(rem(n,N)==0) = -(cos(.1*pi*n)-(4*sin(.2*pi*n)));
with
xn(rem(n,N)==0) = -(cos(.1*pi*n(rem(n,N)==0))-(4*sin(.2*pi*n(rem(n,N)==0))));
This should fix the dimension mismatch.
Resolution for edit - 3
Try:
N = 4; % sampling period
for nB = -30:30;
if rem(nB,N)==0
xnB(nB-(-30)+1) = -(cos(.1*pi*nB)-(4*sin(.2*pi*nB)));
else
xnB(nB-(-30)+1) = 0;
end
end

How to speed this kind of for-loop?

I would like to compute the maximum of translated images along the direction of a given axis. I know about ordfilt2, however I would like to avoid using the Image Processing Toolbox.
So here is the code I have so far:
imInput = imread('tire.tif');
n = 10;
imMax = imInput(:, n:end);
for i = 1:(n-1)
imMax = max(imMax, imInput(:, i:end-(n-i)));
end
Is it possible to avoid using a for-loop in order to speed the computation up, and, if so, how?
First edit: Using Octave's code for im2col is actually 50% slower.
Second edit: Pre-allocating did not appear to improve the result enough.
sz = [size(imInput,1), size(imInput,2)-n+1];
range_j = 1:size(imInput, 2)-sz(2)+1;
range_i = 1:size(imInput, 1)-sz(1)+1;
B = zeros(prod(sz), length(range_j)*length(range_i));
counter = 0;
for j = range_j % left to right
for i = range_i % up to bottom
counter = counter + 1;
v = imInput(i:i+sz(1)-1, j:j+sz(2)-1);
B(:, counter) = v(:);
end
end
imMax = reshape(max(B, [], 2), sz);
Third edit: I shall show the timings.
For what it's worth, here's a vectorized solution using IM2COL function from the Image Processing Toolbox:
imInput = imread('tire.tif');
n = 10;
sz = [size(imInput,1) size(imInput,2)-n+1];
imMax = reshape(max(im2col(imInput, sz, 'sliding'),[],2), sz);
imshow(imMax)
You could perhaps write your own version of IM2COL as it simply consists of well crafted indexing, or even look at how Octave implements it.
Check out the answer to this question about doing a rolling median in c. I've successfully made it into a mex function and it is way faster than even ordfilt2. It will take some work to do a max, but I'm sure it's possible.
Rolling median in C - Turlach implementation

Resources