Vectorization of nested loops and if statements in MATLAB - performance

I am fairly new to the concept of vectorization in MATLAB so please excuse my naivety in this regard. I was trying to vectorize the following MATLAB code which includes if statements within nested for loops:
h = zeros(dimV);
for a = 1 : dimV
for b = 1 : dimV
if a ~= b && C(a,b) == 1
h(a,b) = C(a,b)*exp(-1i*A(a,b)*L(a,b))*(sin(k*L(a,b)))^-1;
else if a == b
for m = 1 : dimV
if m ~= a && C(a,m) == 1
h(a,b) = h(a,b) - C(a,m)*cot(k*L(a,m));
end
end
end
end
end
end
Here the variable dimV specifies the size of the matrix h and is fairly large, of the order of 100, and C is a symmetric square matrix (previously defined) of size dimV all of whose off-diagonal elements are either 0 or 1 and the diagonal elements are necesarrily 0. The elements of matrix L are also zero in the same positions as the zeros of the matrix C. Following the vectorization techniques that I found on this website and here, I was able to vectorize the code, albeit partially, and my MWE is as follows:
h = zeros(dimV);
idx = (C == 1);
h(idx) = C(idx).*exp(-1i*A(idx).*L(idx)).*(sin(k*L(idx))).^-1;
for a = 1:dimV;
m = 1 : dimV;
m = m(C(a,:) == 1);
h(a,a) = - sum (C(a,m).*cot(k*L(a,m)));
end
My main problem is in converting the for loop in the variable a to a vector as I need the individual values of a to address the diagonal elements of h. I compared the evaluation time of both the code blocks using the MATLAB profiler and the latter version is only marginally faster and the improvement in efficiency is really insignificant. In fact the profiler showed that the line allocating the values to h(a,a) takes up nearly 50% of the execution time in the second case. So I was wondering if there is a more elegant way to rewrite the above code using the appropriate vectorization schemes, which would help improve its efficiency. I am really in a bit of bother about this and any I would be greatly appreciative of any help in this regard. Thank you so much.

Vectorized Code -
diag_ind = 1:dimV+1:numel(C);
C_neq1 = C~=1;
parte1_2 = (sin(k.*L)).^-1;
parte1_2(C_neq1 & (L==0))=0;
parte1 = exp(-1i.*A.*L).*parte1_2;
parte1(C_neq1)=0;
h = parte1;
parte2 = cot(k*L);
parte2(C_neq1)=0;
parte2(diag_ind)=0;
h(diag_ind) = - sum(parte2,2);

Related

A faster alternative to all(a(:,i)==a,1) in MATLAB

It is a straightforward question: Is there a faster alternative to all(a(:,i)==a,1) in MATLAB?
I'm thinking of a implementation that benefits from short-circuit evaluations in the whole process. I mean, all() definitely benefits from short-circuit evaluations but a(:,i)==a doesn't.
I tried the following code,
% example for the input matrix
m = 3; % m and n aren't necessarily equal to those values.
n = 5000; % It's only possible to know in advance that 'm' << 'n'.
a = randi([0,5],m,n); % the maximum value of 'a' isn't necessarily equal to
% 5 but it's possible to state that every element in
% 'a' is a positive integer.
% all, equal solution
tic
for i = 1:n % stepping up the elapsed time in orders of magnitude
%%%%%%%%%% all and equal solution %%%%%%%%%
ax_boo = all(a(:,i)==a,1);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
end
toc
% alternative solution
tic
for i = 1:n % stepping up the elapsed time in orders of magnitude
%%%%%%%%%%% alternative solution %%%%%%%%%%%
ax_boo = a(1,i) == a(1,:);
for k = 2:m
ax_boo(ax_boo) = a(k,i) == a(k,ax_boo);
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
end
toc
but it's intuitive that any "for-loop-solution" within the MATLAB environment will be naturally slower. I'm wondering if there is a MATLAB built-in function written in a faster language.
EDIT:
After running more tests I found out that the implicit expansion does have a performance impact in evaluating a(:,i)==a. If the matrix a has more than one row, all(repmat(a(:,i),[1,n])==a,1) may be faster than all(a(:,i)==a,1) depending on the number of columns (n). For n=5000 repmat explicit expansion has proved to be faster.
But I think that a generalization of Kenneth Boyd's answer is the "ultimate solution" if all elements of a are positive integers. Instead of dealing with a (m x n matrix) in its original form, I will store and deal with adec (1 x n matrix):
exps = ((0):(m-1)).';
base = max(a,[],[1,2]) + 1;
adec = sum( a .* base.^exps , 1 );
In other words, each column will be encoded to one integer. And of course adec(i)==adec is faster than all(a(:,i)==a,1).
EDIT 2:
I forgot to mention that adec approach has a functional limitation. At best, storing adec as uint64, the following inequality must hold base^m < 2^64 + 1.
Since your goal is to count the number of columns that match, my example converts the binary encoding to integer decimals, then you just loop over the possible values (with 3 rows that are 8 possible values) and count the number of matches.
a_dec = 2.^(0:(m-1)) * a;
num_poss_values = 2 ^ m;
num_matches = zeros(num_poss_values, 1);
for i = 1:num_poss_values
num_matches(i) = sum(a_dec == (i - 1));
end
On my computer, using 2020a, Here are the execution times for your first 2 options and the code above:
Elapsed time is 0.246623 seconds.
Elapsed time is 0.553173 seconds.
Elapsed time is 0.000289 seconds.
So my code is 853 times faster!
I wrote my code so it will work with m being an arbitrary integer.
The num_matches variable contains the number of columns that add up to 0, 1, 2, ...7 when converted to a decimal.
As an alternative you can use the third output of unique:
[~, ~, iu] = unique(a.', 'rows');
for i = 1:n
ax_boo = iu(i) == iu;
end
As indicated in a comment:
ax_boo isolates the indices of the columns I have to sum in a row vector b. So, basically the next line would be something like c = sum(b(ax_boo),2);
It is a typical usage of accumarray:
[~, ~, iu] = unique(a.', 'rows');
C = accumarray(iu,b);
for i = 1:n
c = C(i);
end

inverse of symmetric matrix is not symmetric in Julia

I am using Julia version 0.6.2 and I am facing this problem.
mat = zeros(6, 6)
for i = 1 : 6
for j = 1 : 6
mat[i, j] = exp(-(i - j)^2)
end
end
issymmetric(mat)
issymmetric(inv(mat))
And the output is
Main> issymmetric(mat)
true
Main> issymmetric(inv(mat))
false
I also tried the following Matlab code
mat = zeros(6, 6);
for i = 1 : 6
for j = 1 : 6
mat(i, j) = exp(-(i - j)^2);
end
end
issymmetric(mat)
issymmetric(inv(mat))
And the output is
logical 1
logical 1
Apart from manually making the matrix symmetric as you propose, e.g. taking the average of matrix and its transpose like
A = inv(mat)
(A+A.')/2
probably a cleaner way is
smat = Symmetric(mat)
B = inv(smat)
now B (as well as smat) passes issymmetric. Moreover, the fact that it is symmetric is ensured on type level (Symmetric) - some functions might take advantage of this additional information. This is exactly what inv does for smat.
EDIT: the question was also posted on Discourse, where you can find additional discussion about the performance of Symmetric.

Implementing Neville's Algorithm in MatLab

I'm attempting to implement Neville's algorithm in MatLab with four given points in this case. However, I'm a little stuck at the moment.
This is my script so far:
% Neville's Method
% Function parameters
x = [7,14,21,28];
fx = [58,50,54,53];
t = 10;
n = length(x);
Q = zeros(n,n);
for i = 1:n
Q(i,1) = fx(i);
end
for j = 2:n
for i = j:n
Q(i,j) = ((t-x(i-j)) * Q(i,j-1)/(x(i)-x(i-j))) + ((x(i)-t) * Q(i-1,j-1)/(x(i)-x(i-j)));
end
end
print(Q);
As for the problem I'm having, I'm getting this output consistently:
Subscript indices must either be real positive integers or logicals.
I've been trying to tweak the loop iterations but to no avail. I know the problem is the primary logic line in the inner loop. Some of the operations result in array indices that are equal to zero initially.
That's where I am, any help would be appreciated!
In your loop at the first time i-j is 0 because you set i = j. In MATLAB indices start at 1. A simple fix to get running code would be to change
for i = j:n
to
for i = j+1:n
This solves
Subscript indices must either be real positive integers or logicals.
However, this may not be ideal and you may need to rethink your logic. The output I get is
>> neville
Q =
58.0000 0 0 0
50.0000 0 0 0
54.0000 50.8571 0 0
53.0000 54.2857 51.3469 0

matlab code nested loop performance improvement

I would be very interested to receive suggestions on how to improve performance of the following nested for loop:
I = (U > q); % matrix of indicator variables, I(i,j) is 1 if U(i,j) > q
for i = 2:K
for j = 1:(i-1)
mTau(i,j) = sum(I(:,i) .* I(:,j));
mTau(j,i) = mTau(i,j);
end
end
The code evaluates if for pairs of variables both variables are below a certain threshold, thereby filling a matrix. I appreciate your help!
You can use matrix multiplication:
I = double(U>q);
mTau = I.'*I;
This will have none-zero values on diagonal so you can set them to zero by
mTau = mTau - diag(diag(mTau));
One approach with bsxfun -
out = squeeze(sum(bsxfun(#and,I,permute(I,[1 3 2])),1));
out(1:size(out,1)+1:end)=0;

Histogram intersection kernel optimization in MATLAB

I want to try a svm classifier using histogram intersection kernel, for a dataset of 153 images but it takes a long time. This is my code:
a = load('...'); %vectors
b = load('...'); %labels
g = dataset(a,b);
error = crossval(g,libsvc([],proxm([],'ih'),100),10,10);
error1 = crossval(g,libsvc([],proxm([],'ih'),10),10,10);
error2 = crossval(g,libsvc([],proxm([],'ih'),1),10,10);
My implementation of the kernel within the proxm function is:
...
case {'dist_histint','ih'}
[m,d]=size(A);
[n,d1]=size(B);
if (d ~= d1)
error('column length of A (%d) != column length of B (%d)\n',d,d1);
end
% With the MATLAB JIT compiler the trivial implementation turns out
% to be the fastest, especially for large matrices.
D = zeros(m,n);
for i=1:m % m is number of samples of A
if (0==mod(i,1000)) fprintf('.'); end
for j=1:n % n is number of samples of B
D(i,j) = sum(min([A(i,:);B(j,:)]));%./max(A(:,i),B(:,j)));
end
end
I need some matlab optimization for this code!
You can get rid of that kernel loop to calculate D with this bsxfun based vectorized approach -
D = squeeze(sum(bsxfun(#min,A,permute(B,[3 2 1])),2))
Or avoid squeeze with this modification -
D = sum(bsxfun(#min,permute(A,[1 3 2]),permute(B,[3 1 2])),3)
If the calculations of D involve max instead of min, just replace #min with #max there.
Explanation: The way bsxfun works is that it does expansion on singleton dimensions and performs the operation as listed with # inside its call. Now, this expansion is basically how one achieves vectorized solutions that replace for-loops. By singleton dimensions in arrays, we mean dimensions of 1 in them.
In many cases, singleton dimensions aren't already present and for vectorization with bsxfun, we need to create singleton dimensions. One of the tools to do so is with permute. That's basically all about the way vectorized approach stated earlier would work.
Thus, your kernel code -
...
case {'dist_histint','ih'}
[m,d]=size(A);
[n,d1]=size(B);
if (d ~= d1)
error('column length of A (%d) != column length of B (%d)\n',d,d1);
end
% With the MATLAB JIT compiler the trivial implementation turns out
% to be the fastest, especially for large matrices.
D = zeros(m,n);
for i=1:m % m is number of samples of A
if (0==mod(i,1000)) fprintf('.'); end
for j=1:n % n is number of samples of B
D(i,j) = sum(min([A(i,:);B(j,:)]));%./max(A(:,i),B(:,j)));
end
end
reduces to -
...
case {'dist_histint','ih'}
[m,d]=size(A);
[n,d1]=size(B);
if (d ~= d1)
error('column length of A (%d) != column length of B (%d)\n',d,d1);
end
D = squeeze(sum(bsxfun(#min,A,permute(B,[3 2 1])),2))
%// OR D = sum(bsxfun(#min,permute(A,[1 3 2]),permute(B,[3 1 2])),3)
I am assuming the line: if (0==mod(i,1000)) fprintf('.'); end isn't important to the calculations as it does printing of some message.

Resources