I would be very interested to receive suggestions on how to improve performance of the following nested for loop:
I = (U > q); % matrix of indicator variables, I(i,j) is 1 if U(i,j) > q
for i = 2:K
for j = 1:(i-1)
mTau(i,j) = sum(I(:,i) .* I(:,j));
mTau(j,i) = mTau(i,j);
end
end
The code evaluates if for pairs of variables both variables are below a certain threshold, thereby filling a matrix. I appreciate your help!
You can use matrix multiplication:
I = double(U>q);
mTau = I.'*I;
This will have none-zero values on diagonal so you can set them to zero by
mTau = mTau - diag(diag(mTau));
One approach with bsxfun -
out = squeeze(sum(bsxfun(#and,I,permute(I,[1 3 2])),1));
out(1:size(out,1)+1:end)=0;
Related
I'm attempting to implement Neville's algorithm in MatLab with four given points in this case. However, I'm a little stuck at the moment.
This is my script so far:
% Neville's Method
% Function parameters
x = [7,14,21,28];
fx = [58,50,54,53];
t = 10;
n = length(x);
Q = zeros(n,n);
for i = 1:n
Q(i,1) = fx(i);
end
for j = 2:n
for i = j:n
Q(i,j) = ((t-x(i-j)) * Q(i,j-1)/(x(i)-x(i-j))) + ((x(i)-t) * Q(i-1,j-1)/(x(i)-x(i-j)));
end
end
print(Q);
As for the problem I'm having, I'm getting this output consistently:
Subscript indices must either be real positive integers or logicals.
I've been trying to tweak the loop iterations but to no avail. I know the problem is the primary logic line in the inner loop. Some of the operations result in array indices that are equal to zero initially.
That's where I am, any help would be appreciated!
In your loop at the first time i-j is 0 because you set i = j. In MATLAB indices start at 1. A simple fix to get running code would be to change
for i = j:n
to
for i = j+1:n
This solves
Subscript indices must either be real positive integers or logicals.
However, this may not be ideal and you may need to rethink your logic. The output I get is
>> neville
Q =
58.0000 0 0 0
50.0000 0 0 0
54.0000 50.8571 0 0
53.0000 54.2857 51.3469 0
I am fairly new to the concept of vectorization in MATLAB so please excuse my naivety in this regard. I was trying to vectorize the following MATLAB code which includes if statements within nested for loops:
h = zeros(dimV);
for a = 1 : dimV
for b = 1 : dimV
if a ~= b && C(a,b) == 1
h(a,b) = C(a,b)*exp(-1i*A(a,b)*L(a,b))*(sin(k*L(a,b)))^-1;
else if a == b
for m = 1 : dimV
if m ~= a && C(a,m) == 1
h(a,b) = h(a,b) - C(a,m)*cot(k*L(a,m));
end
end
end
end
end
end
Here the variable dimV specifies the size of the matrix h and is fairly large, of the order of 100, and C is a symmetric square matrix (previously defined) of size dimV all of whose off-diagonal elements are either 0 or 1 and the diagonal elements are necesarrily 0. The elements of matrix L are also zero in the same positions as the zeros of the matrix C. Following the vectorization techniques that I found on this website and here, I was able to vectorize the code, albeit partially, and my MWE is as follows:
h = zeros(dimV);
idx = (C == 1);
h(idx) = C(idx).*exp(-1i*A(idx).*L(idx)).*(sin(k*L(idx))).^-1;
for a = 1:dimV;
m = 1 : dimV;
m = m(C(a,:) == 1);
h(a,a) = - sum (C(a,m).*cot(k*L(a,m)));
end
My main problem is in converting the for loop in the variable a to a vector as I need the individual values of a to address the diagonal elements of h. I compared the evaluation time of both the code blocks using the MATLAB profiler and the latter version is only marginally faster and the improvement in efficiency is really insignificant. In fact the profiler showed that the line allocating the values to h(a,a) takes up nearly 50% of the execution time in the second case. So I was wondering if there is a more elegant way to rewrite the above code using the appropriate vectorization schemes, which would help improve its efficiency. I am really in a bit of bother about this and any I would be greatly appreciative of any help in this regard. Thank you so much.
Vectorized Code -
diag_ind = 1:dimV+1:numel(C);
C_neq1 = C~=1;
parte1_2 = (sin(k.*L)).^-1;
parte1_2(C_neq1 & (L==0))=0;
parte1 = exp(-1i.*A.*L).*parte1_2;
parte1(C_neq1)=0;
h = parte1;
parte2 = cot(k*L);
parte2(C_neq1)=0;
parte2(diag_ind)=0;
h(diag_ind) = - sum(parte2,2);
I've been working on speeding up the following function, but with no results:
function beta = beta_c(k,c,gamma)
beta = zeros(size(k));
E = #(x) (1.453*x.^4)./((1 + x.^2).^(17/6));
for ii = 1:size(k,1)
for jj = 1:size(k,2)
E_int = integral(E,k(ii,jj),10000);
beta(ii,jj) = c*gamma/(k(ii,jj)*sqrt(E_int));
end
end
end
Up to now, I solved it this way:
function beta = beta_calc(k,c,gamma)
k_1d = reshape(k,[1,numel(k)]);
E_1d =#(k) 1.453.*k.^4./((1 + k.^2).^(17/6));
E_int = zeros(1,numel(k_1d));
parfor ii = 1:numel(k_1d)
E_int(ii) = quad(E_1d,k_1d(ii),10000);
end
beta_1d = c*gamma./(k_1d.*sqrt(E_int));
beta = reshape(beta_1d,[size(k,1),size(k,2)]);
end
Seems to me, it didn't really enhance performances. What do you think about this?
Would you mind to shed a light?
I thank you in advance.
EDIT
I am gonna introduce some theoretical background involving my question.
Generally, beta is to be calculated as follows
Therefore, in the reduced case of unidimensional k array, E_int may be calculated as
E = 1.453.*k.^4./((1 + k.^2).^(17/6));
E_int = 1.5 - cumtrapz(k,E);
or, alternatively as
E_int(1) = 1.5;
for jj = 2:numel(k)
E =#(k) 1.453.*k.^4./((1 + k.^2).^(17/6));
E_int(jj) = E_int(jj - 1) - integral(E,k(jj-1),k(jj));
end
Nonetheless, k is currently a matrix k(size1,size2).
Here's another approach, parallelize, because it's easy using spmd or parfor. Instead of integral consider quad, see this link for examples...
I like this question.
The problem: the function integral takes as integration limits only scalars. Hence, it is difficult to vectorize the computation of of E_int.
A clue: there seems to be lot of redundancy in integrating the same function over and over from k(ii,jj) to infinity...
Proposed solution: How about sorting the values of k from smallest to largest and integrating E_sort_int(si) = integral( E, sortedK(si), sortedK(si+1) ); with sortedK( numel(k) + 1 ) = 10000;. Then the full value of E_int = cumsum( E_sort_int ); (you only need to "undo" the sorting and reshape it back to the size of k).
I have been given an assignment in which I am supposed to write an algorithm which performs polynomial interpolation by the barycentric formula. The formulas states that:
p(x) = (SIGMA_(j=0 to n) w(j)*f(j)/(x - x(j)))/(SIGMA_(j=0 to n) w(j)/(x - x(j)))
I have written an algorithm which works just fine, and I get the polynomial output I desire. However, this requires the use of some quite long loops, and for a large grid number, lots of nastly loop operations will have to be done. Thus, I would appreciate it greatly if anyone has any hints as to how I may improve this, so that I will avoid all these loops.
In the algorithm, x and f stand for the given points we are supposed to interpolate. w stands for the barycentric weights, which have been calculated before running the algorithm. And grid is the linspace over which the interpolation should take place:
function p = barycentric_formula(x,f,w,grid)
%Assert x-vectors and f-vectors have same length.
if length(x) ~= length(f)
sprintf('Not equal amounts of x- and y-values. Function is terminated.')
return;
end
n = length(x);
m = length(grid);
p = zeros(1,m);
% Loops for finding polynomial values at grid points. All values are
% calculated by the barycentric formula.
for i = 1:m
var = 0;
sum1 = 0;
sum2 = 0;
for j = 1:n
if grid(i) == x(j)
p(i) = f(j);
var = 1;
else
sum1 = sum1 + (w(j)*f(j))/(grid(i) - x(j));
sum2 = sum2 + (w(j)/(grid(i) - x(j)));
end
end
if var == 0
p(i) = sum1/sum2;
end
end
This is a classical case for matlab 'vectorization'. I would say - just remove the loops. It is almost that simple. First, have a look at this code:
function p = bf2(x, f, w, grid)
m = length(grid);
p = zeros(1,m);
for i = 1:m
var = grid(i)==x;
if any(var)
p(i) = f(var);
else
sum1 = sum((w.*f)./(grid(i) - x));
sum2 = sum(w./(grid(i) - x));
p(i) = sum1/sum2;
end
end
end
I have removed the inner loop over j. All I did here was in fact removing the (j) indexing and changing the arithmetic operators from / to ./ and from * to .* - the same, but with a dot in front to signify that the operation is performed on element by element basis. This is called array operators in contrast to ordinary matrix operators. Also note that treating the special case where the grid points fall onto x is very similar to what you had in the original implementation, only using a vector var such that x(var)==grid(i).
Now, you can also remove the outermost loop. This is a bit more tricky and there are two major approaches how you can do that in MATLAB. I will do it the simpler way, which can be less efficient, but more clear to read - using repmat:
function p = bf3(x, f, w, grid)
% Find grid points that coincide with x.
% The below compares all grid values with all x values
% and returns a matrix of 0/1. 1 is in the (row,col)
% for which grid(row)==x(col)
var = bsxfun(#eq, grid', x);
% find the logical indexes of those x entries
varx = sum(var, 1)~=0;
% and of those grid entries
varp = sum(var, 2)~=0;
% Outer-most loop removal - use repmat to
% replicate the vectors into matrices.
% Thus, instead of having a loop over j
% you have matrices of values that would be
% referenced in the loop
ww = repmat(w, numel(grid), 1);
ff = repmat(f, numel(grid), 1);
xx = repmat(x, numel(grid), 1);
gg = repmat(grid', 1, numel(x));
% perform the calculations element-wise on the matrices
sum1 = sum((ww.*ff)./(gg - xx),2);
sum2 = sum(ww./(gg - xx),2);
p = sum1./sum2;
% fix the case where grid==x and return
p(varp) = f(varx);
end
The fully vectorized version can be implemented with bsxfun rather than repmat. This can potentially be a bit faster, since the matrices are not explicitly formed. However, the speed difference may not be large for small system sizes.
Also, the first solution with one loop is also not too bad performance-wise. I suggest you test those and see, what is better. Maybe it is not worth it to fully vectorize? The first code looks a bit more readable..
I have a matrix, matrix_logical(50000,100000), that is a sparse logical matrix (a lot of falses, some true). I have to produce a matrix, intersect(50000,50000), that, for each pair, i,j, of rows of matrix_logical(50000,100000), stores the number of columns for which rows i and j have both "true" as the value.
Here is the code I wrote:
% store in advance the nonzeros cols
for i=1:50000
nonzeros{i} = num2cell(find(matrix_logical(i,:)));
end
intersect = zeros(50000,50000);
for i=1:49999
a = cell2mat(nonzeros{i});
for j=(i+1):50000
b = cell2mat(nonzeros{j});
intersect(i,j) = numel(intersect(a,b));
end
end
Is it possible to further increase the performance? It takes too long to compute the matrix. I would like to avoid the double loop in the second part of the code.
matrix_logical is sparse, but it is not saved as sparse in MATLAB because otherwise the performance become the worst possible.
Since the [i,j] entry counts the number of non zero elements in the element-wise multiplication of rows i and j, you can do it by multiplying matrix_logical with its transpose (you should convert to numeric data type first, e.g matrix_logical = single(matrix_logical)):
inter = matrix_logical * matrix_logical';
And it works both for sparse or full representation.
EDIT
In order to calculate numel(intersect(a,b))/numel(union(a,b)); (as asked in your comment), you can use the fact that for two sets a and b, you have
length(union(a,b)) = length(a) + length(b) - length(intersect(a,b))
so, you can do the following:
unLen = sum(matrix_logical,2);
tmp = repmat(unLen, 1, length(unLen)) + repmat(unLen', length(unLen), 1);
inter = matrix_logical * matrix_logical';
inter = inter ./ (tmp-inter);
If I understood you correctly, you want a logical AND of the rows:
intersct = zeros(50000, 50000)
for ii = 1:49999
for jj = ii:50000
intersct(ii, jj) = sum(matrix_logical(ii, :) & matrix_logical(jj, :));
intersct(jj, ii) = intersct(ii, jj);
end
end
Doesn't avoid the double loop, but at least works without the first loop and the slow find command.
Elaborating on my comment, here is a distance function suitable for pdist()
function out = distfun(xi,xj)
out = zeros(size(xj,1),1);
for i=1:size(xj,1)
out(i) = sum(sum( xi & xj(i,:) )) / sum(sum( xi | xj(i,:) ));
end
In my experience, sum(sum()) is faster for logicals than nnz(), thus its appearance above.
You would also need to use squareform() to reshape the output of pdist() appropriately:
squareform(pdist(martrix_logical,#distfun));
Note that pdist() includes a 'jaccard' distance measure, but it is actually the Jaccard distance and not the Jaccard index or coefficient, which is the value you are apparently after.