I am new to matlab so I do not know all the shortcuts matlab has to make the code more efficient and faster. I have been hacking together something in matlab for a homework assignment while focusing on completing the assignment rather than efficiency. Now I'm finding that I'm spending more time waiting on the program than actually coding it. Below is a headache of nested for loops that takes forever to finish. Is there a faster or efficient way of coding this without so many forloops?
for i = 1:ysize
for j = 1:xsize
MArr = zeros(windowSize^2, 2, 2);
for i2 = i - floor(windowSize/2): i + floor(windowSize/2)
if i2 > 0 && i2 < ysize + 1
for j2 = j - floor(windowSize/2): j + floor(windowSize/2)
if j2 > 0 && j2 < xsize + 1
mat = weight*[mappedGX(i2,j2)^2, mappedGX(i2,j2)*mappedGY(i2,j2); mappedGX(i2,j2)*mappedGY(i2,j2), mappedGY(i2,j2)^2];
for i3 = 1:2
for j3 = 1:2
MArr(windowSize*(j2-(j - floor(windowSize/2))+1) + (i2-(i - floor(windowSize/2)) + 1),i3,j3) = mat(i3,j3);
end
end
end
end
end
end
Msum = zeros(2,2);
for k = size(MArr)
for i2 = 1:2
for j2 = 1:2
Msum = Msum + MArr(k,i2,j2);
end
end
end
R(i,j) = det(Msum) - alpha*(trace(Msum)^2);
R = -1 * R;
end
end
Instead of looping, use colons. For example:
for i3 = 1:2
for j3 = 1:2
MArr(windowSize*(j2-(j - floor(windowSize/2))+1) + (i2-(i - floor(windowSize/2)) + 1),i3,j3) = mat(i3,j3);
end
end
Can be written as:
MArr(windowSize*(j2-(j-floor(windowSize/2))+1)+(i2-(i-floor(windowSize/2))+1),:,:)=mat;
After you find all places where this can be done, learn to use indexing instead of looping, e.g.,
i2 = i - floor(windowSize/2): i + floor(windowSize/2);
i2=i2(i2>0 && i2<ysize+1);
j2 = j - floor(windowSize/2): j + floor(windowSize/2);
j2=j2(j2>0 && j2<xsize+1);
mat = weight*[mappedGX(i2,j2)^2, mappedGX(i2,j2)*mappedGY(i2,j2);
(Note for advanced users: the last line may not work if mappedGX is a matrix, and i2/j2 don't represent a rectangular sub-matrix. In such a case you will need sub2ind())
Related
I want to apply following transformation function to a grayscale image, i know how to apply it to the following function,
my question is how do i apply a program to the following transformation function,
code so far,
clear;
pollen = imread('Fig3.10(b).jpg');
u = double(pollen);
[nx ny] = size(u)
nshades = 256;
r1 = 80; s1 = 10; % Transformation by piecewise linear function.
r2 = 140; s2 = 245;
for i = 1:nx
for j = 1:ny
if (u(i,j)< r1)
uspread(i,j) = ((s1-0)/(r1-0))*u(i,j)
end
if ((u(i,j)>=r1) & (u(i,j)<= r2))
uspread(i,j) = ((s2 - s1)/(r2 - r1))*(u(i,j) - r1)+ s1;
end
if (u(i,j)>r2)
uspread(i,j) = ((255 - s2)/(255 - r2))*(u(i,j) - r2) + s2;
end
end
end
hist= zeros(nshades,1);
for i=1:nx
for j=1:ny
for k=0:nshades-1
if uspread(i,j)==k
hist(k+1)=hist(k+1)+1;
end
end
end
end
plot(hist);
pollenspreadmat = uint8(uspread);
imwrite(pollenspreadmat, 'pollenspread.jpg');
Thanks in advance
The figure says that for any intensities that are between A and B, they should be set to C. All you have to do is modify your two for loops so that for any values between A and B, set the output location to C. I'll also assume the range is inclusive. You can simply remove the first and last if conditions and use the middle one:
for i = 1:nx
for j = 1:ny
if ((u(i,j)>=r1) && (u(i,j)<= r2))
uspread(i,j) = C;
end
end
end
C is a constant that you would set yourself. Usually for segmentation, this result is very high to distinguish the foreground from the background. You have a uint8 image here, so C = 255; would work.
However, I would recommend you achieve a more vectorized solution. Avoid for loops and use logical indexing instead:
uspread = u;
uspread(u >= r1 & u <= r2) = C;
In the problem Im working on there is such a part of code, as shown below. The definition part is just to show you the sizes of arrays. Below I pasted vectorized version - and it is >2x slower. Why it happens so? I know that i happens if vectorization requiers large temporary variables, but (it seems) it is not true here.
And generally, what (other than parfor, with I already use) can I do to speed up this code?
maxN = 100;
levels = maxN+1;
xElements = 101;
umn = complex(zeros(levels, levels));
umn2 = umn;
bessels = ones(xElements, xElements, levels); % 1.09 GB
posMcontainer = ones(xElements, xElements, maxN);
tic
for j = 1 : xElements
for i = 1 : xElements
for n = 1 : 2 : maxN
nn = n + 1;
mm = 1;
for m = 1 : 2 : n
umn(nn, mm) = bessels(i, j, nn) * posMcontainer(i, j, m);
mm = mm + 1;
end
end
end
end
toc % 0.520594 seconds
tic
for j = 1 : xElements
for i = 1 : xElements
for n = 1 : 2 : maxN
nn = n + 1;
m = 1:2:n;
numOfEl = ceil(n/2);
umn2(nn, 1:numOfEl) = bessels(i, j, nn) * posMcontainer(i, j, m);
end
end
end
toc % 1.275926 seconds
sum(sum(umn-umn2)) % veryfying, if all done right
Best regards,
Alex
From the profiler:
Edit:
In reply to #Jason answer, this alternative takes the same time:
for n = 1:2:maxN
nn(n) = n + 1;
numOfEl(n) = ceil(n/2);
end
for j = 1 : xElements
for i = 1 : xElements
for n = 1 : 2 : maxN
umn2(nn(n), 1:numOfEl(n)) = bessels(i, j, nn(n)) * posMcontainer(i, j, 1:2:n);
end
end
end
Edit2:
In reply to #EBH :
The point is to do the following:
parfor i = 1 : xElements
for j = 1 : xElements
umn = complex(zeros(levels, levels)); % cleaning
for n = 0:maxN
mm = 1;
for m = -n:2:n
nn = n + 1; % for indexing
if m < 0
umn(nn, mm) = bessels(i, j, nn) * negMcontainer(i, j, abs(m));
end
if m > 0
umn(nn, mm) = bessels(i, j, nn) * posMcontainer(i, j, m);
end
if m == 0
umn(nn, mm) = bessels(i, j, nn);
end
mm = mm + 1; % for indexing
end % m
end % n
beta1 = sum(sum(Aj1.*umn));
betaSumSq1(i, j) = abs(beta1).^2;
beta2 = sum(sum(Aj2.*umn));
betaSumSq2(i, j) = abs(beta2).^2;
end % j
end % i
I speeded it up as much, as I was able to. What you have written is taking only the last bessels and posMcontainer values, so it does not produce the same result. In the real code, those two containers are filled not with 1, but with some precalculated values.
After your edit, I can see that umn is just a temporary variable for another calculation. It still can be mostly vectorizable:
betaSumSq1 = zeros(xElements); % preallocating
betaSumSq2 = zeros(xElements); % preallocating
% an index matrix to fetch the right values from negMcontainer and
% posMcontainer:
indmat = tril(repmat([0 1;1 0],ceil((maxN+1)/2),floor(levels/2)));
indmat(end,:) = [];
% an index matrix to fetch the values in correct order for umn:
b_ind = repmat([1;0],ceil((maxN+1)/2),1);
b_ind(end) = [];
tempind = logical([fliplr(indmat) b_ind indmat+triu(ones(size(indmat)))]);
% permute the arrays to prevent squeeze:
PM = permute(posMcontainer,[3 1 2]);
NM = permute(negMcontainer,[3 1 2]);
B = permute(bessels,[3 1 2]);
for k = 1 : maxN+1 % third dim
for jj = 1 : xElements % columns
b = B(:,jj,k); % get one vector of B
% perform b*NM for every row of NM*indmat, than flip the result:
neg = fliplr(bsxfun(#times,bsxfun(#times,indmat,NM(:,jj,k).'),b));
% perform b*PM for every row of PM*indmat:
pos = bsxfun(#times,bsxfun(#times,indmat,PM(:,jj,k).'),b);
temp = [neg mod(1:levels,2).'.*b pos].'; % concat neg and pos
% assign them to the right place in umn:
umn = reshape(temp(tempind.'),[levels levels]).';
beta1 = Aj1.*umn;
betaSumSq1(jj,k) = abs(sum(beta1(:))).^2;
beta2 = Aj2.*umn;
betaSumSq2(jj,k) = abs(sum(beta2(:))).^2;
end
end
This reduce running time from ~95 seconds to less 3 seconds (both without parfor), so it improves in almost 97%.
I would suspect it is memory allocation. You are re-allocating the m array in a 3 deep loop.
try rearranging the code:
tic
for n = 1 : 2 : maxN
nn = n + 1;
m = 1:2:n;
numOfEl = ceil(n/2);
for j = 1 : xElements
for i = 1 : xElements
umn2(nn, 1:numOfEl) = bessels(i, j, nn) * posMcontainer(i, j, m);
end
end
end
toc % 1.275926 seconds
I was trying this in Igor pro, which a similar language, but with different optimizations. So the direct translations don't time the same way as Matlab (vectorized was slightly faster in Igor). But reordering the loops did speed up the vectorized form.
In your second part of the code, that is setting umn2, inside the loops, you have:
nn = n + 1;
m = 1:2:n;
numOfEl = ceil(n/2);
Those 3 lines don't require any input from the i and j loops, they only use the n loop. So reordering the loops such that i and j are inside the n loop will mean that those 3 lines are done xElements^2 (100^2) times less often. I suspect it is that m = 1:2:n line that takes time, since that is allocating an array.
I am having problems with the following loop, since it is taking too much time. Hence, I would like to use parallel processing, specifically parfor function.
P = numel(scaleX); % quite BIG number
sz = P;
start = 1;
sqrL = 10; % sqr len
e = 200;
A = false(sz, sz);
for m = sz-sqrL/2:(-1)*sqrL:start
for n = M(m):-sqrL:1
temp = [scaleX(m), scaleY(m); scaleX(n), scaleY(n)];
d = pdist(temp, 'euclidean');
if d < e
A(m, n) = 1;
end
end
end
Can anyone, please, help me to convert the outer 'far' loop into 'parfor' in this code?
Suppose that I have an N-by-K matrix A, N-by-P matrix B. I want to do the following calculations to get my final N-by-P matrix X.
X(n,p) = B(n,p) - dot(gamma(p,:),A(n,:))
where
gamma(p,k) = dot(A(:,k),B(:,p))/sum( A(:,k).^2 )
In MATLAB, I have my code like
for p = 1:P
for n = 1:N
for k = 1:K
gamma(p,k) = dot(A(:,k),B(:,p))/sum(A(:,k).^2);
end
x(n,p) = B(n,p) - dot(gamma(p,:),A(n,:));
end
end
which are highly inefficient since it uses three for loops! Is there a good way to speed up this code?
Use bsxfun for the division and matrix multiplication for the loops:
gamma = bsxfun(#rdivide, B.'*A, sum(A.^2));
x = B - A*gamma.';
And here is a test script
N = 3;
K = 4;
P = 5;
A = rand(N, K);
B = rand(N, P);
for p = 1:P
for n = 1:N
for k = 1:K
gamma(p,k) = dot(A(:,k),B(:,p))/sum(A(:,k).^2);
end
x(n,p) = B(n,p) - dot(gamma(p,:),A(n,:));
end
end
gamma2 = bsxfun(#rdivide, B.'*A, sum(A.^2));
X2 = B - A*gamma2.';
isequal(x, X2)
isequal(gamma, gamma2)
which returns
ans =
1
ans =
1
It looks to me like you can hoist the gamma calculations out of the loop; at least, I don't see any dependencies on N in the gamma calculations.
So something like this:
for p = 1:P
for k = 1:K
gamma(p,k) = dot(A(:,k),B(:,p))/sum(A(:,k).^2);
end
end
for p = 1:P
for n = 1:N
x(n,p) = B(n,p) - dot(gamma(p,:),A(n,:));
end
end
I'm not familiar enough with your code (or matlab) to really know if you can merge the two loops, but if you can:
for p = 1:P
for k = 1:K
gamma(p,k) = dot(A(:,k),B(:,p))/sum(A(:,k).^2);
end
for n = 1:N
x(n,p) = B(n,p) - dot(gamma(p,:),A(n,:));
end
end
bxfun is slow...
How about something like the following (I might have a transpose wrong)
modA = A * (1./sum(A.^2,2)) * ones(1,k);
gamma = B' * modA;
x = B - A * gamma';
I have been running a MATLAB program for almost six hours now, and it is still not complete. It is cycling through three while loops (the outer two loops are n=855, the inner loop is n=500). Is this a surprise that it is taking this long? Is there anything I can do to increase the speed? I am including the code below, as well as the variable data types underneath that.
while i < (numAtoms + 1)
pointAccessible = ones(numPoints,1);
j = 1;
while j <(numAtoms + 1)
if (i ~= j)
k=1;
while k < (numPoints + 1)
if (pointAccessible(k) == 1)
sphereCoord = [cell2mat(atomX(i)) + p + sphereX(k), cell2mat(atomY(i)) + p + sphereY(k), cell2mat(atomZ(i)) + p + sphereZ(k)];
neighborCoord = [cell2mat(atomX(j)), cell2mat(atomY(j)), cell2mat(atomZ(j))];
coords(1,:) = [sphereCoord];
coords(2,:) = [neighborCoord];
if (pdist(coords) < (atomRadius(j) + p))
pointAccessible(k)=0;
end
end
k = k + 1;
end
end
j = j+1;
end
remainingPoints(i) = sum(pointAccessible);
i = i +1;
end
Variable Data Types:
numAtoms = 855
numPoints = 500
p = 1.4
atomRadius = <855 * 1 double>
pointAccessible = <500 * 1 double>
atomX, atomY, atomZ = <1 * 855 cell>
sphereX, sphereY, sphereZ = <500 * 1 double>
remainingPoints = <855 * 1 double>