I'm trying parallelise some bits of a code but I do not understand why the following functions main1() and main2() give different results using Julia's multi-threading:
a = rand(4,4);b = rand(4,4);c = rand(4,4);d = rand(4,4)
function main1(a,b,c,d)
L = zeros(2,2,16)
FF = zeros(2,2,16)
FT = zeros(2,2,16)
F = Array{Float32}(undef,2,2)
# L = Array{Array{Float32, 1}, 4}
for i = 1:4
for j = 1:4
ic = i + j*(i-1)
F[1,1] = a[i,j]
F[1,2] = b[i,j]
F[2,1] = c[i,j]
F[2,2] = d[i,j]
L[:,:,ic] .= F * F'
FF[:,:,ic] .= F
FT[:,:,ic] .= F'
end
end
return L,FF,FT
end
function main2(a,b,c,d)
L = zeros(2,2,16)
FF = zeros(2,2,16)
FT = zeros(2,2,16)
F = Array{Float32}(undef,2,2)
# L = Array{Array{Float32, 1}, 4}
Threads.#threads for i = 1:4
Threads.#threads for j = 1:4
ic = i + j*(i-1)
F[1,1] = a[i,j]
F[1,2] = b[i,j]
F[2,1] = c[i,j]
F[2,2] = d[i,j]
L[:,:,ic] .= F * F'
FF[:,:,ic] .= F
FT[:,:,ic] .= F'
end
end
return L,FF,FT
end
How could the parallelisation of main1() be properly fixed?
You cannot nest #threads loops so normally you should do:
Threads.#threads for u in vec(CartesianIndices((4,4)))
i,j = u.I
# your code goes here
end
However, in your code you get the same ic value for different pair of values of (i,j). In the main1 you are overwriting the same parts of L, FF, FT many times which is an obvious bug. Multi-threading will change the order the data is overwritten so it will yields different results. In conclusion, first fix main1 and than parallelize it.
Related
I would love some help with an algorithm I've working on. My specific problem is in the while loop of the algorithm.
What I want the algorithm to do is to sequentially add a singular value in Sk(until the desired TOL is achieved) in the diagonal while making the rest of the matrix 0. The desired goal is to do this to greyscale images.
I can only get 2x2 cycles.
here is the code:
A = rand(5)*30;
[M N] = size(A);
[U S V] = svds(A);
% r = rank(A);
Sn = S(1,1);
Sk = blkdiag(Sn,zeros(N-1));
Ak = U*Sk*V;
X = A - Ak;
F = norm(X,'fro');
tol = 10;
i = 2;
while F > tol
Snk = S(i,i);
Snew = diag([Sn;Snk]);
Sk = blkdiag(Snew,zeros(N-2));
Ak = U*Sk*V';
F = norm(A-Ak,'fro');
i = i + 1;
end
I have a equation that used to compute sigma, in which i is index from 1 to N,* denotes convolution operation, Omega is image domain.
I want to implement it by matlab code. Currently, I have three options to implement the above equation. Could you look at my equation and said to me which one is correct? I spend so much time to see what is differnent amongs methods but I could not find. Thanks in advance
The different between Method 1 and Method 2 that is method 1 compute the sigma after loop but Method 2 computes it in loop.
sigma(1:row,1:col,1:dim) = nu/d;
Does it give same result?
===========Matlab code==============
Method 1
nu = 0;
d = 0;
I2 = I.^2;
[row,col] = size(I);
for i = 1:N
KuI2 = conv2(u(:,:,i).*I2,k,'same');
bc = b.*(c(:,:,i));
bcKuI = -2*bc.*conv2(u(:,:,i).*I,k,'same');
bc2Ku = bc.^2.*conv2(u(:,:,i),k,'same');
nu = nu + sum(sum(KuI2+bcKuI+bc2Ku));
ku = conv2(u(:,:,i),k,'same');
d = d + sum(sum(ku));
end
d = d + (d==0)*eps;
sigma(1:row,1:col,1:dim) = nu/d;
Method 2:
I2 = I.^2;
[row,col] = size(I);
for i = 1:dim
KuI2 = conv2(u(:,:,i).*I2,k,'same');
bc = b.*(c(:,:,i));
bcKuI = -2*bc.*conv2(u(:,:,i).*I,k,'same');
bc2Ku = bc.^2.*conv2(u(:,:,i),k,'same');
nu = sum(sum(KuI2+bcKuI+bc2Ku));
ku = conv2(u(:,:,i),k,'same');
d = sum(sum(ku));
d = d + (d==0)*eps;
sigma(1:row,1:col,i) = nu/d;
end
Method 3:
I2 = I.^2;
[row,col] = size(I);
for i = 1:dim
KuI2 = conv2(u(:,:,i).*I2,k,'same');
bc = b.*(c(:,:,i));
bcKuI = -2*bc.*conv2(u(:,:,i).*I,k,'same');
bc2Ku = bc.^2.*conv2(u(:,:,i),k,'same');
ku = conv2(u(:,:,i),k,'same');
d = ku + (ku==0)*eps;
sigma(:,:,i) = (KuI2+bcKuI+bc2Ku)./d;
end
sigma = sigma + (sigma==0).*eps;
I think that Method 1 is assume that sigma1=sigma2=...sigman because you were computed out of loop function
sigma(1:row,1:col,1:dim) = nu/d;
where nu and d are cumulative sum for each iteration.
While, the Method 2 shown that sigma1 !=sigma 2 !=..sigman because each sigma is calculated in loop function
Hope it help
I write in variable 'O' some values using
for i = 1:size(I,1)
for j = 1:size(1,I)
h = i * j;
O{h} = I(i, j) * theta(h);
end
end
I - double, theta - double.
I need to sum()all 'O' values, but when I do it its give me error: sum: wrong type argument 'cell'.
How can I sum() it?
P.s. when I want to see O(), its give me
O =
{
[1,1] = 0.0079764
[1,2] = 0.0035291
[1,3] = 0.0027539
[1,4] = 0.0034392
[1,5] = 0.017066
[1,6] = 0.0082958
[1,7] = 1.4764e-04
[1,8] = 0.0024597
[1,9] = 1.1155e-04
[1,10] = 0.0010342
[1,11] = 0.0039654
[1,12] = 0.0047713
[1,13] = 0.0054305
[1,14] = 3.3794e-04
[1,15] = 0.014323
[1,16] = 0.0026826
[1,17] = 0.013864
[1,18] = 0.0097778
[1,19] = 0.0058029
[1,20] = 0.0020726
[1,21] = 0.0016430
etc...
The exact answer to your question is to use cell2mat
sum (cell2mat (your_cell_o))
However, this is the very wrong way to solve your problem. The thing is that you should not have created a cell array in first place. You should have created a numeric array:
O = zeros (size (I), class (I));
for i = 1:rows (I)
for j = 1:columns (I)
h = i * j;
O(h) = I(i, j) * theta(h);
endfor
endfor
but even this is just really bad and slow. Octave is a language to vectorize operations. Instead, you should have:
h = (1:rows (I))' .* (1:columns (I)); # automatic broadcasting
O = I .* theta (h);
which assumes your function theta behaves properly and if givena matrix will compute the value for each of the element of h and return something of the same size.
If you get an error about wrong sizes, I will guess you have an old version of Octave that does not perform automatic broadcasting. If so, update Octave. If you really can't, then:
h = bsxfun (#times, (1:rows (I))', 1:columns (I));
I'm trying to use parfor to estimate the time it takes over 96 sec and I've more than one image to treat but I got this error:
The variable B in a parfor cannot be classified
this the code I've written:
Io=im2double(imread('C:My path\0.1s.tif'));
Io=double(Io);
In=Io;
sigma=[1.8 20];
[X,Y] = meshgrid(-3:3,-3:3);
G = exp(-(X.^2+Y.^2)/(2*1.8^2));
dim = size(In);
B = zeros(dim);
c = parcluster
matlabpool(c)
parfor i = 1:dim(1)
for j = 1:dim(2)
% Extract local region.
iMin = max(i-3,1);
iMax = min(i+3,dim(1));
jMin = max(j-3,1);
jMax = min(j+3,dim(2));
I = In(iMin:iMax,jMin:jMax);
% Compute Gaussian intensity weights.
H = exp(-(I-In(i,j)).^2/(2*20^2));
% Calculate bilateral filter response.
F = H.*G((iMin:iMax)-i+3+1,(jMin:jMax)-j+3+1);
B(i,j) = sum(F(:).*I(:))/sum(F(:));
end
end
matlabpool close
any Idea?
Unfortunately, it's actually dim that is confusing MATLAB in this case. You can fix it by doing
[n, m] = size(In);
parfor i = 1:n
for j = 1:m
B(i, j) = ...
end
end
Suppose that I have an N-by-K matrix A, N-by-P matrix B. I want to do the following calculations to get my final N-by-P matrix X.
X(n,p) = B(n,p) - dot(gamma(p,:),A(n,:))
where
gamma(p,k) = dot(A(:,k),B(:,p))/sum( A(:,k).^2 )
In MATLAB, I have my code like
for p = 1:P
for n = 1:N
for k = 1:K
gamma(p,k) = dot(A(:,k),B(:,p))/sum(A(:,k).^2);
end
x(n,p) = B(n,p) - dot(gamma(p,:),A(n,:));
end
end
which are highly inefficient since it uses three for loops! Is there a good way to speed up this code?
Use bsxfun for the division and matrix multiplication for the loops:
gamma = bsxfun(#rdivide, B.'*A, sum(A.^2));
x = B - A*gamma.';
And here is a test script
N = 3;
K = 4;
P = 5;
A = rand(N, K);
B = rand(N, P);
for p = 1:P
for n = 1:N
for k = 1:K
gamma(p,k) = dot(A(:,k),B(:,p))/sum(A(:,k).^2);
end
x(n,p) = B(n,p) - dot(gamma(p,:),A(n,:));
end
end
gamma2 = bsxfun(#rdivide, B.'*A, sum(A.^2));
X2 = B - A*gamma2.';
isequal(x, X2)
isequal(gamma, gamma2)
which returns
ans =
1
ans =
1
It looks to me like you can hoist the gamma calculations out of the loop; at least, I don't see any dependencies on N in the gamma calculations.
So something like this:
for p = 1:P
for k = 1:K
gamma(p,k) = dot(A(:,k),B(:,p))/sum(A(:,k).^2);
end
end
for p = 1:P
for n = 1:N
x(n,p) = B(n,p) - dot(gamma(p,:),A(n,:));
end
end
I'm not familiar enough with your code (or matlab) to really know if you can merge the two loops, but if you can:
for p = 1:P
for k = 1:K
gamma(p,k) = dot(A(:,k),B(:,p))/sum(A(:,k).^2);
end
for n = 1:N
x(n,p) = B(n,p) - dot(gamma(p,:),A(n,:));
end
end
bxfun is slow...
How about something like the following (I might have a transpose wrong)
modA = A * (1./sum(A.^2,2)) * ones(1,k);
gamma = B' * modA;
x = B - A * gamma';