Can this operation be vectorized in Octave? - matrix

I have a matrix A and B. I want to take the sum of squares errors between them ss = sum(sum( (A-B).^2 )), but I only want to do so if NEITHER matrix elements are identically zero. For now, I am going through each matrix as follows:
for i = 1:N
for j = 1:M
if( A(i,j) == 0 )
B(i,j) = 0;
elseif( B(i,j) == 0 )
A(i,j) = 0;
end
end
end
and then taking the sum of squares after that. Is there a way to vectorize the comparison and reassigning of values?

If you were just trying to achieve what the listed code is doing, but in a vectorized fashion, you can use this approach -
%// Create mask to set elements in both A and B to zeros
mask = A==0 | B==0
%// Set A and B to zeros at places where mask has TRUE values
A(mask) = 0
B(mask) = 0
If the bigger context of finding the sum of squares errors after the listed code could be considered, you can do so with this -
df = A - B;
df(A==0 | B==0) = 0;
ss_vectorized = sum(df(:).^2);
Or as #carandraug commented, you can use the built-in sumsq for the sum of squares calculation at the last step -
ss_vectorized = sumsq(df(:));

Related

Image Processing: Algorithm taking too long in MATLAB

I am working in MATLAB to process two 512x512 images, the domain image and the range image. What I am trying to accomplish is the following:
Divide both domain and range images into 8x8 pixel blocks
For each 8x8 block in the domain image, I have to apply a linear transformations to it and compare each of the 4096 transformed blocks with each of the 4096 range blocks.
Compute error in each case between the transformed block and the range image block and find the minimum error.
Finally I'll have for each 8x8 range block, the id of the 8x8 domain block for which the error was minimum (error between the range block and the transformed domain block)
To achieve this, I have written the following code:
RangeImagecolor = imread('input.png'); %input is 512x512
DomainImagecolor = imread('input.png'); %Range and Domain images are identical
RangeImagetemp = rgb2gray(RangeImagecolor);
DomainImagetemp = rgb2gray(DomainImagecolor);
RangeImage = im2double(RangeImagetemp);
DomainImage = im2double(DomainImagetemp);
%For the (k,l)th 8x8 range image block
for k = 1:64
for l = 1:64
minerror = 9999;
min_i = 0;
min_j = 0;
for i = 1:64
for j = 1:64
%here I compute for the (i,j)th domain block, the transformed domain block stored in D_trans
error = 0;
D_trans = zeros(8,8);
R = zeros(8,8); %Contains the pixel values of the (k,l)th range block
for m = 1:8
for n = 1:8
R(m,n) = RangeImage(8*k-8+m,8*l-8+n);
%ApplyTransformation can depend on (k,l) so I can't compute the transformation outside the k,l loop.
[m_dash,n_dash] = ApplyTransformation(8*i-8+m,8*j-8+n);
D_trans(m,n) = DomainImage(m_dash,n_dash);
error = error + (R(m,n)-D_trans(m,n))^2;
end
end
if(error < minerror)
minerror = error;
min_i = i;
min_j = j;
end
end
end
end
end
As an example ApplyTransformation, one can use the identity transformation:
function [x_dash,y_dash] = Iden(x,y)
x_dash = x;
y_dash = y;
end
Now the problem I am facing is the high computation time. The order of computation in the above code is 64^5, which is of the order 10^9. This computation should take at the worst minutes or an hour. It takes about 40 minutes to compute just 50 iterations. I don't know why the code is running so slow.
Thanks for reading my question.
You can use im2col* to convert the image to column format so each block forms a column of a [64 * 4096] matrix. Then apply transformation to each column and use bsxfun to vectorize computation of error.
DomainImage=rand(512);
RangeImage=rand(512);
DomainImage_col = im2col(DomainImage,[8 8],'distinct');
R = im2col(RangeImage,[8 8],'distinct');
[x y]=ndgrid(1:8);
function [x_dash, y_dash] = ApplyTransformation(x,y)
x_dash = x;
y_dash = y;
end
[x_dash, y_dash] = ApplyTransformation(x,y);
idx = sub2ind([8 8],x_dash, y_dash);
D_trans = DomainImage_col(idx,:); %transformation is reduced to matrix indexing
Error = 0;
for mn = 1:64
Error = Error + bsxfun(#minus,R(mn,:),D_trans(mn,:).').^2;
end
[minerror ,min_ij]= min(Error,[],2); % linear index of minimum of each block;
[min_i min_j]=ind2sub([64 64],min_ij); % convert linear index to subscript
Explanation:
Our goal is to reduce number of loops as much as possible. For it we should avoid matrix indexing and instead we should use vectorization. Nested loops should be converted to one loop. As the first step we can create a more optimized loop as here:
min_ij = zeros(4096,1);
for kl = 1:4096 %%% => 1:size(D_trans,2)
minerror = 9999;
min_ij(kl) = 0;
for ij = 1:4096 %%% => 1:size(R,2)
Error = 0;
for mn = 1:64
Error = Error + (R(mn,kl) - D_trans(mn,ij)).^2;
end
if(Error < minerror)
minerror = Error;
min_ij(kl) = ij;
end
end
end
We can re-arrange the loops and we can make the most inner loop as the outer loop and separate computation of the minimum from the computation of the error.
% Computation of the error
Error = zeros(4096,4096);
for mn = 1:64
for kl = 1:4096
for ij = 1:4096
Error(kl,ij) = Error(kl,ij) + (R(mn,kl) - D_trans(mn,ij)).^2;
end
end
end
% Computation of the min
min_ij = zeros(4096,1);
for kl = 1:4096
minerror = 9999;
min_ij(kl) = 0;
for ij = 1:4096
if(Error(kl,ij) < minerror)
minerror = Error(kl,ij);
min_ij(kl) = ij;
end
end
end
Now the code is arranged in a way that can best be vectorized:
Error = 0;
for mn = 1:64
Error = Error + bsxfun(#minus,R(mn,:),D_trans(mn,:).').^2;
end
[minerror ,min_ij] = min(Error, [], 2);
[min_i ,min_j] = ind2sub([64 64], min_ij);
*If you don't have the Image Processing Toolbox a more efficient implementation of im2col can be found here.
*The whole computation takes less than a minute.
First things first - your code doesn't do anything. But you likely do something with this minimum error stuff and only forgot to paste this here, or still need to code that bit. Never mind for now.
One big issue with your code is that you calculate transformation for 64x64 blocks of resulting image AND source image. 64^5 iterations of a complex operation are bound to be slow. Rather, you should calculate all transformations at once and save them.
allTransMats = cell(64);
for i = 1 : 64
for j = 1 : 64
allTransMats{i,j} = getTransformation(DomainImage, i, j)
end
end
function D_trans = getTransformation(DomainImage, i,j)
D_trans = zeros(8);
for m = 1 : 8
for n = 1 : 8
[m_dash,n_dash] = ApplyTransformation(8*i-8+m,8*j-8+n);
D_trans(m,n) = DomainImage(m_dash,n_dash);
end
end
end
This serves to get allTransMat and is OUTSIDE the k, l loop. Preferably as a simple function.
Now, you make your big k, l, i, j loop, where you compare all the elements as needed. Comparison could be also done block-wise instead of filling a small 8x8 matrix, yet doing it per element for some reason.
m = 1 : 8;
n = m;
for ...
R = RangeImage(...); % This will give 8x8 output as n and m are vectors.
D = allTransMats{i,j};
difference = sum(sum((R-D).^2));
if (difference < minDifference) ...
end
Even though this is a simple no transformations case, this speeds up code a lot.
Finally, are you sure you need to compare each block of transformed output with each block in the source? Typically you compare block1(a,b) with block2(a,b) - blocks (or pixels) on the same position.
EDIT: allTransMats requires k and l too. Ouch. There is NO WAY to make this fast for a single iteration, as you require 64^5 calls to ApplyTransformation (or a vectorization of that function, but even then it might not be fast - we would have to see the function to help here).
Therefore, I will re-iterate my advice to generate all transformations and then perform lookup: this upper part of the answer with allTransMats generation should be changed to have all 4 loops and generate allTransMats{i,j,k,l};. It WILL be slow, there is no way around that as I mentioned in the upper part of edit. But, it is a cost you pay once, as after saving the allTransMats, all further image analyses will be able to simply load it instead of generating it again.
But ... what do you even do? Transformation that depends on source and destination block indices plus pixel indices (= 6 values total) sounds like a mistake somewhere, or a prime candidate to optimize instead of all the rest.

K-means for image compression only gives black-and-white result

I'm doing this exercise by Andrew NG about using k-means to reduce the number of colors in an image. But the problem is my code only gives a black-and-white image :( . I have checked every step in the algorithm but it still won't give the correct result. Please help me, thank you very much
Here is the link of the exercise, and here is the dataset.
The correct result is given in the link of the exercise. And here is my black-and-white image:
Here is my code:
function [] = KMeans()
Image = double(imread('bird_small.tiff'));
[rows,cols, RGB] = size(Image);
Points = reshape(Image,rows * cols, RGB);
K = 16;
Centroids = zeros(K,RGB);
s = RandStream('mt19937ar','Seed',0);
% Initialization :
% Pick out K random colours and make sure they are all different
% from each other! This prevents the situation where two of the means
% are assigned to the exact same colour, therefore we don't have to
% worry about division by zero in the E-step
% However, if K = 16 for example, and there are only 15 colours in the
% image, then this while loop will never exit!!! This needs to be
% addressed in the future :(
% TODO : Vectorize this part!
done = false;
while done == false
RowIndex = randperm(s,rows);
ColIndex = randperm(s,cols);
RowIndex = RowIndex(1:K);
ColIndex = ColIndex(1:K);
for i = 1 : K
for j = 1 : RGB
Centroids(i,j) = Image(RowIndex(i),ColIndex(i),j);
end
end
Centroids = sort(Centroids,2);
Centroids = unique(Centroids,'rows');
if size(Centroids,1) == K
done = true;
end
end;
% imshow(imread('bird_small.tiff'))
%
% for i = 1 : K
% hold on;
% plot(RowIndex(i),ColIndex(i),'r+','MarkerSize',50)
% end
eps = 0.01; % Epsilon
IterNum = 0;
while 1
% E-step: Estimate membership given parameters
% Membership: The centroid that each colour is assigned to
% Parameters: Location of centroids
Dist = pdist2(Points,Centroids,'euclidean');
[~, WhichCentroid] = min(Dist,[],2);
% M-step: Estimate parameters given membership
% Membership: The centroid that each colour is assigned to
% Parameters: Location of centroids
% TODO: Vectorize this part!
OldCentroids = Centroids;
for i = 1 : K
PointsInCentroid = Points((find(WhichCentroid == i))',:);
NumOfPoints = size(PointsInCentroid,1);
% Note that NumOfPoints is never equal to 0, as a result of
% the initialization. Or .... ???????
if NumOfPoints ~= 0
Centroids(i,:) = sum(PointsInCentroid , 1) / NumOfPoints ;
end
end
% Check for convergence: Here we use the L2 distance
IterNum = IterNum + 1;
Margins = sqrt(sum((Centroids - OldCentroids).^2, 2));
if sum(Margins > eps) == 0
break;
end
end
IterNum;
Centroids ;
% Load the larger image
[LargerImage,ColorMap] = imread('bird_large.tiff');
LargerImage = double(LargerImage);
[largeRows,largeCols,~] = size(LargerImage); % RGB is always 3
% Dist = zeros(size(Centroids,1),RGB);
% TODO: Vectorize this part!
% Replace each of the pixel with the nearest centroid
for i = 1 : largeRows
for j = 1 : largeCols
Dist = pdist2(Centroids,reshape(LargerImage(i,j,:),1,RGB),'euclidean');
[~,WhichCentroid] = min(Dist);
LargerImage(i,j,:) = Centroids(WhichCentroid);
end
end
% Display new image
imshow(uint8(round(LargerImage)),ColorMap)
imwrite(uint8(round(LargerImage)), 'D:\Hoctap\bird_kmeans.tiff');
You're indexing into Centroids with a single linear index.
Centroids(WhichCentroid)
This is going to return a single value (specifically the red value for that centroid). When you assign this to LargerImage(i,j,:), it will assign all RGB channels the same value resulting in a grayscale image.
You likely want to grab all columns of the selected centroid to provide an array of red, green, and blue values that you want to assign to LargerImage(i,j,:). You can do by using a colon : to specify all columns of Centroids which belong to the row indicated by WhichCentroid.
LargerImage(i,j,:) = Centroids(WhichCentroid,:);

K-means for color quantization - Code not vectorized

I'm doing this exercise by Andrew NG about using k-means to reduce the number of colors in an image. It worked correctly but I'm afraid it's a little slow because of all the for loops in the code, so I'd like to vectorize them. But there are those loops that I just can't seem to vectorize effectively. Please help me, thank you very much!
Also if possible please give some feedback on my coding style :)
Here is the link of the exercise, and here is the dataset.
The correct result is given in the link of the exercise.
And here is my code:
function [] = KMeans()
Image = double(imread('bird_small.tiff'));
[rows,cols, RGB] = size(Image);
Points = reshape(Image,rows * cols, RGB);
K = 16;
Centroids = zeros(K,RGB);
s = RandStream('mt19937ar','Seed',0);
% Initialization :
% Pick out K random colours and make sure they are all different
% from each other! This prevents the situation where two of the means
% are assigned to the exact same colour, therefore we don't have to
% worry about division by zero in the E-step
% However, if K = 16 for example, and there are only 15 colours in the
% image, then this while loop will never exit!!! This needs to be
% addressed in the future :(
% TODO : Vectorize this part!
done = false;
while done == false
RowIndex = randperm(s,rows);
ColIndex = randperm(s,cols);
RowIndex = RowIndex(1:K);
ColIndex = ColIndex(1:K);
for i = 1 : K
for j = 1 : RGB
Centroids(i,j) = Image(RowIndex(i),ColIndex(i),j);
end
end
Centroids = sort(Centroids,2);
Centroids = unique(Centroids,'rows');
if size(Centroids,1) == K
done = true;
end
end;
% imshow(imread('bird_small.tiff'))
%
% for i = 1 : K
% hold on;
% plot(RowIndex(i),ColIndex(i),'r+','MarkerSize',50)
% end
eps = 0.01; % Epsilon
IterNum = 0;
while 1
% E-step: Estimate membership given parameters
% Membership: The centroid that each colour is assigned to
% Parameters: Location of centroids
Dist = pdist2(Points,Centroids,'euclidean');
[~, WhichCentroid] = min(Dist,[],2);
% M-step: Estimate parameters given membership
% Membership: The centroid that each colour is assigned to
% Parameters: Location of centroids
% TODO: Vectorize this part!
OldCentroids = Centroids;
for i = 1 : K
PointsInCentroid = Points((find(WhichCentroid == i))',:);
NumOfPoints = size(PointsInCentroid,1);
% Note that NumOfPoints is never equal to 0, as a result of
% the initialization. Or .... ???????
if NumOfPoints ~= 0
Centroids(i,:) = sum(PointsInCentroid , 1) / NumOfPoints ;
end
end
% Check for convergence: Here we use the L2 distance
IterNum = IterNum + 1;
Margins = sqrt(sum((Centroids - OldCentroids).^2, 2));
if sum(Margins > eps) == 0
break;
end
end
IterNum;
Centroids ;
% Load the larger image
[LargerImage,ColorMap] = imread('bird_large.tiff');
LargerImage = double(LargerImage);
[largeRows,largeCols,NewRGB] = size(LargerImage); % RGB is always 3
% TODO: Vectorize this part!
largeRows
largeCols
NewRGB
% Replace each of the pixel with the nearest centroid
NewPoints = reshape(LargerImage,largeRows * largeCols, NewRGB);
Dist = pdist2(NewPoints,Centroids,'euclidean');
[~,WhichCentroid] = min(Dist,[],2);
NewPoints = Centroids(WhichCentroid,:);
LargerImage = reshape(NewPoints,largeRows,largeCols,NewRGB);
% for i = 1 : largeRows
% for j = 1 : largeCols
% Dist = pdist2(Centroids,reshape(LargerImage(i,j,:),1,RGB),'euclidean');
% [~,WhichCentroid] = min(Dist);
% LargerImage(i,j,:) = Centroids(WhichCentroid,:);
% end
% end
% Display new image
imshow(uint8(round(LargerImage)),ColorMap)
UPDATE: Replaced
for i = 1 : K
for j = 1 : RGB
Centroids(i,j) = Image(RowIndex(i),ColIndex(i),j);
end
end
with
for i = 1 : K
Centroids(i,:) = Image(RowIndex(i),ColIndex(i),:);
end
I think this may be vectorized further by using linear indexing, but for now I should just focus on the while loop since it takes most of the time.
Also when I tried #Dev-iL's suggestion and replaced
for i = 1 : K
PointsInCentroid = Points((find(WhichCentroid == i))',:);
NumOfPoints = size(PointsInCentroid,1);
% Note that NumOfPoints is never equal to 0, as a result of
% the initialization. Or .... ???????
if NumOfPoints ~= 0
Centroids(i,:) = sum(PointsInCentroid , 1) / NumOfPoints ;
end
end
with
E = sparse(1:size(WhichCentroid), WhichCentroid' , 1, Num, K, Num);
Centroids = (E * spdiags(1./sum(E,1)',0,K,K))' * Points ;
the results were always worse: With K = 16, the first takes 2,414s , the second takes 2,455s ; K = 32, the first takes 4,529s , the second takes 5,022s. Seems like vectorization does not help, but maybe there's something wrong with my code :( .
Replaced
for i = 1 : K
for j = 1 : RGB
Centroids(i,j) = Image(RowIndex(i),ColIndex(i),j);
end
end
with
for i = 1 : K
Centroids(i,:) = Image(RowIndex(i),ColIndex(i),:);
end
I think this may be vectorized further by using linear indexing, but for now I should just focus on the while loop since it takes most of the time.
Also when I tried #Dev-iL's suggestion and replaced
for i = 1 : K
PointsInCentroid = Points((find(WhichCentroid == i))',:);
NumOfPoints = size(PointsInCentroid,1);
% Note that NumOfPoints is never equal to 0, as a result of
% the initialization. Or .... ???????
if NumOfPoints ~= 0
Centroids(i,:) = sum(PointsInCentroid , 1) / NumOfPoints ;
end
end
with
E = sparse(1:size(WhichCentroid), WhichCentroid' , 1, Num, K, Num);
Centroids = (E * spdiags(1./sum(E,1)',0,K,K))' * Points ;
the results were always worse: With K = 16, the first takes 2,414s , the second takes 2,455s ; K = 32, the first took 4,529s , the second took 5,022s. Seems like vectorization did not help in this case.
However, when I replaced
Dist = pdist2(Points,Centroids,'euclidean');
[~, WhichCentroid] = min(Dist,[],2);
(in the while loop) with
Dist = bsxfun(#minus,dot(Centroids',Centroids',1)' / 2 , Centroids * Points' );
[~, WhichCentroid] = min(Dist,[],1);
WhichCentroid = WhichCentroid';
the code ran much faster, especially when K is large (K=32)
Thank you everyone!

MATLAB program takes more than 1 hour to execute

The below program is a program for finding k-clique communities from a input graph.
The graph dataset can be found here.
The first line of the dataset contains 'number of nodes and edges' respectively. The following lines have 'node1 node2' representing an edge between node1 and node2 .
For example:
2500 6589 // number_of_nodes, number_of_edges
0 5 // edge between node[0] and node[5]
.
.
.
The k-clique( aCliqueSIZE, anAdjacencyMATRIX ) function is contained here.
The following commands are executed in command window of MATLAB:
x = textread( 'amazon.graph.small' ); %% source input file text
s = max(x(1,1), x(1,2)); %% take largest dimemsion
adjMatrix = sparse(x(2:end,1)+1, x(2:end,2)+1, 1, s, s); %% now matrix is square
adjMatrix = adjMatrix | adjMatrix.'; %% apply "or" with transpose to make symmetric
adjMatrix = full(adjMatrix); %% convert to full if needed
k=4;
[X,Y,Z]=k_clique(k,adjMatrix); %%
% The output can be viewed by the following commands
celldisp(X);
celldisp(Y);
Z
The above program takes more than 1 hour to execute whereas I think this shouldn't be the case. While running the program on windows, I checked the task manager and found that only 500 MB is allocated for the program. Is this the reason for the slowness of the program? If yes, then how can I allocate more heap memory (close to 4GB) to this program in MATLAB?
The problem does not seem to be Memory-bound
Having a sparse, square, symmetric matrix of 6k5 * 6k5 edges does not mean a big memory.
The provided code has many for loops and is heavily recursive in the tail function transfer_nodes()
Add a "Stone-Age-Profiler" into the code
To show the respective times spent on a CPU-bound sections of the processing, wrap the main sections of the code into a construct of:
tic(); for .... end;toc()
which will print you the CPU-bound times spent on relevent sections of the k_clique.m code, showing the readings "on-the-fly"
Your original code k_clique.m
function [components,cliques,CC] = k_clique(k,M)
% k-clique algorithm for detecting overlapping communities in a network
% as defined in the paper "Uncovering the overlapping
% community structure of complex networks in nature and society"
%
% [X,Y,Z] = k_clique(k,A)
%
% Inputs:
% k - clique size
% A - adjacency matrix
%
% Outputs:
% X - detected communities
% Y - all cliques (i.e. complete subgraphs that are not parts of larger
% complete subgraphs)
% Z - k-clique matrix
nb_nodes = size(M,1); % number of nodes
% Find the largest possible clique size via the degree sequence:
% Let {d1,d2,...,dk} be the degree sequence of a graph. The largest
% possible clique size of the graph is the maximum value k such that
% dk >= k-1
degree_sequence = sort(sum(M,2) - 1,'descend');
%max_s = degree_sequence(1);
max_s = 0;
for i = 1:length(degree_sequence)
if degree_sequence(i) >= i - 1
max_s = i;
else
break;
end
end
cliques = cell(0);
% Find all s-size kliques in the graph
for s = max_s:-1:3
M_aux = M;
% Looping over nodes
for n = 1:nb_nodes
A = n; % Set of nodes all linked to each other
B = setdiff(find(M_aux(n,:)==1),n); % Set of nodes that are linked to each node in A, but not necessarily to the nodes in B
C = transfer_nodes(A,B,s,M_aux); % Enlarging A by transferring nodes from B
if ~isempty(C)
for i = size(C,1)
cliques = [cliques;{C(i,:)}];
end
end
M_aux(n,:) = 0; % Remove the processed node
M_aux(:,n) = 0;
end
end
% Generating the clique-clique overlap matrix
CC = zeros(length(cliques));
for c1 = 1:length(cliques)
for c2 = c1:length(cliques)
if c1==c2
CC(c1,c2) = numel(cliques{c1});
else
CC(c1,c2) = numel(intersect(cliques{c1},cliques{c2}));
CC(c2,c1) = CC(c1,c2);
end
end
end
% Extracting the k-clique matrix from the clique-clique overlap matrix
% Off-diagonal elements <= k-1 --> 0
% Diagonal elements <= k --> 0
CC(eye(size(CC))==1) = CC(eye(size(CC))==1) - k;
CC(eye(size(CC))~=1) = CC(eye(size(CC))~=1) - k + 1;
CC(CC >= 0) = 1;
CC(CC < 0) = 0;
% Extracting components (or k-clique communities) from the k-clique matrix
components = [];
for i = 1:length(cliques)
linked_cliques = find(CC(i,:)==1);
new_component = [];
for j = 1:length(linked_cliques)
new_component = union(new_component,cliques{linked_cliques(j)});
end
found = false;
if ~isempty(new_component)
for j = 1:length(components)
if all(ismember(new_component,components{j}))
found = true;
end
end
if ~found
components = [components; {new_component}];
end
end
end
function R = transfer_nodes(S1,S2,clique_size,C)
% Recursive function to transfer nodes from set B to set A (as
% defined above)
% Check if the union of S1 and S2 or S1 is inside an already found larger
% clique
found_s12 = false;
found_s1 = false;
for c = 1:length(cliques)
for cc = 1:size(cliques{c},1)
if all(ismember(S1,cliques{c}(cc,:)))
found_s1 = true;
end
if all(ismember(union(S1,S2),cliques{c}(cc,:)))
found_s12 = true;
break;
end
end
end
if found_s12 || (length(S1) ~= clique_size && isempty(S2))
% If the union of the sets A and B can be included in an
% already found (larger) clique, the recursion is stepped back
% to check other possibilities
R = [];
elseif length(S1) == clique_size;
% The size of A reaches s, a new clique is found
if found_s1
R = [];
else
R = S1;
end
else
% Check the remaining possible combinations of the neighbors
% indices
if isempty(find(S2>=max(S1),1))
R = [];
else
R = [];
for w = find(S2>=max(S1),1):length(S2)
S2_aux = S2;
S1_aux = S1;
S1_aux = [S1_aux S2_aux(w)];
S2_aux = setdiff(S2_aux(C(S2(w),S2_aux)==1),S2_aux(w));
R = [R;transfer_nodes(S1_aux,S2_aux,clique_size,C)];
end
end
end
end
end

Fast computation of warp matrices

For a fixed and given tform, the imwarp command in the Image Processing Toolbox
B = imwarp(A,tform)
is linear with respect to A, meaning there exists some sparse matrix W, depending on tform but independent of A, such that the above can be equivalently implemented
B(:)=W*A(:)
for all A of fixed known dimensions [n,n]. My question is whether there are fast/efficient options for computing W. The matrix form is necessary when I need the transpose operation W.'*B(:), or if I need to do W\B(:) or similar linear algebraic things which I can't do directly through imwarp alone.
I know that it is possible to compute W column-by-column by doing
E=zeros(n);
W=spalloc(n^2,n^2,4*n^2);
for i=1:n^2
E(i)=1;
tmp=imwarp(E,tform);
E(i)=0;
W(:,i)=tmp(:);
end
but this is brute force and slow.
The routine FUNC2MAT is somewhat more optimal in that it uses the loop to compute/gather the sparse entry data I,J,S of each column W(:,i). Then, after the loop, it uses this to construct the overall sparse matrix. It also offers the option of using a PARFOR loop. However, this is still slower than I would like.
Can anyone suggest more speed-optimal alternatives?
EDIT:
For those uncomfortable with my claim that imwarp(A,tform) is linear w.r.t. A, I include the demo script below, which tests that the superposition property is satisfied for random input images and tform data. It can be run repeatedly to see that the nonlinearityError is always small, and easily attributable to floating point noise.
tform=affine2d(rand(3,2));
%tform=projective2d(rand(3));
fun=#(A) imwarp(A,tform,'cubic');
I1=rand(100); I2=rand(100);
c1=rand; c2=rand;
LHS=fun(c1*I1+c2*I2); %left hand side
RHS=c1*fun(I1)+c2*fun(I2); %right hand side
linearityError = norm(LHS(:)-RHS(:),'inf')
That's actually pretty simple:
W = sparse(B(:)/A(:));
Note that W is not unique, but this operation probably produces the most sparse result. Another way to calculate it would be
W = sparse( B(:) * pinv(A(:)) );
but that results in a much less sparse (yet still valid) result.
I constructed the warping matrix using the optical flow fields [u,v] and it is working well for my application
% this function computes the warping matrix
% M x N is the size of the image
function [ Fw ] = generateFwi( u,v,M,N )
Fw = zeros(M*N, M*N);
k =1;
for i=1:M
for j= 1:N
newcoord(1) = i+u(i,j);
newcoord(2) = j+v(i,j);
newi = newcoord(1);
newj = newcoord(2);
if newi >0 && newj >0
newi1x = floor(newi);
newi1y = floor(newj);
newi2x = floor(newi);
newi2y = ceil(newj);
newi3x = ceil(newi); % four nearest points to the given point
newi3y = floor(newj);
newi4x = ceil(newi);
newi4y = ceil(newj);
x1 = [newi,newj;newi1x,newi1y];
x2 = [newi,newj;newi2x,newi2y];
x3 = [newi,newj;newi3x,newi3y];
x4 = [newi,newj;newi4x,newi4y];
w1 = pdist(x1,'euclidean');
w2 = pdist(x2,'euclidean');
w3 = pdist(x3,'euclidean');
w4 = pdist(x4,'euclidean');
if ceil(newi) == floor(newi) && ceil(newj)==floor(newj) % both the new coordinates are integers
Fw(k,(newi1x-1)*N+newi1y) = 1;
else if ceil(newi) == floor(newi) % one of the new coordinates is an integer
w = w1+w2;
w1new = w1/w;
w2new = w2/w;
W = w1new*w2new;
y1coord = (newi1x-1)*N+newi1y;
y2coord = (newi2x-1)*N+newi2y;
if y1coord <= M*N && y2coord <=M*N
Fw(k,y1coord) = W/w2new;
Fw(k,y2coord) = W/w1new;
end
else if ceil(newj) == floor(newj) % one of the new coordinates is an integer
w = w1+w3;
w1 = w1/w;
w3 = w3/w;
W = w1*w3;
y1coord = (newi1x-1)*N+newi1y;
y2coord = (newi3x-1)*N+newi3y;
if y1coord <= M*N && y2coord <=M*N
Fw(k,y1coord) = W/w3;
Fw(k,y2coord) = W/w1;
end
else % both the new coordinates are not integers
w = w1+w2+w3+w4;
w1 = w1/w;
w2 = w2/w;
w3 = w3/w;
w4 = w4/w;
W = w1*w2*w3 + w2*w3*w4 + w3*w4*w1 + w4*w1*w2;
y1coord = (newi1x-1)*N+newi1y;
y2coord = (newi2x-1)*N+newi2y;
y3coord = (newi3x-1)*N+newi3y;
y4coord = (newi4x-1)*N+newi4y;
if y1coord <= M*N && y2coord <= M*N && y3coord <= M*N && y4coord <= M*N
Fw(k,y1coord) = w2*w3*w4/W;
Fw(k,y2coord) = w3*w4*w1/W;
Fw(k,y3coord) = w4*w1*w2/W;
Fw(k,y4coord) = w1*w2*w3/W;
end
end
end
end
else
Fw(k,k) = 1;
end
k=k+1;
end
end
end

Resources