Speed-efficient classification in Matlab - performance

I have an image of size as RGB uint8(576,720,3) where I want to classify each pixel to a set of colors. I have transformed using rgb2lab from RGB to LAB space, and then removed the L layer so it is now a double(576,720,2) consisting of AB.
Now, I want to classify this to some colors that I have trained on another image, and calculated their respective AB-representations as:
Cluster 1: -17.7903 -13.1170
Cluster 2: -30.1957 40.3520
Cluster 3: -4.4608 47.2543
Cluster 4: 46.3738 36.5225
Cluster 5: 43.3134 -17.6443
Cluster 6: -0.9003 1.4042
Cluster 7: 7.3884 11.5584
Now, in order to classify/label each pixel to a cluster 1-7, I currently do the following (pseudo-code):
clusters;
for each x
for each y
ab = im(x,y,2:3);
dist = norm(ab - clusters); // norm of dist between ab and each cluster
[~, idx] = min(dist);
end
end
However, this is terribly slow (52 seconds) because of the image resolution and that I manually loop through each x and y.
Are there some built-in functions I can use that performs the same job? There must be.
To summarize: I need a classification method that classifies pixel images to an already defined set of clusters.

Approach #1
For a N x 2 sized points/pixels array, you can avoid permute as suggested in the other solution by Luis, which could slow down things a bit, to have a kind of "permute-unrolled" version of it and also let's bsxfun work towards a 2D array instead of a 3D array, which must be better with performance.
Thus, assuming clusters to be ordered as a N x 2 sized array, you may try this other bsxfun based approach -
%// Get a's and b's
im_a = im(:,:,2);
im_b = im(:,:,3);
%// Get the minimum indices that correspond to the cluster IDs
[~,idx] = min(bsxfun(#minus,im_a(:),clusters(:,1).').^2 + ...
bsxfun(#minus,im_b(:),clusters(:,2).').^2,[],2);
idx = reshape(idx,size(im,1),[]);
Approach #2
You can try out another approach that leverages fast matrix multiplication in MATLAB and is based on this smart solution -
d = 2; %// dimension of the problem size
im23 = reshape(im(:,:,2:3),[],2);
numA = size(im23,1);
numB = size(clusters,1);
A_ext = zeros(numA,3*d);
B_ext = zeros(numB,3*d);
for id = 1:d
A_ext(:,3*id-2:3*id) = [ones(numA,1), -2*im23(:,id), im23(:,id).^2 ];
B_ext(:,3*id-2:3*id) = [clusters(:,id).^2 , clusters(:,id), ones(numB,1)];
end
[~, idx] = min(A_ext * B_ext',[],2); %//'
idx = reshape(idx, size(im,1),[]); %// Desired IDs
What’s going on with the matrix multiplication based distance matrix calculation?
Let us consider two matrices A and B between whom we want to calculate the distance matrix. For the sake of an easier explanation that follows next, let us consider A as 3 x 2 and B as 4 x 2 sized arrays, thus indicating that we are working with X-Y points. If we had A as N x 3 and B as M x 3 sized arrays, then those would be X-Y-Z points.
Now, if we have to manually calculate the first element of the square of distance matrix, it would look like this –
first_element = ( A(1,1) – B(1,1) )^2 + ( A(1,2) – B(1,2) )^2
which would be –
first_element = A(1,1)^2 + B(1,1)^2 -2*A(1,1)* B(1,1) + ...
A(1,2)^2 + B(1,2)^2 -2*A(1,2)* B(1,2) … Equation (1)
Now, according to our proposed matrix multiplication, if you check the output of A_ext and B_ext after the loop in the earlier code ends, they would look like the following –
So, if you perform matrix multiplication between A_ext and transpose of B_ext, the first element of the product would be the sum of elementwise multiplication between the first rows of A_ext and B_ext, i.e. sum of these –
The result would be identical to the result obtained from Equation (1) earlier. This would continue for all the elements of A against all the elements of B that are in the same column as in A. Thus, we would end up with the complete squared distance matrix. That’s all there is!!
Vectorized Variations
Vectorized variations of the matrix multiplication based distance matrix calculations are possible, though there weren't any big performance improvements seen with them. Two such variations are listed next.
Variation #1
[nA,dim] = size(A);
nB = size(B,1);
A_ext = ones(nA,dim*3);
A_ext(:,2:3:end) = -2*A;
A_ext(:,3:3:end) = A.^2;
B_ext = ones(nB,dim*3);
B_ext(:,1:3:end) = B.^2;
B_ext(:,2:3:end) = B;
distmat = A_ext * B_ext.';
Variation #2
[nA,dim] = size(A);
nB = size(B,1);
A_ext = [ones(nA*dim,1) -2*A(:) A(:).^2];
B_ext = [B(:).^2 B(:) ones(nB*dim,1)];
A_ext = reshape(permute(reshape(A_ext,nA,dim,[]),[1 3 2]),nA,[]);
B_ext = reshape(permute(reshape(B_ext,nB,dim,[]),[1 3 2]),nB,[]);
distmat = A_ext * B_ext.';
So, these could be considered as experimental versions too.

Use pdist2 (Statistics Toolbox) to compute the distances in a vectorized manner:
ab = im(:,:,2:3); % // get A, B components
ab = reshape(ab, [size(im,1)*size(im,2) 2]); % // reshape into 2-column
dist = pdist2(clusters, ab); % // compute distances
[~, idx] = min(dist); % // find minimizer for each pixel
idx = reshape(idx, size(im,1), size(im,2)); % // reshape result
If you don't have the Statistics Toolbox, you can replace the third line by
dist = squeeze(sum(bsxfun(#minus, clusters, permute(ab, [3 2 1])).^2, 2));
This gives squared distance instead of distance, but for the purposes of minimizing it doesn't matter.

Related

Efficiently calculate optical flow parameters - MATLAB

I am implementing the partial derivative equations from the Horn & Schunck paper on optical flow. However, even for relative small images (320x568), it takes a frustratingly long time (~30-40 seconds) to complete. I assume this is due to the 320 x 568 = 181760 loop iterations, but I can't figure out a more efficient way to do this (short of a MEX file).
Is there some way to turn this into a more efficient MATLAB operation (a convolution perhaps)? I can figure out how to do this as a convolution for It but not Ix and Iy. I've also considered matrix shifting, but that only works for It as well, as far as I can figure out.
Has anyone else run into this issue and found a solution?
My code is below:
function [Ix, Iy, It] = getFlowParams(img1, img2)
% Make sure image dimensions match up
assert(size(img1, 1) == size(img2, 1) && size(img1, 2) == size(img2, 2), ...
'Images must be the same size');
assert(size(img1, 3) == 1, 'Images must be grayscale');
% Dimensions of original image
[rows, cols] = size(img1);
Ix = zeros(numel(img1), 1);
Iy = zeros(numel(img1), 1);
It = zeros(numel(img1), 1);
% Pad images to handle edge cases
img1 = padarray(img1, [1,1], 'post');
img2 = padarray(img2, [1,1], 'post');
% Concatenate i-th image with i-th + 1 image
imgs = cat(3, img1, img2);
% Calculate energy for each pixel
for i = 1 : rows
for j = 1 : cols
cube = imgs(i:i+1, j:j+1, :);
Ix(sub2ind([rows, cols], i, j)) = mean(mean(cube(:, 2, :) - cube(:, 1, :)));
Iy(sub2ind([rows, cols], i, j)) = mean(mean(cube(2, :, :) - cube(1, :, :)));
It(sub2ind([rows, cols], i, j)) = mean(mean(cube(:, :, 2) - cube(:, :, 1)));
end
end
2D convolution is the way to go here as also predicted in the question to replace those heavy mean/average calculations. Also, those iterative differentiations could be replaced by MATLAB's diff. Thus, incorporating all that, a vectorized implementation would be -
%// Pad images to handle edge cases
img1 = padarray(img1, [1,1], 'post');
img2 = padarray(img2, [1,1], 'post');
%// Store size parameters for later usage
[m,n] = size(img1);
%// Differentiation along dim-2 on input imgs for Ix calculations
df1 = diff(img1,[],2)
df2 = diff(img2,[],2)
%// 2D Convolution to simulate average calculations & reshape to col vector
Ixvals = (conv2(df1,ones(2,1),'same') + conv2(df2,ones(2,1),'same'))./4;
Ixout = reshape(Ixvals(1:m-1,:),[],1);
%// Differentiation along dim-1 on input imgs for Iy calculations
df1 = diff(img1,[],1)
df2 = diff(img2,[],1)
%// 2D Convolution to simulate average calculations & reshape to col vector
Iyvals = (conv2(df1,ones(1,2),'same') + conv2(df2,ones(1,2),'same'))./4
Iyout = reshape(Iyvals(:,1:n-1),[],1);
%// It just needs elementwise diffentiation between input imgs.
%// 2D convolution to simulate mean calculations & reshape to col vector
Itvals = conv2(img2-img1,ones(2,2),'same')./4
Itout = reshape(Itvals(1:m-1,1:n-1),[],1)
Benefits with such a vectorized implementation would be :
Memory efficiency : No more concatenation along the third dimension that would incur memory overhead. Again, performance wise it would be a benefit as we won't need to index into such heavy arrays.
The iterative differentiations inside the loopy codes are replaced by differentiation with diff, so this should be another improvement.
Those expensive average calculations are replaced by very fast convolution calculations and this should be the major improvement section.
A faster method, with improved results (in the cases I have observed) is the following:
function [Ix, Iy, It] = getFlowParams(imNew,imPrev)
gg = [0.2163, 0.5674, 0.2163];
f = imNew + imPrev;
Ix = f(:,[2:end end]) - f(:,[1 1:(end-1)]);
Ix = conv2(Ix,gg','same');
Iy = f([2:end end],:) - f([1 1:(end-1)],:);
Iy = conv2(Iy,gg ,'same');
It = 2*conv2(gg,gg,imNew - imPrev,'same');
This handles the boundary cases elegantly.
I made this as part of my optical flow toolbox, where you can easily view H&S, Lucas Kanade and more in realtime. In the toolbox, the function is called grad3D.m. You may also want to check out grad3Drec.m, in the same toolbox, which adds simple temporal blurring.

How to construct a loop with reducing iterations

In MATLAB, I have a 256x256 RGB image and a 3x3 kernel that passes over it. The 3x3 kernel computes the colour-euclidean distance between every pair combination of the 9 pixels in the kernel, and stores the maximum value in an array. It then moves by 1 pixel and performs the same computation, and so on.
I can easily code the movement of the kernel over the image, as well as the extraction of the RGB values from the pixels in the kernel.
HOWEVER, I do have trouble efficiently computing the colour-euclidean distance operation for every pair combination of pixels.
For example if I had a 3x3 matrix with the following values:
[55 12 5; 77 15 99; 124 87 2]
I need to code a loop such that the 1st element performs an operation with the 2nd,3rd...9th element. Then the 2nd element performs the operation with the 3rd,4th...9th element and so on until finally the 8th element performs the operation with the 9th element. Preferrably, the same pixel combination shouldn't compute again (like if you computed 2nd with 7th, don't compute 7th with 2nd).
Thank you in advance.
EDIT: My code so far
K=3;
s=1; %If S=0, don't reject, If S=1 Reject first max distance pixel pair
OI=imread('onion.png');
Rch = im2col(OI(:,:,1),[K,K],'sliding')
Gch = im2col(OI(:,:,2),[K,K],'sliding')
Bch = im2col(OI(:,:,3),[K,K],'sliding')
indexes = bsxfun(#gt,(1:K^2)',1:K^2)
a=find(indexes);
[idx1,idx2] = find(indexes);
Rsqdiff = (Rch(idx2,:) - Rch(idx1,:)).^2
Gsqdiff = (Gch(idx2,:) - Gch(idx1,:)).^2
Bsqdiff = (Bch(idx2,:) - Bch(idx1,:)).^2
dists = sqrt(double(Rsqdiff + Gsqdiff + Bsqdiff)) %Distance values for all 36 combinations in 1 column
[maxdist,idx3] = max(dists,[],1) %idx3 is each column's index of max value
if s==0
y = reshape(maxdist,size(OI,1)-K+1,[]) %max value of each column (each column has 36 values)
elseif s==1
[~,I]=max(maxdist);
idx3=idx3(I);
n=size(idx3,2);
for i=1:1:n
idx3(i)=a(idx3(i));
end
[I,J]=ind2sub([K*K K*K],idx3);
for j=1:1:a
[M,N]=ind2sub([K*K K*K],dists(j,:));
M(I,:)=0;
N(:,J)=0;
dists(j,:)=sub2ind; %Incomplete line, don't know what to do here
end
[maxdist,idx3] = max(dists,[],1);
y = reshape(maxdist,size(OI,1)-K+1,[]);
end
If I understood the question correctly, you are looking to form unique pairwise combinations within a sliding 3x3 window, perform euclidean distance calculations consider all three channels, which we are calling as colour-euclidean distances and finally picking out the largest of all distances for each sliding window. So, for a 3x3 window that has 9 elements, you would have 36 unique pairs. If the image size is MxN, because of the sliding nature, you would have (M-3+1)*(N-3+1) = 64516 (for 256x256 case) such sliding windows with 36 pairs each, and therefore the distances array would be 36x64516 sized and the output array of maximum distances would be of size 254x254. The implementation suggested here involves im2col to extract sliding windowed elements as columns, nchoosek to form the pairs and finally performing the square-root of squared differences between three channels of such pairs and would look something like this -
K = 3; %// Kernel size
Rch = im2col(img(:,:,1),[K,K],'sliding')
Gch = im2col(img(:,:,2),[K,K],'sliding')
Bch = im2col(img(:,:,3),[K,K],'sliding')
[idx1,idx2] = find(bsxfun(#gt,(1:K^2)',1:K^2)); %//'
Rsqdiff = (Rch(idx2,:) - Rch(idx1,:)).^2
Gsqdiff = (Gch(idx2,:) - Gch(idx1,:)).^2
Bsqdiff = (Bch(idx2,:) - Bch(idx1,:)).^2
dists = sqrt(Rsqdiff + Gsqdiff + Bsqdiff)
out = reshape(max(dists,[],1),size(img,1)-K+1,[])
Your question is interesting and caught my attention. As far as I understood, you need to calculate euclidean distance between RGB color values of all cells inside 3x3 kernel and to find the largest one. I suggest a possible way to do this by using circshift function and 4D array operations:
Firstly, we pad the input array and create 8 shifted versions of it for each direction:
DIM = 256;
A = zeros(DIM,DIM,3,9);
A(:,:,:,1) = round(255*rand(DIM,DIM,3));%// random 256x256 array (suppose it is your image)
A = padarray(A,[1,1]);%// add zeros on each side of image
%// compute shifted versions of the input array
%// and write them as 4th dimension starting from shifted up clockwise:
A(:,:,:,2) = circshift(A(:,:,:,1),[-1, 0]);
A(:,:,:,3) = circshift(A(:,:,:,1),[-1, 1]);
A(:,:,:,4) = circshift(A(:,:,:,1),[ 0, 1]);
A(:,:,:,5) = circshift(A(:,:,:,1),[ 1, 1]);
A(:,:,:,6) = circshift(A(:,:,:,1),[ 1, 0]);
A(:,:,:,7) = circshift(A(:,:,:,1),[ 1,-1]);
A(:,:,:,8) = circshift(A(:,:,:,1),[ 0,-1]);
A(:,:,:,9) = circshift(A(:,:,:,1),[-1,-1]);
Next, we create an array that calculates the difference for all the possible combinations between all the above arrays:
q = nchoosek(1:9,2);
B = zeros(DIM+2,DIM+2,3,size(q,1));
for i = 1:size(q,1)
B(:,:,:,i) = (A(:,:,:,q(i,1)) - A(:,:,:,q(i,2))).^2;
end
C = sqrt(sum(B,3));
Finally, what we have is all the euclidean distances between all possible pairs within a 3x3 kernel. All we have to do is to extract the maximum values. As far as I understood, you do not consider image edges, so:
C = sqrt(sum(B,3));
D = zeros(DIM-2);
for i = 3:DIM
for j = 3:DIM
temp = C(i-1:i+1,j-1:j+1);
D(i-2,j-2) = max(temp(:));
end
end
D is the 254x254 array with maximum Euclidean distances for A(2:255,2:255), i.e. we exclude image edges.
Hope that helps.
P.S. I am amazed by the shortness of the code provided by #Divakar.

Calculating translation value and rotation angle of a rotated 2D image

I have two images which one of them is the Original image and the second one is Transformed image.
I have to find out how many degrees Transformed image was rotated using 3x3 transformation matrix. Plus, I need to find how far translated from origin.
Both images are grayscaled and held in matrix variables. Their sizes are same [350 500].
I have found a few lecture notes like this.
Lecture notes say that I should use the following matrix formula for rotation:
For translation matrix the formula is given:
Everything is good. But there are two problems:
I could not imagine how to implement the formulas using MATLAB.
The formulas are shaped to find x',y' values but I already have got x,x',y,y' values. I need to find rotation angle (theta) and tx and ty.
I want to know the equivailence of x, x', y, y' in the the matrix.
I have got the following code:
rotationMatrix = [ cos(theta) sin(theta) 0 ; ...
-sin(theta) cos(theta) 0 ; ...
0 0 1];
translationMatrix = [ 1 0 tx; ...
0 1 ty; ...
0 0 1];
But as you can see, tx, ty, theta variables are not defined before used. How can I calculate theta, tx and ty?
PS: It is forbidden to use Image Processing Toolbox functions.
This is essentially a homography recovery problem. What you are doing is given co-ordinates in one image and the corresponding co-ordinates in the other image, you are trying to recover the combined translation and rotation matrix that was used to warp the points from the one image to the other.
You can essentially combine the rotation and translation into a single matrix by multiplying the two matrices together. Multiplying is simply compositing the two operations together. You would this get:
H = [cos(theta) -sin(theta) tx]
[sin(theta) cos(theta) ty]
[ 0 0 1]
The idea behind this is to find the parameters by minimizing the error through least squares between each pair of points.
Basically, what you want to find is the following relationship:
xi_after = H*xi_before
H is the combined rotation and translation matrix required to map the co-ordinates from the one image to the other. H is also a 3 x 3 matrix, and knowing that the lower right entry (row 3, column 3) is 1, it makes things easier. Also, assuming that your points are in the augmented co-ordinate system, we essentially want to find this relationship for each pair of co-ordinates from the first image (x_i, y_i) to the other (x_i', y_i'):
[p_i*x_i'] [h11 h12 h13] [x_i]
[p_i*y_i'] = [h21 h22 h23] * [y_i]
[ p_i ] [h31 h32 1 ] [ 1 ]
The scale of p_i is to account for homography scaling and vanishing points. Let's perform a matrix-vector multiplication of this equation. We can ignore the 3rd element as it isn't useful to us (for now):
p_i*x_i' = h11*x_i + h12*y_i + h13
p_i*y_i' = h21*x_i + h22*y_i + h23
Now let's take a look at the 3rd element. We know that p_i = h31*x_i + h32*y_i + 1. As such, substituting p_i into each of the equations, and rearranging to solve for x_i' and y_i', we thus get:
x_i' = h11*x_i + h12*y_i + h13 - h31*x_i*x_i' - h32*y_i*x_i'
y_i' = h21*x_i + h22*y_i + h23 - h31*x_i*y_i' - h32*y_i*y_i'
What you have here now are two equations for each unique pair of points. What we can do now is build an over-determined system of equations. Take each pair and build two equations out of them. You will then put it into matrix form, i.e.:
Ah = b
A would be a matrix of coefficients that were built from each set of equations using the co-ordinates from the first image, b would be each pair of points for the second image and h would be the parameters you are solving for. Ultimately, you are finally solving this linear system of equations reformulated in matrix form:
You would solve for the vector h which can be performed through least squares. In MATLAB, you can do this via:
h = A \ b;
A sidenote for you: If the movement between images is truly just a rotation and translation, then h31 and h32 will both be zero after we solve for the parameters. However, I always like to be thorough and so I will solve for h31 and h32 anyway.
NB: This method will only work if you have at least 4 unique pairs of points. Because there are 8 parameters to solve for, and there are 2 equations per point, A must have at least a rank of 8 in order for the system to be consistent (if you want to throw in some linear algebra terminology in the loop). You will not be able to solve this problem if you have less than 4 points.
If you want some MATLAB code, let's assume that your points are stored in sourcePoints and targetPoints. sourcePoints are from the first image and targetPoints are for the second image. Obviously, there should be the same number of points between both images. It is assumed that both sourcePoints and targetPoints are stored as M x 2 matrices. The first columns contain your x co-ordinates while the second columns contain your y co-ordinates.
numPoints = size(sourcePoints, 1);
%// Cast data to double to be sure
sourcePoints = double(sourcePoints);
targetPoints = double(targetPoints);
%//Extract relevant data
xSource = sourcePoints(:,1);
ySource = sourcePoints(:,2);
xTarget = targetPoints(:,1);
yTarget = targetPoints(:,2);
%//Create helper vectors
vec0 = zeros(numPoints, 1);
vec1 = ones(numPoints, 1);
xSourcexTarget = -xSource.*xTarget;
ySourcexTarget = -ySource.*xTarget;
xSourceyTarget = -xSource.*yTarget;
ySourceyTarget = -ySource.*yTarget;
%//Build matrix
A = [xSource ySource vec1 vec0 vec0 vec0 xSourcexTarget ySourcexTarget; ...
vec0 vec0 vec0 xSource ySource vec1 xSourceyTarget ySourceyTarget];
%//Build RHS vector
b = [xTarget; yTarget];
%//Solve homography by least squares
h = A \ b;
%// Reshape to a 3 x 3 matrix (optional)
%// Must transpose as reshape is performed
%// in column major format
h(9) = 1; %// Add in that h33 is 1 before we reshape
hmatrix = reshape(h, 3, 3)';
Once you are finished, you have a combined rotation and translation matrix. If you want the x and y translations, simply pick off column 3, rows 1 and 2 in hmatrix. However, we can also work with the vector of h itself, and so h13 would be element 3, and h23 would be element number 6. If you want the angle of rotation, simply take the appropriate inverse trigonometric function to rows 1, 2 and columns 1, 2. For the h vector, this would be elements 1, 2, 4 and 5. There will be a bit of inconsistency depending on which elements you choose as this was solved by least squares. One way to get a good overall angle would perhaps be to find the angles of all 4 elements then do some sort of average. Either way, this is a good starting point.
References
I learned about homography a while ago through Leow Wee Kheng's Computer Vision course. What I have told you is based on his slides: http://www.comp.nus.edu.sg/~cs4243/lecture/camera.pdf. Take a look at slides 30-32 if you want to know where I pulled this material from. However, the MATLAB code I wrote myself :)

Downsize a matrix in which some entries are unknown

I have a 2D grid (G= 250x250) and just about 100 points of this is known and the rest is unknown (NaN). I want to resize this matrix. My problem is that imresize cannot do it for me in MATLAB, because it deletes the known values for me and just gives a NaN matrix.
Anyone know about a method that can do it for me? A suggestion is to use an interpolation method (e.g. by using inverse distance weighting), but I am not sure whether it works or not or even is there any better method?
G = NaN(250,250);
a = ceil(rand(1,50)*250*250);
b = ceil(rand(1,50)*250*250);
G (a) = 1; G (b) = 0;
How about this:
% find the non-NaN entries in G
idx = ~isnan(G);
% find their corresponding row/column indices
[i,j] = find(idx);
% resize your matrix as desired, i.e. scale the row/column indices
i = ceil(i*100/250);
j = ceil(j*100/250);
% write the old non-NaN entries to Gnew using accumarray
% you have to set the correct size of Gnew explicitly
% maximum value is chosen if many entries share the same scaled i/j indices
% NaNs are used as the fill
Gnew = accumarray([i, j], G(idx), [100 100], #max, NaN);
You can also choose a different accumulation function for accumarray if max is not suitable for you. And you can change the fill value from NaN to something else if it is not what you need.

3D Least Squares Plane

What's the algorithm for computing a least squares plane in (x, y, z) space, given a set of 3D data points? In other words, if I had a bunch of points like (1, 2, 3), (4, 5, 6), (7, 8, 9), etc., how would one go about calculating the best fit plane f(x, y) = ax + by + c? What's the algorithm for getting a, b, and c out of a set of 3D points?
If you have n data points (x[i], y[i], z[i]), compute the 3x3 symmetric matrix A whose entries are:
sum_i x[i]*x[i], sum_i x[i]*y[i], sum_i x[i]
sum_i x[i]*y[i], sum_i y[i]*y[i], sum_i y[i]
sum_i x[i], sum_i y[i], n
Also compute the 3 element vector b:
{sum_i x[i]*z[i], sum_i y[i]*z[i], sum_i z[i]}
Then solve Ax = b for the given A and b. The three components of the solution vector are the coefficients to the least-square fit plane {a,b,c}.
Note that this is the "ordinary least squares" fit, which is appropriate only when z is expected to be a linear function of x and y. If you are looking more generally for a "best fit plane" in 3-space, you may want to learn about "geometric" least squares.
Note also that this will fail if your points are in a line, as your example points are.
The equation for a plane is: ax + by + c = z. So set up matrices like this with all your data:
x_0 y_0 1
A = x_1 y_1 1
...
x_n y_n 1
And
a
x = b
c
And
z_0
B = z_1
...
z_n
In other words: Ax = B. Now solve for x which are your coefficients. But since (I assume) you have more than 3 points, the system is over-determined so you need to use the left pseudo inverse. So the answer is:
a
b = (A^T A)^-1 A^T B
c
And here is some simple Python code with an example:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
N_POINTS = 10
TARGET_X_SLOPE = 2
TARGET_y_SLOPE = 3
TARGET_OFFSET = 5
EXTENTS = 5
NOISE = 5
# create random data
xs = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
ys = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
zs = []
for i in range(N_POINTS):
zs.append(xs[i]*TARGET_X_SLOPE + \
ys[i]*TARGET_y_SLOPE + \
TARGET_OFFSET + np.random.normal(scale=NOISE))
# plot raw data
plt.figure()
ax = plt.subplot(111, projection='3d')
ax.scatter(xs, ys, zs, color='b')
# do fit
tmp_A = []
tmp_b = []
for i in range(len(xs)):
tmp_A.append([xs[i], ys[i], 1])
tmp_b.append(zs[i])
b = np.matrix(tmp_b).T
A = np.matrix(tmp_A)
fit = (A.T * A).I * A.T * b
errors = b - A * fit
residual = np.linalg.norm(errors)
print("solution:")
print("%f x + %f y + %f = z" % (fit[0], fit[1], fit[2]))
print("errors:")
print(errors)
print("residual:")
print(residual)
# plot plane
xlim = ax.get_xlim()
ylim = ax.get_ylim()
X,Y = np.meshgrid(np.arange(xlim[0], xlim[1]),
np.arange(ylim[0], ylim[1]))
Z = np.zeros(X.shape)
for r in range(X.shape[0]):
for c in range(X.shape[1]):
Z[r,c] = fit[0] * X[r,c] + fit[1] * Y[r,c] + fit[2]
ax.plot_wireframe(X,Y,Z, color='k')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()
unless someone tells me how to type equations here, let me just write down the final computations you have to do:
first, given points r_i \n \R, i=1..N, calculate the center of mass of all points:
r_G = \frac{\sum_{i=1}^N r_i}{N}
then, calculate the normal vector n, that together with the base vector r_G defines the plane by calculating the 3x3 matrix A as
A = \sum_{i=1}^N (r_i - r_G)(r_i - r_G)^T
with this matrix, the normal vector n is now given by the eigenvector of A corresponding to the minimal eigenvalue of A.
To find out about the eigenvector/eigenvalue pairs, use any linear algebra library of your choice.
This solution is based on the Rayleight-Ritz Theorem for the Hermitian matrix A.
See 'Least Squares Fitting of Data' by David Eberly for how I came up with this one to minimize the geometric fit (orthogonal distance from points to the plane).
bool Geom_utils::Fit_plane_direct(const arma::mat& pts_in, Plane& plane_out)
{
bool success(false);
int K(pts_in.n_cols);
if(pts_in.n_rows == 3 && K > 2) // check for bad sizing and indeterminate case
{
plane_out._p_3 = (1.0/static_cast<double>(K))*arma::sum(pts_in,1);
arma::mat A(pts_in);
A.each_col() -= plane_out._p_3; //[x1-p, x2-p, ..., xk-p]
arma::mat33 M(A*A.t());
arma::vec3 D;
arma::mat33 V;
if(arma::eig_sym(D,V,M))
{
// diagonalization succeeded
plane_out._n_3 = V.col(0); // in ascending order by default
if(plane_out._n_3(2) < 0)
{
plane_out._n_3 = -plane_out._n_3; // upward pointing
}
success = true;
}
}
return success;
}
Timed at 37 micro seconds fitting a plane to 1000 points (Windows 7, i7, 32bit program)
This reduces to the Total Least Squares problem, that can be solved using SVD decomposition.
C++ code using OpenCV:
float fitPlaneToSetOfPoints(const std::vector<cv::Point3f> &pts, cv::Point3f &p0, cv::Vec3f &nml) {
const int SCALAR_TYPE = CV_32F;
typedef float ScalarType;
// Calculate centroid
p0 = cv::Point3f(0,0,0);
for (int i = 0; i < pts.size(); ++i)
p0 = p0 + conv<cv::Vec3f>(pts[i]);
p0 *= 1.0/pts.size();
// Compose data matrix subtracting the centroid from each point
cv::Mat Q(pts.size(), 3, SCALAR_TYPE);
for (int i = 0; i < pts.size(); ++i) {
Q.at<ScalarType>(i,0) = pts[i].x - p0.x;
Q.at<ScalarType>(i,1) = pts[i].y - p0.y;
Q.at<ScalarType>(i,2) = pts[i].z - p0.z;
}
// Compute SVD decomposition and the Total Least Squares solution, which is the eigenvector corresponding to the least eigenvalue
cv::SVD svd(Q, cv::SVD::MODIFY_A|cv::SVD::FULL_UV);
nml = svd.vt.row(2);
// Calculate the actual RMS error
float err = 0;
for (int i = 0; i < pts.size(); ++i)
err += powf(nml.dot(pts[i] - p0), 2);
err = sqrtf(err / pts.size());
return err;
}
As with any least-squares approach, you proceed like this:
Before you start coding
Write down an equation for a plane in some parameterization, say 0 = ax + by + z + d in thee parameters (a, b, d).
Find an expression D(\vec{v};a, b, d) for the distance from an arbitrary point \vec{v}.
Write down the sum S = \sigma_i=0,n D^2(\vec{x}_i), and simplify until it is expressed in terms of simple sums of the components of v like \sigma v_x, \sigma v_y^2, \sigma v_x*v_z ...
Write down the per parameter minimization expressions dS/dx_0 = 0, dS/dy_0 = 0 ... which gives you a set of three equations in three parameters and the sums from the previous step.
Solve this set of equations for the parameters.
(or for simple cases, just look up the form). Using a symbolic algebra package (like Mathematica) could make you life much easier.
The coding
Write code to form the needed sums and find the parameters from the last set above.
Alternatives
Note that if you actually had only three points, you'd be better just finding the plane that goes through them.
Also, if the analytic solution in unfeasible (not the case for a plane, but possible in general) you can do steps 1 and 2, and use a Monte Carlo minimizer on the sum in step 3.
CGAL::linear_least_squares_fitting_3
Function linear_least_squares_fitting_3 computes the best fitting 3D
line or plane (in the least squares sense) of a set of 3D objects such
as points, segments, triangles, spheres, balls, cuboids or tetrahedra.
http://www.cgal.org/Manual/latest/doc_html/cgal_manual/Principal_component_analysis_ref/Function_linear_least_squares_fitting_3.html
It sounds like all you want to do is linear regression with 2 regressors. The wikipedia page on the subject should tell you all you need to know and then some.
All you'll have to do is to solve the system of equations.
If those are your points:
(1, 2, 3), (4, 5, 6), (7, 8, 9)
That gives you the equations:
3=a*1 + b*2 + c
6=a*4 + b*5 + c
9=a*7 + b*8 + c
So your question actually should be: How do I solve a system of equations?
Therefore I recommend reading this SO question.
If I've misunderstood your question let us know.
EDIT:
Ignore my answer as you probably meant something else.
We first present a linear least-squares plane fitting method that minimizes the residuals between the estimated normal vector and provided points.
Recall that the equation for a plane passing through origin is Ax + By + Cz = 0, where (x, y, z) can be any point on the plane and (A, B, C) is the normal vector perpendicular to this plane.
The equation for a general plane (that may or may not pass through origin) is Ax + By + Cz + D = 0, where the additional coefficient D represents how far the plane is away from the origin, along the direction of the normal vector of the plane. [Note that in this equation (A, B, C) forms a unit normal vector.]
Now, we can apply a trick here and fit the plane using only provided point coordinates. Divide both sides by D and rearrange this term to the right-hand side. This leads to A/D x + B/D y + C/D z = -1. [Note that in this equation (A/D, B/D, C/D) forms a normal vector with length 1/D.]
We can set up a system of linear equations accordingly, and then solve it by an Eigen solver in C++ as follows.
// Example for 5 points
Eigen::Matrix<double, 5, 3> matA; // row: 5 points; column: xyz coordinates
Eigen::Matrix<double, 5, 1> matB = -1 * Eigen::Matrix<double, 5, 1>::Ones();
// Find the plane normal
Eigen::Vector3d normal = matA.colPivHouseholderQr().solve(matB);
// Check if the fitting is healthy
double D = 1 / normal.norm();
normal.normalize(); // normal is a unit vector from now on
bool planeValid = true;
for (int i = 0; i < 5; ++i) { // compare Ax + By + Cz + D with 0.2 (ideally Ax + By + Cz + D = 0)
if ( fabs( normal(0)*matA(i, 0) + normal(1)*matA(i, 1) + normal(2)*matA(i, 2) + D) > 0.2) {
planeValid = false; // 0.2 is an experimental threshold; can be tuned
break;
}
}
We then discuss its equivalence to the typical SVD-based method and their comparison.
The aforementioned linear least-squares (LLS) method fits the general plane equation Ax + By + Cz + D = 0, whereas the SVD-based method replaces D with D = - (Ax0 + By0 + Cz0) and fits the plane equation A(x-x0) + B(y-y0) + C(z-z0) = 0, where (x0, y0, z0) is the mean of all points that serves as the origin of the new local coordinate frame.
Comparison between two methods:
The LLS fitting method is much faster than the SVD-based method, and is suitable for use when points are known to be roughly in a plane shape.
The SVD-based method is more numerically stable when the plane is far away from origin, because the LLS method would require more digits after decimal to be stored and processed in such cases.
The LLS method can detect outliers by checking the dot product residual between each point and the estimated normal vector, whereas the SVD-based method can detect outliers by checking if the smallest eigenvalue of the covariance matrix is significantly smaller than the two larger eigenvalues (i.e. checking the shape of the covariance matrix).
We finally provide a test case in C++ and MATLAB.
// Test case in C++ (using LLS fitting method)
matA(0,0) = 5.4637; matA(0,1) = 10.3354; matA(0,2) = 2.7203;
matA(1,0) = 5.8038; matA(1,1) = 10.2393; matA(1,2) = 2.7354;
matA(2,0) = 5.8565; matA(2,1) = 10.2520; matA(2,2) = 2.3138;
matA(3,0) = 6.0405; matA(3,1) = 10.1836; matA(3,2) = 2.3218;
matA(4,0) = 5.5537; matA(4,1) = 10.3349; matA(4,2) = 1.8796;
// With this sample data, LLS fitting method can produce the following result
// fitted normal vector = (-0.0231143, -0.0838307, -0.00266429)
// unit normal vector = (-0.265682, -0.963574, -0.0306241)
// D = 11.4943
% Test case in MATLAB (using SVD-based method)
points = [5.4637 10.3354 2.7203;
5.8038 10.2393 2.7354;
5.8565 10.2520 2.3138;
6.0405 10.1836 2.3218;
5.5537 10.3349 1.8796]
covariance = cov(points)
[V, D] = eig(covariance)
normal = V(:, 1) % pick the eigenvector that corresponds to the smallest eigenvalue
% normal = (0.2655, 0.9636, 0.0306)

Resources