Computing the homography matrix - homography

I have a poster in the world image. I have to replace the poster with my own image.
Let the poster in world image have points A, B , C , D. My own image coordinates are a , b , c , d.
The idea is to compute a homography matrix such that a = HA, b = HB, c = HC, d = HD. After that I apply H^-1 to my image and transform them to the poster in in world image. I saw a couple of books , lectures like this.
Why is not that I can compute a homography H' such that A = H'a , B = H'b and so on. Why find H and then its inverse and not directly H'. Is there some problem with it?

Actually there isn't any difference between finding H and inverting it, or directly finding H^-1. Usually it is better to avoid inversion to speed up the algorithm and reduce accumulation of numerical errors.
I can't think of any 'real' reasons to use the inversion the way you described. Maybe authors did it for sake of clarity, or some notation conventions.
Also, usually when you warp an image using a homography - it is inverted under the hood. So instead of calculating the 'new' coordinates for points in the original image and writing source colors there - you iterate over pixels in the destination image and see where should you take the color values from in the source image. This way you can avoid writing to same pixel several times (if you are downscaling an image) or producing a sparse image (when upscaling).
If you post the exact sources where you have seen this unnecessary conversion - we can give you more helpful comments.

Related

Pose from essential matrix, the order of transform

From I've understood, the transform order (rotate first or translate first) yield different [R|t].
So I want to know what's the order of the 4 possible poses you got from essential matrix SVD.
I've read a code implementation of Pose From Essential Matrix(Hartley and Zisserman's multiple view geometry (page 259)). And the author seems to interpret it as rotate first then translate, where he retrieve camera position by using p = -R^T * t.
Also, opencv seems to use trasnlate first then rotate rule. Because the t vector I got from calibrating camera is the position of camera.
Or maybe I have been wrong and the order doesn't matter?
You shouldn't use SVD to decompose a transformation into rotation and translation components. Viewed as x' = M*x = T*R*x, the translation is just the fourth column of M, and the rotation is in the upper-left 3x3 submatrix.
If you feed the whole 4x4 matrix into SVD, I'm not sure what you'll get out, but it won't be useful. (If nothing else, U and V will not be affine.)

How can I sort a coordinate matrix based on the distance between points in another coordinate matrix in matlab?

I am using matlab's built in function called Procrustes to see the rotation translation and scale between two images. But, I am just using coordinates of the brightest points in the image and rotating these coordinates about the center of the image. Procrustes compares two matrices and gives you the rotation, translation, and scale. However, procrustes only works correctly if the matrices are in the same order for comparison.
I am given an image and a separate comparison coordinate matrix. The end goal is to find how much the image has been rotated, translated, and scaled compared to the coordinate matrix. I can just use Procrustes for this, but I need to correctly order the coordinates found from the image to match the order in the comparison coordinate matrix. My thought was to compare the distance between every possible combination of points in the coordinate matrix and compare it to the coordinates that I find in the picture. I just do not know how to write this code due to the fact if there is n coordinates, there will be n! possible combinations.
Just searching for the shortest distance is not so hard.
A = rand(1E4,2);
B = rand(1E4,2);
tic
idx = nan(1,1E4);
for ct = 1:size(A,1)
d = sum((A(ct,:)-B).^2,2);
idx(ct) = find(d==min(d));
end
toc
plot(A(1:10,1),A(1:10,2),'.r',B(idx(1:10),1),B(idx(1:10),2),'.b')
takes half a second on my PC.
The problems can start when two points in set A are matched to the same location in set B.
length(unique(idx))==length(idx)
This can be solved in several ways. The best (imho) is to determine a probability that point B matches with point A based on the distance (usually something that decreases exponentially), and solve for the most probable situation.
A simpler method (but more error prone) is to remove the matched point from set B.

How to undo a rotation on a matrix

Please forgive me for the naive question, I don't remember linear algebra at all.
To do that I use a matrix that is associated to an image to apply transformations,
The matrix of the image is a matrix, now I want to get the how much the matrix has been translated and scale.
It's OK when there are no rotations applied,
but rotation confuses things a lot.
Say your new matrix N = RTS, where R is a rotation, T is a translation, and S is a scaling. This means in order you scale, translate, then rotate. If you want to see the scaling and translation, left-multiply by R-inverse, which is the same as R's transpose. With respect to your original view, you will see a stretched and transformed matrix.
If instead N = TSR, you would have to right multiply by R inverse. Note: The two matrices N obtained by these operations will not in general be the same!
Alternatively you can change coordinate systems, but this is more involved as rotation and translation do not commute in general.

Procrustes analysis / Finding the Angle Between two images represented by two sets of 2d points

If I have 2 sets of points I can rotate one around with Procrustes analysis to align one with the other.
But suppose these 2 sets of points each are attached to images and I would like to rotate the images as well. Is there any way I can also rotate the image, instead of rotating just the points? The tutorial there uses a dot product for rotation (solve u, s, v = svd(p1', p2) and then do p2 . v . u', p' is transposed p)
However that doesn't tell me what the angle between the images is.
The page on wikipedia calculates an angle between each pair of points I think.
Maybe what I'm asking is impossible? If I rotate the first set of points to align it with the first, can't I also rotate the respective images by an angle as well? Point being, which angle is that?
I noticed that v . u' gives me a 2 x 2 matrix which seems to be the rotation matrix (there's a wikipedia page but I can't link there due to posting priviledges). I got the sin and cos of the third and first elements and then used arctan2, but the results I'm getting are kind of weird. I know they have to be transformed from radians but I'm not convinced what I'm doing is right. Trying the rotation it gives me on gimp makes it seem like it's not what I want, but I'll test some more.
It seems like your approach is mostly correct. Two things which come to mind:
1) The paper you linked to (Procrustes analysis) involves a translation and a scaling in addition to rotation. Depending on whether or not those operations are also performed on your images, you may end up with strange results that don't appear to match.
2) I think you may be overcomplicating your angle calculation. v * u' appears to be the correct rotation matrix, but I believe the correct angle only requires one of the matrix entries in the 2x2 matrix. For instance, just use acos() of the first matrix entry. As you've noticed, this will (depending on the program) give you an answer in radians which you'll have to convert to degrees if you want to try out the rotation in gimp, etc.
Hope this helps!

Is there any algorithm for determining 3d position in such case? (images below)

So first of all I have such image (and ofcourse I have all points coordinates in 2d so I can regenerate lines and check where they cross each other)
(source: narod.ru)
But hey, I have another Image of same lines (I know thay are same) and new coords of my points like on this image
(source: narod.ru)
So... now Having points (coords) on first image, How can I determin plane rotation and Z depth on second image (asuming first one's center was in point (0,0,0) with no rotation)?
What you're trying to find is called a projection matrix. Determining precise inverse projection usually requires that you have firmly established coordinates in both source and destination vectors, which the images above aren't going to give you. You can approximate using pixel positions, however.
This thread will give you a basic walkthrough of the techniques you need to use.
Let me say this up front: this problem is hard. There is a reason Dan Story's linked question has not been answered. Let provide an explanation for people who want to take a stab at it. I hope I'm wrong about how hard it is, though.
I will assume that the 2D screen coordinates and projection/perspective matrix is known to you. You need to know at least this much (if you don't know the projection matrix, essentially you are using a different camera to look at the world). Let's call each pair of 2D screen coordinates (a_i, b_i), and I will assume the projection matrix is of the form
P = [ px 0 0 0 ]
[ 0 py 0 0 ]
[ 0 0 pz pw]
[ 0 0 s 0 ], s = +/-1
Almost any reasonable projection has this form. Working through the rendering pipeline, you find that
a_i = px x_i / (s z_i)
b_i = py y_i / (s z_i)
where (x_i, y_i, z_i) are the original 3D coordinates of the point.
Now, let's assume you know your shape in a set of canonical coordinates (whatever you want), so that the vertices is (x0_i, y0_i, z0_i). We can arrange these as columns of a matrix C. The actual coordinates of the shape are a rigid transformation of these coordinates. Let's similarly organize the actual coordinates as columns of a matrix V. Then these are related by
V = R C + v 1^T (*)
where 1^T is a row vector of ones with the right length, R is an orthogonal rotation matrix of the rigid transformation, and v is the offset vector of the transformation.
Now, you have an expression for each column of V from above: the first column is { s a_1 z_1 / px, s b_1 z_1 / py, z_1 } and so on.
You must solve the set of equations (*) for the set of scalars z_i, and the rigid transformation defined R and v.
Difficulties
The equation is nonlinear in the unknowns, involving quotients of R and z_i
We have assumed up to now that you know which 2D coordinates correspond to which vertices of the original shape (if your shape is a square, this is slightly less of a problem).
We assume there is even a solution at all; if there are errors in the 2D data, then it's hard to say how well equation (*) will be satisfied; the transformation will be nonrigid or nonlinear.
It's called (digital) photogrammetry. Start Googling.
If you are really interested in this kind of problems (which are common in computer vision, tracking objects with cameras etc.), the following book contains a detailed treatment:
Ma, Soatto, Kosecka, Sastry, An Invitation to 3-D Vision, Springer 2004.
Beware: this is an advanced engineering text, and uses many techniques which are mathematical in nature. Skim through the sample chapters featured on the book's web page to get an idea.

Resources