Essential Matrix from 8 points algorithm - algorithm

In „Multiple View Geometry in Computer Vision” R.Hartley,A.Zisserman in chapter 11 - about computation Fundamental Matrix one can read:
„11.7.3 The calibrated case
In the case of calibrated cameras normalized image coordinates may be used, and the essential matrix E computed instead of the fundamental matrix”
Does it mean – if I have proper Intrinsic Cameras matrices (does it mean calibrated in this case?) I can calculate Essential Matrix directly (using 8 points algorithm) omitting calculating Fundamental Matrix?
And I can get matrices R and T from calculated Essential Matrix to reconstruction 3D model?
Regards,
Artik

Short answer, yes. See also longer explanation on Wikipedia.

From your correspondences, using the 8point Alg you obtain the Fundamental Matrix F.
From the relation E=K'^T F K, assuming that you know both K' and K (in case that both the images were taken by the same camera, you have K'=K), you can compute E.
From E you get 4 possible camera couple (P_0,P_0') (P_1,P_1')....(P_3,P_3'). Only one of this couple satisfy the positive depth constraint (i.e. the 3D points lie in front of both the cameras).
That couple will be your cameras.
Hope this help!

So, In general, a calibrated camera in visual odometry refers to a camera for which the intrinsic matrix is known.
In the case of a stereo visual odometry system, I typically take it to mean that the intrinsic matrix is known for both cameras, however, some of my co-works mean it to mean that the Rotation and Translation between the two cameras are known.
In Practice, there is hardly any distinction between the two as you can estimate the intrinsic matrix of a camera using various functions in MatLab or OpenCV, and given the instinct matrix, you can determine the rotation and translation between the two cameras.
Furthermore, the derivation of the fundamental matrix relies upon the Essential matrix and the intrinsic matrix of two cameras (the intrinsic matrix can be the same in the case of monocular visual odometry). This means that it is often the case that the essential matrix is estimated and the fundamental matrix is not.
For an explanation on getting the rotation and translation from the essential matrix, I recommend first watching a youtube video on Single Value Decomposition (SVD) and then reading: https://www.researchgate.net/publication/220556161_Visual_Odometry_Tutorial.
Good Luck with Your Studies Young Scholar.

Related

Is there any use for the determinant of a 4x4 matrix in computer graphics?

In most graphics libraries I've seen, there's some function that returns the determinant from 3x3 and 4x4 matrices, but I have no idea when you'd actually need to use the determinant in 3D computer graphics.
What are some examples of using a determinant in 3D graphics programming?
Off the top of my head...
If the determinant is 0 then the matrix cannot be inverted, which can be useful to know.
If the determinant is negative, then objects transformed by the matrix will reversed as if in a mirror (left handedness becomes right handedness and vice-versa)
For 3x3 matrices, the volume of an object will be multiplied by the determinant when it is transformed by the matrix. Knowing this could be useful for determining, for example, the level of detail / number of polygons to use when rendering an object.
In 3D vector graphics
there are used 4x4 homogenuous transform matrices and we need booth direct and inverse matrices which can be computed by (sub)determinants. But for orthogonal matrices there are faster and more accurate methods like
full pseudo inverse matrix.
Many intersection tests use determinants (or can be converted to use them) especially for quadratic equations (ellipsoids,...) for example:
ray and ellipsoid intersection accuracy improvement
as Matt Timmermans suggested you can decide if your matrix is invertible or left/right handed which is useful to detect errors in matrices (accuracy degradation) or porting skeletons in between formats or engines etc.
And I am sure there area lot of other uses for it in vector math (IIRC IGES use them for rotational surfaces, cross product is determinant,...)
The incircle test is a key primitive for computing Voronoi diagrams and Delaunay triangulations. It is given by the sign of a 4x4 determinant.
(picture from https://www.cs.cmu.edu/~quake/robust.html)

Uncertainty on pose estimate when minimizing measurement errors

Let's say I want to estimate the camera pose for a given image I and I have a set of measurements (e.g. 2D points ui and their associated 3D coordinates Pi) for which I want to minimize the error (e.g. the sum of squared reprojection errors).
My question is: How do I compute the uncertainty on my final pose estimate ?
To make my question more concrete, consider an image I from which I extracted 2D points ui and matched them with 3D points Pi. Denoting Tw the camera pose for this image, which I will be estimating, and piT the transformation mapping the 3D points to their projected 2D points. Here is a little drawing to clarify things:
My objective statement is as follows:
There exist several techniques to solve the corresponding non-linear least squares problem, consider I use the following (approximate pseudo-code for the Gauss-Newton algorithm):
I read in several places that JrT.Jr could be considered an estimate of the covariance matrix for the pose estimate. Here is a list of more accurate questions:
Can anyone explain why this is the case and/or know of a scientific document explaining this in details ?
Should I be using the value of Jr on the last iteration or should the successive JrT.Jr be somehow combined ?
Some people say that this actually is an optimistic estimate of the uncertainty, so what would be a better way to estimate the uncertainty ?
Thanks a lot, any insight on this will be appreciated.
The full mathematical argument is rather involved, but in a nutshell it goes like this:
The outer product (Jt * J) of the Jacobian matrix of the reprojection error at the optimum times itself is an approximation of the Hessian matrix of least squares error. The approximation ignores terms of order three and higher in the Taylor expansion of the error function at the optimum. See here (pag 800-801) for proof.
The inverse of the Hessian matrix is an approximation of the covariance matrix of the reprojection errors in a neighborhood of the optimal values of the parameters, under a local linear approximation of parameters-to-errors transformation (pag 814 above ref).
I do not know where the "optimistic" comment comes from. The main assumption underlying the approximation is that the behavior of the cost function (the reproj. error) in a small neighborhood of the optimum is approximately quadratic.

Algorithm to check if a polygon is a projection of a polyhedron

I am trying to develop an algorithm that performs the following :
Given a 2D polygon and a 3D polyhedron, determine if the 2D polygon is a projection of the 3D polyhedron (a perspective projection to be precise) without knowing which transformation matrix we may have possibly used for the projection.
input
{2D Polygon}
{3D Polyhedron}
output
{bool} whether or not it's a perspective projection
I am not asking for code, but I would simply like to know if this is feasible in polynomial time.
Any help will be greatly appreciated.
A 3D to 2D perspective projection has 7 degrees of freedom (6 for the relative motion of the scene with respect to the camera, 1 for the focal length).
Select four vertices in the 2D projection and consider all possible correspondences with polyhedron vertices (there is a polynomial number of such associations). Then form a system of 7 equations in the 7 unknown parameters (unfortunately a nonlinear one; maybe the eighth equation can be useful to select among multiple solutions).
Knowing the parameters, you can check a solution by re-projecting the polyhedron and comparing to the polygon (with further search for correspondences with vertices and edges).
All of this will take polynomial time (quartic if I am right), if one admits that the solver takes bounded time (hence bounded precision).
If the focal length is known, then a better approach is possible. Indeed, with only 6 unknowns, you can find the projection parameters from the projection of just three points. This problem is known to have an analytical solution (actually up to 4 of them), as described at length in "New Algorithms for the Perspective-Three-Point Problem, GAO Xiaoshan & CHEN Hangfei, Vol.16 No.3 J. Comput. Sci. & Technol."
This should lead to an O(N³) exact procedure.
More generally speaking, you form putative correspondences between N pairs of points, solve the corresponding Perspective-N-point problem, and check the hypothesis by reprojecting the polyhedron and comparing to the known projection to validate the hypothesis.
Just an idea for an algorithm:
Take a triangle of the projection made of three points next to each other not on the same line. Iterate through all corresponding triangles of the original. For all possible projections that solve the pair of triangles, check if the rest matches.
I must admit I am not sure right now if there could be infinite solutions for triangles (which would be hard to iterate)? If so, start with four points.
I think it is possible but you have to do a fair amount of reverse engineering. A 2D sketch that represents a 3D object is known as an Orthographic Projection. The link shows you the transformation matrices you need apply to transform the 3D point onto its 2D projection. Now, how do you go the opposite way? Inverse matrices with a mix of some inverse transformations (translation, scaling, rotation...)? I think this is a good lead to follow.

What are eigen values and expansions?

What are eigen values, vectors and expansions and as an algorithm designer how can I use them?
EDIT: I want to know how YOU have used it in your program so that I get an idea. Thanks.
they're used for a lot more than matrix algebra. examples include:
the asymptotic state distribution of a hidden markov model is given by the left eigenvector associated with the eigenvalue of unity from the state transition matrix.
one of the best & fastest methods of finding community structure in a network is to construct what's called the modularity matrix (which basically is how "surprising" is a connection between two nodes), and then the signs of the elements of the eigenvector associated with the largest eigenvalue tell you how to partition the network into two communities
in principle component analysis you essentially select the eigenvectors associated with the k largest eigenvalues from the n>=k dimensional covariance matrix of your data and project your data down to the k dimensional subspace. the use of the largest eigenvalues ensures that you're retaining the dimensions that are most significant to the data, since they are the ones that have the greatest variance.
many methods of image recognition (e.g. facial recognition) rely on building an eigenbasis from known data (a large set of faces) and seeing how difficult it is to reconstruct a target image using the eigenbasis -- if it's easy, then the target image is likely to be from the set the eigenbasis describes (i.e. eigenfaces easily reconstruct faces, but not cars).
if you're in to scientific computing, the eigenvectors of a quantum hamiltonian are those states that are stable, in that if a system is in an eigenstate at time t1, then at time t2>t1, if it hasn't been disturbed, it will still be in that eigenstate. also, the eigenvector associated with the smallest eigenvalue of a hamiltonian is the ground state of a system.
Eigen vectors and corresponding eigen values are mainly used to switch between different coordinate systems. This might simplify problems and computations enormously by moving the problem sphere to from one coordinate system to another.
This new coordinates system has the eigen vectors as its base vectors, i.e. they "span" this coordinate system. Since they can be normalized, the transformation matrix from the first coordinate system is "orthonormal", that is the eigen vectors have magnitude 1 and are perpendicular to each other.
In the transformed coordinate system, a linear operation A (matrix) is pure diagonal. See Spectral Theorem, and Eigendecomposition for more information.
A quick implication is for example that you can from a general quadratic curve:
ax^2 + 2bxy + cy^2 + 2dx + 2fy + g = 0
rewrite it as
AX^2 + BY^2 + C = 0
where X and Y are counted along the direction of the eigen vectors.
Cheers !
check out http://mathworld.wolfram.com/Eigenvalue.html
Using eigen values in algorithms will need you to be proficient with the math involved.
I'm absolutely the wrong person to be talking about math: I puke on it.
cheers, jrh.
Eigen values and vectors are used in matrix computation as finding of reverse matrix. So if you need to write math code, precomputing them can speed some operations.
In short, you need them if you do matrix algebra, linear algebra etc.
Using the notation favored by physicists, if we have an operator H, then |x> is an eigenstate of H if and only if
H|x> = h|x>
where we call h the eigenvalue associated with the eigenvector |x> under H.
(Here the state of the system can be represented by a matrix, making this math isomorphic with all the other expressions already linked.)
Which brings us to the uses of these things once they have been discovered:
The full set of eigenvectors of a system under a given operator form an orthagonal spanning set for they system. This set may be a basis if there is no degeneracy. This is very useful because it allows extremely compact expressions of arbitrary (non eigen-) states of the system.

Find a similarity of two vector shapes

Looking for any information/algorithms relating to comparing vector graphics. E.g. say there two point collections or vector files with two almost identical figures. I want to determine that a first figure is about 90% similar to the second one.
A common way to test for similarity is with image moments. Moments are intrinsically translationally invariant, and if the objects you compare might be scaled or rotated you can use moments that are invariant to these transformations, such as Hu moments.
Most of the programs I know would require rasterized versions of the vector objects; but the moments could be calculated directly from the vector graphics using a Green's Theorem approach, or a more simplistic approach that just identifies unique (unordered) vertex configurations would be to convert the Hu moment integrals to sums over the vertices -- in a physics analogy replacing the continuous object with equal point masses at each vertex.
There is a paper on a tool called VISTO that sorts vector graphics images (using moments, I think), which should certainly be useful for more details.
You could search for fingerprint matching algorithms. Fingerprints are usually converted to a set of points with their relative location to each other, which makes it basically the same problem as yours.
You could transform it to a non-vector graphic and then apply standard image analysis techniques like SIFT points, etc.

Resources