I am implementing the Gaussian distribution model on some data, if the sigma(covariance matrix) is singular, then it's not invertible, and will result in the failure in calculating the probability. I think add an Identity matrix to the sigma will make the sigma invertible, but that will make the model not fits the data.
Is there a way to make the sigma matrix invertible and keep the model fitting data?
Have a set of data: (x1, x2)_1 , (x1, x2)_2 , ... , (x1, x2)_i . where x1 and x2 are continues real numbers and some (x1, x2) can appear serval times, And I assumpt that those data follow Guassian distribution, and then can calculate the mean vector as (mean(x1), mean(x2)), and then calculate the covariance matrix as usual. And in some case the covariance matrix may be singular, I think add some random small shifts to it can make it nonsingular, but I don't know how to do it correctly so that the model can still fit data well.
You only need to model one dimension of the data with a 1D gaussian distribution in this case.
If you have two-dimensional data {(x1,x2)_i} whose covariance matrix is singular, this means that the data lies along a straight line. The {x2} data is a deterministic function of the {x1} data, so you only need to model the {x1} data randomly. The {x2} data follows immediately from {x1} and is no longer random once you know {x1}.
Here is my reasoning:
The covariance matrix would look something like this, since all covariance matrices are symmetric:
| a b |
| b c |
Where a = var(x1), c = var(x2), b = cov(x1,x2).
Now if this matrix is singular, the second column vector would have to be a scalar multiple of the first (since they are linearly dependent). Let's say the constant is k. Then:
b = k*c
a = k*b = k*k*c
So the covariance matrix really looks like:
| k*k*c k*c |
| k*c c |
Here there is only one parameter c = var(x2) which determines the distribution (since k can be anything), so the data in inherently one-dimension. Modelling it with one variable x1 is enough. Another way of seeing this is by checking that the Pearson Correlation Coefficient for this distribution is b/(sqrt(a)*sqrt(c)) = 1.
Related
I want to do scattered interpolation in Matlab, but scatteredInterpolant does not do quite what I want.
scatteredInterpolant allows me to provide a set of input sampling positions and the corresponding sample values. Then I can query the interpolated values by supplying a set of positions:
F = scatteredInterpolant(xpos, ypos, samplevals)
interpvals = F(xgrid, ygrid)
This is sort of the opposite of what I want. I already have a fixed set of sample positions, xpos/ypos, and output grid, xgrid/ygrid, and then I want to vary the sample values. The use case is that I have many quantities sampled at the same sampling positions, that should all be interpolated to the same output grid.
I have an idea how to do this for nearest neighbor and linear interpolation, but not for more general cases, in particular for natural neighbor interpolation.
This is what I want, in mock code:
G = myScatteredInterpolant(xpos, ypos, xgrid, ygrid, interp_method)
interpvals = G(samplevals)
In terms of what this means, I suppose G holds a (presumably sparse) matrix of weights, W, and then G(samplevals) basically does W * samplevals, where the weights in the matrix W depends on the input and output grid, as well as the interpolation method (nearest neighbor, linear, natural neighbor). Calculating the matrix W is probably much more expensive than evaluating the product W * samplevals, which is why I want this to be reused.
Is there any code in Matlab, or in a similar language that I could adapt, that does this? Can it somehow be extracted from scatteredInterpolant in reasonable processing time?
I have a 3D Cartesian grid data that needs to be used to create a 3D regular mesh for interpolation method. x,y & z are 3 vectors with data points that are used to form this grid. My question is, how can i efficiently give 2 index to these points say,
where c000 is indexed as 1 point (1,1,1), c100 is indexed as 2 for (2,1,1) for (x,y,z)
coordinate points and another index to identify the 8 points forming the cube. Say if I have a point C, I must retrieve the nearest 8 points for interpolation. so for points c000,c100,c110,c010,c001,c101,c111,c011 point index and cube index. Since the data available is huge, the focus is to use faster implementation. pls give me some hints on how to proceed.
About the maths:
Identifying the cube which a point p surrounds requires a mapping
U ⊂ ℝ+**3 -> ℕ:
p' (= p - O_) -> hash_r(p');
"O_" being located at (min_x(G),min_y(G),min_z(G)) of the Grid G.
Along each axis, the cube numbering is trivial.
Given a compound cube number
n_ = (s,t,u)
and N_x, N_y, N_z being the size of your X_, Y_, Z_, a suitable hash would be
hash_n(n_) = s
| t * 2**(floor(log_2(N_x))+1)
| u * 2**(floor(log_2(N_x)) + floor(log_2(N_y)) + 2).
To calculate e.g. "s" for a point C, take
s = floor((C[0] - O_)/ a)
"a" being the edge length of the cubes.
About taking that to C++
Given you have enough space to allocate
(2**(floor(log_2(max(N_x, N_y, N_z)))+1)**3
buckets, a std::unordered_map<hash_t,Cube> using that (perfect) hash would offer O(1) for finding the cube for a point p.
A lesser pompous std::map<hash_t,Cube> using a less based on that hash would offer O(log(N)) find complexity.
After multiplying a lot of rotation matrices, the end result might not be a valid rotation matrix any more, due to rounding issues (de-orthogonalized)
One way to re-orthogonalize is to follow these steps:
Convert the rotation matrix to an axis-angle representation (link)
Convert back the axis-angle to a rotation matrix (link)
Is there something in Eigen library that does the same thing by hiding all the details? Or is there any better recipe?
This procedure has to be handled with care due to special singularity cases, so if Eigen provides a better tool for this it would be great.
I don't use Eigen and didn't bother to look up the API but here is a simple, computationally cheap and stable procedure to re-orthogonalize the rotation matrix. This orthogonalization procedure is taken from Direction Cosine Matrix IMU: Theory by
William Premerlani and Paul Bizard; equations 19-21.
Let x, y and z be the row vectors of the (slightly messed-up) rotation matrix. Let error=dot(x,y) where dot() is the dot product. If the matrix was orthogonal, the dot product of x and y, that is, the error would be zero.
The error is spread across x and y equally: x_ort=x-(error/2)*y and y_ort=y-(error/2)*x. The third row z_ort=cross(x_ort, y_ort), which is, by definition orthogonal to x_ort and y_ort.
Now, you still need to normalize x_ort, y_ort and z_ort as these vectors are supposed to be unit vectors.
x_new = 0.5*(3-dot(x_ort,x_ort))*x_ort
y_new = 0.5*(3-dot(y_ort,y_ort))*y_ort
z_new = 0.5*(3-dot(z_ort,z_ort))*z_ort
That's all, were are done.
It should be pretty easy to implement this with the API provided by Eigen. You can easily come up with other orthoginalization procedures but I don't think it will make a noticable difference in practice. I used the above procedure in my motion tracking application and it worked beatifully; it's both stable and fast.
You can use a QR decomposition to systematically re-orthogonalize, where you replace the original matrix with the Q factor. In the library routines you have to check and correct, if necessary, by negating the corresponding column in Q, that the diagonal entries of R are positive (close to 1 if the original matrix was close to orthogonal).
The closest rotation matrix Q to a given matrix is obtained from the polar or QP decomposition, where P is a positive semi-definite symmetric matrix. The QP decomposition can be computed iteratively or using a SVD. If the latter has the factorization USV', then Q=UV'.
Singular Value Decomposition should be very robust. To quote from the reference:
Let M=UΣV be the singular value decomposition of M, then R=UV.
For your matrix, the singular-values in Σ should be very close to one. The matrix R is guaranteed to be orthogonal, which is the defining property of a rotation matrix. If there weren't any rounding errors in calculating your original rotation matrix, then R will be exactly the same as your M to within numerical precision.
In the meantime:
#include <Eigen/Geometry>
Eigen::Matrix3d mmm;
Eigen::Matrix3d rrr;
rrr << 0.882966, -0.321461, 0.342102,
0.431433, 0.842929, -0.321461,
-0.185031, 0.431433, 0.882966;
// replace this with any rotation matrix
mmm = rrr;
Eigen::AngleAxisd aa(rrr); // RotationMatrix to AxisAngle
rrr = aa.toRotationMatrix(); // AxisAngle to RotationMatrix
std::cout << mmm << std::endl << std::endl;
std::cout << rrr << std::endl << std::endl;
std::cout << rrr-mmm << std::endl << std::endl;
Which is nice news, because I can get rid of my custom method and have one headache less (how can one be sure that he takes care of all singularities?),
but I really want your opinion on better/alternative ways :)
An alternative is to use Eigen::Quaternion to represent your rotation. This is much easier to normalize, and rotation*rotation products are generally faster. If you have a lot of rotation*vector products (with the same matrix), you should locally convert the quaternion to a 3x3 matrix.
M is the matrix we want to orthonormalize, and R is the rotation matrix closest to M.
Analytic Solution
Matrix R = M*inverse(sqrt(transpose(M)*M));
Iterative Solution
// To re-orthogonalize matrix M, repeat:
M = 0.5f*(inverse(transpose(M)) + M);
// until M converges
M converges to R, the nearest rotation matrix. The number of digits of accuracy will approximately double with each iteration.
Check whether the sum of the squares of the elements of (M - M^-T)/2 is less than the square of your error goal to know when (M + M^-T)/2 meets your accuracy threshold. M^-T is the inverse transpose of M.
Why It Works
We want to find the rotation matrix R which is closest to M. We will define the error as the sum of squared differences of the elements. That is, minimize trace((M - R)^T (M - R)).
The analytic solution is R = M (M^T M)^-(1/2), outlined here.
The problem is that this requires finding the square root of M^T M. However, if we notice, there are many matrices whose closest rotation matrix is R. One of these is M (M^T M)^-1, which simplifies to M^-T, the inverse transpose. The nice thing is that M and M^-T are on opposite sides of R, (intuitively like a and 1/a are on opposite side of 1).
We recognize that the average, (M + M^-T)/2 will be closer to R than M, and because it is a linear combination, will also maintain R as the closest rotation matrix. Doing this recursively, we converge to R.
Worst Case Convergence (Speculative)
Because it is related to the Babylonian square root algorithm, it converges quadratically.
The exact worst case error after one iteration of a matrix M and error e is nextE:
nextE = e^2/(2 e + 2)
e = sqrt(trace((M - R)^T (M - R)))
R = M (M^T M)^-(1/2)
This problem has been bothering me for several days, hence I decided to ask you for help.
I am reading the book "Quaternions and Rotation Sequence" written by Jack B. Kuipers. In section 6.4, the author derives a formula of a composite rotation quaternion. One of the steps of this derivation is difficult for me to understand.
I would like to briefly describe the derivation process as follow:
Consider a tracking problem as in this picture.
(I am sorry I have to use links instead of posting pictures directly because this is the first time I post a question here so I am not eligible to do so yet)
In the picture, XYZ is a global, reference frame. 2 successive rotations are performed:
The first one is a rotation about the Z axis through an angle alpha, transforming frame XYZ into a new frame x1y1z1.
The second one is a rotation about the y1 axis through an angle beta, transforming frame x1y1z1 into a new frame x2y2z2.
The goal is to find a single composite rotation quaternion which is equivalent to the two rotations above.
The author does this as follow. The first rotation can be represented by the following quaternion p:
p = cos(alpha/2) + k*sin(alpha/2) (1)
In this formula, k is a standard basis vector (we have vectors i, j, k in R3 corresponding to the axes x, y, z respectively).
The second rotation can be represented by the following quaternion q:
q = cos(beta/2) + j*sin(beta/2) (2)
The composite quaternion we are looking for is the product of these 2 quaternions: qp. The formula of this product is in this picture.
In order to derive this final formula, the author uses 2 assumptions about the standard basis vectors i, j, k, which are: k.j = 0 and k x j = -i. And this is where I dont understand.
We all know that, for a set of 3 mutually orthogonal vectors i, j, k, these 2 assumptions above are correct. However, vector k in (1) and vector j in (2) don't belong to the same coordinate frame. In other words, k in (1) corresponds to Z in frame XYZ, and j in (2) corresponds to y1 in x1y1z1. And these are 2 different, distinguish frames, so I think the second assumption used by the author is incorrect.
What do you think about this? Any answer would be appreciated. Thank you.
author uses 2 assumptions about the standard basis vectors i, j, k...
it is not assumption!
You not understand cross product and dot product see
http://en.wikipedia.org/wiki/Cross_product
http://en.wikipedia.org/wiki/Dot_product
3 mutually orthogonal vectors i, j, k
orthogonal vectors.... What is it(definition)?
Dot_product... What is it(definition)?
Can we define orthogonal vectors via Dot_product?
You must learn a basic of Linear algebra and Complex analysis before understand quaternion.
I have two sets of 3D points (original and reconstructed) and correspondence information about pairs - which point from one set represents the second one. I need to find 3D translation and scaling factor which transforms reconstruct set so the sum of square distances would be least (rotation would be nice too, but points are rotated similarly, so this is not main priority and might be omitted in sake of simplicity and speed). And so my question is - is this solved and available somewhere on the Internet? Personally, I would use least square method, but I don't have much time (and although I'm somewhat good at math, I don't use it often, so it would be better for me to avoid it), so I would like to use other's solution if it exists. I prefer solution in C++, for example using OpenCV, but algorithm alone is good enough.
If there is no such solution, I will calculate it by myself, I don't want to bother you so much.
SOLUTION: (from your answers)
For me it's Kabsch alhorithm;
Base info: http://en.wikipedia.org/wiki/Kabsch_algorithm
General solution: http://nghiaho.com/?page_id=671
STILL NOT SOLVED:
I also need scale. Scale values from SVD are not understandable for me; when I need scale about 1-4 for all axises (estimated by me), SVD scale is about [2000, 200, 20], which is not helping at all.
Since you are already using Kabsch algorithm, just have a look at Umeyama's paper which extends it to get scale. All you need to do is to get the standard deviation of your points and calculate scale as:
(1/sigma^2)*trace(D*S)
where D is the diagonal matrix in SVD decomposition in the rotation estimation and S is either identity matrix or [1 1 -1] diagonal matrix, depending on the sign of determinant of UV (which Kabsch uses to correct reflections into proper rotations). So if you have [2000, 200, 20], multiply the last element by +-1 (depending on the sign of determinant of UV), sum them and divide by the standard deviation of your points to get scale.
You can recycle the following code, which is using the Eigen library:
typedef Eigen::Matrix<double, 3, 1, Eigen::DontAlign> Vector3d_U; // microsoft's 32-bit compiler can't put Eigen::Vector3d inside a std::vector. for other compilers or for 64-bit, feel free to replace this by Eigen::Vector3d
/**
* #brief rigidly aligns two sets of poses
*
* This calculates such a relative pose <tt>R, t</tt>, such that:
*
* #code
* _TyVector v_pose = R * r_vertices[i] + t;
* double f_error = (r_tar_vertices[i] - v_pose).squaredNorm();
* #endcode
*
* The sum of squared errors in <tt>f_error</tt> for each <tt>i</tt> is minimized.
*
* #param[in] r_vertices is a set of vertices to be aligned
* #param[in] r_tar_vertices is a set of vertices to align to
*
* #return Returns a relative pose that rigidly aligns the two given sets of poses.
*
* #note This requires the two sets of poses to have the corresponding vertices stored under the same index.
*/
static std::pair<Eigen::Matrix3d, Eigen::Vector3d> t_Align_Points(
const std::vector<Vector3d_U> &r_vertices, const std::vector<Vector3d_U> &r_tar_vertices)
{
_ASSERTE(r_tar_vertices.size() == r_vertices.size());
const size_t n = r_vertices.size();
Eigen::Vector3d v_center_tar3 = Eigen::Vector3d::Zero(), v_center3 = Eigen::Vector3d::Zero();
for(size_t i = 0; i < n; ++ i) {
v_center_tar3 += r_tar_vertices[i];
v_center3 += r_vertices[i];
}
v_center_tar3 /= double(n);
v_center3 /= double(n);
// calculate centers of positions, potentially extend to 3D
double f_sd2_tar = 0, f_sd2 = 0; // only one of those is really needed
Eigen::Matrix3d t_cov = Eigen::Matrix3d::Zero();
for(size_t i = 0; i < n; ++ i) {
Eigen::Vector3d v_vert_i_tar = r_tar_vertices[i] - v_center_tar3;
Eigen::Vector3d v_vert_i = r_vertices[i] - v_center3;
// get both vertices
f_sd2 += v_vert_i.squaredNorm();
f_sd2_tar += v_vert_i_tar.squaredNorm();
// accumulate squared standard deviation (only one of those is really needed)
t_cov.noalias() += v_vert_i * v_vert_i_tar.transpose();
// accumulate covariance
}
// calculate the covariance matrix
Eigen::JacobiSVD<Eigen::Matrix3d> svd(t_cov, Eigen::ComputeFullU | Eigen::ComputeFullV);
// calculate the SVD
Eigen::Matrix3d R = svd.matrixV() * svd.matrixU().transpose();
// compute the rotation
double f_det = R.determinant();
Eigen::Vector3d e(1, 1, (f_det < 0)? -1 : 1);
// calculate determinant of V*U^T to disambiguate rotation sign
if(f_det < 0)
R.noalias() = svd.matrixV() * e.asDiagonal() * svd.matrixU().transpose();
// recompute the rotation part if the determinant was negative
R = Eigen::Quaterniond(R).normalized().toRotationMatrix();
// renormalize the rotation (not needed but gives slightly more orthogonal transformations)
double f_scale = svd.singularValues().dot(e) / f_sd2_tar;
double f_inv_scale = svd.singularValues().dot(e) / f_sd2; // only one of those is needed
// calculate the scale
R *= f_inv_scale;
// apply scale
Eigen::Vector3d t = v_center_tar3 - (R * v_center3); // R needs to contain scale here, otherwise the translation is wrong
// want to align center with ground truth
return std::make_pair(R, t); // or put it in a single 4x4 matrix if you like
}
For 3D points the problem is known as the Absolute Orientation problem. A c++ implementation is available from Eigen http://eigen.tuxfamily.org/dox/group__Geometry__Module.html#gab3f5a82a24490b936f8694cf8fef8e60 and paper http://web.stanford.edu/class/cs273/refs/umeyama.pdf
you can use it via opencv by converting the matrices to eigen with cv::cv2eigen() calls.
Start with translation of both sets of points. So that their centroid coincides with the origin of the coordinate system. Translation vector is just the difference between these centroids.
Now we have two sets of coordinates represented as matrices P and Q. One set of points may be obtained from other one by applying some linear operator (which performs both scaling and rotation). This operator is represented by 3x3 matrix X:
P * X = Q
To find proper scale/rotation we just need to solve this matrix equation, find X, then decompose it into several matrices, each representing some scaling or rotation.
A simple (but probably not numerically stable) way to solve it is to multiply both parts of the equation to the transposed matrix P (to get rid of non-square matrices), then multiply both parts of the equation to the inverted PT * P:
PT * P * X = PT * Q
X = (PT * P)-1 * PT * Q
Applying Singular value decomposition to matrix X gives two rotation matrices and a matrix with scale factors:
X = U * S * V
Here S is a diagonal matrix with scale factors (one scale for each coordinate), U and V are rotation matrices, one properly rotates the points so that they may be scaled along the coordinate axes, other one rotates them once more to align their orientation to second set of points.
Example (2D points are used for simplicity):
P = 1 2 Q = 7.5391 4.3455
2 3 12.9796 5.8897
-2 1 -4.5847 5.3159
-1 -6 -15.9340 -15.5511
After solving the equation:
X = 3.3417 -1.2573
2.0987 2.8014
After SVD decomposition:
U = -0.7317 -0.6816
-0.6816 0.7317
S = 4 0
0 3
V = -0.9689 -0.2474
-0.2474 0.9689
Here SVD has properly reconstructed all manipulations I performed on matrix P to get matrix Q: rotate by the angle 0.75, scale X axis by 4, scale Y axis by 3, rotate by the angle -0.25.
If sets of points are scaled uniformly (scale factor is equal by each axis), this procedure may be significantly simplified.
Just use Kabsch algorithm to get translation/rotation values. Then perform these translation and rotation (centroids should coincide with the origin of the coordinate system). Then for each pair of points (and for each coordinate) estimate Linear regression. Linear regression coefficient is exactly the scale factor.
A good explanation Finding optimal rotation and translation between corresponding 3D points
The code is in matlab but it's trivial to convert to opengl using the cv::SVD function
You might want to try ICP (Iterative closest point).
Given two sets of 3d points, it will tell you the transformation (rotation + translation) to go from the first set to the second one.
If you're interested in a c++ lightweight implementation, try libicp.
Good luck!
The general transformation, as well the scale can be retrieved via Procrustes Analysis. It works by superimposing the objects on top of each other and tries to estimate the transformation from that setting. It has been used in the context of ICP, many times. In fact, your preference, Kabash algorithm is a special case of this.
Moreover, Horn's alignment algorithm (based on quaternions) also finds a very good solution, while being quite efficient. A Matlab implementation is also available.
Scale can be inferred without SVD, if your points are uniformly scaled in all directions (I could not make sense of SVD-s scale matrix either). Here is how I solved the same problem:
Measure distances of each point to other points in the point cloud to get a 2d table of distances, where entry at (i,j) is norm(point_i-point_j). Do the same thing for the other point cloud, so you get two tables -- one for original and the other for reconstructed points.
Divide all values in one table by the corresponding values in the other table. Because the points correspond to each other, the distances do too. Ideally, the resulting table has all values being equal to each other, and this is the scale.
The median value of the divisions should be pretty close to the scale you are looking for. The mean value is also close, but I chose median just to exclude outliers.
Now you can use the scale value to scale all the reconstructed points and then proceed to estimating the rotation.
Tip: If there are too many points in the point clouds to find distances between all of them, then a smaller subset of distances will work, too, as long as it is the same subset for both point clouds. Ideally, just one distance pair would work if there is no measurement noise, e.g when one point cloud is directly derived from the other by just rotating it.
you can also use ScaleRatio ICP proposed by BaoweiLin
The code can be found in github