In ThreeJS Object3D.applyMatrix4 is not continuous - three.js

I was transforming an object with a matrix A using Object3D.applyMatrix4 and I found that at one point it didn't preserve an eigen vector's direction.
So I tried animating interpolation between Identity Matrix I and A and I found this:
How could the transformation be not continuous?

Linear interpolation of rotation matrices isn't mathematically sound. The vectors composing a rotation matrix need to be unit length.. or at least stay a consistent length.
Imagine a clock with a hand at 12, and a hand at 6.
If you Linearly interpolate the point at the tip of the 12 oclock hand, to the tip of the 6oclock hand, the point travels in a straight line from top of the clock to the bottom.
To interpolate the rotation represented by a 4x4 matrix, you can convert the rotations of the matrices, to quaternions, and .slerp (spherical linear interpolate) between those quaternions, then convert back to a matrix.
And then linearly interpolate the object.position. (although again.. this assumes linear motion between keyframes).
Now in the case that the rotation is small, you can get away with linearly interpolating the matrix, but you will need to orthonormalize it at each step, to reshape the mesh into one that has consistent length vectors that are orthogonal to each other. That isn't that hard.. you use a combination of dot products, multiplies and adds of the vectors forming the matrix rows (or columns, i forget) to orthonormalize the matrix. But its more of a pain, and less accurate than just using quaternions and .slerp.

#manthrax 's answer pointed out the fundamental problem of interpolating a matrix linearly which I wasn't aware of at the time and he was right about that. But the real problem was that Object3D.applyMatrix4 wasn't the right function for explicitly defining local matrix. I tried setting Object3D.matrix property directly and it worked. And the linear interpolation (although I shouldn't do that) became continuous.

Related

Linear depth buffer

Many people use usual perspective matrix with third line like this:
(0 0 (n+f)/(n-f) 2*n*f/(n-f))
But it has problem with float precision near far clipping surface. The result is z-fighting.
What about to use linear transformation of z? Let's change the matrix third line to this:
(0 0 -2/(f-n) (-f-n)/(f-n))
It will be linear transformation z from [-n, -f] to [-1, 1]. Then, we will add the line in vertex shader:
gl_Position.z *= gl_Position.w;
After perspective divide the z value will be restored.
Why don't it used everywhere? I found a lot of articles in internet. All of them used a usual matrix.
Is linear transformation described by me has problems what I don't see?
Update: This is not a duplicate of this. My question is not about how to do linear depth buffer. In my case, the buffer is already linear. I don't understand, why is this method not used? Are there traps in the inner webgl pipeline?
The approach you're describing simply doesn't work. One advantage of a hyperbolic Z buffer is that we can interpolate the resulting depth values linearly in screen space. If you multiply gl_Position.z by gl_Position.w, the resulting z value will not be linear in screen space any more, but the depth test will still use linearly interpolated values. This results in your primitives becoming bend in the z-dimension, leading to completely wrong occlusions and intersections between nearby primitives (especially if the vertices of on primitive lie near the center of the other).
The only way to use a linear depth buffer is to actually do the non-linear interpolation for the Z value yourself in the fragment shader. This can be done (and boil's down to just linearly transform the perspective-corrected interpolated w value for each fragment, hence it is sometimes called "W buffering"), but you're losing the benefits of the early Z test and - much worse - of the hierarchical depth test.
An interesting way to improve the precision of the depth test is to use a floating point buffer in combination with a reversed Z projection matrix, as explained in this Depth Precision Visualized blog article.
UPDATE
From your comment:
Depth in screen space is linear interpolation of NDC, how I understand form here. In my case, it will be linear interpolation of linear interpolation of z from camera space. Thus, depth in screen space interpolated already.
You mis-understood this. May main point was that the linear interpolation in screen space is only valid if you're using Z values which are already hyperbolically distorted (like NDC Z). If you want to use eye-space Z, this can not be linearly interpolated. I made some drawings of the situation:
This is a top-down view on eye-space and NDC. All drawings are actually to scale. The green ray is a view ray going through some pixel. This pixel happens to be the one which directly represents the mid-point of that one triangle (green point).
After the projection matrix is applied and the division by w has happened, we are in normalized device coordinates. Note that the direction of the viewing ray is now just +z, and all view rays of all pixels became parallel (so that we can just ignore Z when rasterizing). Due to the hyperbolic relation of the z value, the green point now does not lie on exactly on the center any more, but is squeezed towards the far plane. However, the important point is that this point now lies on the straight line formed by the (hyperbolically distorted) end points of the primitive - hence we simply can interpolate z_ndc linearly in screen space.
If you use a linear depth buffer, the green point now lies at z in the center of the primitive again, but that point is not on the straight line - you actually bend your primitives.
Since the depth test will use a linear interpolation, it will just get the points as in the rightmost drawing as input from the vertex shader, but will interpolate them linearly - connecting those points by straight lines. As a result, the intersection between primitives will not be where it actually has to be.
Another way to think of this: Imagine you draw some primitive which extents into the z-dimension, with some perspective projection. Due to perspective, stuff that is farther away will appear smaller. So if you just go one pixel to the right in screen space, the z extent covered by that step will actually bigger if the primitive is far away, while it will become smaller and smaller as closer you get. So if you just go in equal-sized steps to the right, the z-steps you're making will vary depending on the orientation and position of your primitive. However, we want to use a linear interpolation, so we want to make the same z step size for every x step. The only we to do this is by distorting the space z is in - and the hyperbolical distortion introduced by the division by w exactly does that.
We don't use a linear transformation because that will have precision problems at all distances equally. At least now, the precision problems only show up far away, where you're less likely to notice. A linear mapping spaces the error out evenly, which makes errors more likely to happen close to the camera.

Algorithmic complexity of finding subset of 3D points in cube

Given an array of 3D integers, what is the algorithmic complexity of determining which of those integers exist within a cube? I'm assuming the points can be represented in a number of concurrent data structures, each sorted in one or more dimensions.
My intuition tells me given a sorted array of points in 1D one can determine the subset of points between some lower and upper bound in something like O(log(n), but I would be very grateful for any insights others can offer on this notion (and any help others can offer generalizing to the multidimensional case!).
If you're unfamiliar with the math involved, I recommend doing this problem in two dimensions first, with a rectangle. That way, you can get familiar with the math, which is really just a bit of basic trigonometry. After that, stepping up to three dimensions isn't very difficult.
The problem is much simpler if the cube (or rectangle) is axis aligned, so you probably should do that first. For an example of determining the rotation you need, see How to calculate rotation angle from rectangle points?.
Once you've determined the rotation angle, you can translate the rectangle to the origin and rotate it by doing the first two steps in the accepted answer here: Drawing a Rotated Rectangle.
You now have an axis-aligned rectangle that's centered at the origin.
Finally, for each of your points:
Apply the same translation and rotation that you applied to the rectangle.
Test to see if the x and y coordinates in the resulting point are within the rectangle. This is a matter of, at most, four bounds checks.
If the point is in the rectangle, save it.
Once you've done this in two dimensions, you should be able to apply those concepts to three dimensions.
The algorithm is O(n), where n is the number of points.

Average transformation matrix for a list of transformations

I have multiple estimates for a transformation matrix, from mapping two point clouds to each other via ICP (Iterative Closest Point).
How can I generate the average transformation matrix for all these matrices?
Each matrix consists of a rigid translation and a rotation only, no scale or skew.
Ideally I would also like to calculate a weighted average, but an unweighted one is fine for now.
Averaging the translation vectors is of course trivial, but the rotations are problematic. One approach I found is averaging the individual base vectors for the rotations, but I am not sure that will result in a new orthonormal base, and the approach seems a little ad-hoc.
Splitting the transformation in translation and rotation is a good start. Averaging the translation is trivial.
Averaging the rotation is not that easy. Most approaches will use quaternions. So you need to transform the rotation matrix to a quaternion.
The easiest way to approximate the average is a linear blending, followed by renormalization of the quaternion:
q* = w1 * q1 + w2 * q2 + ... + w2 * qn
normalize q*
However, this is only an approximation. The reason for that is that the combination of two rotations is not performed by adding the quaternions, but by multiplying them. If we convert quaternions to a logarithmic space, we can use a simple linear blend (because multiplication will become additions). Then transform the quaternion back to the original space. This is the idea of the Spherical Average (Buss 2001). If you're lucky, you find a library that supports log and exp of quaternions:
start with q* as above
do until convergence
for each input quaternion i (index)
diff = q[i] * inverse(q*)
u[i] = log(diff, base q*)
//Now perform the linear blend
adapt := zero quaternion
weights := 0
for each input quaternion i
adapt += weight[i] * u[i]
weights += weight[i]
adapt *= 1/weights
adaptInOriginalSpace = q* ^ adapt (^ is the power operator)
q* = adaptInOriginalSpace * q*
You can define a threshold for adaptInOriginalSpace. If it is a very very small rotation, you can break the loop. This algorithm is proven to preserve geodesic distances on a sphere.
http://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation and http://en.wikipedia.org/wiki/Rotation_matrix#Quaternion will give you some elegant mathematics and a way to turn a rotation matrix into an angle of rotation round an axis of rotation. There will be two possible representations of each rotation, with different signs for both angle of rotation and axis of rotation.
You could convert everything and normalize them to have +ve angles of rotation, then work out the average angle of rotation and the average axis of rotation, renormalising this into a unit vector.
OTOH if your intention is to work out the most accurate possible estimate of the transformation, you need to write down some measure of the goodness of fit of any candidate transformation - a sum of squared errors is often mathematically convenient - and then solve an optimization problem to work out which transformation minimizes the sum of squared errors. This is at least easier to justify than taking an average of individually error-prone estimates, and may well be more accurate.
If you have an existing lerp method, then there is a trivial solution:
count = 1
average_transform = Matrix.Identity(4)
for new_transform in list_of_matrices:
factor = 1/count
average_transform = lerp(average_transform, new_transform, factor)
count += 1
This is only useful because lots more mathermatics packages have the ability to lerp matrices than to average lots of them.
Because I haven't come across this method elsewhere, here's an informal proof:
If there is one matrix, use just that matrix (factor will equal 1 for first matrix)
If there are two matrices, we need 50% of the second one (second factor is 50% so we lerp to half way between the existing first one and the new one)
If there are three matrices we need 33% of each, or 66% of the average of the first two and 33% of the third. The lerp factor of 0.3333 makes this happen.
And so on.
I haven't tested extensively with matrices, but I've used this successfully as a rolling average for other datatypes.
The singular value decomposition (SVD) can be used here.
Take the SVD of the sum of the rotation matricies, and then the average rotation matrix is simply given by Ravg = UV'.
"sdfgeoff" I can't comment in your answer because I'm new here, but you are the most correct, I think. Beutifull and elegant solution, by the way. Would be perfect if you use Spherical Linear Interpolation (SLERP) with quaternions, instead of Linear Interpolation (LERP) because quaternions that map rotations (quaternions with norm 1) define a sphere in 4D, and interpolating between then is in fact interpolate between two point in a sphere surface.
With my experience from point cloud registration, I wuold like to say that this will not work. ICP don't return random rotations in the likehood of the correct rotation. You need to use a beter algorith to register you point clouds (Global Registration algorithms, like FPFH, 4PCS, K4PCS, BSC, FGR, etc). Or a better initial guess for the transformation. ICP will only give you totally wrong rotations (when stuck in local minima) or almost perfect rotations, when initialized with good initial transformations.
Conclusion: averaging it will not work.
I would suggest taking a look at "Average" of multiple quaternions? for a more elaborate discussion on how to compute the average of rotations.

how to Rotate about an arbitrary axis?

Givens
1- X,y,and Z the world co-ordinate system
2-i,j,k another co-ordinate system.
3-the cosines in which each of i,j, and k make with the X,Y,Z.
problem
how to rotate the i,j,k system about i or j or k??
If you have the cosines of the angles formed by pairing each of i,j,k with each of xhat, yhat, and zhat (nine angles altogether), you have the makings for the direction cosine matrix. For example, see http://www.ae.illinois.edu/~tbretl/ae403/handouts/06-dcm.pdf (or just google direction cosine matrix). The direction cosine matrix is just another name for a transformation or rotation matrix.
Be careful, though!
There is no single standard scheme. You need to know that this is the case and read the literature carefully.
Are you rotating the object or transforming coordinates? Rotation and transformation are conjugate operations. Some people (many people!) use the term 'rotation matrix' when they mean 'transformation matrix', and vice versa.
Do you represent vectors as column vectors or row vectors? Here there is a lot more consistency; most people use column vectors rather than row vectors for things like positions, velocities, etc. BUT there are very good reasons to use row vectors (or column vectors if you are one of those contrarians) for things that properly belong in the dual space.
Quaternions have even more ambiguity of representation than matrices. There's nothing wrong with that (I use quaternions all the time), but you do have to beware of these ambiguities when you read a paper or book, look at someone else's code, or exchange data.
Finally, matrices and quaternions are only two of many charts on SO(3). There are lots of ways to represent rotations in 3-space.
You can first create either a rotation matrix or a quaternion. Then you use that to transform your vectors.
You can find the code to create a rotation matrix or a quaternion in pretty much any 3d maths library.
If I recall correctly you calculated the rotation quaternion as(assuming normalized axis):
q.x=axis.x*sin(alpha)
q.y=axis.y*sin(alpha)
q.y=axis.z*sin(alpha)
q.w=cos(alpha)

Quaternions and Transform Matrices

Tell me if I am wrong.
I'm starting using quaternions. Using a rotation matrix 4 x 4 (as used in OpenGL), I can compute model view matrix multiplying the current model view with a rotation matrix. The rotation matrix is derived from the quaternion.
The quaternion is a direction vector (even not normalized) and a rotation angle. Resulted rotation is dependent on the direction vector module and the w quaternion component.
But why I should use quaternions instead of Euler axis/angle notation? The latter is simpler to visualize and to manage...
All information that I found could be synthetized with this beatifull article:
http://en.wikipedia.org/wiki/Rotation_representation
Why it is better to use quaternions is explained in the article.
More compact than the DCM representation and less susceptible to round-off errors
The quaternion elements vary continuously over the unit sphere in R4, (denoted by S3) as the orientation changes, avoiding discontinuous jumps (inherent to three-dimensional parameterizations), this is often referred to as gimbal lock.
Expression of the DCM in terms of quaternion parameters involves no trigonometric functions
It is simple to combine two individual rotations represented as quaternions using a quaternion product
Unlike Euler angles, quaternions don't suffer from gimbal lock.
I disagree that quaternions are easier to visualize, but the main reason for using them is that it's easy to concatenate rotations without "matrix creep".
Quaternions are generally used for calculative simplicity - it's a lot easier (and faster) to do things like composing transformations when using quaternions. To quote the Wikipedia page you linked,
Combining two successive rotations,
each represented by an Euler axis and
angle, is not straightforward, and in
fact does not satisfy the law of
vector addition, which shows that
finite rotations are not really
vectors at all. It is best to employ
the direction cosine matrix (DCM), or
tensor, or quaternion notation,
calculate the product, and then
convert back to Euler axis and angle.
They also do not suffer from a problem common to axis/angle form, gimbal lock.
Quaternions are easier to visualize, manage and create in scenarios where you want to rotate about a particular axis that can be easily calculated. Determining a single rotation angle is much easier than decomposing a rotation into multiple angles.
Corrections to the OP: the vector represents the axis of rotation, not a direction, and the rotation component is the cosine of the half-angle, not the angle itself.
As mentioned, quaternions don't suffer from gimble lock.
For a given rotation, there is exactly one normalized quaternion representation.
There can be several seemingly unrelated axis/angle values that result in the same rotation.
Quaternion rotations can be easily combined.
It is extraordinarily complex to calculate an axis/angle notation that is the cumulation of two other axis/angle rotations.
Floating point numbers have a higher degree of accuracy when representing values between 0.0 and 1.0.
The short answer is that axis/angle notation can initially seem like the most reasonable representation, but in practice quaternions alleviate many problems that axis/angle notation presents.

Resources