Calculate distance from a static camera to an object on a ground plane - matrix

I have a stationary camera, which is 640mm above the ground and is tilted slightly forward (about the x-axis?) roughly 30 degrees, so it's looking slightly down towards a flat ground plane.
My goal is to determine the distance from the camera to any small objects that it detects on the ground, for example, if my camera detects an object which is located at pixel [314, 203], I would like to know where on the ground that object would be in world coordinates [x, y, z] with y=0, and the distance to that object.
I've drawn a diagram to better visualize the problem:
camera plane diagram
I have my rotation matrix and translation vector and also my intrinsic matrix, but I'm not sure 1) if the rotation matrix and translation vector are correct given the information/diagram above, and 2) how to proceed with figuring out a mathematical formula for finding distance and real world location. Here is what I have so far:
Rotation matrix R (generated here https://www.andre-gaschler.com/rotationconverter/) from the orientation [-30, 0, 0] (degrees)
R =
[ 1.0000000, 0.0000000, 0.0000000;
0.0000000, 0.8660254, -0.5000000;
0.0000000, 0.5000000, 0.8660254 ]
Camera is 640mm above ground plane
t =
[0, 640, 0]
Intrinsic matrix from calibration information given by camera
fx=349.595, fy=349.505, cx=328.875, cy=178.204
K =
[ 349.595, 0.0000, 328.875;
0.0000, 349.505, 178.204;
0.0000, 0.0000, 1.000 ]
I also have these distortion parameters, I'm unsure what to do with them or if they are relevant to K
k1=-0.170901, k2=0.0255027, k3=-9.30328e-11, p1=0.000117187, p2=6.42836e-05
I get about this far and then I'm lost, any help would be much appreciated.
Also sorry if this is a lot of information or if it's confusing in any way, I'm very much a beginner when it comes to projection matrices
UPDATE:
After some more research and testing on my own I found a formula that seems to give me a somewhat decent approximation. Given pixel [x, y], I find (I think) the direction vector from the camera origin to the pixel coordinate with:
dir_x = (x - cx) / fx
dir_y = (y - cy) / fy
dir_z = 1
which I then multiply by rotation matrix R, which gives me the real-world vector. I then divide my camera height (640mm) by the y-value of that vector, which gives me (I think) the distance to the specified pixel in the real-world. After some testing and measuring by hand, this seems to be an adequate method for finding the distance, but I'm not sure if I'm missing steps for accuracy or if I'm actually doing this completely wrong.
Again, any insight is greatly appreciated.

Related

Inverse Camera Intrinsic Matrix for Image Plane at Z = -1

A similar question was asked before, unfortunately I cannot comment Samgaks answer so I open up a new post with this one. Here is the link to the old question:
How to calculate ray in real-world coordinate system from image using projection matrix?
My goal is to map from image coordinates to world coordinates. In fact I am trying to do this with the Camera Intrinsics Parameters of the HoloLens Camera.
Of course this mapping will only give me a ray connecting the Camera Optical Centre and all points, which can lie on that ray. For the mapping from image coordinates to world coordinates we can use the inverse camera matrix which is:
K^-1 = [1/fx 0 -cx/fx; 0 1/fy -cy/fy; 0 0 1]
Pcam = K^-1 * Ppix;
Pcam_x = P_pix_x/fx - cx/fx;
Pcam_y = P_pix_y/fy - cy/fy;
Pcam_z = 1
Orientation of Camera Coordinate System and Image Plane
In this specific case the image plane is probably at Z = -1 (However, I am a bit uncertain about this). The Section Pixel to Application-specified Coordinate System on page HoloLens CameraProjectionTransform describes how to go form pixel coordinates to world coordinates. To what I understand two signs in the K^-1 are flipped s.t. we calculate the coordinates as follows:
Pcam_x = (Ppix_x/fx) - (cx*(-1)/fx) = P_pix_x/fx + cx/fx;
Pcam_y = (Ppix_y/fy) - (cy*(-1)/fy) = P_pix_y/fy + cy/fy;
Pcam_z = -1
Pcam = (Pcam_x, Pcam_y, -1)
CameraOpticalCentre = (0,0,0)
Ray = Pcam - CameraOpticalCentre
I do not understand how to create the Camera Intrinsics for the case of the image plane being at a negative Z-coordinate. And I would like to have a mathematical explanation or intuitive understanding of why we have the sign flip (P_pix_x/fx + cx/fx instead of P_pix_x/fx - cx/fx).
Edit: I read in another post that the thirst column of the camera matrix has to be negated for the case that the camera is facing down the negative z-direction. This would explain the sign flip. However, why do we need to change the sign of the third column. I would like to have a intuitive understanding of this.
Here the link to the post Negation of third column
Thanks a lot in advance,
Lisa
why do we need to change the sign of the third column
To understand why we need to negate the third column of K (i.e. negate the principal points of the intrinsic matrix) let's first understand how to get the pixel coordinates of a 3D point already in the camera coordinates frame. After that, it is easier to understand why -z requires negating things.
let's imagine a Camera c, and one point B in the space (w.r.t. the camera coordinate frame), let's put the camera sensor (i.e. image) at E' as in the image below. Therefore f (in red) will be the focal length and ? (in blue) will be the x coordinate in pixels of B (from the center of the image). To simplify things let's place B at the corner of the field of view (i.e. in the corner of the image)
We need to calculate the coordinates of B projected into the sensor d (which is the same as the 2d image). Because the triangles AEB and AE'B' are similar triangles then ?/f = X/Z therefore ? = X*f/Z. X*f is the first operation of the K matrix is. We can multiply K*B (with B as a column vector) to check.
This will give us coordinates in pixels w.r.t. the center of the image. Let's imagine the image is size 480x480. Therefore B' will look like this in the image below. Keep in mind that in image coordinates, the y-axis increases going down and the x-axis increases going right.
In images, the pixel at coordinates 0,0 is in the top left corner, therefore we need to add half of the width of the image to the point we have. then px = X*f/Z + cx. Where cx is the principal point in the x-axis, usually W/2. px = X*f/Z + cx is exactly as doing K * B / Z. So X*f/Z was -240, if we add cx (W/2 = 480/2 = 240) and therefore X*f/Z + cx = 0, same with the Y. The final pixel coordinates in the image are 0,0 (i.e. top left corner)
Now in the case where we use z as negative, when we divide X and Y by Z, because Z is negative, it will change the sign of X and Y, therefore it will be projected to B'' at the opposite quadrant as in the image below.
Now the second image will instead be:
Because of this, instead of adding the principal point, we need to subtract it. That is the same as negating the last column of K.
So we have 240 - 240 = 0 (where the second 240 is the principal point in x, cx) and the same for Y. The pixel coordinates are 0,0 as in the example when z was positive. If we do not negate the last column we will end up with 480,480 instead of 0,0.
Hope this helped a little bit

Having a 3D point projected onto a 3D plane, find the 2D coord based on the plane two axis

I have a THREE.Plane plane which is intersected by a number of THREE.Line3 lines[].
Using only this information, how can I acquire a 2D coordinate set of points?
Edit for better understanding the problem:
The 2D coordinate is related to the plane, so imagine the 3D plane becomes a Cartesian plane drawn on a blackboard. It is pretty much a 3D drawing of a 2D plane. What I want to find is the X, Y values of points previously projected onto this Cartesian plane. But they are 3D, just like the 3D plane.
You don't have enough information. In this answer I'll explain why, and provide more information to achieve what you want, should you be able to provide the necessary information
First, let's create a plane. Like you, I'm uing Plane.setFromNormalAndCoplanarPoint. I'm considering the co-planar point as the origin ((0, 0)) of the plane's Cartesian space.
let normal = new Vector3(Math.random(), Math.random(), Math.random()).normalize()
let origin = new Vector3(Math.random(), Math.random(), Math.random()).normalize().setLength(10)
let plane = new Plane.setFromNormalAndCoplanarPoint(normal, origin)
Now, we create a random 3D point, and project it onto the plane.
let point1 = new Vector3(Math.random(), Math.random(), Math.random()).normalize()
let projectedPoint1 = new Vector3()
plane.projectPoint(point1, projectedPoint1)
The projectedPoint1 variable is now co-planar with your plane. But this plane is infinite, with no discrete X/Y axes. So currently we can only get the distance from the origin to the projected point.
let distance = origin.distanceTo(projectedPoint1)
In order to turn this into a Cartesian coordinate, you need to define at least one axis. To make this truly random, let's compute a random +Y axis:
let tempY = new Vector3(Math.random(), Math.random(), Math.random())
let pY = new Vector3()
plane.projectPoint(tempY, pY)
pY.normalize()
Now that we have +Y, let's get +X:
let pX = new Vector3().crossVectors(pY, normal)
pX.normalize()
Now, we can project the plane-projected point onto the axis vectors to get the Cartesian coordinates.
let x = projectedPoint1.clone().projectOnVector(pX).distanceTo(origin)
if(!projectedPoint1.clone().projectOnVector(pX).normalize().equals(pX)){
x = -x
}
let y = projectedPoint1.clone().projectOnVector(pY).distanceTo(origin)
if(!projectedPoint1.clone().projectOnVector(pY).normalize().equals(pY)){
y = -y
}
Note that in order to get negative values, I check a normalized copy of the axis-projected vector against the normalized axis vector. If they match, the value is positive. If they don't match, the value is negative.
Also, all the clone-ing I did above was to be explicit with the steps. This is not an efficient way to perform this operation, but I'll leave optimization up to you.
EDIT: My logic for determining the sign of the value was flawed. I've corrected the logic to normalize the projected point and check against the normalized axis vector.

PIXI.js - Canvas Coordinate to Container Coordinate

I have initiated a PIXI js canvas:
g_App = new PIXI.Application(800, 600, { backgroundColor: 0x1099bb });
Set up a container:
container = new PIXI.Container();
g_App.stage.addChild(container);
Put a background texture (2000x2000) into the container:
var texture = PIXI.Texture.fromImage('picBottom.png');
var back = new PIXI.Sprite(texture);
container.addChild(back);
Set the global:
var g_Container = container;
I do various pivot points and rotations on container and canvas stage element:
// Set the focus point of the container
g_App.stage.x = Math.floor(400);
g_App.stage.y = Math.floor(500); // Note this one is not central
g_Container.pivot.set(1000, 1000);
g_Container.rotation = 1.5; // radians
Now I need to be able to convert a canvas pixel to the pixel on the background texture.
g_Container has an element transform which in turn has several elements localTransform, pivot, position, scale ands skew. Similarly g_App.stage has the same transform element.
In Maths this is simple, you just have vector point and do matix operations on them. Then to go back the other way you just find inverses of those matrices and multiply backwards.
So what do I do here in pixi.js?
How do I convert a pixel on the canvas and see what pixel it is on the background container?
Note: The following is written using the USA convention of using matrices. They have row vectors on the left and multiply them by the matrix on the right. (Us pesky Brits in the UK do the opposite. We have column vectors on the right and multiply it by the matrix on the left. This means UK and USA matrices to do the same job will look slightly different.)
Now I have confused you all, on with the answer.
g_Container.transform.localTransform - this matrix takes the world coords to the scaled/transposed/rotated COORDS
g_App.stage.transform.localTransform - this matrix takes the rotated world coords and outputs screen (or more accurately) html canvas coords
So for example the Container matrix is:
MatContainer = [g_Container.transform.localTransform.a, g_Container.transform.localTransform.b, 0]
[g_Container.transform.localTransform.c, g_Container.transform.localTransform.d, 0]
[g_Container.transform.localTransform.tx, g_Container.transform.localTransform.ty, 1]
and the rotated container matrix to screen is:
MatToScreen = [g_App.stage.transform.localTransform.a, g_App.stage.transform.localTransform.b, 0]
[g_App.stage.transform.localTransform.c, g_App.stage.transform.localTransform.d, 0]
[g_App.stage.transform.localTransform.tx, g_App.stage.transform.localTransform.ty, 1]
So to get from World Coordinates to Screen Coordinates (noting our vector will be a row on the left, so the first operation matrix that acts first on the World coordinates must also be on the left), we would need to multiply the vector by:
MatAll = MatContainer * MatToScreen
So if you have a world coordinate vector vectWorld = [worldX, worldY, 1.0] (I'll explain the 1.0 at the end), then to get to the screen coords you would do the following:
vectScreen = vectWorld * MatAll
So to get screen coords and to get to world coords we first need to calculate the inverse matrix of MatAll, call it invMatAll. (There are loads of places that tell you how to do this, so I will not do it here.)
So if we have screen (canvas) coordinates screenX and screenY, we need to create a vector vectScreen = [screenX, screenY, 1.0] (again I will explain the 1.0 later), then to get to world coordinates worldX and worldY we do:
vectWorld = vectScreen * invMatAll
And that is it.
So what about the 1.0?
In a 2D system you can do rotations, scaling with 2x2 matrices. Unfortunately you cannot do a 2D translations with a 2x2 matrix. Consequently you need 3x3 matrices to fully describe all 2D scaling, rotations and translations. This means you need to make your vector 3D as well, and you need to put a 1.0 in the third position in order to do the translations properly. This 1.0 will also be 1.0 after any matrix operation as well.
Note: If we were working in a 3D system we would need 4x4 matrices and put a dummy 1.0 in our 4D vectors for exactly the same reasons.

Determine whether the direction of a line segment is clockwise or anti clockwise

I have a list of 2D points (x1,y1),(x2,y2)......(Xn,Yn) representing a curved segment, is there any formula to determine whether the direction of drawing that segment is clockwise or anti clockwise ?
any help is appreciated
Alternately, you can use a bit of linear algebra. If you have three points a, b, and c, in that order, then do the following:
1) create the vectors u = (b-a) = (b.x-a.x,b.y-a.y) and v = (c-b) ...
2) calculate the cross product uxv = u.x*v.y-u.y*v.x
3) if uxv is -ve then a-b-c is curving in clockwise direction (and vice-versa).
by following a longer curve along in the same manner, you can even detect when as 's'-shaped curve changes from clockwise to anticlockwise, if that is useful.
One possible approach. It should work reasonably well if the sampling of the line represented by your list of points is uniform and smooth enough, and if the line is sufficiently simple.
Subtract the mean to "center" the line.
Convert to polar coordinates to get the angle.
Unwrap the angle, to make sure its increments are meaningful.
Check if total increment is possitive or negative.
I'm assuming you have the data in x and y vectors.
theta = cart2pol(x-mean(x), y-mean(y)); %// steps 1 and 2
theta = unwrap(theta); %// step 3
clockwise = theta(end)<theta(1); %// step 4. Gives 1 if CW, 0 if ACW
This only considers the integrated effect of all points. It doesn't tell you if there are "kinks" or sections with different directions of turn along the way.
A possible improvement would be to replace the average of x and y by some kind of integral. The reason is: if sampling is denser in a region the average will be biased towards that, whereas the integral wouldn't.
Now this is my approach, as mentioned in a comment to the question -
Another approach: draw a line from starting point to ending point. This line is indeed a vector. A CW curve has most of its part on RHS of this line. For CCW, left.
I wrote a sample code to elaborate this idea. Most of the explanation can be found in comments in the code.
clear;clc;close all
%% draw a spiral curve
N = 30;
theta = linspace(0,pi/2,N); % a CCW curve
rho = linspace(1,.5,N);
[x,y] = pol2cart(theta,rho);
clearvars theta rho N
plot(x,y);
hold on
%% find "the vector"
vec(:,:,1) = [x(1), y(1); x(end), y(end)]; % "the vector"
scatter(x(1),y(1), 200,'s','r','fill') % square is the starting point
scatter(x(end),y(end), 200,'^','r','fill') % triangle is the ending point
line(vec(:,1,1), vec(:,2,1), 'LineStyle', '-', 'Color', 'r')
%% find center of mass
com = [mean(x), mean(y)]; % center of mass
vec(:,:,2) = [x(1), y(1); com]; % secondary vector (start -> com)
scatter(com(1), com(2), 200,'d','k','fill') % diamond is the com
line(vec(:,1,2), vec(:,2,2), 'LineStyle', '-', 'Color', 'k')
%% find rotation angle
dif = diff(vec,1,1);
[ang, ~] = cart2pol(reshape(dif(1,1,:),1,[]), reshape(dif(1,2,:),1,[]));
clearvars dif
% now you can tell the answer by the rotation angle
if ( diff(ang)>0 )
disp('CW!')
else
disp('CCW!')
end
One can always tell on which side of the directed line (the vector) a point is, by comparing two vectors, namely, rotating vector [starting point -> center of mass] to the vector [starting point -> ending point], and then comparing the rotation angle to 0. A few seconds of mind-animating can help understand.

Finding the spin of a sphere given X, Y, and Z vectors relative to sphere

I'm using Electro in Lua for some 3D simulations, and I'm running in to something of a mathematical/algorithmic/physics snag.
I'm trying to figure out how I would find the "spin" of a sphere of a sphere that is spinning on some axis. By "spin" I mean a vector along the axis that the sphere is spinning on with a magnitude relative to the speed at which it is spinning. The reason I need this information is to be able to slow down the spin of the sphere by applying reverse torque to the sphere until it stops spinning.
The only information I have access to is the X, Y, and Z unit vectors relative to the sphere. That is, each frame, I can call three different functions, each of which returns a unit vector pointing in the direction of the sphere model's local X, Y and Z axes, respectively. I can keep track of how each of these change by essentially keeping the "previous" value of each vector and comparing it to the "new" value each frame. The question, then, is how would I use this information to determine the sphere's spin? I'm stumped.
Any help would be great. Thanks!
My first answer was wrong. This is my edited answer.
Your unit vectors X,Y,Z can be put together to form a 3x3 matrix:
A = [[x1 y1 z1],
[x2 y2 z2],
[x3 y3 z3]]
Since X,Y,Z change with time, A also changes with time.
A is a rotation matrix!
After all, if you let i=(1,0,0) be the unit vector along the x-axis, then
A i = X so A rotates i into X. Similarly, it rotates the y-axis into Y and the
z-axis into Z.
A is called the direction cosine matrix (DCM).
So using the DCM to Euler axis formula
Compute
theta = arccos((A_11 + A_22 + A_33 - 1)/2)
theta is the Euler angle of rotation.
The magnitude of the angular velocity, |w|, equals
w = d(theta)/dt ~= (theta(t+dt)-theta(t)) / dt
The axis of rotation is given by e = (e1,e2,e3) where
e1 = (A_32 - A_23)/(2 sin(theta))
e2 = (A_13 - A_31)/(2 sin(theta))
e3 = (A_21 - A_12)/(2 sin(theta))
I applaud ~unutbu's, answer, but I think there's a simpler approach that will suffice for this problem.
Take the X unit vector at three successive frames, and compare them to get two deltas:
deltaX1 = X2 - X1
deltaX2 = X3 - X2
(These are vector equations. X1 is a vector, the X vector at time 1, not a number.)
Now take the cross-product of the deltas and you'll get a vector in the direction of the rotation vector.
Now for the magnitude. The angle between the two deltas is the angle swept out in one time interval, so use the dot product:
dx1 = deltaX1/|deltaX1|
dx2 = deltax2/|deltaX2|
costheta = dx1.dx2
theta = acos(costheta)
w = theta/dt
For the sake of precision you should choose the unit vector (X, Y or Z) that changes the most.

Resources