If the mesh is moving from position (0,0,0) to (10,0,0). The new position of the mesh in x direction is "10". So, what will be the UOM of position. will it be in meter or millimeter or centimeter.
i.e. 10 m or 10 mm or 10 cm or 10 px ?
The official answer is that ThreeJS uses SI units. Thus distances are in meters:
https://github.com/mrdoob/three.js/issues/6259
Related
A similar question was asked before, unfortunately I cannot comment Samgaks answer so I open up a new post with this one. Here is the link to the old question:
How to calculate ray in real-world coordinate system from image using projection matrix?
My goal is to map from image coordinates to world coordinates. In fact I am trying to do this with the Camera Intrinsics Parameters of the HoloLens Camera.
Of course this mapping will only give me a ray connecting the Camera Optical Centre and all points, which can lie on that ray. For the mapping from image coordinates to world coordinates we can use the inverse camera matrix which is:
K^-1 = [1/fx 0 -cx/fx; 0 1/fy -cy/fy; 0 0 1]
Pcam = K^-1 * Ppix;
Pcam_x = P_pix_x/fx - cx/fx;
Pcam_y = P_pix_y/fy - cy/fy;
Pcam_z = 1
Orientation of Camera Coordinate System and Image Plane
In this specific case the image plane is probably at Z = -1 (However, I am a bit uncertain about this). The Section Pixel to Application-specified Coordinate System on page HoloLens CameraProjectionTransform describes how to go form pixel coordinates to world coordinates. To what I understand two signs in the K^-1 are flipped s.t. we calculate the coordinates as follows:
Pcam_x = (Ppix_x/fx) - (cx*(-1)/fx) = P_pix_x/fx + cx/fx;
Pcam_y = (Ppix_y/fy) - (cy*(-1)/fy) = P_pix_y/fy + cy/fy;
Pcam_z = -1
Pcam = (Pcam_x, Pcam_y, -1)
CameraOpticalCentre = (0,0,0)
Ray = Pcam - CameraOpticalCentre
I do not understand how to create the Camera Intrinsics for the case of the image plane being at a negative Z-coordinate. And I would like to have a mathematical explanation or intuitive understanding of why we have the sign flip (P_pix_x/fx + cx/fx instead of P_pix_x/fx - cx/fx).
Edit: I read in another post that the thirst column of the camera matrix has to be negated for the case that the camera is facing down the negative z-direction. This would explain the sign flip. However, why do we need to change the sign of the third column. I would like to have a intuitive understanding of this.
Here the link to the post Negation of third column
Thanks a lot in advance,
Lisa
why do we need to change the sign of the third column
To understand why we need to negate the third column of K (i.e. negate the principal points of the intrinsic matrix) let's first understand how to get the pixel coordinates of a 3D point already in the camera coordinates frame. After that, it is easier to understand why -z requires negating things.
let's imagine a Camera c, and one point B in the space (w.r.t. the camera coordinate frame), let's put the camera sensor (i.e. image) at E' as in the image below. Therefore f (in red) will be the focal length and ? (in blue) will be the x coordinate in pixels of B (from the center of the image). To simplify things let's place B at the corner of the field of view (i.e. in the corner of the image)
We need to calculate the coordinates of B projected into the sensor d (which is the same as the 2d image). Because the triangles AEB and AE'B' are similar triangles then ?/f = X/Z therefore ? = X*f/Z. X*f is the first operation of the K matrix is. We can multiply K*B (with B as a column vector) to check.
This will give us coordinates in pixels w.r.t. the center of the image. Let's imagine the image is size 480x480. Therefore B' will look like this in the image below. Keep in mind that in image coordinates, the y-axis increases going down and the x-axis increases going right.
In images, the pixel at coordinates 0,0 is in the top left corner, therefore we need to add half of the width of the image to the point we have. then px = X*f/Z + cx. Where cx is the principal point in the x-axis, usually W/2. px = X*f/Z + cx is exactly as doing K * B / Z. So X*f/Z was -240, if we add cx (W/2 = 480/2 = 240) and therefore X*f/Z + cx = 0, same with the Y. The final pixel coordinates in the image are 0,0 (i.e. top left corner)
Now in the case where we use z as negative, when we divide X and Y by Z, because Z is negative, it will change the sign of X and Y, therefore it will be projected to B'' at the opposite quadrant as in the image below.
Now the second image will instead be:
Because of this, instead of adding the principal point, we need to subtract it. That is the same as negating the last column of K.
So we have 240 - 240 = 0 (where the second 240 is the principal point in x, cx) and the same for Y. The pixel coordinates are 0,0 as in the example when z was positive. If we do not negate the last column we will end up with 480,480 instead of 0,0.
Hope this helped a little bit
I am trying to determine the distance of an object and the height of an object towards my camera. Is it possible or do I need to use OpenCV calibrate.py to gather more information? I am confused because the Logitech C920HD has 3 MP and scales to 15 MP via software.
I have following info:
Resolution (pixel): 1920x1080
Focal Length (mm): 3.67mm
Pixel Size (µm): 3.98
Sensor Size (inches): 1/2.88
Object real height (mm): 180
Object image height (px): 370
I checked this formula:
distance (mm) = 3.67(mm) * 180(mm) * 1080(px) / 511 (px) * (1/2.88)(inches)*2.54 (mm/inches)
Which gives me 15.8 cm. Altough it should be about 60cm.
What am I doing wrong?
Thanks for help!
Your formula looks the correct one, however, for it to hold over the entire image plane, you should correct lens distortions first, e.g., following the answer
Camera calibration, reverse projection of pixel to direction
Along the way, OpenCV lens calibration module will estimate your true focal length.
Filling the formula gives
Distance = 3.67 mm * 180 mm * 1080/511 / sensor_height_mm = 1396 mm^2 / sensor_height_mm
Leaving sensor_height_mm unknown. Given your camera is 16:9 format
w^2 + h^2 = D^2
(16x)^2+(9x)^2 = D^2
<=>
x = sqrt( D^2/337 )
<=>
h = 9x = 9*sqrt( D^2/337 )
Remember the rule of 16:
https://photo.stackexchange.com/questions/24952/why-is-a-1-sensor-actually-13-2-%C3%97-8-8mm/24954
Most importantly, a 1/2.88" sensor has 16/2.88 mm image circle diameter instead of 25.4/2.88 mm. Funny enough, the true image circle diameter is metric. Thus the sensor diameter is
D = 16 mm/ 2.88 = 5.556 mm
and
sensor_height_mm = h = 2.72 mm
giving
Distance = 513 mm
Note, that this distance is measured with respect to the lenses first principal point and not the sensor position or the lens front element position.
As you correct the barrel distortion, the reading should get more accurate. It's quite a lot for this camera. I have similar.
Hope this helps.
I have two cameras I have calibrated the cameras considering there position at the same point. But actually the positions of the cameras is slightly different than considered during calibration. This caused a parallax error. Now when I capture a point with these two cameras I get a misalignment in the images due to parallax Now I want to calculate this misalignment in pixels.
I tried to calculate the misalignment in m
Z(measured) = Z(calib) + (Du /tan a1 + tan a2)
Z(measured) is actual distance from cam to object in m
Z(calib) is distance from camera to calibration marker point.
Du is distance between the projected point of the object captured by two cameras on image plane in meters
tan a1 = (distance between camera position during calibration and actual camera 1 position/ distance between camera position during calibration and position of calibration marker point)
tan a2 = (distance between camera position during calibration and actual camera 2 position/ distance between camera position during calibration and position of calibration marker point)
How can I now convert this value of Du in meters to pixels
If you know what the ground sample distance of your image you can use that to determine how much distance a pixel represents and use that number to convert meters to pixels.
Ground sample distance is calculated as:
GSD = D/F* PS
GSD = Ground sample distance
D = Distance to object (from camera)
F = Focal Length
PS = Pixel size (calculated using Photo dimension/Camera Sensor Dimension.
PS should be almost if not exactly the same when comparing Width and Height result.
Having GSD you can then work backwards to determine number of pixels based on distance in meters (note this means you will want units to all be in meters).
I have a depth texture and I would like to know in which coordinate system are the values stored inside the depth texture. Homogeneous coordinates, camera coordinates, world coordinates or model coordinates?
I also would like to know what values are stored in the depth texture and what do they mean.
Thanks.
This should be a value in range [min, max] where min is either -1.0 or 0.0 and max is 1.0 though what you get from the texture might simply be an integer value which might need to be transformed (from 24-bit to 32-bit). If none confirms any of these you will need to test it yourself.
Anyway, these values min and max should represent the clipping planes so min = near and max = far due to the depth buffer optimisation. To get the true Z value from texture coordinate ZT then:
Z = near + ((far-near) * ((ZT-min)/(max-min)))
This Z then represents the distance from (0,0,0) from the user perspective this is the distance between object and the camera position.
Try looking for some literature.
I'm looking at this example in particular:
http://www.airtightinteractive.com/demos/processing_js/noisefield08.html
And here's the code for it:
http://www.airtightinteractive.com/demos/processing_js/noisefield08.pjs
I guess I need explaining for what these lines do in the particle class:
d=(noise(id,x/mouseY,y/mouseY)-0.5)*mouseX;
x+=cos(radians(d))*s;
y+=sin(radians(d))*s;
I understand that noise calculates a value based on the coordinates given, but I don't get the logic in dividing the particles' x pos by the mouseY, or the y pos by the mouseY. I also don't understand what 'id', which seems to be a counter stands for, or what the next two lines accomplish.
Thanks
Move mouse to change particle motion.
d seems to be the direction of motion. By putting mouseY and mouseX into the calculation of d it allows the underlying field to depend on the mouse position. Without a better understanding of the function itself I can't tell you exactly what affect mouseY and mouseX have on the field.
By running cos(radians(d)) and sin(radians(d)) the code turns an angle (d) into a unit vector. For example, if d was 1 radian then cos(radians(d)) would be -1 and sin(radians(d)) would be 0 so it turns the angle 1 radians into the unit vector (-1,0).
So it appears that there is some underlying motion field which determines the direction the particles move. The motion field is represented by the noise function and takes in the current position of the particle, the particle id (perhaps to give each particle independent motion or perhaps to remember a history of the particle's motion and base the future motion on that history) and the current position of the mouse.
The actual distance the particle moves is s which is determined randomly to be between 2 and 7 pixels.
By running cos(radians(d)) and sin(radians(d)) the code turns an angle (d) into a unit vector. For example, if d was 1 radian then cos(radians(d)) would be -1 and sin(radians(d)) would be 0 so it turns the angle 1 radians into the unit vector (-1,0).
Slight correction: that is a rotation of pi radians (180 degrees), not 1 radian (around 57 degrees).