Color space to Camera space transformation matrix - camera-calibration

I am looking for transformation matrix to convert color space to camera space.
I know that the point conversion can be done using CoordinateMapper but I am not using Kinect v2 official APIs.
I really appreciate if someone can share the transformation matrix, which can convert color space to camera space.
As always, thank you very much.

Important : The raw kinect RGB image has a distortion. Remove it first.
Short answer
The "transformation matrix" you are searching is called projection matrix.
rgb.cx:959.5
rgb.cy:539.5
rgb.fx:1081.37
rgb.fy:1081.37
Long answer
First understand how color image is generated in Kinect.
X, Y, Z : coordinates of the given point in a coordinate space where kinect sensor is consider as the origin. AKA camera space. Note that camera space is 3D.
u, v : Coordinates of the corresponding color pixel in color space. Note that color space is 2D.
fx , fy : Focal length
cx, cy : principal points (you can consider the principal points of the kinect RGB camera as the center of image)
(R|t) : Extrinsic camera matrix. In kinect this one you can consider as (I|0) where I is identity matrix.
s : scaler value. you can set it to 1.
To get the most accurate values for the fx , fy, cx, cy, you need to calibrate your rgb camera in kinect using a chess board.
The above fx , fy, cx, cy values are my own calibration of my kinect. These values are differ from one kinect to another in very small margin.
More info and implementaion
All Kinect camera matrix
Distort
Registration
I implemented the Registration process in CUDA since CPU is not fast enough to process that much of data (1920 x 1080 x 30 matrix calculations per second) in real-time.

Related

Webot camera default parameters like pixel size and focus

I am using two cameras without lens or any other settings in webot to measure the position of an object. To apply the localization, I need to know the focus length, which is the distance from the camera center to the imaging plane center,namely f. I see the focus parameter in the camera node, but when I set it NULL as default, the imaging is still normal. Thus I consider this parameter has no relation with f. In addition, I need to know the width and height of a pixel in the image, namely dx and dy respectively. But I have no idea how to get these information.
This is the calibration model I used, where c means camera and w means world coordinate. I need to calculate xw,yw,zw from u,v. For ideal camera, gama is 0, u0, v0 are just half of the resolution. So my problems exist in fx and fy.
First important thing to know is that in Webots pixels are square, therefore dx and dy are equivalent.
Then in the Camera node, you will find a 'fieldOfView' which will give you the horizontal field of view, using the resolution of the camera you can then compute the vertical field of view too:
2 * atan(tan(fieldOfView * 0.5) / (resolutionX / resolutionY))
Finally, you can also get the near projection plane from the 'near' field of the Camera node.
Note also that Webots cameras are regular OpenGL cameras, you can therefore find more information about the OpenGL projection matrix here for example: http://www.songho.ca/opengl/gl_projectionmatrix.html

Three.js Image Pixel coordinate to World Coordinate Mapping

I'm creating a 3D object in Three.js with 6 faces. Each face has a mesh which uses a THREE.PlaneGeometry(width and height both are 256). On the mesh I'm using a JPEG picture which is 256 by 256 for the texture. I'm trying to find a way to find the world coordinate of a pixel coordinate(for example 200,250 is the pixel coordinate) on the Object3D's PlaneGeometry corresponding to where that picture was used as texture.
Object hierarchy:-
Object3D-->face(object3d) (total 6 faces)-->Each face has a mesh(planegeometry) and uses a jpeg file as texture.
Picture1 pixel coordinate-->Used to create texture for Plane1-->World Coordinate corresponding to that pixel coordinate.
Can someone please help me.
Additional information:-
Thanks for the answer. I'm trying to compare 2 results.
Method 1:- One yaw/pitch is obtained by clicking on a specific point in the 3d object(e.g, center of a particular car headlight which is the front face) using a mouse and getting the point of intersection with the front face using raycasting.
Method 2:-The other yaw/pitch is obtained by taking the pixel coordinate of the same point(center of a particular car headlight) and calculating the world space coordinate for that pixel point. Pls note that pixel coordinate is taken from the JPEG file that was used as texture to create the PlaneGeometry for the mesh(which is a child of the front face).
Do you think the above comparison approach is supposed to produce the same results, assuming all other parameters are identical between the 2 approaches?
Well assuming your planes are PlaneGeometry(1,1) then the local coordinate X/Y/ZZ for a given pixel is pixelX / 256, pixelY / 256 and the Z is 0.5
so something like:
var localPoint = new THREE.Vector3(px/256,py/256,0.5)
var worldPoint = thePlaneObject.localToWorld(localPoint)

obtaining Q matrix from matlab stereoparams

I am working on 3d image reconstrcution using stereo camera. I started with opencv 3.2 and visual studio. I was unable to correctly register two point clouds from two scenes with an overlap correctly. So, I have my doubt on the Q matrix obtained from camera calibration process. So I did the camera calibration using the matlab calibrator app. I want to manually create the Q matrix from the calibration parameters obtained from matlab and then use it in opencv. I found from this post how to create a Q matrix. Now the problem is i don't know the focal length i should use in this matrix. Matlab provides the calibration parameters in a stereoparam object which contains camera parameters for both camera sensors separately. So i have fx and fy from camera1 and fx and fy from camera2. So how do i obtain a single focal length for the stereo camera?
As reported here, fx and fy are expressed in pixels.
F, the focal length in world unit (typically in millimeters) can be computed as
F = fx * px or F = fy * py,
where px and py are the size of the pixel along x and y, respectively.
In particular,
px = image width [pixel] / image sensor width [mm]
py = image height [pixel] / image sensor height [mm].
This is the Q matrix aka reprojection matrix
Since you have the camera intrinsic matrix and extrinsic matrix, then just fill it in accordingly.
Take note the Tx should be in mm.

Get position from Translation and Scale Matrix (Direct2D)

I have a 2D camera defined by Direct2D 3x2 matrix like this :
ViewMatrix = ScaleMatrix * TranslationMatrix;
But when trying to do hit testing, I get at a point where I need to know my X,Y camera coordinate. I tried to keep track of hit in a vector but without success, scaling with offset center complicate a lot the work..
So I guess it should be possible to find my camera coordinate from these two matrix right ? But how ?
Thank a lot for help.

principle point in camera matrix ( programming issue)

In a 3x3 camera matrix what does the principle point do? how its location is formed? can we visualize that?
It is told that the principle point is the intersection of optical axis with the image plane. but why it is not always in the center of image?
we use opencv
The 3x3 camera intrinsics matrix is used to map between the coordinates in the image to the physical world coordinates. Similarly, the role of the principle point in this matrix is the mapping of "the intersection of optical axis with the image plane", between the coordinates in the image to the physical world coordinates. Ideally the principle point is in the center of the image, for most cameras, but this is not always the case in practice. The principle point may be slightly off center due to tangential distortion or imperfect centering of the lens components and other manufacturing defects. The 3x3 camera intrinsics matrix tries to correct this distortion.
I have found this site to be helpful to me when when learning about camera calibration. Although it is in MATLAB, it is based on the same camera calibration used in OpenCV.
The principal point in the 3x3 camera calibration matrix might also, more usefully, represent image cropping. If the image has been cropped around an object, then mapping the pixel coordinates to word coordinates requires a translation vector which appears as a non-centralized principal point in the matrix.

Resources