I'm using VisualSfM to build the 3D reconstruction of a scene. Now I want to estimate the depthmap and reproject the image. Any idea on how to do it?
If you have the camera intrinsic matrix K, its position vector in the world C and an orientation matrix R that rotates from world space to camera space, you can iterate over all pixels x,y in your image and perform:
Then, find using ray tracing, the minimal t that causes the ray to intersect with your 3D model (assuming it's dense, otherwise interpolate it), so that P lies on your model. The t value you found is then the pixel value of the depth map (perhaps normalized to some range).
Related
I currently have two images of a plane in real life from straight above. One to use as a reference image, and another when the plane has undergone a rotation fixed at the centre of the plane thus changing its orientation. The camera stays at a constant position.
I was wondering if I found the homography matrix of this rotation in opencv and then decomposed the homography matrix in order to find the rotation matrix whether this would yield accurate results and I would be able to find the three angles needed to describe the planes rotation in euclidean coordinates to a reasonable degree of accuracy.
Thanks
I have a projected view of a 3D scene. The 2D points are computed by multiplying the 3D points in homogenous coordinates by a view matrix (which includes a translation and rotation) and a perspective matrix. I want to allow the user to move control points which describe the three axes, and update the rotation matrix based on this.
How do I compute the new rotation matrix given a change in projected 2D coordinates, assuming rotation around the origin? Solving for the position of the end of the single axis has a large degeneracy in the set of possible, but maybe solving for rotation in the axes perpendicular to the moved axis might work.
In my program (using MATLAB), I specified(through dragging) the pedestrian lane as my Region Of Interest (ROI) with the coordinates [7, 178, 620, 190] (in xmin, ymin, width, and height respectively) using the getrect, roipoly and insertshape function. Refer to the image below.
The video from where this snapshot is taken is in 640x480 pixels resolution (480p)
Defining a real world space as my ROI by mouse dragging is barbaric. That's why the ROI coordinates must be derived mathematically.
What I'm going at is using real-world measurements from the video capturing site and use the Pythagorean Theorem from where the camera is positioned:
How do I obtain the equivalent pixel coordinates and parameters using the real-world measurements?
I'll try to split your question into 2 smaller questions.
A) How do I obtain the equivalent pixel coordinates of an interesting
point? (pratical question)
Your program shoudl be able to retrieve/reconnaise a feature/marker that you positioned in the "real-world" interesting point. The output is a coordinate in pixel. This can be done quite easily (think about QR-codes, for example)
B) What is the analytical relationship between 1 point in 3D space and
its pixel coordinate in the image? (theoretical question)
This is the projection equation based on the pinhole camera model. X,Y,Z 3D coordinates are related with x,y pixel coordinates
Cool, but some detail have to be explained (and there will be any "automatic short formula")
s represent the scale factor. A single pixel in an image could be the projection of infinite different point, due to perspective. In your photo, a pixel containing a piece of a car (when the car is present) will be the same pixel that contain a piece of street under the car (when the car is passed).
So there is not an univocal relationship starting from pixels coordinates
The matrix on the left involves the camera parameters (focal length, etc.) which are called intrinsic parameters. They have to be known to build the relationship between 3D coordinates and pixel coordinates
The matrix on the right seems to be trivial, is the combination of an identity matrix which represents rotation and a column array of zeros which represents translation. Something like T = [R|t].
Which rotation, which translation? You have to consider that every set of coordinates is implicitly expressed in its own reference system. So you have to determine the relationship between the reference system of your measurement and the camera reference system: not only to retrieve position of the camera in your 3D space with euclidean geometry, but also orientation of the camera (angles).
Actually, I have rendered 3 input images of a sphere with different light directions in PBRT.
As the next step of the process, I am going to compute surface normals of this sphere, so I need to put the Focal length value in my formula.
All that I now is that I have the value of Field of View (FOV) in my PBRT input files which is 45.
The dimensions of the whole image is 32*32 and the dimensions of the sphere in the image is 26*26.
How can I compute the exact amount of the Focal length using this information?
you can not use perspective without knowing the focal length in 3D graphics. It is also called z_near and it is the distance from camera origin (point from which you cast the rays) to the projection plane. Look at this:
where: Point P near Camera label is the focal point and the blue rectangle labeled Screen(z_near) is the projection plane. The focal length is perpendicular distance of this point to the plane.
PS. boyfarrell is right you do not need focal length for normal computations. It does not make sense. You could need it to compute some physic process like pupil size etc but not for normals.
I have an image of a 3D rectangle (which due to the projection distortion is not a rectangle in the image). I know the all world and image coordinates of all corners of this rectangle.
What I need is to determine the world coordinate of a point in the image inside this rectangle. To do that I need to compute a transformation to unproject that rectangle to a 2D rectangle.
How can I compute that transform?
Thanks in advance
This is a special case of finding mappings between quadrilaterals that preserve straight lines. These are generally called homographic transforms. Here, one of the quads is a rectangle, so this is a popular special case. You can google these terms ("quad to quad", etc) to find explanations and code, but here are some sites for you.
Perspective Transform Estimation
a gaming forum discussion
extracting a quadrilateral image to a rectangle
Projective Warping & Mapping
ProjectiveMappings for ImageWarping by Paul Heckbert.
The math isn't particularly pleasant, but it isn't that hard either. You can also find some code from one of the above links.
If I understand you correctly, you have a 2D point in the projection of the rectangle, and you know the 3D (world) and 2D (image) coordinates of all four corners of the rectangle. The goal is to find the 3D coordinates of the unique point on the interior of the (3D, world) rectangle which projects to the given point.
(Do steps 1-3 below for both the 3D (world) coordinates, and the 2D (image) coordinates of the rectangle.)
Identify (any) one corner of the rectangle as its "origin", and call it "A", which we will treat as a vector.
Label the other vertices B, C, D, in order, so that C is diagonally opposite A.
Calculate the vectors v=AB and w=AD. These form nice local coordinates for points in the rectangle. Points in the rectangle will be of the form A+rv+sw, where r, s, are real numbers in the range [0,1]. This fact is true in world coordinates and in image coordinates. In world coordinates, v and w are orthogonal, but in image coordinates, they are not. That's ok.
Working in image coordinates, from the point (x,y) in the image of your rectangle, calculate the values of r and s. This can be done by linear algebra on the vector equations (x,y) = A+rv+sw, where only r and s are unknown. It will boil down to a 2x2 matrix equation, which you can solve generally in code using Cramer's rule. (This step will break if the determinant of the required matrix is zero. This corresponds to the case where the rectangle is seen edge-on. The solution isn't unique in that case. If that's possible, make special exception.)
Using the values of r and s from 4, compute A+rv+sw using the vectors A, v, w, for world coordinates. That's the world point on the rectangle.