ray tracing: "size" of the field of depth - raytracing

How do I actually define the depth of the field, that is, the range of z-values where my objects would be in focus. That is, I want objects with z-coords in [f-w,f+w] to be in focus (camera coords), where f is the focal length and w is some predefined constant.
The way I do it now is I find a primary ray from lens center to P (point on the focal plane, z=-f), then shoot rays from random points L on the lens to P.
What I am seeing is that the implicit value w is very small, so there is a very visible band where things are in focus, and everything near it and further is blurred. Now, I could play with d (distance from lens to image plane) and aperture values to make a specific scene look OK, but I wanted to see some maths on how to solve this issue properly.
I've looked at several ray tracing books and they all skirt this issue.

In case anyone else will look, a 1995 paper by Kolb, Mitchel, Hanrahan - A Realistic Camera Model for Computer Graphics describes a lens model of thickness t and how to control the depth of the field itself.
Link to paper

Related

How to "focus zoom" on a spherical camera?

So, for anyone familiar with Google Maps, when you zoom, it does it around the cursor.
That is to say, the matrix transformation for such a zoom is as simple as:
TST^{-1}*x
Where T is the translation matrix representing the point of focus, S the scale matrix and x is any arbitrary point on the plane.
Now, I want to produce a similar effect with a spherical camera, think sketchfab.
When you zoom in and out, the camera needs to be translated so as to give a similar effect as the 2D zooming in Maps. To be more precise, given a fully composed MVP matrix, there exists a set of parallel planes that are parallel to the camera plane. Among those there exists a unique plane P that also contains the center of the current spherical camera.
Given that plane, there exists a point x, that is the unprojection of the current cursor position onto the camera plane.
If the center of the spherical camera is c then the direction from c to x is d = x - c.
And here's where my challenge comes. Zooming is implemented as just offsetting the camera radially from the center, given a change in zoom Delta, I need to find the translation vector u, colinear with d, that moves the center of the camera towards x, such that I get a similar visual effect as zooming in google maps.
Since I know this is a bit hard to parse I tried to make a diagram:
TL;DR
I want to offset a spherical camera towards the cursor when I zoom, how do i pick my translation vector?

Perspective Projection given only field of view

I am working on a Perspective camera. The constructor must be:
PerspectiveCamera::PerspectiveCamera(Vec3f &center, Vec3f &direction, Vec3f &up, float angle)
This is construction different from most others, as it lacks near and far clipping planes. I know what to with center, direction, and up -- the standard look at algorithm.
We can construct the view matrix and translate matrix accordingly:
Thus, the viewing transformation is:
For an orthographic camera (which is working correctly for me), the inverse transformation is used to go from screen space to world space. The camera coordinates go from (-1,-1,0) --> (1,1,0) in screen space.
For perspective transformation, only the field of view is given. The Wikipedia 3D projection article gives a perspective projection matrix using the field of view angle and assuming camera coordinates go from (-1,-1) --> (1,1):
In my code, (ex,ey,ez) are the camera coordinates that go from (-1,-1, ez) --> (1,1, ez). Note that the 1 in (3,3) spot of K isn't in the Wikipedia article -- I put it in to make the matrix invertible. So that may be a problem.
But anyways, for perspective projection, I used this transformation:
K inverse is multiplied with p to make the canonical view volume to a view frustum, and the result of that is multiplied with M inverse to move into world coordinates.
I get the wrong results. The correct output is:
My output looks like this:
Am I using the right algorithm for perspective projection given my constraint (no near and far plane inputs)???
Just in case somebody else runs into this issue, the method presented in the question is not proper way to create a viewing frustum. The perspective matrix (K) is for projecting the far plane onto the near plane, and we don't have those planes in this case
To create a frustum, do the inverse transformation on (x,y,ez) [as opposed to (x,y,0) for orthographic projection). Find a new direction by subtracting the transformed point from the center of projection. Shoot the ray.

Stereoscopic camera for depth measurement

Been reading this paper:
http://photon07.pd.infn.it:5210/users/dazzi/Thesis_doctorate/Info/Chapter_6/Stereoscopy_(Mrovlje).pdf
to figure out how to use 2 parallel camera to find the depth of an object. Seems like some how we need the field of view of the camera at exact plane (which is the depth which the cameras try to measure anyway) to get the depth.
Am I interpreting this wrong? Or anyone else knows how does one use a pair of camera to measure distance of an object from the camera pair?
Kelvin
Camera sensors either have to lie on the same plane or their images has to be rectified so that 'virtually' they lie in the same plane. This is the only requirement and it simplifies the search for matches between the left and right image: whatever you have in the left image will be located in the right at the same row so you don't need to check other rows. You can skip this requirement but then your search will be more extensive. When you done with finding correspondences you can figure out the depth from them.
In rectified camera, the depth is determined from the shift: for example if the left image has a feature in row 4, column 11 and the left image has this feature in row 4 (same row since camera was rectified) column 1 then we say that disparity is 11-1=10. The disparity D is inversely proportional to dept Z:
Z=fB/D , where B is distance between cameras.
At the end you will have depth estimates everywhere where you found correspondences. So called dense stereo aims to get more than 90% of image area covered where sparse stereo recovers only a few depth measurements.
Note that it is hard to find correspondences if there is a little texture on the surface of the object or in other words it is uniformly colored. Some cameras such as Kinect project their own pattern on the objects to solve the problem of feature absence.

Computing the gradient in volume ray casting

Please help me clear up this question I have about the volume ray casting algorithm:
In the wikipedia article (link), it says that "For each sampling point, a gradient of illumination values is computed. These represent the orientation of local surfaces within the volume."
My question is: Why a gradient of illumination values? Why not opacity values? Surely the transition from "stuff" to "no stuff" is more accurately described by changes in opacity.
Consider, for instance, two voxels: [1][2]. 1 is bright and transparent, and 2 is dark and opaque. In my mind this corresponds to a surface facing left. Am I missing something?
The gradient vector points along the direction of greatest increase. In this case, that means it points in the direction in which illumination increases most rapidly, and so in your [1][2] example the gradient at 2 indeed points left.
Conversely, if you took the gradient of the opacity you'd get a vector pointing towards the direction of increasing opacity, ie "inwards". You could take the negation of that to get a vector pointing outwards, but there's still the problem that opacity is local. It tells you only about the structure of the object in the immediate surroundings. When you're shading something, what you want to know is how well-lit a surface is as you travel from point-to-point. Knowing the surfaces of constant illumination allows you to deduce that, but from a surface of constant opacity you could only deduce how well-lit it'd be if there were nothing between the surface and the light source.
gradient is the normal vector to iso-surface, you may consider volumetric object as a stack of iso-surfaces (a lot of them ;o), each iso-surface has identical scalar field value upon its surface; Transfer Function sets correspondence of opacity/Red/Green/Blue and scalar values so each iso-surface has an assigned transparency and color. Such representation makes simple to understand the role of volumetric gradient for volume rendering...
In classical ray tracing, you calculate a ray from your eye through a pixel on a screen in front of you and out into the world to see what it hits. Once you determine the intersection point, you need to determine the lighting characteristics at that point. The lighting characteristics depend on the surface normal. When you think about it, the surface normal is a property that's determined not just by the intersection point itself but by the the surrounding surface.
This same principle applies to ray casting. Once you determine that the ray does hit some voxel, you need to analyze surrounding voxels to effectively calculate a "surface normal" that you can use for the lighting calculations.

Captured image viewpoint changing

i have a picture that captured from a fixed position [X Y Z] and angle [Pitch Yaw Roll] and a focal length of F (i think this information is called camera matrix)
i want to change the captured picture to a different position like it was taken in up position
the result image should be like:
in fact i have picture taken from this position:
and i want to change my picture in a way that it was taken in this position:
i hope that i could express my problem.
thnx in advance
It can be done accurately only for the (green) plane itself. The 3D objects standing onto the plane will be deformed after remapping, but the deformation may be acceptable if their height is small relative to the camera distance.
If the camera is never moving, all you need to do is identify on the perspective image four points that are the four vertices of a rectangle of known size (e.g. the soccer field itself), then compute the homography that maps those four points to that rectangle, and apply it to the whole image.
For details and code, see the OpenCV links at the bottom of that Wikipedia article.

Resources