Currently, I'm taking each corner of my object's bounding box and converting it to Normalized Device Coordinates (NDC) and I keep track of the maximum and minimum NDC. I then calculate the middle of the NDC, find it in the world and have my camera look at it.
<Determine max and minimum NDCs>
centerX = (maxX + minX) / 2;
centerY = (maxY + minY) / 2;
point.set(centerX, centerY, 0);
projector.unprojectVector(point, camera);
direction = point.sub(camera.position).normalize();
point = camera.position.clone().add(direction.multiplyScalar(distance));
camera.lookAt(point);
camera.updateMatrixWorld();
This is an approximate method correct? I have seen it suggested in a few places. I ask because every time I center my object the min and max NDCs should be equal when their are calculated again (before any other change is made) but they are not. I get close but not equal numbers (ignoring the negative sign) and as I step closer and closer the 'error' between the numbers grows bigger and bigger. IE the error for the first few centers are: 0.0022566539084770687, 0.00541687811360958, 0.011035676399427596, 0.025670088917273515, 0.06396864345885889, and so on.
Is there a step I'm missing that would cause this?
I'm using this code as part of a while loop to maximize and center the object on screen. (I'm programing it so that the user can enter a heading an elevation and the camera will be positioned so that it's viewing the object at that heading and elevation. After a few weeks I've determined that (for now) it's easier to do it this way.)
However, this seems to start falling apart the closer I move the camera to my object. For example, after a few iterations my max X NDC is 0.9989318709122867 and my min X NDC is -0.9552042384799428. When I look at the calculated point though, I look too far right and on my next iteration my max X NDC is 0.9420058636660581 and my min X NDC is 1.0128126740876888.
Your approach to this problem is incorrect. Rather than thinking about this in terms of screen coordinates, think about it terms of the scene.
You need to work out how much the camera needs to move so that a ray from it hits the centre of the object. Imagine you are standing in a field and opposite you are two people Alex and Burt, Burt is standing 2 meters to the right of Alex. You are currently looking directly at Alex but want to look at Burt without turning. If you know the distance and direction between them, 2 meters and to the right. You merely need to move that distance and direction, i.e. right and 2 meters.
In a mathematical context you need to do the following:
Get the centre of the object you are focusing on in 3d space, and then project a plane parallel to your camera, i.e. a tangent to the direction the camera is facing, which sits on that point.
Next from your camera raycast to the plane in the direction the camera is facing, the resultant difference between the centre point of the object and the point you hit the plane from the camera is the amount you need to move the camera. This should work irrespective of the direction or position of the camera and object.
You are playing the what came first problem. The chicken or the egg. Every time you change the camera attributes you are effectively changing where your object is projected in NDC space. So even though you think you are getting close, you will never get there.
Look at the problem from a different angle. Place your camera somewhere and try to make it as canonical as possible (ie give it a 1 aspect ratio) and place your object around the cameras z-axis. Is this not possible?
Related
A project I've been working on for the past few months is calculating the top area of an object taken with a 3D depth camera from top view.
workflow of my project:
capture a group of objects image(RGB,DEPTH data) from top-view
Instance Segmentation with RGB image
Calculate the real area of the segmented mask with DEPTH data
Some problem on the project:
All given objects have different shapes
The side of the object, not the top, begins to be seen as it moves to the outside of the image.
Because of this, the mask area to be segmented gradually increases.
As a result, the actual area of an object located outside the image is calculated to be larger than that of an object located in the center.
In the example image, object 1 is located in the middle of the angle, so only the top of the object is visible, but object 2 is located outside the angle, so part of the top is lost and the side is visible.
Because of this, the mask area to be segmented is larger for objects located on the periphery than for objects located in the center.
I only want to find the area of the top of an object.
example what I want image:
Is there a way to geometrically correct the area of an object located on outside of the image?
I tried to calibrate by multiplying the area calculated according to the angle formed by Vector 1 connecting the center point of the camera lens to the center point of the floor and Vector 2 connecting the center point of the lens to the center of gravity of the target object by a specific value.
However, I gave up because I couldn't logically explain how much correction was needed.
fig 3:
What I would do is convert your RGB and Depth image to 3D mesh (surface with bumps) using your camera settings (FOVs,focal length) something like this:
Align already captured rgb and depth images
and then project it onto ground plane (perpendicul to camera view direction in the middle of screen). To obtain ground plane simply take 3 3D positions of the ground p0,p1,p2 (forming triangle) and using cross product to compute the ground normal:
n = normalize(cross(p1-p0,p2-p1))
now you plane is defined by p0,n so just each 3D coordinate convert like this:
by simply adding normal vector (towards ground) multiplied by distance to ground, if I see it right something like this:
p' = p + n * dot(p-p0,n)
That should eliminate the problem with visible sides on edges of FOV however you should also take into account that by showing side some part of top is also hidden so to remedy that you might also find axis of symmetry, and use just half of top side (that is not hidden partially) and just multiply the measured half area by 2 ...
Accurate computation is virtually hopeless, because you don't see all sides.
Assuming your depth information is available as a range image, you can consider the points inside the segmentation mask of a single chicken, estimate the vertical direction at that point, rotate and project the points to obtain the silhouette.
But as a part of the surface is occluded, you may have to reconstruct it using symmetry.
There is no way to do this accurately for arbitrary objects, since there can be parts of the object that contribute to the "top area", but which the camera cannot see. Since the camera cannot see these parts, you can't tell how big they are.
Since all your objects are known to be chickens, though, you could get a pretty accurate estimate like this:
Use Principal Component Analysis to determine the orientation of each chicken.
Using many objects in many images, find a best-fit polynomial that estimates apparent chicken size by distance from the image center, and orientation relative to the distance vector.
For any given chicken, then, you can divide its apparent size by the estimated average apparent size for its distance and orientation, to get a normalized chicken size measurement.
I'm working on my raytracer and it seems I can't manage to handle the case where the direction vector of my camera is parallel to the vector (0,1,0).
I think it is linked to my way to compute the vector up and right for camera but I can't manage to find a work around.
Here is how I do it:
cam_up = vector_cross(cam_dir, {0, 1, 0});
camp_right = vector_cross(cam_right, cam_dir);
Can somebody enlighten me?
You have the correct formula for calculation of an orthogonal axis from a single cameraOut vector. However, as has been stated this formula will not account for the camera roll, which could be any direction in the plane perpendicular to the camera direction. This will be apparent when moving a camera across the pole (y-axis) as there will be undesireable behavior (yes it will be correctly aimed, but no doubt the roll won't be desired).
For more information, look into gimbal lock.
The roll itself is not really incorrect, however in reality for this camera transition to be smooth and appear correct (rather than suddenly flip or spin as it's direction becomes 0,1,0), you need to correct any roll incurred. This is a rotation about the cameraOut axis and ideally should be relative to the previous cameraAlong. This means in order to maintain the correct roll (or perceived correct roll) you need to consider the camera POSE (position and orientation) from the previous frame and ensure the roll is mitigated. Of course, if the camera doesn't move (i.e. your rendering a frame with a static camera position) you do not have a previous camera state so the position cannot be calculated and instead must be explicitly defined as part of the scene definition.
Personally I store an entire orthogonal axis for a camera so the orientation and roll is always clearly defined. This is only for completeness, to be honest you don't need to store the entire axis, 2 vectors cameraOut and cameraAlong (the third one being cameraUp) are enough. cameraAlong is dependant on the handed-ness of your coordinate system (e.g. for initial camera position say position (0,0,0) in left hand coordinate system, the cameraAlong direction will be in the right direction in relation to the viewer, for right hand system the cameraAlong would be the other way around. The cameraUp and cameraOut would are the same in both coordinate systems).
Hope this helps.
P.S This isn't ray tracing specific and the same principles apply for OpenGL/DirectX or any 3D representation.
I would like to draw an artificial horizon. The center of the view would represent perfectly horizontal view with roll rotating the horizontal line and pitch moving it up or down.
The question is: what is the correct calculation to translate the horizon line up or down (pitch) given the pitch angle.
My guess is that this would probably depend on the FOV angle that one would assume for an assumed camera, so this angle would need to be a factor in the algorithm sought. Ideally I would figure out this angle for the iPhone/iPad camera so that the artificial horizon would line up with the actual horizon if you hold the device in front of you and look towards the horizon.
Until now I've been guesstimating the offset, but I would like to have the exact formula.
Try horizon_offset/(screen_height/2)=tan(pitch)/tan(vertical_FOV/2).
Look at the picture, and the formula derives itself.
(source: zwibbler.com)
.
Update I have two angles mixed up. One is the FOV angle of the camera, the other is the viewing angle of the screen. These are two different things. The latter depends on the viewing distance. You probably have to estimate this distance, and adjust magnification and/or focal distance such that objects visible on the screen are the same angular size as the same objects visible with the naked eye. (With my particular phone, you would need to magnify the image by an additional factor of about 3 after the 5x zoom, if the user stretches his hand with the phone all the way forward). Then the two angles are the same, and the formula works.
If you want to introduce magnification (i.e. objects on the screen have different sizes from their real-life counterparts), multiply the horizon offset by the magnification factor.
Update 2 When taking the viewing distance into account, the screen size cancels out, and the offset simply becomes viewing_distance*tan(pitch_angle) (with unit magnification).
We are developing a real time strategy varient for WP7. At the momment, we require some direction/instruction on how to build an effective camera system. In other words, we would like a camera that can pan around a flat surface (2d or 3d level map). We have been experimenting with a 2d tile map, while our unit/characters are all 3d models.
At first glance, it appears we need to figure out how to calculate a bounding box around the camera and its entire view perspective. Or, restrict the movement of the camera to what it can see, to the bounds of a 2d map.
Any help would be greatly appreciated!!
Cheers
If you're doing true 2D scrolling, it's pretty simple:
Scroll.X must be between 0 and level.width - screen.width
Scroll.Y must be between 0 and level.height - screen.height
(use MathHelper.Clamp to help you with this)
As for 3D it's a little trickier but almost the same principle.
All you really need is to define TWO Vector3 points, one is the lower left back corner and the other the upper right front (or you could do upper left front / lower right back, etc., up to you). These would be your bounding values.
The first one you could define as a readonly with just constant values that you tweak the camera bounds exactly as you wish for that corner. There IS a way of computing this, but honestly I prefer to have more control so I typically choose the route of value tweaking instead.
The second one you could start off with a "base" that you could manually tweak (or compute) just like before but this time you have to add the map width and length (to X and Z) so you know the true bounds depending on the map you have loaded.
Once you have these values, clamp them just as before:
//pans the camera but caps at bounds
public void ScrollByCheckBounds(Vector3 scroll, Vector3 bottomLeftFront, Vector3 topRightBack)
{
Vector3 newScroll = Scroll + scroll;
//clamp each dimension
newScroll.X = MathHelper.Clamp(newScroll.X, topRightBack.X, bottomLeftFront.X);
newScroll.Y = MathHelper.Clamp(newScroll.Y, topRightBack.Y, bottomLeftFront.Y);
newScroll.Z = MathHelper.Clamp(newScroll.Z, bottomLeftFront.Z, topRightBack.Z);
Scroll = newScroll;
}
I'm a fresh in cocos3d, now I have a problem.
In cocos3d, I want to rotate a node. I got the angles in x axis, y axis, z axis, then I used the property:rotation to rotate, like this:
theNodeToBeRotated.rotation = cc3v(x,y,z);
But I found out it didn't rotate as I expected, because the document said the rotate order is y-x-z.
I want to change the order to x-y-z. Can anyone let me know how?
You might need to clarify further regarding the following: "it didn't rotate as I expected"
OpenGL ES (and ergo, cocos3D) uses the y-axis as up so the rotation order is still x-y-z. If you are importing a model, you then need to take into account the 3D editor's co-ordinate system and adapt accordingly.
If you are not used to working with three-dimensional representations, the leap from 2D to 3D can be a significant hurdle. Within Cocos3D:
the x-axis is positive on the right and negative on the left
the y-axis is positive upwards and negative downwards
the z-axis is positive moving towards you and negative moving away from you
Envisage those three lines of axis, or even better, a piece of string.
If you are rotating around the x-axis, hold the string horizontally from left to right: the object would rotating towards you or away from you.
If you are rotating around the y-axis, hold the string vertically from feet to head: the object would rotate as if like a revolving door.
If you are rotating around the z-axis, hold one end close to your chest and the other end as far away as possible: the object would rotate similar to a clock face.
-- Update
I heavily wouldn't recommend changing the rotation order as it is the OpenGL standard to use Y-X-Z. If you wish to modify it, take a look at CC3GLMatrixMath and look for kmMat4RotationYXZ - there is also kmMat4RotationZYX. If you want to have X-Y-Z, you would need to construct your own rotation matrix and update accordingly in CC3GLMatrix and CC3GLMatrixMath.
As a reference, you also have the OpenGL Red book - it should have some suggestions for you.