Interpolation of 2D camera with pan and zoom

Interpolation of 2D camera with pan and zoom - animation

I'm developing an animation application with 2D virtual camera. The camera viewport can be positioned and scaled in the keyframes and is then interpolated to render the final animation. I'm looking for the best way to interpolate the camera's parameters of x,y position, scale so that objects in the scene transformed by the camera change size at a constant rate and so that all objects travel in a straight line.
The transform matrix for rendering the scene from the point of view of the camera is calculate from the position and scale as follows, where DimX, DimY are the dimensions of the scene image, Pos and Scale are the position and scale of the camera (the variables that I want to interpolate).
LCen := PointF(DimX*0.5, DimY*0.5);
CamTransformInv := TMatrix.CreateTranslation(-(Pos.X + LCen.X), -(Pos.Y + LCen.Y));
LScaleInv := 1 / Scale;
CamTransformInv := CamTransformInv * TMatrix.CreateScaling(LScaleInv, LScaleInv);
CamTransformInv := CamTransformInv * TMatrix.CreateTranslation(LCen.X, LCen.Y);
Here's an animation created by linearly interpolating the scale and position. The black line extends from the center of the viewport in the first position to the center of the viewport in the second position. You can see the effect of it appearing to speed up as it zooms in, which I'd like to avoid. On the plus side, all objects in the scene move in a straight line. I've made the animation loop to make the acceleration effect more obvious.
So I modified my code to linearly interpolate the Ln of the scale and take Exp of the result. This results in an exponential interpolation with the scale change slowing down as it zooms in which looks good since objects in the scene then grow at a constant rate. This makes sense because objects in the scene get multiplied by scale whereas objects get added by position, so interpolation of scale has to be multiplicative. This is achieved by taking log before interpolating. Position is still linear as before. The problem now is that parts of the image to the sides of the line move in a curve. It doesn't look right (see the top of the tower).
It occurred to me that the problem is because I'm interpolating the scale non-linearly and the position linearly. If I made the position decelerate in the same way that the scale is decelerating then it would look correct. However, I can't think how this would be computed as the position and scale are coupled in a complex way. If there's no scale change then the position should change linearly, but the greater the scale change the greater the non-linearality of the position should be.
So is there a standard way of doing this?

This has been answered for me elsewhere. Here is my working code.
CamT.Scale := Exp(LinearInterpolate(Ln(Cam1.Scale), Ln(Cam2.Scale), k));
if abs(Cam2.Scale - Cam1.Scale) > 0.001 then begin
r := Cam2.Scale / Cam1.Scale
w := (Power(r, k) - 1) / (r - 1);
CamT.Pos.X := LinearInterpolate(Cam1.Pos.X, Cam2.Pos.X, w);
CamT.Pos.Y := LinearInterpolate(Cam1.Pos.Y, Cam2.Pos.Y, w);
end else begin
CamT.Pos.X := LinearInterpolate(Cam1.Pos.X, Cam2.Pos.X, k);
CamT.Pos.Y := LinearInterpolate(Cam1.Pos.Y, Cam2.Pos.Y, k);
end;
And the resulting zoom.

Related

How to convert a screen coordinate into a translation for a projection matrix?

(More info at end)----->
I am trying to render a small picture-in-picture display over my scene. The PiP is just a smaller texture, but it is intended to reveal secret objects in the scene when it is placed over them.
To do this, I want to render my scene, then render the SAME scene on the smaller texture, but with the exact same positioning as the main scene. The intended result would be something like this:
My problem is... I cannot get the scene on the smaller texture to match up 1:1. I keep trying various kludges, but ultimately I suspect that I need to do something to the projection matrix to pan it over to the location of the frame. I can get it to zoom correctly...just can't get it to pan.
Can anyone suggest what I need to do to my projection matrix to render my scene 1:1 (but panned by x,y) onto a smaller texture?
The data I have:
Resolution of the full-screen framebuffer
Resolution of the smaller texture
XY coordinate where I want to draw the smaller texture as an overlay sprite
The world/view/projection matrices from the original full-screen scene
The viewport from the original full-screen scene
(Edit)
Here is the function I use to produce the 3D camera:
void Make3DCamera(Vector theCameraPos, Vector theLookAt, Vector theUpVector, float theFOV, Point theRez, Matrix& theViewMatrix,Matrix& theProjectionMatrix)
{
Matrix aCombinedViewMatrix;
Matrix aViewMatrix;
aCombinedViewMatrix.Scale(1,1,-1);
theCameraPos.mZ*=-1;
theLookAt.mZ*=-1;
theUpVector.mZ*=-1;
aCombinedViewMatrix.Translate(-theCameraPos);
Vector aLookAtVector=theLookAt-theCameraPos;
Vector aSideVector=theUpVector.Cross(aLookAtVector);
theUpVector=aLookAtVector.Cross(aSideVector);
aLookAtVector.Normalize();
aSideVector.Normalize();
theUpVector.Normalize();
aViewMatrix.mData.m[0][0] = -aSideVector.mX;
aViewMatrix.mData.m[1][0] = -aSideVector.mY;
aViewMatrix.mData.m[2][0] = -aSideVector.mZ;
aViewMatrix.mData.m[3][0] = 0;
aViewMatrix.mData.m[0][1] = -theUpVector.mX;
aViewMatrix.mData.m[1][1] = -theUpVector.mY;
aViewMatrix.mData.m[2][1] = -theUpVector.mZ;
aViewMatrix.mData.m[3][1] = 0;
aViewMatrix.mData.m[0][2] = aLookAtVector.mX;
aViewMatrix.mData.m[1][2] = aLookAtVector.mY;
aViewMatrix.mData.m[2][2] = aLookAtVector.mZ;
aViewMatrix.mData.m[3][2] = 0;
aViewMatrix.mData.m[0][3] = 0;
aViewMatrix.mData.m[1][3] = 0;
aViewMatrix.mData.m[2][3] = 0;
aViewMatrix.mData.m[3][3] = 1;
if (gG.mRenderToSprite) aViewMatrix.Scale(1,-1,1);
aCombinedViewMatrix*=aViewMatrix;
// Projection Matrix
float aAspect = (float) theRez.mX / (float) theRez.mY;
float aNear = gG.mZRange.mData1;
float aFar = gG.mZRange.mData2;
float aWidth = gMath.Cos(theFOV / 2.0f);
float aHeight = gMath.Cos(theFOV / 2.0f);
if (aAspect > 1.0) aWidth /= aAspect;
else aHeight *= aAspect;
float s = gMath.Sin(theFOV / 2.0f);
float d = 1.0f - aNear / aFar;
Matrix aPerspectiveMatrix;
aPerspectiveMatrix.mData.m[0][0] = aWidth;
aPerspectiveMatrix.mData.m[1][0] = 0;
aPerspectiveMatrix.mData.m[2][0] = gG.m3DOffset.mX/theRez.mX/2;
aPerspectiveMatrix.mData.m[3][0] = 0;
aPerspectiveMatrix.mData.m[0][1] = 0;
aPerspectiveMatrix.mData.m[1][1] = aHeight;
aPerspectiveMatrix.mData.m[2][1] = gG.m3DOffset.mY/theRez.mY/2;
aPerspectiveMatrix.mData.m[3][1] = 0;
aPerspectiveMatrix.mData.m[0][2] = 0;
aPerspectiveMatrix.mData.m[1][2] = 0;
aPerspectiveMatrix.mData.m[2][2] = s / d;
aPerspectiveMatrix.mData.m[3][2] = -(s * aNear / d);
aPerspectiveMatrix.mData.m[0][3] = 0;
aPerspectiveMatrix.mData.m[1][3] = 0;
aPerspectiveMatrix.mData.m[2][3] = s;
aPerspectiveMatrix.mData.m[3][3] = 0;
theViewMatrix=aCombinedViewMatrix;
theProjectionMatrix=aPerspectiveMatrix;
}
Edit to add more information:
Just playing and tweaking numbers, I have come to a "close" result. However the "close" result requires a multiplication by some kludge numbers, that I don't understand.
Here's what I'm doing to to perspective matrix to produce my close result:
//Before calling Make3DCamera, adjusting FOV:
aFOV*=smallerTexture.HeightF()/normalRenderSize.HeightF(); // Zoom it
aFOV*=1.02f // <- WTH is this?
//Then, to pan the camera over to the x/y position I want, I do:
Matrix aPM=GetCurrentProjectionMatrix();
float aX=(screenX-normalRenderSize.WidthF()/2.0f)/2.0f;
float aY=(screenY-normalRenderSize.HeightF()/2.0f)/2.0f;
aX*=1.07f; // <- WTH is this?
aY*=1.07f; // <- WTH is this?
aPM.mData.m[2][0]=-aX/normalRenderSize.HeightF();
aPM.mData.m[2][1]=-aY/normalRenderSize.HeightF();
SetCurrentProjectionMatrix(aPM);
When I do this, my new picture is VERY close... but not exactly perfect-- the small render tends to drift away from "center" the further the "magic window" is from the center. Without the kludge number, the drift away from center with the magic window is very pronounced.
The kludge numbers 1.02f for zoom and 1.07 for pan reduce the inaccuracies and drift to a fraction of a pixel, but those numbers must be a ratio from somewhere, right? They work at ANY RESOLUTION, though-- so I have have a 1280x800 screen and a 256,256 magic window texture... if I change the screen to 1024x768, it all still works.
Where the heck are these numbers coming from?

If you don't care about sub-optimal performance (i.e., drawing the whole scene twice) and if you don't need the smaller scene in a texture, an easy way to obtain the overlay with pixel perfect precision is:
Set up main scene (model/view/projection matrices, etc.) and draw it as you are now.
Use glScissor to set the rectangle for the overlay. glScissor takes the screen-space x, y, width, and height and discards anything outside that rectangle. It looks like you have those four data items already, so you should be good to go.
Call glEnable(GL_SCISSOR_TEST) to actually turn on the test.
Set the shader variables (if you're using shaders) for drawing the greyscale scene/hidden objects/etc. You still use the same view and projection matrices that you used for the main scene.
Draw the greyscale scene/hidden objects/etc.
Call glDisable(GL_SCISSOR_TEST) so you won't be scissoring at the start of the next frame.
Draw the red overlay border, if desired.
Now, if you actually need the overlay in its own texture for some reason, this probably won't be adequate...it could be made to work either with framebuffer objects and/or pixel readback, but this would be less efficient.

Most people completely overcomplicate such issues. There is absolutely no magic to applying transformations after applying the projection matrix.
If you have a projection matrix P (and I'm assuming default OpenGL conventions here where P is constructed in a way that the vector is post-multiplied to the matrix, so for an eye space vector v_eye, we get v_clip = P * v_eye), you can simply pre-multiply some other translate and scale transforms to cut out any region of interest.
Assume you have a viewport of size w_view * h_view pixels, and you want to find a projection matrix which renders only a tile w_tile * h_tile pixels , beginning at pixel location (x_tile, y_tile) (again, assuming default GL conventions here, window space origin is bottom left, so y_tile is measured from the bottom). Also note that the _tile coordinates are to be interpreted relative to the viewport, in the typical case, that would start at (0,0) and have the size of your full framebuffer, but this is by no means required nor assumed here.
Since after applying the projection matrix we are in clip space, we need to transform our coordinates from window space pixels to clip space. Note that clip space is a 4D homogeneous space, but we can use any w value we like (except 0) to represent any point (as a point in the 3D space we care about forms a line in the 4D space we work in), so let's just use w=1 for simplicity's sake.
The view volume in clip space is denoted by the [-w,w] range, so in the w=1 hyperplane, it is [-1,1]. Converting our tile into this space yields:
x_clip = 2 * (x_tile / w_view) -1
y_clip = 2 * (y_tile / h_view) -1
w_clip = 2 * (w_tile / w_view) -1
h_clip = 2 * (h_tile / h_view) -1
We now just need to translate the objects such that the center of the tile is moved to the center of the view volume, which by definition is the origin, and scale the w_clip * h_clip sized region to the full [-1,1] extent in each dimension.
That means:
T = translate(-(x_clip + 0.5*w_clip), -(y_clip + 0.5 *h_clip), 0)
S = scale(2.0/w_clip, 2.0/h_clip, 1.0)
We can now create the modified projection matrix P' as P' = S * T * P, and that's all there is. Rendering with P' instead of P will render exactly the region of your tile to whatever viewport you are using, so for it to be pixel-exact with respect to your original viewport, you must now render with a viewport which is also w_tile * h_tile pixels big.
Note that there is also another approach: The viewport is not clamped against the framebuffer you're rendering to. It is actually valid to provide negative values for x and y. If your framebuffer for rendering your tile into is exactly w_tile * h_tile pixels, you simply could set glViewport(-x_tile, -y_tile, x_tile + w_tile, y_tile + h_tile) and render with the unmodified projection matrix P instead.

Invariant scale geometry

I am writing a mesh editor where I have manipulators with the help of which I change the vertices of the mesh. The task is to render the manipulators with constant dimensions, which would not change when changing the camera and viewport parameters. The projection matrix is perspective. I will be grateful for ideas how to implement the invariant scale geometry.

If I got it right you want to render some markers (for example vertex drag editation area) with the same visual size for any depth they are rendered to.
There are 2 approaches for this:
scale with depth
compute perpendicular distance to camera view (simple dot product) and scale the marker size so it has the same visual size invariant on the depth.
So if P0 is your camera position and Z is your camera view direction unit vector (usually Z axis). Then for any position P compute the scale like this:
depth = dot(P-P0,Z)
Now the scale depends on wanted visual size0 at some specified depth0. Now using triangle similarity we want:
size/dept = size0/depth0
size = size0*depth/depth0
so render your marker with size or scale depth/depth0. In case of using scaling you need to scale around your target position P otherwise your marker would shift to the sides (so translate, scale, translate back).
compute screen position and use non perspective rendering
so you transform target coordinates the same way as the graphic pipeline does until you got the screen x,y position. Remember it and in pass that will render your markers just use that instead of real position. For this rendering pass either use some constant depth (distance from camera) or use non perspective view matrix.
For more info see Understanding 4x4 homogenous transform matrices
[Edit1] pixel size
you need to use FOVx,FOVy projection angles and view/screen resolution (xs,ys) for that. That means if depth is znear and coordinate is at half of the angle then the projected coordinate will go to edge of screen:
tan(FOVx/2) = (xs/2)*pixelx/znear
tan(FOVy/2) = (ys/2)*pixely/znear
---------------------------------
pixelx = 2*znear*tan(FOVx/2)/xs
pixely = 2*znear*tan(FOVy/2)/ys
Where pixelx,pixely is size (per axis) representing single pixel visually at depth znear. In case booth sizes are the same (so pixel is square) you have all you need. In case they are not equal (pixel is not square) then you need to render markers in screen axis aligned coordinates so approach #2 is more suitable for such case.
So if you chose depth0=znear then you can set size0 as n*pixelx and/or n*pixely to get the visual size of n pixels. Or use any dept0 and rewrite the computation to:
pixelx = 2*depth0*tan(FOVx/2)/xs
pixely = 2*depth0*tan(FOVy/2)/ys
Just to be complete:
size0x = size_in_pixels*(2*depth0*tan(FOVx/2)/xs)
size0y = size_in_pixels*(2*depth0*tan(FOVy/2)/ys)
-------------------------------------------------
sizex = size_in_pixels*(2*depth0*tan(FOVx/2)/xs)*(depth/depth0)
sizey = size_in_pixels*(2*depth0*tan(FOVy/2)/ys)*(depth/depth0)
---------------------------------------------------------------
sizex = size_in_pixels*(2*tan(FOVx/2)/xs)*(depth)
sizey = size_in_pixels*(2*tan(FOVy/2)/ys)*(depth)
---------------------------------------------------------------
sizex = size_in_pixels*2*depth*tan(FOVx/2)/xs
sizey = size_in_pixels*2*depth*tan(FOVy/2)/ys

Direct3D9 Calculating view space point light position

I am working on my own deffered rendering engine. I am rendering the scene to the g-buffer containing diffuse color, view space normals and depth (for now). I have implemented directional light for the second rendering stage and it works great. Now I want to render a point light, which is a bit harder.
I need the point light position for the shader in view space because I have only depth in the g-buffer and I can't afford a matrix multiplication in every pixel. I took the light position and transformed it by the same matrix, by which I transform every vertex in shader, so it should align with verices in the scene (using D3DXVec3Transform). But that isn't the case: transformed position doesn't represent viewspace position nearly at all. It's x,y coordinates are off the charts, they are often way out of the (-1,1) range. The transformed position respects the camera orientation somewhat, but the light moves too quick and the y-axis is inverted. Only if the camera is at (0,0,0), the light stands at (0,0) in the center of the screen. Here is my relevant rendering code executed every frame:
D3DXMATRIX matView; // the view transform matrix
D3DXMATRIX matProjection; // the projection transform matrix
D3DXMatrixLookAtLH(&matView,
&D3DXVECTOR3 (x,y,z), // the camera position
&D3DXVECTOR3 (xt,yt,zt), // the look-at position
&D3DXVECTOR3 (0.0f, 0.0f, 1.0f)); // the up direction
D3DXMatrixPerspectiveFovLH(&matProjection,
fov, // the horizontal field of view
asp, // aspect ratio
znear, // the near view-plane
zfar); // the far view-plane
D3DXMATRIX vysl=matView*matProjection;
eff->SetMatrix("worldViewProj",&vysl); //vertices are transformed ok ín shader
//render g-buffer
D3DXVECTOR4 lpos; D3DXVECTOR3 lpos2(0,0,0);
D3DXVec3Transform(&lpos,&lpos2,&vysl); //transforming lpos into lpos2 using vysl, still the same matrix
eff->SetVector("poslight",&lpos); //but there is already a mess in lpos at this time
//render the fullscreen quad with wrong lighting
Not that relevant shader code, but still, I see the light position this way (passing IN.texture is just me being lazy):
float dist=length(float2(IN.texture0*2-1)-float2(poslight.xy));
OUT.col=tex2D(Sdiff,IN.texture0)/dist;
I have tried to transform a light only by matView without projection, but the problem is still the same. If I transform the light in a shader, it's the same result, so the problem is the matrix itself. But it is the same matrix as is transforming the vertices! How differently are vertices treated?
Can you please take a look at the code and tell me where the mistake is? It seems to me it should work ok, but it doesn't. Thanks in advance.

You don't need a matrix multiplication to reconstruct view position, here is a code snippet (from andrew lauritzen deffered light example)
tP is the projection transform, position screen is -1/1 pixel coordinate and viewspaceZ is linear depth that you sample from your texture.
float3 ViewPosFromDepth(float2 positionScreen,
float viewSpaceZ)
{
float2 screenSpaceRay = float2(positionScreen.x / tP._11,
positionScreen.y / tP._22);
float3 positionView;
positionView.z = viewSpaceZ;
positionView.xy = screenSpaceRay.xy * positionView.z;
return positionView;
}

Result of this transform D3DXVec3Transform(&lpos,&lpos2,&vysl); is a vector in homogeneous space(i.e. projected vector but not divided by w). But in you shader you use it's xy components without respecting this(w). This is (quite probably) the problem. You could divide vector by its w yourself or use D3DXVec3Project instead of D3DXVec3Transform.
It's working fine for vertices as (I suppose) you mul them by the same viewproj matrix in the vertex shader and pass transformed values to interpolator where hardware eventually divides it's xyz by interpolated 'w'.

Invisible, interactable objects in AS3 -- how to code efficient invisibility?

Alpha invisibility.
I currently define circular regions on some images as "hot spots". For instance, I could have my photo on screen and overlay a circle on my head. To check for interaction with my head in realtime, I would returnOverlaps and do some manipulation on all objects overlapping the circle. For debugging, I make the circle yellow with alpha 0.5, and for release I decrease alpha to 0, making the circle invisible (as it should be).
Does this slow down the program? Is there another way to make the circle itself invisible while still remaining capable of interaction? Is there some way to color it "invisible" without using a (potentially) costly alpha of 0? Cache as bitmap matrix? Or some other efficient way to solve the "hot spot" detection without using masks?

Having just a few invisible display objects should not slow it down that much, but having many could. I think a more cleaner option may be to just handle it all in code, rather then have actual invisible display objects on the stage.
For a circle, you would define the center point and radius. Then to get if anyone clicked on it, you could go:
var xDist:Number = circle.x - mousePoint.x;
var yDist:Number = circle.y - mousePoint.y;
if((xDist * xDist) + (yDist * yDist) <= (circle.radius * circle.radius)){
// mousePoint is within circle
} else {
// mousePoint is outside of circle
}
If you insist on using display objects to set these circular hit areas (sometimes it can be easier visually, then by numbers), you could also write some code to read those display objects (and remove them from being rendered) in to get their positions and radius size.
added method:
// inputX and inputY are the hotspot's x and y positions, and inputRadius is the radius of the hotspot
function hitTestObj(inputA:DisplayObject, inputX:int, inputY:int, inputRadius:int):Boolean {
var xDist:Number = inputX - inputA.x;
var yDist:Number = inputY - inputA.y;
var minDist:Number = inputRadius + (inputA.width / 2);
return (((xDist * xDist) + (yDist * yDist)) =< (minDist * minDist))
}

An alpha=0 isn't all that costly in terms of rendering as Flash player will optimize for that (check here for actual figures). Bitmap caching wouldn't be of any help as the sprite is invisible. There's other ways to perform collision detection by doing the math yourself (more relevant in games with tens or even hundreds of sprites) but that would be an overkill in your case.

Rotating an image with the mouse

I am writing a drawing program, Whyteboard -- http://code.google.com/p/whyteboard/
I have implemented image rotating functionality, except that its behaviour is a little odd. I can't figure out the proper logic to make rotating the image in relation to the mouse position
My code is something similar to this:
(these are called from a mouse event handler)
def resize(self, x, y, direction=None):
"""Rotate the image"""
self.angle += 1
if self.angle > 360:
self.angle = 0
self.rotate()
def rotate(self, angle=None):
"""Rotate the image (in radians), turn it back into a bitmap"""
rad = (2 * math.pi * self.angle) / 360
if angle:
rad = (2 * math.pi * angle) / 360
img = self.img.Rotate(rad, (0, 0))
So, basically the angle to rotate the image keeps getting increased when the user moves the mouse. However, this sometimes means you have to "circle" the mouse many times to rotate an image 90 degrees, let alone 360.
But, I need it similar to other programs - how the image is rotated in relation to your mouse's position to the image.
This is the bit I'm having trouble with. I've left the question language-independent, although using Python and wxPython it could be applicable to any language

I'm assuming resize() is called for every mouse movement update. Your problem seems to be the self.angle += 1, which makes you update your angle by 1 degree on each mouse event.
A solution to your problem would be: pick the point on the image where the rotation will be centered (on this case, it's your (0,0) point on self.img.Rotate(), but usually it is the center of the image). The rotation angle should be the angle formed by the line that goes from this point to the mouse cursor minus the angle formed by the line that goes from this point to the mouse position when the user clicked.
To calculate the angle between two points, use math.atan2(y2-y1, x2-x1) which will give you the angle in radians. (you may have to change the order of the subtractions depending on your mouse position axis).

fserb's solution is the way I would go about the rotation too, but something additional to consider is your use of:
img = self.img.Rotate(rad, (0, 0))
If you are performing a bitmap image rotation in response to every mouse drag event, you are going to get a lot of data loss from the combined effect of all the interpolation required for the rotation. For example, rotating by 1 degree 360 times will give you a much blurrier image than the original.
Try having a rotation system something like this:
display_img = self.img.Rotate(rad, pos)
then use the display_img image while you are in rotation mode. When you end rotation mode (onMouseUp maybe), img = display_img.
This type of strategy is good whenever you have a lossy operation with a user preview.

Here's the solution in the end,
def rotate(self, position, origin):
""" position: mouse x/y position, origin: x/y to rotate around"""
origin_angle = self.find_angle(origin, self.center)
mouse_angle = self.find_angle(position, self.center)
angle = mouse_angle - origin_angle
# do the rotation here
def find_angle(self, a, b):
try:
answer = math.atan2((a[0] - b[0]) , (a[1] - b[1]))
except:
answer = 0
return answer

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio