PIXI.js - Canvas Coordinate to Container Coordinate - transformation

I have initiated a PIXI js canvas:
g_App = new PIXI.Application(800, 600, { backgroundColor: 0x1099bb });
Set up a container:
container = new PIXI.Container();
g_App.stage.addChild(container);
Put a background texture (2000x2000) into the container:
var texture = PIXI.Texture.fromImage('picBottom.png');
var back = new PIXI.Sprite(texture);
container.addChild(back);
Set the global:
var g_Container = container;
I do various pivot points and rotations on container and canvas stage element:
// Set the focus point of the container
g_App.stage.x = Math.floor(400);
g_App.stage.y = Math.floor(500); // Note this one is not central
g_Container.pivot.set(1000, 1000);
g_Container.rotation = 1.5; // radians
Now I need to be able to convert a canvas pixel to the pixel on the background texture.
g_Container has an element transform which in turn has several elements localTransform, pivot, position, scale ands skew. Similarly g_App.stage has the same transform element.
In Maths this is simple, you just have vector point and do matix operations on them. Then to go back the other way you just find inverses of those matrices and multiply backwards.
So what do I do here in pixi.js?
How do I convert a pixel on the canvas and see what pixel it is on the background container?

Note: The following is written using the USA convention of using matrices. They have row vectors on the left and multiply them by the matrix on the right. (Us pesky Brits in the UK do the opposite. We have column vectors on the right and multiply it by the matrix on the left. This means UK and USA matrices to do the same job will look slightly different.)
Now I have confused you all, on with the answer.
g_Container.transform.localTransform - this matrix takes the world coords to the scaled/transposed/rotated COORDS
g_App.stage.transform.localTransform - this matrix takes the rotated world coords and outputs screen (or more accurately) html canvas coords
So for example the Container matrix is:
MatContainer = [g_Container.transform.localTransform.a, g_Container.transform.localTransform.b, 0]
[g_Container.transform.localTransform.c, g_Container.transform.localTransform.d, 0]
[g_Container.transform.localTransform.tx, g_Container.transform.localTransform.ty, 1]
and the rotated container matrix to screen is:
MatToScreen = [g_App.stage.transform.localTransform.a, g_App.stage.transform.localTransform.b, 0]
[g_App.stage.transform.localTransform.c, g_App.stage.transform.localTransform.d, 0]
[g_App.stage.transform.localTransform.tx, g_App.stage.transform.localTransform.ty, 1]
So to get from World Coordinates to Screen Coordinates (noting our vector will be a row on the left, so the first operation matrix that acts first on the World coordinates must also be on the left), we would need to multiply the vector by:
MatAll = MatContainer * MatToScreen
So if you have a world coordinate vector vectWorld = [worldX, worldY, 1.0] (I'll explain the 1.0 at the end), then to get to the screen coords you would do the following:
vectScreen = vectWorld * MatAll
So to get screen coords and to get to world coords we first need to calculate the inverse matrix of MatAll, call it invMatAll. (There are loads of places that tell you how to do this, so I will not do it here.)
So if we have screen (canvas) coordinates screenX and screenY, we need to create a vector vectScreen = [screenX, screenY, 1.0] (again I will explain the 1.0 later), then to get to world coordinates worldX and worldY we do:
vectWorld = vectScreen * invMatAll
And that is it.
So what about the 1.0?
In a 2D system you can do rotations, scaling with 2x2 matrices. Unfortunately you cannot do a 2D translations with a 2x2 matrix. Consequently you need 3x3 matrices to fully describe all 2D scaling, rotations and translations. This means you need to make your vector 3D as well, and you need to put a 1.0 in the third position in order to do the translations properly. This 1.0 will also be 1.0 after any matrix operation as well.
Note: If we were working in a 3D system we would need 4x4 matrices and put a dummy 1.0 in our 4D vectors for exactly the same reasons.

Related

Inverse Camera Intrinsic Matrix for Image Plane at Z = -1

A similar question was asked before, unfortunately I cannot comment Samgaks answer so I open up a new post with this one. Here is the link to the old question:
How to calculate ray in real-world coordinate system from image using projection matrix?
My goal is to map from image coordinates to world coordinates. In fact I am trying to do this with the Camera Intrinsics Parameters of the HoloLens Camera.
Of course this mapping will only give me a ray connecting the Camera Optical Centre and all points, which can lie on that ray. For the mapping from image coordinates to world coordinates we can use the inverse camera matrix which is:
K^-1 = [1/fx 0 -cx/fx; 0 1/fy -cy/fy; 0 0 1]
Pcam = K^-1 * Ppix;
Pcam_x = P_pix_x/fx - cx/fx;
Pcam_y = P_pix_y/fy - cy/fy;
Pcam_z = 1
Orientation of Camera Coordinate System and Image Plane
In this specific case the image plane is probably at Z = -1 (However, I am a bit uncertain about this). The Section Pixel to Application-specified Coordinate System on page HoloLens CameraProjectionTransform describes how to go form pixel coordinates to world coordinates. To what I understand two signs in the K^-1 are flipped s.t. we calculate the coordinates as follows:
Pcam_x = (Ppix_x/fx) - (cx*(-1)/fx) = P_pix_x/fx + cx/fx;
Pcam_y = (Ppix_y/fy) - (cy*(-1)/fy) = P_pix_y/fy + cy/fy;
Pcam_z = -1
Pcam = (Pcam_x, Pcam_y, -1)
CameraOpticalCentre = (0,0,0)
Ray = Pcam - CameraOpticalCentre
I do not understand how to create the Camera Intrinsics for the case of the image plane being at a negative Z-coordinate. And I would like to have a mathematical explanation or intuitive understanding of why we have the sign flip (P_pix_x/fx + cx/fx instead of P_pix_x/fx - cx/fx).
Edit: I read in another post that the thirst column of the camera matrix has to be negated for the case that the camera is facing down the negative z-direction. This would explain the sign flip. However, why do we need to change the sign of the third column. I would like to have a intuitive understanding of this.
Here the link to the post Negation of third column
Thanks a lot in advance,
Lisa
why do we need to change the sign of the third column
To understand why we need to negate the third column of K (i.e. negate the principal points of the intrinsic matrix) let's first understand how to get the pixel coordinates of a 3D point already in the camera coordinates frame. After that, it is easier to understand why -z requires negating things.
let's imagine a Camera c, and one point B in the space (w.r.t. the camera coordinate frame), let's put the camera sensor (i.e. image) at E' as in the image below. Therefore f (in red) will be the focal length and ? (in blue) will be the x coordinate in pixels of B (from the center of the image). To simplify things let's place B at the corner of the field of view (i.e. in the corner of the image)
We need to calculate the coordinates of B projected into the sensor d (which is the same as the 2d image). Because the triangles AEB and AE'B' are similar triangles then ?/f = X/Z therefore ? = X*f/Z. X*f is the first operation of the K matrix is. We can multiply K*B (with B as a column vector) to check.
This will give us coordinates in pixels w.r.t. the center of the image. Let's imagine the image is size 480x480. Therefore B' will look like this in the image below. Keep in mind that in image coordinates, the y-axis increases going down and the x-axis increases going right.
In images, the pixel at coordinates 0,0 is in the top left corner, therefore we need to add half of the width of the image to the point we have. then px = X*f/Z + cx. Where cx is the principal point in the x-axis, usually W/2. px = X*f/Z + cx is exactly as doing K * B / Z. So X*f/Z was -240, if we add cx (W/2 = 480/2 = 240) and therefore X*f/Z + cx = 0, same with the Y. The final pixel coordinates in the image are 0,0 (i.e. top left corner)
Now in the case where we use z as negative, when we divide X and Y by Z, because Z is negative, it will change the sign of X and Y, therefore it will be projected to B'' at the opposite quadrant as in the image below.
Now the second image will instead be:
Because of this, instead of adding the principal point, we need to subtract it. That is the same as negating the last column of K.
So we have 240 - 240 = 0 (where the second 240 is the principal point in x, cx) and the same for Y. The pixel coordinates are 0,0 as in the example when z was positive. If we do not negate the last column we will end up with 480,480 instead of 0,0.
Hope this helped a little bit

Convert Cubemap coordinates to equivalents in Equirectangular

I have a set of coordinates of a 6-image Cubemap (Front, Back, Left, Right, Top, Bottom) as follows:
[ [160, 314], Front; [253, 231], Front; [345, 273], Left; [347, 92], Bottom; ... ]
Each image is 500x500p, being [0, 0] the top-left corner.
I want to convert these coordinates to their equivalents in equirectangular, for a 2500x1250p image. The layout is like this:
I don't need to convert the whole image, just the set of coordinates. Is there any straight-forward conversion por a specific pixel?
convert your image+2D coordinates to 3D normalized vector
the point (0,0,0) is the center of your cube map to make this work as intended. So basically you need to add the U,V direction vectors scaled to your coordinates to 3D position of texture point (0,0). The direction vectors are just unit vectors where each axis has 3 options {-1, 0 , +1} and only one axis coordinate is non zero for each vector. Each side of cube map has one combination ... Which one depends on your conventions which we do not know as you did not share any specifics.
use Cartesian to spherical coordinate system transformation
you do not need the radius just the two angles ...
convert the spherical angles to your 2D texture coordinates
This step depends on your 2D texture geometry. The simplest is rectangular texture (I think that is what you mean by equirectangular) but there are other mappings out there with specific features and each require different conversion. Here few examples:
Bump-map a sphere with a texture map
How to do a shader to convert to azimuthal_equidistant
For the rectangle texture you just scale the spherical angles into texture resolution size...
U = lon * Usize/(2*Pi)
V = (lat+(Pi/2)) * Vsize/Pi
plus/minus some orientation signs to match your coordinate systems.
btw. just found this (possibly duplicate QA):
GLSL Shader to convert six textures to Equirectangular projection

Rotating an image matrix around its center in MATLAB

Assume I have a 2x2 matrix filled with values which will represent a plane. Now I want to rotate the plane around itself in a 3-D way, in the "z-Direction". For a better understanding, see the following image:
I wondered if this is possible by a simple affine matrix, thus I created the following simple script:
%Create a random value matrix
A = rand*ones(200,200);
%Make a box in the image
A(50:200-50,50:200-50) = 1;
Now I can apply transformations in the 2-D room simply by a rotation matrix like this:
R = affine2d([1 0 0; .5 1 0; 0 0 1])
tform = affine3d(R);
transformed = imwarp(A,tform);
However, this will not produce the desired output above, and I am not quite sure how to create the 2-D affine matrix to create such behavior.
I guess that a 3-D affine matrix can do the trick. However, if I define a 3-D affine matrix I cannot work with the 2-D representation of the matrix anymore, since MATLAB will throw the error:
The number of dimensions of the input image A must be 3 when the
specified geometric transformation is 3-D.
So how can I code the desired output with an affine matrix?
The answer from m3tho correctly addresses how you would apply the transformation you want: using fitgeotrans with a 'projective' transform, thus requiring that you specify 4 control points (i.e. 4 pairs of corresponding points in the input and output image). You can then apply this transform using imwarp.
The issue, then, is how you select these pairs of points to create your desired transformation, which in this case is to create a perspective projection. As shown below, a perspective projection takes into account that a viewing position (i.e. "camera") will have a given view angle defining a conic field of view. The scene is rendered by taking all 3-D points within this cone and projecting them onto the viewing plane, which is the plane located at the camera target which is perpendicular to the line joining the camera and its target.
Let's first assume that your image is lying in the viewing plane and that the corners are described by a normalized reference frame such that they span [-1 1] in each direction. We need to first select the degree of perspective we want by choosing a view angle and then computing the distance between the camera and the viewing plane. A view angle of around 45 degrees can mimic the sense of perspective of normal human sight, so using the corners of the viewing plane to define the edge of the conic field of view, we can compute the camera distance as follows:
camDist = sqrt(2)./tand(viewAngle./2);
Now we can use this to generate a set of control points for the transformation. We first apply a 3-D rotation to the corner points of the viewing plane, rotating around the y axis by an amount theta. This rotates them out of plane, so we now project the corner points back onto the viewing plane by defining a line from the camera through each rotated corner point and finding the point where it intersects the plane. I'm going to spare you the mathematical derivations (you can implement them yourself from the formulas in the above links), but in this case everything simplifies down to the following set of calculations:
term1 = camDist.*cosd(theta);
term2 = camDist-sind(theta);
term3 = camDist+sind(theta);
outP = [-term1./term2 camDist./term2; ...
term1./term3 camDist./term3; ...
term1./term3 -camDist./term3; ...
-term1./term2 -camDist./term2];
And outP now contains your normalized set of control points in the output image. Given an image of size s, we can create a set of input and output control points as follows:
scaledInP = [1 s(1); s(2) s(1); s(2) 1; 1 1];
scaledOutP = bsxfun(#times, outP+1, s([2 1])-1)./2+1;
And you can apply the transformation like so:
tform = fitgeotrans(scaledInP, scaledOutP, 'projective');
outputView = imref2d(s);
newImage = imwarp(oldImage, tform, 'OutputView', outputView);
The only issue you may come across is that a rotation of 90 degrees (i.e. looking end-on at the image plane) would create a set of collinear points that would cause fitgeotrans to error out. In such a case, you would technically just want a blank image, because you can't see a 2-D object when looking at it edge-on.
Here's some code illustrating the above transformations by animating a spinning image:
img = imread('peppers.png');
s = size(img);
outputView = imref2d(s);
scaledInP = [1 s(1); s(2) s(1); s(2) 1; 1 1];
viewAngle = 45;
camDist = sqrt(2)./tand(viewAngle./2);
for theta = linspace(0, 360, 360)
term1 = camDist.*cosd(theta);
term2 = camDist-sind(theta);
term3 = camDist+sind(theta);
outP = [-term1./term2 camDist./term2; ...
term1./term3 camDist./term3; ...
term1./term3 -camDist./term3; ...
-term1./term2 -camDist./term2];
scaledOutP = bsxfun(#times, outP+1, s([2 1])-1)./2+1;
tform = fitgeotrans(scaledInP, scaledOutP, 'projective');
spinImage = imwarp(img, tform, 'OutputView', outputView);
if (theta == 0)
hImage = image(spinImage);
set(gca, 'Visible', 'off');
else
set(hImage, 'CData', spinImage);
end
drawnow;
end
And here's the animation:
You can perform a projective transformation that can be estimated using the position of the corners in the first and second image.
originalP='peppers.png';
original = imread(originalP);
imshow(original);
s = size(original);
matchedPoints1 = [1 1;1 s(1);s(2) s(1);s(2) 1];
matchedPoints2 = [1 1;1 s(1);s(2) s(1)-100;s(2) 100];
transformType = 'projective';
tform = fitgeotrans(matchedPoints1,matchedPoints2,'projective');
outputView = imref2d(size(original));
Ir = imwarp(original,tform,'OutputView',outputView);
figure; imshow(Ir);
This is the result of the code above:
Original image:
Transformed image:

Invariant scale geometry

I am writing a mesh editor where I have manipulators with the help of which I change the vertices of the mesh. The task is to render the manipulators with constant dimensions, which would not change when changing the camera and viewport parameters. The projection matrix is perspective. I will be grateful for ideas how to implement the invariant scale geometry.
If I got it right you want to render some markers (for example vertex drag editation area) with the same visual size for any depth they are rendered to.
There are 2 approaches for this:
scale with depth
compute perpendicular distance to camera view (simple dot product) and scale the marker size so it has the same visual size invariant on the depth.
So if P0 is your camera position and Z is your camera view direction unit vector (usually Z axis). Then for any position P compute the scale like this:
depth = dot(P-P0,Z)
Now the scale depends on wanted visual size0 at some specified depth0. Now using triangle similarity we want:
size/dept = size0/depth0
size = size0*depth/depth0
so render your marker with size or scale depth/depth0. In case of using scaling you need to scale around your target position P otherwise your marker would shift to the sides (so translate, scale, translate back).
compute screen position and use non perspective rendering
so you transform target coordinates the same way as the graphic pipeline does until you got the screen x,y position. Remember it and in pass that will render your markers just use that instead of real position. For this rendering pass either use some constant depth (distance from camera) or use non perspective view matrix.
For more info see Understanding 4x4 homogenous transform matrices
[Edit1] pixel size
you need to use FOVx,FOVy projection angles and view/screen resolution (xs,ys) for that. That means if depth is znear and coordinate is at half of the angle then the projected coordinate will go to edge of screen:
tan(FOVx/2) = (xs/2)*pixelx/znear
tan(FOVy/2) = (ys/2)*pixely/znear
---------------------------------
pixelx = 2*znear*tan(FOVx/2)/xs
pixely = 2*znear*tan(FOVy/2)/ys
Where pixelx,pixely is size (per axis) representing single pixel visually at depth znear. In case booth sizes are the same (so pixel is square) you have all you need. In case they are not equal (pixel is not square) then you need to render markers in screen axis aligned coordinates so approach #2 is more suitable for such case.
So if you chose depth0=znear then you can set size0 as n*pixelx and/or n*pixely to get the visual size of n pixels. Or use any dept0 and rewrite the computation to:
pixelx = 2*depth0*tan(FOVx/2)/xs
pixely = 2*depth0*tan(FOVy/2)/ys
Just to be complete:
size0x = size_in_pixels*(2*depth0*tan(FOVx/2)/xs)
size0y = size_in_pixels*(2*depth0*tan(FOVy/2)/ys)
-------------------------------------------------
sizex = size_in_pixels*(2*depth0*tan(FOVx/2)/xs)*(depth/depth0)
sizey = size_in_pixels*(2*depth0*tan(FOVy/2)/ys)*(depth/depth0)
---------------------------------------------------------------
sizex = size_in_pixels*(2*tan(FOVx/2)/xs)*(depth)
sizey = size_in_pixels*(2*tan(FOVy/2)/ys)*(depth)
---------------------------------------------------------------
sizex = size_in_pixels*2*depth*tan(FOVx/2)/xs
sizey = size_in_pixels*2*depth*tan(FOVy/2)/ys

openGL reverse image texturing logic

I'm about to project image into cylindrical panorama. But first I need to get the pixel (or color from pixel) I'm going to draw, then then do some Math in shaders with polar coordinates to get new position of pixel and then finally draw pixel.
Using this way I'll be able to change shape of image from polygon shape to whatever I want.
But I cannot find anything about this method (get pixel first, then do the Math and get new position for pixel).
Is there something like this, please?
OpenGL historically doesn't work that way around; it forward renders — from geometry to pixels — rather than backwards — from pixel to geometry.
The most natural way to achieve what you want to do is to calculate texture coordinates based on geometry, then render as usual. For a cylindrical mapping:
establish a mapping from cylindrical coordinates to texture coordinates;
with your actual geometry, imagine it placed within the cylinder, then from each vertex proceed along the normal until you intersect the cylinder. Use that location to determine the texture coordinate for the original vertex.
The latter is most easily and conveniently done within your geometry shader; it's a simple ray intersection test, with attributes therefore being only vertex location and vertex normal, and texture location being a varying that is calculated purely from the location and normal.
Extemporaneously, something like:
// get intersection as if ray hits the circular region of the cylinder,
// i.e. where |(position + n*normal).xy| = 1
float planarLengthOfPosition = length(position.xy);
float planarLengthOfNormal = length(normal.xy);
float planarDistanceToPerimeter = 1.0 - planarLengthOfNormal;
vec3 circularIntersection = position +
(planarDistanceToPerimeter/planarLengthOfNormal)*normal;
// get intersection as if ray hits the bottom or top of the cylinder,
// i.e. where |(position + n*normal).z| = 1
float linearLengthOfPosition = abs(position.z);
float linearLengthOfNormal = abs(normal.z);
float linearDistanceToEdge = 1.0 - linearLengthOfPosition;
vec3 endIntersection = position +
(linearDistanceToEdge/linearLengthOfNormal)*normal;
// pick whichever of those was lesser
vec3 cylindricalIntersection = mix(circularIntersection,
endIntersection,
step(linearDistanceToEdge,
planarDistanceToPerimeter));
// ... do something to map cylindrical intersection to texture coordinates ...
textureCoordinateVarying =
coordinateFromCylindricalPosition(cylindricalIntersection);
With a common implementation of coordinateFromCylindricalPosition possibly being simply return vec2(atan(cylindricalIntersection.y, cylindricalIntersection.x) / 6.28318530717959, cylindricalIntersection.z * 0.5);.

Resources