Inverse Camera Intrinsic Matrix for Image Plane at Z = -1 - matrix

A similar question was asked before, unfortunately I cannot comment Samgaks answer so I open up a new post with this one. Here is the link to the old question:
How to calculate ray in real-world coordinate system from image using projection matrix?
My goal is to map from image coordinates to world coordinates. In fact I am trying to do this with the Camera Intrinsics Parameters of the HoloLens Camera.
Of course this mapping will only give me a ray connecting the Camera Optical Centre and all points, which can lie on that ray. For the mapping from image coordinates to world coordinates we can use the inverse camera matrix which is:
K^-1 = [1/fx 0 -cx/fx; 0 1/fy -cy/fy; 0 0 1]
Pcam = K^-1 * Ppix;
Pcam_x = P_pix_x/fx - cx/fx;
Pcam_y = P_pix_y/fy - cy/fy;
Pcam_z = 1
Orientation of Camera Coordinate System and Image Plane
In this specific case the image plane is probably at Z = -1 (However, I am a bit uncertain about this). The Section Pixel to Application-specified Coordinate System on page HoloLens CameraProjectionTransform describes how to go form pixel coordinates to world coordinates. To what I understand two signs in the K^-1 are flipped s.t. we calculate the coordinates as follows:
Pcam_x = (Ppix_x/fx) - (cx*(-1)/fx) = P_pix_x/fx + cx/fx;
Pcam_y = (Ppix_y/fy) - (cy*(-1)/fy) = P_pix_y/fy + cy/fy;
Pcam_z = -1
Pcam = (Pcam_x, Pcam_y, -1)
CameraOpticalCentre = (0,0,0)
Ray = Pcam - CameraOpticalCentre
I do not understand how to create the Camera Intrinsics for the case of the image plane being at a negative Z-coordinate. And I would like to have a mathematical explanation or intuitive understanding of why we have the sign flip (P_pix_x/fx + cx/fx instead of P_pix_x/fx - cx/fx).
Edit: I read in another post that the thirst column of the camera matrix has to be negated for the case that the camera is facing down the negative z-direction. This would explain the sign flip. However, why do we need to change the sign of the third column. I would like to have a intuitive understanding of this.
Here the link to the post Negation of third column
Thanks a lot in advance,
Lisa

why do we need to change the sign of the third column
To understand why we need to negate the third column of K (i.e. negate the principal points of the intrinsic matrix) let's first understand how to get the pixel coordinates of a 3D point already in the camera coordinates frame. After that, it is easier to understand why -z requires negating things.
let's imagine a Camera c, and one point B in the space (w.r.t. the camera coordinate frame), let's put the camera sensor (i.e. image) at E' as in the image below. Therefore f (in red) will be the focal length and ? (in blue) will be the x coordinate in pixels of B (from the center of the image). To simplify things let's place B at the corner of the field of view (i.e. in the corner of the image)
We need to calculate the coordinates of B projected into the sensor d (which is the same as the 2d image). Because the triangles AEB and AE'B' are similar triangles then ?/f = X/Z therefore ? = X*f/Z. X*f is the first operation of the K matrix is. We can multiply K*B (with B as a column vector) to check.
This will give us coordinates in pixels w.r.t. the center of the image. Let's imagine the image is size 480x480. Therefore B' will look like this in the image below. Keep in mind that in image coordinates, the y-axis increases going down and the x-axis increases going right.
In images, the pixel at coordinates 0,0 is in the top left corner, therefore we need to add half of the width of the image to the point we have. then px = X*f/Z + cx. Where cx is the principal point in the x-axis, usually W/2. px = X*f/Z + cx is exactly as doing K * B / Z. So X*f/Z was -240, if we add cx (W/2 = 480/2 = 240) and therefore X*f/Z + cx = 0, same with the Y. The final pixel coordinates in the image are 0,0 (i.e. top left corner)
Now in the case where we use z as negative, when we divide X and Y by Z, because Z is negative, it will change the sign of X and Y, therefore it will be projected to B'' at the opposite quadrant as in the image below.
Now the second image will instead be:
Because of this, instead of adding the principal point, we need to subtract it. That is the same as negating the last column of K.
So we have 240 - 240 = 0 (where the second 240 is the principal point in x, cx) and the same for Y. The pixel coordinates are 0,0 as in the example when z was positive. If we do not negate the last column we will end up with 480,480 instead of 0,0.
Hope this helped a little bit

Related

How to find the pixel location of a GPS point in an orthoimage with known orientation and GPS location

I have a problem where I need to determine whether a given latitude, longitude GPS-point is in a given orthoimage (approx. 1 hectare area) with known real-world orientation and GPS-location (corresponding to the center of image).
That is, given a GPS-point P, I need to determine:
Is point P located in the orthoimage, and if yes,
What is the pixel location of point P in the orthoimage.
My question is summarized in the following image:
As you can see in the image, I know the GPS-coordinates of the image (center) and where North is located with respect to the image. Also, I know how many centimeters in the ground each pixel corresponds to.
My question is: What would be an efficient and smart way to achieve the goals in my problem?
One approach I had in mind was to solve a linear mapping between the GPS- and pixel-points and then use this mapping to answer both problems 1-2. I thought this could be a reasonable approach, even though the earth has curvature and the GPS-coordinates are (I'd say) more like a parabolic function of the pixel coordinates, since the distances are very small (one image is an approximately 1 hectare area) I could assume without significant loss in accuracy that the GPS-coordinates change locally linearly w.r.t pixel coordinates.
What do you think? Thank you.
Update:
The orthophotos have been taken with a Phantom 4 Pro drone with gimbal camera system.
I thought about one possibility myself, not perfect but it's a start:
The following information is given:
a rectangular orthoimage Img, Yaw of the image (that is, how many degrees the image is facing away from north), pix_size pixel size in the ground (centimeters/pixel).
The problem is: Given an arbitrary GPS-point p = (lat, long), determine the pixel location of p in Img.
Denote c = (latc, longc) and cp = (x,y) as the GPS- and pixel-coordinates of the center point of Img.
Determine how much we must move along North-South and West-East axes to get from c to p. Let lat_delta = latc-lat and long_delta = longc-long. If lat_delta < 0 -> p is more in north than c, otherwise p is more in south than c. The same goes analoguously for long_delta.
> if lat_delta < 0:
> pN = [latc + abs(lat_delta), longc]
> else:
> pN = [latc - abs(lat_delta), longc]
>
> if lat_long < 0:
> pE = [latc, longc + abs(long_delta)]
> else:
> pE = [latc, longc - abs(long_delta)]
Now the points c, p, pN and pE form a "spherical" right triangle (I think I could safely assume it to be planar because the orthophoto describes max 1 hectare area). So the Pythagorean theorem applies sufficiently enough for my purposes.
Next, I calculate the ground distances dN = Haversine(c,pN) and dE = Haversine(c, pE), which tell me how much in ground distance I must move in North-South and West-East axes in order to get from c to p.
Now I will apply a rotation matrix R(-Yaw) to vectors n = [0,1] and e = [1,0], which represent the upwards and right vectors in my pixel coordinate system. So I get nr = R(-Yaw)*n and er = R(-Yaw)*e where nr is a unit pixel vector pointing towards North in the image and er is similarly a unit pixel vector pointing towards East in the image.
Next, I calculate the ratios mN = dN/pix_size and mE = dE/pix_size (the factors also need to take into account the +- direction). Now I calculate the pixel location of p by:
pp = cp + mN*nr + mE*er,
where I can now easily check if the pixel values pp are within the bounds of the image Img.
Of course this method does not work in a general large area case and needs to be refined for this purpose.

Calculate 3D distance based on change in intensity

I have three sections (top, mid, bot) of grayscale images (3D). In each section, I have a point with coordinates (x,y) and intensity values [0-255]. The distance between each section is 20 pixels.
I created an illustration to show how those images were generated using a microscope:
Illustration
Illustration (side view): red line is the object of interest. Blue stars represents the dots which are visible in top, mid, bot section. The (x,y) coordinates of these dots are known. The length of the object remains the same but it can rotate in space - 'out of focus' (illustration shows a rotating line at time point 5). At time point 1, the red line is resting (in 2D image: 2 dots with a distance equal to the length of the object).
I want to estimate the x,y,z-coordinate of the end points (represents as stars) by using the changes in intensity, the knowledge about the length of the object and the information in the sections I have. Any help would be appreciated.
Here is an example of images:
Bot section
Mid section
Top section
My 3D PSF data:
https://drive.google.com/file/d/1qoyhWtLDD2fUy2zThYUgkYM3vMXxNh64/view?usp=sharing
Attempt so far:
enter image description here
I guess the correct approach would be to record three images with slightly different z-coordinates for your bot and your top frame, then do a 3D-deconvolution (using Richardson-Lucy or whatever algorithm).
However, a more simple approach would be as I have outlined in my comment. If you use the data for a publication, I strongly recommend to emphasize that this is just an estimation and to include the steps how you have done it.
I'd suggest the following procedure:
Since I do not have your PSF-data, I fake some by estimating the PSF as a 3D-Gaussiamn. Of course, this is a strong simplification, but you should be able to get the idea behind it.
First, fit a Gaussian to the PSF along z:
[xg, yg, zg] = meshgrid(-32:32, -32:32, -32:32);
rg = sqrt(xg.^2+yg.^2);
psf = exp(-(rg/8).^2) .* exp(-(zg/16).^2);
% add some noise to make it a bit more realistic
psf = psf + randn(size(psf)) * 0.05;
% view psf:
%
subplot(1,3,1);
s = slice(xg,yg,zg, psf, 0,0,[]);
title('faked PSF');
for i=1:2
s(i).EdgeColor = 'none';
end
% data along z through PSF's center
z = reshape(psf(33,33,:),[65,1]);
subplot(1,3,2);
plot(-32:32, z);
title('PSF along z');
% Fit the data
% Generate a function for a gaussian distibution plus some background
gauss_d = #(x0, sigma, bg, x)exp(-1*((x-x0)/(sigma)).^2)+bg;
ft = fit ((-32:32)', z, gauss_d, ...
'Start', [0 16 0] ... % You may find proper start points by looking at your data
);
subplot(1,3,3);
plot(-32:32, z, '.');
hold on;
plot(-32:.1:32, feval(ft, -32:.1:32), 'r-');
title('fit to z-profile');
The function that relates the intensity I to the z-coordinate is
gauss_d = #(x0, sigma, bg, x)exp(-1*((x-x0)/(sigma)).^2)+bg;
You can re-arrange this formula for x. Due to the square root, there are two possibilities:
% now make a function that returns the z-coordinate from the intensity
% value:
zfromI = #(I)ft.sigma * sqrt(-1*log(I-ft.bg))+ft.x0;
zfromI2= #(I)ft.sigma * -sqrt(-1*log(I-ft.bg))+ft.x0;
Note that the PSF I have faked is normalized to have one as its maximum value. If your PSF data is not normalized, you can divide the data by its maximum.
Now, you can use zfromI or zfromI2 to get the z-coordinate for your intensity. Again, I should be normalized, that is the fraction of the intensity to the intensity of your reference spot:
zfromI(.7)
ans =
9.5469
>> zfromI2(.7)
ans =
-9.4644
Note that due to the random noise I have added, your results might look slightly different.

Calculate the angles of a pixel to a camera plane in a depth-image

I have a z-image from a ToF Camera (Kinect V2). I do not have the pixel size, but I know that the depth image has a resolution of 512x424. I also know that I have a fov of 70.6x60 degrees.
I asked how to get the Pixel size before here. In Matlab this code looks like the following.
The brighter the pixel, the closer the object.
close all
clear all
%Load image
depth = imread('depth_0_30_0_0.5.png');
frame_width = 512;
frame_height = 424;
horizontal_scaling = tan((70.6 / 2) * (pi/180));
vertical_scaling = tan((60 / 2) * (pi/180));
%pixel size
with_size = horizontal_scaling * 2 .* (double(depth)/frame_width);
height_size = vertical_scaling * 2 .* (double(depth)/frame_height);
The image itself is a cube rotated by 30 degree, and can be seen here: .
What I want to do now is calculate the horizontal angle of a pixel to the camera-plane and the vertical angle to the camera plane.
I tried to do this with triangulation, I calculate the z-distance from one pixel to another, first in the horizontal direction and then in the vertical direction. I do this with a convolution:
%get the horizontal errors
dx = abs(conv2(depth,[1 -1],'same'));
%get the vertical errors
dy = abs(conv2(depth,[1 -1]','same'));
After this I calculate it via the atan, like this:
horizontal_angle = rad2deg(atan(with_size ./ dx));
vertical_angle = rad2deg(atan(height_size ./ dy));
horizontal_angle(horizontal_angle == NaN) = 0;
vertical_angle(vertical_angle == NaN) = 0;
Which gives back promising results, like these:
However, using a little bit more complex image like this, which is turned by 60° and 30°.
Gives back the same angle images for horizontal and vertical angles, which look like this:
After subtracting both images from each other, I get the following image - which shows that there is a difference between those two.
So, I have the following questions: How can I proof this concept? Is the math correct, and the test case is just poorly chosen? Is the angle difference from horizontal to vertical angles in the two images too close? Are there any errors in the calculation ?
While my previous code may looks good, it had a flaw. I tested it with smaller images (5x5,3x3 and so on) and saw, that there is an offset created by the difference picture (dx,dy) made by the convolution. It is simple not possible to map the difference picture (which holds the difference between two pixels) to the pixels itself, since the difference picture is smaller than the original one.
For a fast fix, I do a downsampling. So I changed the filter mask to:
%get the horizontal differences
dx = abs(conv2(depth,[1 0 -1],'valid'));
%get the vertical differences
dy = abs(conv2(depth,[1 0 -1]','valid'));
And changed the angle function to:
%get the angles by the tangent
horizontal_angle = rad2deg(atan(with_size(2:end-1,2:end-1)...
./ dx(2:end-1,:)))
vertical_angle = rad2deg(atan(height_size(2:end-1,2:end-1)...
./ dy(:,2:end-1)))
Also I used a padding function to get the angle map to the same size as the original images.
horizontal_angle = padarray(horizontal_angle,[1 1],0);
vertical_angle = padarray(vertical_angle[1 1],0);

How to check a point is inside an ellipsoid with orientation?

For an ellipsoid of the form
with orientation vector and centre at point , how to find whether a point is inside the ellipsoid or not?
An additional note that the geometry actually is with a=b (spheroid) and therefore one axis is sufficient to define orientation
Note: I see a similar question asked in the forum. But, it is about an ellipsoid at origin and without any arbitrary orientation and here both arbitrary position and orientation are considered.
Find affine transform M that translates this ellipse in axis-oriented one (translation by -p and rotation to align orientation vector r and proper coordinate axis).
Then apply this transform to point p and check that p' lies inside axis-oriented ellipsoid, i.e.
x^2/a^2+ y^2/b^2+z^2/c^2 <= 1
Create a coordinate system E with the center at p and with the long axis of the ellipse aligned with r. Create a matrix that can transform global coordinates to the coordinate system E. Then put the transformed coordinates into the ellipse equation.
A center point p and an "orientation vector" r do not suffice to completely specify the position of the ellipsoid, there is one degree of freedom left. Your problem is indeterminate.
If your vector r is a unit vector from the origin to the pole, then the test for whether a point q is in (or on) the ellipse is:
v = q-p; // 3d vector difference
dot = v.r; // 3d dot product
f = dot*dot;
g = v.v - f; // 3d dot product and scalar subtraction
return f/(b*b) + g/(a*a) <= 1
Note that if the ellipse was aligned so that r was the z unit vector, then the test above translates into the usual test for inclusion of a point in an ellipse.

Projection of a plane onto a cylinder

I have a plain bitmap and I want to do a projection on a cylinder.
That means, I want to transform the image in a way so that if I print it and wrap around a columnar cylinder and photograph it from a certain position, the resulting image looks like the original.
Still I'm quite lost in all the projection algorithms (that are often related to earth projections).
So I'd be thankful for hints what the correct algorithm could be and which tools I could use to apply it to my image.
Let say you have a rectangle image of lenght: L and height: H .
and a cylinder of radius : R and height H'
Let A (x,z) be a point in the picture,
Then A' (x',y',z') = ( R*cos(x*(2Pi/L)) , R*sin(x*(2Pi/L)) , z*(H'/H)) will be the projection of your point A on your cylinder.
Proof :
1. z' = z*(H'/H)
I first fit the cylinder to the image size , that's why I multiply by
: (H'/H), and I keep the same z axis. (if you draw it you will see it
immediatly)
2. x' and y ' ?
I project each line of my image into a circle . the parametric
equation of a circle is (Rcos(t), Rsin(t)) for t in [0,2PI], the
parametric equation map a segment (t in [0,2PI]) to a circle . That's
exactly what we are trying to do.
then if x describes a line of lenght L, x*(2pi)/L describres a line of
length 2pi and I can use the parametric equation to map each point of
this line to a circle.
Hope it helps
The previous function gave the function to "press" a plane against a cylinder.
This is a bijection, so from a given point in the cylinder you can easily get the original image.
A(x,y,z) from the cylinder
A'(x',z') in the image :
z' = z*(H/H')
and x' = L/(2Pi)* { arccos(x/R) *(sign(y)) (mod(2Pi)) }
(it's a pretty ugly formula but that's it :D and you need to express the modulo as a positive value)
If you can apply that to your cylindrical image you get how to uncoil your picture.

Resources