What is the Google Map zoom algorithm? - algorithm

I'm working on a map zoom algorithm which change the area (part of the map visible) coordinates on click.
For example, at the beginning, the area has this coordinates :
(0, 0) for the corner upper left
(100, 100) for the corner lower right
(100, 100) for the center of the area
And when the user clicks somewhere in the area, at a (x, y) coordinate, I say that the new coordinates for the area are :
(x-(100-0)/3, y-(100-0)/3) for the corner upper left
(x+(100-0)/3, y+(100-0)/3) for the corner upper right
(x, y) for the center of the area
The problem is that algorithm is not really powerful because when the user clicks somewhere, the point which is under the mouse moves to the middle of the area.
So I would like to have an idea of the algorithm used in Google Maps to change the area coordinates because this algorithm is pretty good : when the user clicks somewhere, the point which is under the mouse stays under the mouse, but the rest of area around is zoomed.
Somebody has an idea of how Google does ?

Lets say you have rectangle windowArea which holds drawing area coordinates(i.e web browser window area in pixels), for example if you are drawing map on the whole screen and the top left corner has coordinates (0, 0) then that rectangle will have values:
windowArea.top = 0;
windowArea.left = 0;
windowArea.right = maxWindowWidth;
windowArea.bottom = maxWindowHeight;
You also need to know visible map fragment, that will be longitude and latitude ranges, for example:
mapArea.top = 8.00; //lat
mapArea.left = 51.00; //lng
mapArea.right = 12.00; //lat
mapArea.bottom = 54.00; //lng
When zooming recalculate mapArea:
mapArea.left = mapClickPoint.x - (windowClickPoint.x- windowArea.left) * (newMapWidth / windowArea.width());
mapArea.top = mapClickPoint.y - (windowArea.bottom - windowClickPoint.y) * (newMapHeight / windowArea.height());
mapArea.right = mapArea.left + newWidth;
mapArea.bottom = mapArea.top + newHeight;
mapClickPoint holds map coordinates under mouse pointer(longitude, latitude).
windowClickPoint holds window coordinates under mouse pointer(pixels).
newMapHeight and newMapWidth hold new ranges of visible map fragment after zoom:
newMapWidth = zoomFactor * mapArea.width;//lets say that zoomFactor = <1.0, maxZoomFactor>
newMapHeight = zoomFactor * mapArea.height;
When you have new mapArea values you need to stretch it to cover whole windowArea, that means mapArea.top/left should be drawn at windowArea.top/left and mapArea.right/bottom should be drawn at windowArea.right/bottom.
I am not sure if google maps use the same algorithms, it gives similar results and it is pretty versatile but you need to know window coordinates and some kind of coordinates for visible part of object that will be zoomed.

Let us state the problem in 1 dimension, with the input (left, right, clickx, ratio)
So basically, you want to have the ratio to the click from the left and to the right to be the same:
Left'-clickx right'-clickx
------------- = --------------
left-clickx right-clickx
and furthermore, the window is reduced, so:
right'-left'
------------ = ratio
right-left
Therefore, the solution is:
left' = ratio*(left -clickx)+clickx
right' = ratio*(right-clickx)+clickx
And you can do the same for the other dimensions.

Related

Inverse Camera Intrinsic Matrix for Image Plane at Z = -1

A similar question was asked before, unfortunately I cannot comment Samgaks answer so I open up a new post with this one. Here is the link to the old question:
How to calculate ray in real-world coordinate system from image using projection matrix?
My goal is to map from image coordinates to world coordinates. In fact I am trying to do this with the Camera Intrinsics Parameters of the HoloLens Camera.
Of course this mapping will only give me a ray connecting the Camera Optical Centre and all points, which can lie on that ray. For the mapping from image coordinates to world coordinates we can use the inverse camera matrix which is:
K^-1 = [1/fx 0 -cx/fx; 0 1/fy -cy/fy; 0 0 1]
Pcam = K^-1 * Ppix;
Pcam_x = P_pix_x/fx - cx/fx;
Pcam_y = P_pix_y/fy - cy/fy;
Pcam_z = 1
Orientation of Camera Coordinate System and Image Plane
In this specific case the image plane is probably at Z = -1 (However, I am a bit uncertain about this). The Section Pixel to Application-specified Coordinate System on page HoloLens CameraProjectionTransform describes how to go form pixel coordinates to world coordinates. To what I understand two signs in the K^-1 are flipped s.t. we calculate the coordinates as follows:
Pcam_x = (Ppix_x/fx) - (cx*(-1)/fx) = P_pix_x/fx + cx/fx;
Pcam_y = (Ppix_y/fy) - (cy*(-1)/fy) = P_pix_y/fy + cy/fy;
Pcam_z = -1
Pcam = (Pcam_x, Pcam_y, -1)
CameraOpticalCentre = (0,0,0)
Ray = Pcam - CameraOpticalCentre
I do not understand how to create the Camera Intrinsics for the case of the image plane being at a negative Z-coordinate. And I would like to have a mathematical explanation or intuitive understanding of why we have the sign flip (P_pix_x/fx + cx/fx instead of P_pix_x/fx - cx/fx).
Edit: I read in another post that the thirst column of the camera matrix has to be negated for the case that the camera is facing down the negative z-direction. This would explain the sign flip. However, why do we need to change the sign of the third column. I would like to have a intuitive understanding of this.
Here the link to the post Negation of third column
Thanks a lot in advance,
Lisa
why do we need to change the sign of the third column
To understand why we need to negate the third column of K (i.e. negate the principal points of the intrinsic matrix) let's first understand how to get the pixel coordinates of a 3D point already in the camera coordinates frame. After that, it is easier to understand why -z requires negating things.
let's imagine a Camera c, and one point B in the space (w.r.t. the camera coordinate frame), let's put the camera sensor (i.e. image) at E' as in the image below. Therefore f (in red) will be the focal length and ? (in blue) will be the x coordinate in pixels of B (from the center of the image). To simplify things let's place B at the corner of the field of view (i.e. in the corner of the image)
We need to calculate the coordinates of B projected into the sensor d (which is the same as the 2d image). Because the triangles AEB and AE'B' are similar triangles then ?/f = X/Z therefore ? = X*f/Z. X*f is the first operation of the K matrix is. We can multiply K*B (with B as a column vector) to check.
This will give us coordinates in pixels w.r.t. the center of the image. Let's imagine the image is size 480x480. Therefore B' will look like this in the image below. Keep in mind that in image coordinates, the y-axis increases going down and the x-axis increases going right.
In images, the pixel at coordinates 0,0 is in the top left corner, therefore we need to add half of the width of the image to the point we have. then px = X*f/Z + cx. Where cx is the principal point in the x-axis, usually W/2. px = X*f/Z + cx is exactly as doing K * B / Z. So X*f/Z was -240, if we add cx (W/2 = 480/2 = 240) and therefore X*f/Z + cx = 0, same with the Y. The final pixel coordinates in the image are 0,0 (i.e. top left corner)
Now in the case where we use z as negative, when we divide X and Y by Z, because Z is negative, it will change the sign of X and Y, therefore it will be projected to B'' at the opposite quadrant as in the image below.
Now the second image will instead be:
Because of this, instead of adding the principal point, we need to subtract it. That is the same as negating the last column of K.
So we have 240 - 240 = 0 (where the second 240 is the principal point in x, cx) and the same for Y. The pixel coordinates are 0,0 as in the example when z was positive. If we do not negate the last column we will end up with 480,480 instead of 0,0.
Hope this helped a little bit

Three.js determine camera distance based on object3D size

I'm trying to determine how far away the camera needs to be from my object3D which is a collection of meshes in order for the entire model to be framed in the viewport.
I get the object3D size like this:
public getObjectSize ( target: THREE.Object3D ): Size {
let box: THREE.Box3 = new THREE.Box3().setFromObject(target);
let size: Size = {
depth: (-1 * box.min.z) + box.max.z,
height: (-1 * box.min.y) + box.max.y,
width: (-1 * box.min.x) + box.max.x
};
return size;
}
Next I use trig in an attempt to determine how far back the camera needs to be based on that box size in order for the entire box to be visible.
private determinCameraDistance(): number {
let cameraDistance: number;
let halfFOVInRadians: number = this.geometryService.getRadians(this.FOV / 2);
let height: number = this.productModelSizeService.getObjectSize(this.viewService.primaryView.scene).height;
let width: number = this.productModelSizeService.getObjectSize(this.viewService.primaryView.scene).width;
cameraDistance = ((width / 2) / Math.tan(halfHorizontalFOVInRadians));
return cameraDistance;
}
The math all works out on paper and the length of the adjacent side of the triangle (the camera distance) can be verified using a^2 + b^2 = c^2. However for some reason the distance returned is 10.4204 while the camera distance I need to show the entire object3D is actually 95 (determined by hard coding the value) which results in only being able to see a tiny portion of my model.
Any ideas on what I might be doing wrong, or better way to determine this. It seems to me like there is some kind of unit conversion that I'm missing when going from the box sizing units to camera distance units
Actual numbers used in the calculation:
FOV = 110 degrees,
Object3D size: {
Depth: 11.6224,
Height: 18.4,
Width: 29.7638
}
So we take half the field of view to create a right triangle with the adjacent side placed along our camera distance, that's 55 degrees. We then use the formula Degrees * PI / 180 to convert 55 degrees into the radian equivalent, which is .9599. Next we take half the object3D width, again to create a right triangle, which is 14.8819. We can now take our half width and divide it by the tangent of the FOV (in radians), this gives us the length for the adjacent side / camera distance of 10.4204.
We can further verify this is the correct length of this side I'll get the length of the hypotenuse using SOHCAHTOA again:
Sin(55) = 14.8819 / y
.8192 * y = 14.8819
y = 14.8819 / .8192
y = 18.1664
Now using this we can use the pythagorean theorem solve for b to check our math.
14.8819^2 + b^2 = 18.1664^2
221.4709 + b^2 = 330.0018
b^2 = 108.5835
b = 10.4203 (we're off by .0001 but that's due to rounding)
The issue ended up being that in THREE.js field of view represents the vertical viewing area. I had been assuming that THREE like Maya and other applications uses Field of View as the horizontal viewing area.
Multiplying the FOV that I was getting by the Aspect Ratio gives me the correct horizontal field of view, which results in a Camera distance of ~92.

D3 force layout: Finding relative center based on current visible view

I have a d3.js graph that is a forced layout design. I have allowed for users to zoom in and out of the graph with bounds set so they can't zoom in past 1 and can't zoom out past 0.1. Right now, when I plot values on the graph, I automatically send them to the center of the graph (based on the height and width of the SVG container). This works fine until I zoom out then zoom in to some where else and plot a new node. The new node will end up back at the original center and not my new relative center.
How I scale when zooming right now:
function onZoom() {
graph.attr("transform", "translate(" + zoom.translate() + ")" + " scale(" + zoom.scale() + ")");
}
I was unable to find any calls to get the current visible coordinates of the graph, but even with those, how would I use them to calculate the relative center of the graph if my SVG graph size always remains static?
I know this post is very old but I found it useful. Below is the update for d3 v5.
var el = d3.select('#canvas').node().getBoundingClientRect();
var z = d3.zoomTransform(svg.node());
var w = el.width;
var h = el.height;
var center = {
x: (z.x / z.k * -1) + (w / z.k * 0.5),
y: (z.y / z.k * -1) + (h / z.k * 0.5)
};
One thing of note, however... is that I found I also needed to divide the pan x/y by the scale factor z.k. Which, you did not do in your formula.
For simple geometric zoom, it's fairly straightforward to figure out the visible area from the visible area dimensions plus the translation and scale settings. Just remember that the translation setting is the position of the (0,0) origin relative to the top left corner of your display, so if translation is (-100,50), that means that top left corner is at (+100,-50) in your coordinate system. Likewise, if the scale is 2, that means that the visible area covers 1/2 as many units as the original width and height.
How to access the current transformation? graph.attr("transform") will give you the most recently set transform attribute string, but then you'll need to use regular expressions to access the numbers. Easier to query the zoom behaviour directly using zoom.translate() and zoom.scale().
With those together, you get
var viewCenter = [];
viewCenter[0] = (-1)*zoom.translate()[0] + (0.5) * ( width/zoom.scale() );
viewCenter[1] = (-1)*zoom.translate()[1] + (0.5) * ( height/zoom.scale() );
I.e., the position of the center of the visible area is the position of the top-left corner of the visible area, plus half the visible width and height.

How to get the Position & Dimension of a Shape in Powerpoint?

I'm playing around with OpenXmlSDK to see if it's a viable solution for our Powerpoint needs. One thing that is required is the ability to position shapes in the Powerpoint. I've been searching around for a way to get the position of a Shape, but have only come across is the MSDN "How To" http://msdn.microsoft.com/en-us/library/cc850828.aspx and a Position class (but no way to get it from a Shape) http://msdn.microsoft.com/en-us/library/office/documentformat.openxml.wordprocessing.position%28v=office.14%29.aspx.
How do I do something like:
PresentationDocument presentationDocument = PresentationDocument.Open("C:\\MyDoc.pptx", true);
IdPartPair pp = presentationDocument.PresentationPart.SlideParts.First().Parts.FirstOrDefault();
var shape = pp.OpenXmlPart;
// How do I get the position and dimensions?
You have 2 variables for the dimension of the shape :
- Offset gives the position of the top corner of your shape
- Extents gives the size off your shape
shape.ShapeProperties.Transform2D.Offset.X //gives the x position of top left corner
shape.ShapeProperties.Transform2D.Offset.Y //gives the y position of top left corner
shape.ShapeProperties.Transform2D.Extents.X //gives the x size of the shape : the width
shape.ShapeProperties.Transform2D.Extents.Y //gives the y size of the shape : the height
Go through the XML for the slide in question and look for xfrm elements, which should contain off (offset) and ext (extent) sub-elements. The measurements are in EMUs (see last page of Wouter van Vugt's document).
Sometimes ShapeProperties is not displayed as a Shape property, you must write
var sP = ((DocumentFormat.OpenXml.Presentation.Shape)shape).ShapeProperties;
After you can use Transform2D and find coordinates as Deunz wrote.

Rotating an image with the mouse

I am writing a drawing program, Whyteboard -- http://code.google.com/p/whyteboard/
I have implemented image rotating functionality, except that its behaviour is a little odd. I can't figure out the proper logic to make rotating the image in relation to the mouse position
My code is something similar to this:
(these are called from a mouse event handler)
def resize(self, x, y, direction=None):
"""Rotate the image"""
self.angle += 1
if self.angle > 360:
self.angle = 0
self.rotate()
def rotate(self, angle=None):
"""Rotate the image (in radians), turn it back into a bitmap"""
rad = (2 * math.pi * self.angle) / 360
if angle:
rad = (2 * math.pi * angle) / 360
img = self.img.Rotate(rad, (0, 0))
So, basically the angle to rotate the image keeps getting increased when the user moves the mouse. However, this sometimes means you have to "circle" the mouse many times to rotate an image 90 degrees, let alone 360.
But, I need it similar to other programs - how the image is rotated in relation to your mouse's position to the image.
This is the bit I'm having trouble with. I've left the question language-independent, although using Python and wxPython it could be applicable to any language
I'm assuming resize() is called for every mouse movement update. Your problem seems to be the self.angle += 1, which makes you update your angle by 1 degree on each mouse event.
A solution to your problem would be: pick the point on the image where the rotation will be centered (on this case, it's your (0,0) point on self.img.Rotate(), but usually it is the center of the image). The rotation angle should be the angle formed by the line that goes from this point to the mouse cursor minus the angle formed by the line that goes from this point to the mouse position when the user clicked.
To calculate the angle between two points, use math.atan2(y2-y1, x2-x1) which will give you the angle in radians. (you may have to change the order of the subtractions depending on your mouse position axis).
fserb's solution is the way I would go about the rotation too, but something additional to consider is your use of:
img = self.img.Rotate(rad, (0, 0))
If you are performing a bitmap image rotation in response to every mouse drag event, you are going to get a lot of data loss from the combined effect of all the interpolation required for the rotation. For example, rotating by 1 degree 360 times will give you a much blurrier image than the original.
Try having a rotation system something like this:
display_img = self.img.Rotate(rad, pos)
then use the display_img image while you are in rotation mode. When you end rotation mode (onMouseUp maybe), img = display_img.
This type of strategy is good whenever you have a lossy operation with a user preview.
Here's the solution in the end,
def rotate(self, position, origin):
""" position: mouse x/y position, origin: x/y to rotate around"""
origin_angle = self.find_angle(origin, self.center)
mouse_angle = self.find_angle(position, self.center)
angle = mouse_angle - origin_angle
# do the rotation here
def find_angle(self, a, b):
try:
answer = math.atan2((a[0] - b[0]) , (a[1] - b[1]))
except:
answer = 0
return answer

Resources