I have 4 Point values: TopLeft, TopRight, BottomLeft, BottomRight. These define a 4 sided shape (like a distorted rectangle) on my monitor. These are the point a Tobii gaze device thinks I am looking at when in fact I am looking at the four corners of my monitor.
This picture shows a bitmap on the left representing my monitor, and the points the Tobii device tells me I am looking at when I am in fact looking at the corners of the screen. (It's a representation, not real).
I want to use those four calibration points to take a screen X,Y position that is from an inaccurate gaze position and correct it so that it is positioned as per the image on the right.
Edit: New solution for the edited question is at the end.
This problem is call bilinear interpolation.
Once you grasp the idea, it will be very easy and you would remember it for the rest of your life.
It would be quite long to post all detail here, but I will try.
First, I will name the point on the left to be (x,y) and the right to be (X,Y).
Let (x1,y1), (x1,y2), (x2,y1), (x2,y2) be the corner points on the left rectangle.
Secondly, let's split the problem into 2 bilinear interpolation problems:
want to find X
want to find Y
Let's find them one by one (X or Y).
Define : Qxx are the value of X or Y of the four corner in the right rectangle.
Suppose that we want to find the value of the unknown function f at
the point (x, y). It is assumed that we know the value of f at the
four points Q11 = (x1, y1), Q12 = (x1, y2), Q21 = (x2, y1), and Q22 =
(x2, y2).
The f(x,y) of your problem is X or Y in your question.
Then you interpolate f(x,y1) and f(x,y2) to be f(x,y) in the same way.
Finally, you will got X or Y=f(x,y)
Reference : All pictures/formulas/text here are copied from the wiki link (some with modification).
Edit: After the question has been edited, it become very different.
The new one is opposite, and it is called "inverse bilinear interpolation" which is far harder.
For more information, please read http://www.iquilezles.org/www/articles/ibilinear/ibilinear.htm
You can define a unique Linear Transform using 6 equations. The 3 points which have to align provide those 6 equations, as each pair of matching points provides two equations in x and y.
If you want to pursue this, I can provide the matrix equation which defines the Linear Transform based on how it maps three points. You invert this matrix and it will provide the linear transform.
But having done that, the transform is completely specified. You have no control over where the corner points of the original quadrilateral will go. In general, you can't even define a linear transform to map one quadrilateral onto another; this gives 8 equations (2 for each corner) with only 6 unknowns. Its over-specified. In fact a Linear Transform must always map a rectangle to a parallelogram, so in general you can't define a Linear Transform which maps one quadrilateral to another.
So if it can't be a Linear Transform, can it be a non-Linear Transform? Well, yes, but non-Linear Transforms don't necessarily map straight lines to straight lines, so the mapped edges of the quadrilateral won't be straight. Or any other lines. And you still have 14 equations (2 for each point and corner) for which you have to invent some non-Linear transform with 14 unknowns.
So the problem as stated cannot be solved with a Linear Transform; its over specified. Using a non-Linear transform will require you to devise a non-Linear transform which has 14 free variables (vs the 6 in a Linear Transform), this will map the 7 points correctly but straight lines will no longer be straight. Adding this requirement in adds an infinite number of constraints (one for every point in the line) and you won't even be able to use continuous functions.
There may be some solution to what you are doing in terms of what you are really trying to do (ie the underlying application need), but as a mathematical problem it is unsolvable.
Let me know if you want the matrix equation to produce a Linear Transform based on how it transforms 3 points.
Related
I have a turtle-graphics-based algorithm for generating a space-filling Hilbert curve in two dimensions. It is recursive and goes like this:
Wa want to draw a curve of order n, in direction x (where x ∈ {L, R}), and let y be the direction opposite to x. We do as follows:
turn in the direction y
draw a Hilbert curve of order n-1, direction y
move one step forward
turn in the direction x
draw a Hilbert curve of order n-1, direction x
move one step forward
draw a Hilbert curve of order n-1, direction x
turn in the direction x
move one step forward
draw a Hilbert curve of order n-1, direction y
I understand this and was able to implement a working solution. However, I'm now trying to "upgrade" this to 3D, and here's where I basically hit a wall; in 3D, when we reach a vertex, we can turn not in two, but four directions (going straight or backing up is obviously not an option, hence four and not six). Intuitively, I think I should store the plane on which the turtle is "walking" and its general direction in the world, represented by an enum with six values:
Up
Down
Left
Right
In (from the camera's perspective, it goes "inside" the world)
Out (same as above, outside)
The turtle, like in 2D, has a state containing the information outlined above, and when it reaches as vertex (which can be thought of as a "crossing") has to make a decision where to go next, based on that state. Whereas in two dimensions it is rather simple, in three, I'm stumped.
Is my approach correct? (i.e., is this what I should store in the turtle's state?)
If it is, how can I use that information to make a decision where to go next?
Because there are many variants of 3D space filling Hilbert curves, I should specify that this is what I'm using as reference and to aid my imagination:
I'm aware that a similar question has already been asked, but the accepted answer links to a website there this problem is solved using a different approach (i.e., not turtle graphics).
Your 2d algorithm can be summarized as “LRFL” or “RLFR” (with “F” being “forward”). Each letter means “turn that direction, draw a (n-1)-curve in that direction, and take a step forward”. (This assumes the x in step 8 should be a y.)
In 3d, you can summarize the algorithm as the 7 turns you would need to go along your reference. This will depend on how you visualize the turtle starting. If it starts at the empty circle, facing the filled circle, and being right-side-up (with its back facing up), then your reference would be “DLLUULL”.
I know it's possible to apply a symbolic perturbation scheme like 'Simulation of Simplicity'(SoS) to geometric predicates like the 4-point orient, to avoid handling degenerate cases. I'm assuming it's also valid to do the same with plane-based geometry, where points are implicitly defined by the intersection of 3 planes, so I can have a similar orient predicate that tells me on which side of a 4th plane the point defined by the first 3 lies. I'd perturb the coefficients of the plane equation instead of the cartesian coordinates of a point.
The problem is that a point could be defined by many different planes. Each vertex in a cube is defined by 3 planes, but the apex of a pyramid has 4. Consistency seems to be everything with schemes like SoS, and I can't figure if it matters which 3 planes I select to define a point. Perhaps it doesn't, as long as every time I refer to that point I use the same 3 planes.
So, the question: Can I choose any 3 planes to represent a point?
Thanks in advance.
For a very similar problem, I represented the planes as perturbed bisectors between couples of points pi and pj:
Pij = {p | d2(pi,p) - ei = d2(p_j,p) - ej)}
where d2 denotes the squared Euclidean distance
and where ei = epsilon^(2^i) denotes the symbolic perturbation.
Then it is possible to write the equation of the intersection between three planes, inject it into the predicate, separate the nominator from the denominator to avoid divisions, order the ei terms and deduce the symbolic perturbation.
In your case, it would represent the degeneracy with a point on four planes as two points, each of them being on three of the four planes (exactly like order-4 vertices in Voronoi diagrams when using perturbed incircle predicate).
The advantage of this representation is that the symbolic perturbation is reasonably simple to write (only two terms per plane).
The implementation and documentation is available in my GEOGRAM library:
http://alice.loria.fr/software/geogram/doc/html/namespaceGEO_1_1PCK.html
Imagine an enormous 3D grid (procedurally defined, and potentially infinite; at the very least, 10^6 coordinates per side). At each grid coordinate, there's a primitive (e.g., a sphere, a box, or some other simple, easily mathematically defined function).
I need an algorithm to intersect a ray, with origin outside the grid and direction entering it, against the grid's elements. I.e., the ray might travel halfway through this huge grid, and then hit a primitive. Because of the scope of the grid, an iterative method [EDIT: (such as ray marching) ]is unacceptably slow. What I need is some closed-form [EDIT: constant time ]solution for finding the primitive hit.
One possible approach I've thought of is to determine the amount the ray would converge each time step toward the primitives on each of the eight coordinates surrounding a grid cell in some modular arithmetic space in each of x, y, and z, then divide by the ray's direction and take the smallest distance. I have no evidence other than intuition to think this might work, and Google is unhelpful; "intersecting a grid" means intersecting the grid's faces.
Notes:
I really only care about the surface normal of the primitive (I could easily find that given a distance to intersection, but I don't care about the distance per se).
The type of primitive intersected isn't important at this point. Ideally, it would be a box. Second choice, sphere. However, I'm assuming that whatever algorithm is used might be generalizable to other primitives, and if worst comes to worst, it doesn't really matter for this application anyway.
Here's another idea:
The ray can only hit a primitive when all of the x, y and z coordinates are close to integer values.
If we consider the parametric equation for the ray, where a point on the line is given by
p=p0 + t * v
where p0 is the starting point and v is the ray's direction vector, we can plot the distance from the ray to an integer value on each axis as a function of t. e.g.:
dx = abs( ( p0.x + t * v.x + 0.5 ) % 1 - 0.5 )
This will yield three sawtooth plots whose periods depend on the components of the direction vector (e.g. if the direction vector is (1, 0, 0), the x-plot will vary linearly between 0 and 0.5, with a period of 1, while the other plots will remain constant at whatever p0 is.
You need to find the first value of t for which all three plots are below some threshold level, determined by the size of your primitives. You can thus vastly reduce the number of t values to be checked by considering the plot with the longest (non-infinite) period first, before checking the higher-frequency plots.
I can't shake the feeling that it may be possible to compute the correct value of t based on the periods of the three plots, but I can't come up with anything that isn't scuppered by the starting position not being the origin, and the threshold value not being zero. :-/
Basically, what you'll need to do is to express the line in the form of a function. From there, you will just mathematically have to calculate if the ray intersects with each object, as and then if it does make sure you get the one it collides with closest to the source.
This isn't fast, so you will have to do a lot of optimization here. The most obvious thing is to use bounding boxes instead of the actual shapes. From there, you can do things like use Octrees or BSTs (Binary Space Partitioning).
Well, anyway, there might be something I am overlooking that becomes possible through the extra limitations you have to your system, but that is how I had to make a ray tracer for a course.
You state in the question that an iterative solution is unacceptably slow - I assume you mean iterative in the sense of testing every object in the grid against the line.
Iterate instead over the grid cubes that the line intersects, and for each cube test the 8 objects that the cube intersects. Look to Bresenham's line drawing algorithm for how to find which cubes the line intersects.
Note that Bresenham's will not return absolutely every cube that the ray intersects, but for finding which primitives to test I'm fairly sure that it'll be good enough.
It also has the nice properties:
Extremely simple - this will be handy if you're running it on the GPU
Returns results iteratively along the ray, so you can stop as soon as you find a hit.
Try this approach:
Determine the function of the ray;
Say the grid is divided in different planes in z axis, the ray will intersect with each 'z plane' (the plane where the grid nodes at the same height lie in), and you can easily compute the coordinate (x, y, z) of the intersect points from the ray function;
Swipe z planes, you can easily determine which intersect points lie in a cubic or a sphere;
But the ray may intersects with the cubics/spheres between the z planes, so you need to repeat the 1-3 steps in x, y axises. This will ensure no intersection is left off.
Throw out the repeated cubics/spheres found from x,y,z directions searches.
Problem
Given a set of known cartesian points (set A), and a 2d transformation (rotation, translation, scale) of some subset of those points (set B), find the orientation of the subset (rotation, translation, scale) relative to the original set of points.
I.E. Suppose I take a "picture" of a known set of 2d points on a wall. I want to know what position the camera was in relative to "upright and centered" when the picture was taken. Some of the points may not be visible in the picture (they may be occluded). (in this analogy, assume the camera is orthoganal and always pointed directly at the plane of the wall, so you don't need to take distortion or perspective into account)
Proposed approach:
Step 1: Scale B to the same "range" as A
Don't know how; open to suggestions. Maybe take the area of a convex hull around all the points in B, and scale it to nearly that of the convex hull around A. This is tricky, because points may be missing from B.
Step 2: Match some arbitrary point in "B" to its twin in "A"
Pick some random point in set B. Call this point K. Somehow take a "fingerprint" of K relative to all the other points in B (using distance only). Find its match in A by fingerprinting all points in A and taking the point with the most similar fingerprint of K.
Step 3: Rotate B (around K) until all points in B are aligned with a point in A
Multiple solutions are possible, so keep rotating though 360d looking for solutions.
That's just shooting from the hip, I may be way off base. Anyone have any ideas?
Assuming you don't actually know the correspondence between the points in the two clouds, you could try a statistical approach.
First, compute the mean x0 of the original cloud, then compute the mean x1 of the subset cloud. The difference of the mean vectors, x1-x0, is a good estimate of the required translation.
Now, subtract the relevant mean vector from each set to give two clouds centered at the origin. Compute the covariance matrix for each cloud and find its eigenvalues and eigenvectors. The required rotation can be found from the eigenvectors, while the scaling corresponds to the eigenvalues.
Compose all of this and you should have a good statistical estimate of the desired transform. Obviously, its quality will be a function of how well the subset spans the original set.
"Give me a place to stand on, and I will move the Earth" Archimede
I think we should follow the steps of Archimede
Arpi's algoritm:
We must choose a point (X1) of set A with coordinates (0, 0). (this will be the place to stand on)
Choose another point (X2) and put it on the OX vector (to simplify things)
All the other points' coordinates from set A will be calculated based on the coordinates of X1(0, 0) and X2(some_Coordinate, 0).
Now, choose a point from set B (Y1) and that will be the center of the B set. Choose another point from set B (Y2) and put it to OX of the B set. Now, we have a scale scalar and a rotation angle. If this will be a solution, than Y1 in the B set represents X1 from the A set and Y2 from the B set represents X2 from the A set. If we can find a map between the B set and A set based on this, using all the points of the B set and Yi <> Yj if i <> j, where i and j are the indexes of the points in our representation than we have a potential solution and we store that.
End of Arpi's algoritm
To find all the potential solutions you must do the following:
foreach point in A as X1 do
foreach point in A as X2 do
arpi's algoritm(X1, X2)
Of course, you can optimize this, but for the sake of simplicity I described it without optimizations (complications), it will be your job to optimize this and only if you need that.
I would attempt to minimize the deviation between the target points and the found points. Meaning I would pair each target point with a found point, and apply any transformation (rotation, scale or skew) to all the target points which decreases the sum of the deviations. I would repeat this for all potential pairs, eventually taking the match to be the set of pairs and the necessary transformations with the smallest total deviation.
The real question is how you optimize this so the performance to be better than O(n^2). I suppose some sort of heuristic matching, perhaps caching the intermediary results, or finding a method of eliminating some pairs earlier in the process.
As a followup to my previous question about determining camera parameters I have formulated a new problem.
I have two pictures of the same rectangle:
The first is an image without any transformations and shows the rectangle as it is.
The second image shows the rectangle after some 3d transformation (XYZ-rotation, scaling, XY-translation) is applied. This has caused the rectangle to look a trapezoid.
I hope the following picture describes my problem:
alt text http://wilco.menge.nl/application.data/cms/upload/transformation%20matrix.png
How do determine what transformations (more specifically: what transformation matrix) have caused this tranformation?
I know the pixel locations of the corners in both images, hence i also know the distances between the corners.
I'm confused. Is this a 2d or a 3d problem?
The way I understand it, you have a flat rectangle embedded in 3d space, and you're looking at two 2d "pictures" of it - one of the original version and one based on the transformed version. Is this correct?
If this is correct, then there is not enough information to solve the problem. For example, suppose the two pictures look exactly the same. This could be because the translation is the identity, or it could be because the translation moves the rectangle twice as far away from the camera and doubles its size (thus making it look exactly the same).
This is a math problem, not programming ..
you need to define a set of equations (your transformation matrix, my guess is 3 equations) and then solve it for the 4 transformations of the corner-points.
I've only ever described this using German words ... so the above will sound strange ..
Based on the information you have, this is not that easy. I will give you some ideas to play with, however. If you had the 3D coordinates of the corners, you'd have an easier time. Here's the basic idea.
Move a corner to the origin. Thereafter, rotations will take place about the origin.
Determine vectors of the axes. Do this by subtracting the adjacent corners from the origin point. These will be a local x and y axis for your world.
Determine angles using the vectors. You can use the dot and cross products to determine the angle between the local x axis and the global x axis (1, 0, 0).
Rotate by the angle in step 3. This will give you a new x axis which should match the global x axis and a new local y axis. You can then determine another rotation about the x axis which will bring the y axis into alignment with the global y axis.
Without the z coordinates, you can see that this will be difficult, but this is the general process. I hope this helps.
The solution will not be unique, as Alex319 points out.
If the second image is really a trapezoid as you say, then this won't be too hard. It is a trapezoid (not a parallelogram) because of perspective, so it must be an isosceles trapezoid.
Draw the two diagonals. They intersect at the center of the rectangle, so that takes care of the translation.
Rotate the trapezoid until its parallel sides are parallel to two sides of the original rectangle. (Which two? It doesn't matter.)
Draw a third parallel through the center. Scale this to the sides of the rectangle you chose.
Now for the rotation out of the plane. Measure the distance from the center to one of the parallel sides and use the law of sines.
If it's not a trapezoid, just a quadralateral, then it'll be harder, you'll have to use the angles between the diagonals to find the axis of rotation.