Is there any algorithm for determining 3d position in such case? (images below) - algorithm

So first of all I have such image (and ofcourse I have all points coordinates in 2d so I can regenerate lines and check where they cross each other)
(source: narod.ru)
But hey, I have another Image of same lines (I know thay are same) and new coords of my points like on this image
(source: narod.ru)
So... now Having points (coords) on first image, How can I determin plane rotation and Z depth on second image (asuming first one's center was in point (0,0,0) with no rotation)?

What you're trying to find is called a projection matrix. Determining precise inverse projection usually requires that you have firmly established coordinates in both source and destination vectors, which the images above aren't going to give you. You can approximate using pixel positions, however.
This thread will give you a basic walkthrough of the techniques you need to use.

Let me say this up front: this problem is hard. There is a reason Dan Story's linked question has not been answered. Let provide an explanation for people who want to take a stab at it. I hope I'm wrong about how hard it is, though.
I will assume that the 2D screen coordinates and projection/perspective matrix is known to you. You need to know at least this much (if you don't know the projection matrix, essentially you are using a different camera to look at the world). Let's call each pair of 2D screen coordinates (a_i, b_i), and I will assume the projection matrix is of the form
P = [ px 0 0 0 ]
[ 0 py 0 0 ]
[ 0 0 pz pw]
[ 0 0 s 0 ], s = +/-1
Almost any reasonable projection has this form. Working through the rendering pipeline, you find that
a_i = px x_i / (s z_i)
b_i = py y_i / (s z_i)
where (x_i, y_i, z_i) are the original 3D coordinates of the point.
Now, let's assume you know your shape in a set of canonical coordinates (whatever you want), so that the vertices is (x0_i, y0_i, z0_i). We can arrange these as columns of a matrix C. The actual coordinates of the shape are a rigid transformation of these coordinates. Let's similarly organize the actual coordinates as columns of a matrix V. Then these are related by
V = R C + v 1^T (*)
where 1^T is a row vector of ones with the right length, R is an orthogonal rotation matrix of the rigid transformation, and v is the offset vector of the transformation.
Now, you have an expression for each column of V from above: the first column is { s a_1 z_1 / px, s b_1 z_1 / py, z_1 } and so on.
You must solve the set of equations (*) for the set of scalars z_i, and the rigid transformation defined R and v.
Difficulties
The equation is nonlinear in the unknowns, involving quotients of R and z_i
We have assumed up to now that you know which 2D coordinates correspond to which vertices of the original shape (if your shape is a square, this is slightly less of a problem).
We assume there is even a solution at all; if there are errors in the 2D data, then it's hard to say how well equation (*) will be satisfied; the transformation will be nonrigid or nonlinear.

It's called (digital) photogrammetry. Start Googling.

If you are really interested in this kind of problems (which are common in computer vision, tracking objects with cameras etc.), the following book contains a detailed treatment:
Ma, Soatto, Kosecka, Sastry, An Invitation to 3-D Vision, Springer 2004.
Beware: this is an advanced engineering text, and uses many techniques which are mathematical in nature. Skim through the sample chapters featured on the book's web page to get an idea.

Related

Pose from essential matrix, the order of transform

From I've understood, the transform order (rotate first or translate first) yield different [R|t].
So I want to know what's the order of the 4 possible poses you got from essential matrix SVD.
I've read a code implementation of Pose From Essential Matrix(Hartley and Zisserman's multiple view geometry (page 259)). And the author seems to interpret it as rotate first then translate, where he retrieve camera position by using p = -R^T * t.
Also, opencv seems to use trasnlate first then rotate rule. Because the t vector I got from calibrating camera is the position of camera.
Or maybe I have been wrong and the order doesn't matter?
You shouldn't use SVD to decompose a transformation into rotation and translation components. Viewed as x' = M*x = T*R*x, the translation is just the fourth column of M, and the rotation is in the upper-left 3x3 submatrix.
If you feed the whole 4x4 matrix into SVD, I'm not sure what you'll get out, but it won't be useful. (If nothing else, U and V will not be affine.)

How can I sort a coordinate matrix based on the distance between points in another coordinate matrix in matlab?

I am using matlab's built in function called Procrustes to see the rotation translation and scale between two images. But, I am just using coordinates of the brightest points in the image and rotating these coordinates about the center of the image. Procrustes compares two matrices and gives you the rotation, translation, and scale. However, procrustes only works correctly if the matrices are in the same order for comparison.
I am given an image and a separate comparison coordinate matrix. The end goal is to find how much the image has been rotated, translated, and scaled compared to the coordinate matrix. I can just use Procrustes for this, but I need to correctly order the coordinates found from the image to match the order in the comparison coordinate matrix. My thought was to compare the distance between every possible combination of points in the coordinate matrix and compare it to the coordinates that I find in the picture. I just do not know how to write this code due to the fact if there is n coordinates, there will be n! possible combinations.
Just searching for the shortest distance is not so hard.
A = rand(1E4,2);
B = rand(1E4,2);
tic
idx = nan(1,1E4);
for ct = 1:size(A,1)
d = sum((A(ct,:)-B).^2,2);
idx(ct) = find(d==min(d));
end
toc
plot(A(1:10,1),A(1:10,2),'.r',B(idx(1:10),1),B(idx(1:10),2),'.b')
takes half a second on my PC.
The problems can start when two points in set A are matched to the same location in set B.
length(unique(idx))==length(idx)
This can be solved in several ways. The best (imho) is to determine a probability that point B matches with point A based on the distance (usually something that decreases exponentially), and solve for the most probable situation.
A simpler method (but more error prone) is to remove the matched point from set B.

Plotting a location into an irregular rectangle

I have 4 Point values: TopLeft, TopRight, BottomLeft, BottomRight. These define a 4 sided shape (like a distorted rectangle) on my monitor. These are the point a Tobii gaze device thinks I am looking at when in fact I am looking at the four corners of my monitor.
This picture shows a bitmap on the left representing my monitor, and the points the Tobii device tells me I am looking at when I am in fact looking at the corners of the screen. (It's a representation, not real).
I want to use those four calibration points to take a screen X,Y position that is from an inaccurate gaze position and correct it so that it is positioned as per the image on the right.
Edit: New solution for the edited question is at the end.
This problem is call bilinear interpolation.
Once you grasp the idea, it will be very easy and you would remember it for the rest of your life.
It would be quite long to post all detail here, but I will try.
First, I will name the point on the left to be (x,y) and the right to be (X,Y).
Let (x1,y1), (x1,y2), (x2,y1), (x2,y2) be the corner points on the left rectangle.
Secondly, let's split the problem into 2 bilinear interpolation problems:
want to find X
want to find Y
Let's find them one by one (X or Y).
Define : Qxx are the value of X or Y of the four corner in the right rectangle.
Suppose that we want to find the value of the unknown function f at
the point (x, y). It is assumed that we know the value of f at the
four points Q11 = (x1, y1), Q12 = (x1, y2), Q21 = (x2, y1), and Q22 =
(x2, y2).
The f(x,y) of your problem is X or Y in your question.
Then you interpolate f(x,y1) and f(x,y2) to be f(x,y) in the same way.
Finally, you will got X or Y=f(x,y)
Reference : All pictures/formulas/text here are copied from the wiki link (some with modification).
Edit: After the question has been edited, it become very different.
The new one is opposite, and it is called "inverse bilinear interpolation" which is far harder.
For more information, please read http://www.iquilezles.org/www/articles/ibilinear/ibilinear.htm
You can define a unique Linear Transform using 6 equations. The 3 points which have to align provide those 6 equations, as each pair of matching points provides two equations in x and y.
If you want to pursue this, I can provide the matrix equation which defines the Linear Transform based on how it maps three points. You invert this matrix and it will provide the linear transform.
But having done that, the transform is completely specified. You have no control over where the corner points of the original quadrilateral will go. In general, you can't even define a linear transform to map one quadrilateral onto another; this gives 8 equations (2 for each corner) with only 6 unknowns. Its over-specified. In fact a Linear Transform must always map a rectangle to a parallelogram, so in general you can't define a Linear Transform which maps one quadrilateral to another.
So if it can't be a Linear Transform, can it be a non-Linear Transform? Well, yes, but non-Linear Transforms don't necessarily map straight lines to straight lines, so the mapped edges of the quadrilateral won't be straight. Or any other lines. And you still have 14 equations (2 for each point and corner) for which you have to invent some non-Linear transform with 14 unknowns.
So the problem as stated cannot be solved with a Linear Transform; its over specified. Using a non-Linear transform will require you to devise a non-Linear transform which has 14 free variables (vs the 6 in a Linear Transform), this will map the 7 points correctly but straight lines will no longer be straight. Adding this requirement in adds an infinite number of constraints (one for every point in the line) and you won't even be able to use continuous functions.
There may be some solution to what you are doing in terms of what you are really trying to do (ie the underlying application need), but as a mathematical problem it is unsolvable.
Let me know if you want the matrix equation to produce a Linear Transform based on how it transforms 3 points.

Procrustes analysis / Finding the Angle Between two images represented by two sets of 2d points

If I have 2 sets of points I can rotate one around with Procrustes analysis to align one with the other.
But suppose these 2 sets of points each are attached to images and I would like to rotate the images as well. Is there any way I can also rotate the image, instead of rotating just the points? The tutorial there uses a dot product for rotation (solve u, s, v = svd(p1', p2) and then do p2 . v . u', p' is transposed p)
However that doesn't tell me what the angle between the images is.
The page on wikipedia calculates an angle between each pair of points I think.
Maybe what I'm asking is impossible? If I rotate the first set of points to align it with the first, can't I also rotate the respective images by an angle as well? Point being, which angle is that?
I noticed that v . u' gives me a 2 x 2 matrix which seems to be the rotation matrix (there's a wikipedia page but I can't link there due to posting priviledges). I got the sin and cos of the third and first elements and then used arctan2, but the results I'm getting are kind of weird. I know they have to be transformed from radians but I'm not convinced what I'm doing is right. Trying the rotation it gives me on gimp makes it seem like it's not what I want, but I'll test some more.
It seems like your approach is mostly correct. Two things which come to mind:
1) The paper you linked to (Procrustes analysis) involves a translation and a scaling in addition to rotation. Depending on whether or not those operations are also performed on your images, you may end up with strange results that don't appear to match.
2) I think you may be overcomplicating your angle calculation. v * u' appears to be the correct rotation matrix, but I believe the correct angle only requires one of the matrix entries in the 2x2 matrix. For instance, just use acos() of the first matrix entry. As you've noticed, this will (depending on the program) give you an answer in radians which you'll have to convert to degrees if you want to try out the rotation in gimp, etc.
Hope this helps!

2D Inverse Kinematics Implementation

I am trying to implement Inverse Kinematics on a 2D arm(made up of three sticks with joints). I am able to rotate the lowest arm to the desired position. Now, I have some questions:
How can I make the upper arm move alongwith the third so the end point of the arm reaches the desired point. Do I need to use the rotation matrices for both and if yes can someone give me some example or an help and is there any other possibl;e way to do this without rotation matrices???
The lowest arm only moves in one direction. I tried google it, they are saying that cross product of two vectors give the direction for the arm but this is for 3D. I am using 2D and cross product of two 2D vectors give a scalar. So, how can I determine its direction???
Plz guys any help would be appreciated....
Thanks in advance
Vikram
I'll give it a shot, but since my Robotics are two decades in the past, take it with a grain of salt.
The way I learned it, every joint was described by its own rotation matrix, defined relative to its current position and orientation. The coordinate of the whole arm's endpoint was then calculated by combining the rotation matrices together.
This achieved exactly the effect you are looking for: you could move only one joint (change its orientation), and all the other joints followed automatically.
You won't have much chance in getting around matrices here - in fact, if you use homogeneous coordinates, all joint calculations (rotations as well as translations) can be modeled with matrix multiplications. The advantage is that the full arm position can then be described with a single matrix (plus the arm's origin).
With this transformation matrix, you can tackle the inverse kinematic problem: since the transformation matrix' elements will depend on the angles of the joints, you can treat the whole calculation 'endpoint = startpoint x transformation' as a system of equations, and with startpoint and endpoint known, you can solve this system to determine the unknown angles. The difficulty herein lies that the equation may not be solvable, or that there are multiple solutions.
I don't quite understand your second question, though - what are you looking for?
Instead of a rotation matrix, the rotation can be represented by its angle or by a complex number of the unit circle, but it's the same thing really. More importantly, you need a representation T of rigid body transformations, so that you can write stuff like t1 * t2 * t3 to compute the position and orientation of the third link.
Use atan2 to compute the angle between the vectors.
As the following Python example shows, those two things are enough to build a small IK solver.
from gameobjects.vector2 import Vector2 as V
from matrix33 import Matrix33 as T
from math import sin, cos, atan2, pi
import random
The gameobjects library does not have 2D transformations, so you have to write matrix33 yourself. Its interface is just like gameobjects.matrix44.
Define the forward kinematics function for the transformation from one joint to the next. We assume the joint rotates by angle and is followed by a fixed transformation joint:
def fk_joint(joint, angle): return T.rotation(angle) * joint
The transformation of the tool is tool == fk(joints, q) where joints are the fixed transformations and q are the joint angles:
def fk(joints, q):
prev = T.identity()
for i, joint in enumerate(joints):
prev = prev * fk_joint(joint, q[i])
return prev
If the base of the arm has an offset, replace the T.identity() transformation.
The OP is solving the IK problem for position by cyclic coordinate descent. The idea is to move the tool closer to the goal position by adjusting one joint variable at a time. Let q be the angle of a joint and prev be the transformation of the base of the joint. The joint should be rotated by the angle between the vectors to the tool and goal positions:
def ccd_step(q, prev, tool, goal):
a = tool.get_position() - prev.get_position()
b = goal - prev.get_position()
return q + atan2(b.get_y(), b.get_x()) - atan2(a.get_y(), a.get_x())
Traverse the joints and update the tool configuration for every change of a joint value:
def ccd_sweep(joints, tool, q, goal):
prev = T.identity()
for i, joint in enumerate(joints):
next = prev * fk_joint(joint, q[i])
q[i] = ccd_step(q[i], prev, tool, goal)
prev = prev * fk_joint(joint, q[i])
tool = prev * next.get_inverse() * tool
return prev
Note that fk() and ccd_sweep() are the same for 3D; you just have to rewrite fk_joint() and ccd_step().
Construct an arm with n identical links and run cnt iterations of the CCD sweep, starting from a random arm configuration q:
def ccd_demo(n, cnt):
q = [random.uniform(-pi, pi) for i in range(n)]
joints = [T.translation(0, 1)] * n
tool = fk(joints, q)
goal = V(0.9, 0.75) # Some arbitrary goal.
print "i Error"
for i in range(cnt):
tool = ccd_sweep(joints, tool, q, goal)
error = (tool.get_position() - goal).get_length()
print "%d %e" % (i, error)
We can try out the solver and compare the rate of convergence for different numbers of links:
>>> ccd_demo(3, 7)
i Error
0 1.671521e-03
1 8.849190e-05
2 4.704854e-06
3 2.500868e-07
4 1.329354e-08
5 7.066271e-10
6 3.756145e-11
>>> ccd_demo(20, 7)
i Error
0 1.504538e-01
1 1.189107e-04
2 8.508951e-08
3 6.089372e-11
4 4.485040e-14
5 2.601336e-15
6 2.504777e-15
In robotics we most often use DH parameters for the forward and reverse kinematics. Wikipedia has a nice introduction.
The DH (Denavit-Hartenberg) notation is part of the solution. It helps you collect a succinct set of values that describe the mechanics of your robot such as link length and joint type.
From there it becomes easier to calculate forward kinematics. The first think you have to understand is how to translate a coordinate frame from one place to another coordinate frame. For example, given your robot (or the DH table of it), what is the set of rotations and translations you have to apply to one coordinate frame (the world for example) to know the location of a point (or vector) in the robot's wrist coordinate frame.
As you may already know, homogeneous transform matrices are very useful for such transformations. They are 4x4 matrices that encapsulate rotation and translation. Another very useful property of those matrices is that if you have two coordinate frames linked and defined by some rotation and translation, if you multiply the two matrices together, then you just need to multiply your transformation target by the product of that multiplication.
So the DH table will help you build that matrix.
Inverse kinematics is a bit more complicated though and depends on your application. The complication arises from having multiple solutions for the same problem. The greater the number of DOF, the greater the number of solutions.
Think about your arm. Pinch something solid around you. You can move your arm to several locations in the space and still keep your pinching vector unchanged. Solving the inverse kinematics problem involves deciding which solution to choose as well.

Resources