I dunno if this should go in a Math forum or a Programming forums, but I'll post it in both and see where I get.
I have two computer images... one of them is an "original" image (a large TIF file). The other one is a transformed version of the original image... it's been rotated, sheared and translated in a software program. I need to do some work on the transformed image, but I need the (x-y) coordinates of each pixel in the original image to finish my calculations.
I know that the image was rotated and sheared with a 3x3 Transformation matrix. If I had the matrix, I could derive the second image from the first (or vice-versa) myself. I don't know exactly how much it was rotated, sheared, or translated, so I can't just derive the matrices from a set of known transformations. What I do have is a set of corresponding points (the corners, et al) in each image, and their corresponding (x,y) coordinates. So here's my dilemma:
Using a set of corresponding transformed points ((x,y) -> (x',y'), three or more of them), can I derive the Transformation matrix that was used to turn one image into the other? If I can derive the matrix, I can solve for the original coordinates of all the pixels (all 18-million of 'em) and get the calculations done that I need to do.
Can anyone help? I'm familiar with linear algebra... just not familiar enough to derive this without a whole lotta head scratching. Anything is appreciated!
Mike
Not sure if you want manual or automatic.
Manual
If you specify the transformed coordinates of the four corners of your rectangle, then you can derive the transformation equations:
x' = c1 * x + c2 * y + c3 * x * y + c4
y' = c5 * x + c6 * y + c7 * x * y + c8
(From Pierre Wellner's Interacting with Paper on the DigitalDesk, page 67)
Now you just have to solve for the coefficients of the equation.
With four point pairs, the two sets of four simultaneous linear equations can be quickly solved by Gaussian Elimination to find the values of c1-8.
Lastly, you can turn those equations into the 3x3 matrices you want. The above equations are powerful enough to do non-linear transformations and you can simplify it into the 3x3 affine shear matrix.
But I would just stick with the nonliner equations (above) since they can handle perspective distortion.
Automatic
Same method, but you can use an edge-detector comboined with a line detection algorithm to find a set of 4-ish lines that makeup a rectangle.
If your image rectangles really stand out (whiteish images on a dark background), then you can use corner detection available from libraries like OpenCV's Feature Detection (see cv::cornerHarris).
You can intersect those lines to find the four corners and use the transformation equation.
I think you should start by providing a list of, say 3 points (for 6 unknowns) with X/Y coordinates before and after transformation.
Then somebody more clever than I should pop that into a set of linear equations and then feed it to (say) Wolfram Alpha for solving.
The top of Java's documentation for AffineTransform shows how the matrix needs to be set up:
[ x'] [ m00 m01 m02 ] [ x ] [ m00 x + m01 y + m02 ]
[ y'] = [ m10 m11 m12 ] [ y ] = [ m10 x + m11 y + m12 ]
[ 1 ] [ 0 0 1 ] [ 1 ] [ 1 ]
Removing most of the fluff leaves:
[ x'] [ m00 x + m01 y + m02 ]
[ y'] = [ m10 x + m11 y + m12 ]
Then you just set up a set of 6 x 2 equations like this:
m00 x + m01 y + m02 - x' = 0
m10 x + m11 y + m12 - y' = 0
(repeat for 2 other x/y before/after pairs)
and throw them at an equation solver.
You only need 3 points to define a 3x3 transformation matrix. If you have the points (0,0), (0,1) and (1,0) and transform them by the matrix [a b c d e f 0 0 1], you'll get (c,f), (b,e) and (a,d).
Related
I am working in an application where I need to know the intermediate points between two xy coordinates in CIE 1931 colour space.
In the picture below we can see that a linear transition (straight line) between A and B will go through a series of other colours, and I am struggling to find a mathematical way of describing the transitions between A and B.
Any ideas?
If I got your problem correctly, it's just a matter of finding a parameterization of a point on segment A-B in your illustration. The fact that it is a color is irrelevant.
Let C with coordinates (Xc,Yc) be such a point.
Then C can be written as:
Xc = Xa + a * (Xb-Xa)
Yc = Ya + a * (Yb-Ya)
where a is a fractional number in the range [0 ; 1]
I have two images which one of them is the Original image and the second one is Transformed image.
I have to find out how many degrees Transformed image was rotated using 3x3 transformation matrix. Plus, I need to find how far translated from origin.
Both images are grayscaled and held in matrix variables. Their sizes are same [350 500].
I have found a few lecture notes like this.
Lecture notes say that I should use the following matrix formula for rotation:
For translation matrix the formula is given:
Everything is good. But there are two problems:
I could not imagine how to implement the formulas using MATLAB.
The formulas are shaped to find x',y' values but I already have got x,x',y,y' values. I need to find rotation angle (theta) and tx and ty.
I want to know the equivailence of x, x', y, y' in the the matrix.
I have got the following code:
rotationMatrix = [ cos(theta) sin(theta) 0 ; ...
-sin(theta) cos(theta) 0 ; ...
0 0 1];
translationMatrix = [ 1 0 tx; ...
0 1 ty; ...
0 0 1];
But as you can see, tx, ty, theta variables are not defined before used. How can I calculate theta, tx and ty?
PS: It is forbidden to use Image Processing Toolbox functions.
This is essentially a homography recovery problem. What you are doing is given co-ordinates in one image and the corresponding co-ordinates in the other image, you are trying to recover the combined translation and rotation matrix that was used to warp the points from the one image to the other.
You can essentially combine the rotation and translation into a single matrix by multiplying the two matrices together. Multiplying is simply compositing the two operations together. You would this get:
H = [cos(theta) -sin(theta) tx]
[sin(theta) cos(theta) ty]
[ 0 0 1]
The idea behind this is to find the parameters by minimizing the error through least squares between each pair of points.
Basically, what you want to find is the following relationship:
xi_after = H*xi_before
H is the combined rotation and translation matrix required to map the co-ordinates from the one image to the other. H is also a 3 x 3 matrix, and knowing that the lower right entry (row 3, column 3) is 1, it makes things easier. Also, assuming that your points are in the augmented co-ordinate system, we essentially want to find this relationship for each pair of co-ordinates from the first image (x_i, y_i) to the other (x_i', y_i'):
[p_i*x_i'] [h11 h12 h13] [x_i]
[p_i*y_i'] = [h21 h22 h23] * [y_i]
[ p_i ] [h31 h32 1 ] [ 1 ]
The scale of p_i is to account for homography scaling and vanishing points. Let's perform a matrix-vector multiplication of this equation. We can ignore the 3rd element as it isn't useful to us (for now):
p_i*x_i' = h11*x_i + h12*y_i + h13
p_i*y_i' = h21*x_i + h22*y_i + h23
Now let's take a look at the 3rd element. We know that p_i = h31*x_i + h32*y_i + 1. As such, substituting p_i into each of the equations, and rearranging to solve for x_i' and y_i', we thus get:
x_i' = h11*x_i + h12*y_i + h13 - h31*x_i*x_i' - h32*y_i*x_i'
y_i' = h21*x_i + h22*y_i + h23 - h31*x_i*y_i' - h32*y_i*y_i'
What you have here now are two equations for each unique pair of points. What we can do now is build an over-determined system of equations. Take each pair and build two equations out of them. You will then put it into matrix form, i.e.:
Ah = b
A would be a matrix of coefficients that were built from each set of equations using the co-ordinates from the first image, b would be each pair of points for the second image and h would be the parameters you are solving for. Ultimately, you are finally solving this linear system of equations reformulated in matrix form:
You would solve for the vector h which can be performed through least squares. In MATLAB, you can do this via:
h = A \ b;
A sidenote for you: If the movement between images is truly just a rotation and translation, then h31 and h32 will both be zero after we solve for the parameters. However, I always like to be thorough and so I will solve for h31 and h32 anyway.
NB: This method will only work if you have at least 4 unique pairs of points. Because there are 8 parameters to solve for, and there are 2 equations per point, A must have at least a rank of 8 in order for the system to be consistent (if you want to throw in some linear algebra terminology in the loop). You will not be able to solve this problem if you have less than 4 points.
If you want some MATLAB code, let's assume that your points are stored in sourcePoints and targetPoints. sourcePoints are from the first image and targetPoints are for the second image. Obviously, there should be the same number of points between both images. It is assumed that both sourcePoints and targetPoints are stored as M x 2 matrices. The first columns contain your x co-ordinates while the second columns contain your y co-ordinates.
numPoints = size(sourcePoints, 1);
%// Cast data to double to be sure
sourcePoints = double(sourcePoints);
targetPoints = double(targetPoints);
%//Extract relevant data
xSource = sourcePoints(:,1);
ySource = sourcePoints(:,2);
xTarget = targetPoints(:,1);
yTarget = targetPoints(:,2);
%//Create helper vectors
vec0 = zeros(numPoints, 1);
vec1 = ones(numPoints, 1);
xSourcexTarget = -xSource.*xTarget;
ySourcexTarget = -ySource.*xTarget;
xSourceyTarget = -xSource.*yTarget;
ySourceyTarget = -ySource.*yTarget;
%//Build matrix
A = [xSource ySource vec1 vec0 vec0 vec0 xSourcexTarget ySourcexTarget; ...
vec0 vec0 vec0 xSource ySource vec1 xSourceyTarget ySourceyTarget];
%//Build RHS vector
b = [xTarget; yTarget];
%//Solve homography by least squares
h = A \ b;
%// Reshape to a 3 x 3 matrix (optional)
%// Must transpose as reshape is performed
%// in column major format
h(9) = 1; %// Add in that h33 is 1 before we reshape
hmatrix = reshape(h, 3, 3)';
Once you are finished, you have a combined rotation and translation matrix. If you want the x and y translations, simply pick off column 3, rows 1 and 2 in hmatrix. However, we can also work with the vector of h itself, and so h13 would be element 3, and h23 would be element number 6. If you want the angle of rotation, simply take the appropriate inverse trigonometric function to rows 1, 2 and columns 1, 2. For the h vector, this would be elements 1, 2, 4 and 5. There will be a bit of inconsistency depending on which elements you choose as this was solved by least squares. One way to get a good overall angle would perhaps be to find the angles of all 4 elements then do some sort of average. Either way, this is a good starting point.
References
I learned about homography a while ago through Leow Wee Kheng's Computer Vision course. What I have told you is based on his slides: http://www.comp.nus.edu.sg/~cs4243/lecture/camera.pdf. Take a look at slides 30-32 if you want to know where I pulled this material from. However, the MATLAB code I wrote myself :)
I'm trying to deduct the 2D-transformation parameters from the result.
Given is a large number of samples in an unknown X-Y-coordinate system as well as their respective counterparts in WGS84 (longitude, latitude). Since the area is small, we can assume the target system to be flat, too.
Sadly I don't know which order of scale, rotate, translate was used, and I'm not even sure if there were 1 or 2 translations.
I tried to create a lengthy equation system, but that ended up too complex for me to handle. Basic geometry also failed me, as the order of transformations is unknown and I would have to check every possible combination order.
Is there a systematic approach to this problem?
Figuring out the scaling factor is easy, just choose any two points and find the distance between them in your X-Y space and your WGS84 space and the ratio of them is your scaling factor.
The rotations and translations is a little trickier, but not nearly as difficult when you learn that the result of applying any number of rotations or translations (in 2 dimensions only!) can be reduced to a single rotation about some unknown point by some unknown angle.
Suddenly you have N points to determine 3 unknowns, the axis of rotation (x and y coordinate) and the angle of rotation.
Calculating the rotation looks like this:
Pr = R*(Pxy - Paxis_xy) + Paxis_xy
Pr is your rotated point in X-Y space which then needs to be converted to WGS84 space (if the axes of your coordinate systems are different).
R is the familiar rotation matrix depending on your rotation angle.
Pxy is your unrotated point in X-Y space.
Paxis_xy is the axis of rotation in X-Y space.
To actually find the 3 unknowns, you need to un-scale your WGS84 points (or equivalently scale your X-Y points) by the scaling factor you found and shift your points so that the two coordinate systems have the same origin.
First, finding the angle of rotation: take two corresponding pairs of points P1, P1' and P2, P2' and write out
P1' = R(P1-A) + A
P2' = R(P2-A) + A
where I swapped A = Paxis_xy for brevity. Subtracting the two equations gives:
P2'-P1' = R(P2-P1)
B = R * C
Bx = cos(a) * Cx - sin(a) * Cy
By = cos(a) * Cx + sin(a) * Cy
By + Bx = 2 * cos(a) * Cx
(By + Bx) / (2 * Cx) = cos(a)
...
(By - Bx) / (2 * Cy) = sin(a)
a = atan2(sin(a), cos(a)) <-- to get the right quadrant
And you have your angle, you can also do a quick check that cos(a) * cos(a) + sin(a) * sin(a) == 1 to make sure either you got all the calculations correct or that your system really is an orientation-preserving isometry (consists only of translations and rotations).
Now that we know a we know R and so to find A we do:
P1` = R(P1-A) + A
P1' - R*P1 = (I-R)A
A = (inverse(I-R)) * (P1' - R*P1)
where the inversion of a 2x2 matrix is easy.
EDIT: There is an error in the above, or more specifically one case that needs to be treated separately.
There is one combination of translations and rotations that does not reduce to a single rotation and that is a single translation. You can think of it in terms of fixed points (how many points are unchanged after the operation).
A translation has no fixed points (all points are changed) and a rotation has 1 fixed point (the axis doesn't change). It turns out that two rotations leave 1 fixed point and a translation and a rotation leaves 1 fixed point, which (with a little proof that says the number of fixed points tells you the operation performed) is the reason that arbitrary combinations of these result in a single rotation.
What this means for you is that if your angle comes out as 0 then using the method above will give you A = 0 as well, which is likely incorrect. In this case you have to do A = P1' - P1.
If I understood the question correctly, you have n points (X1,Y1),...,(Xn,Yn), the corresponding points, say, (x1,y1),...,(xn,yn) in another coordinate system, and the former are supposedly obtained from the latter by rotation, scaling and translation.
Note that this data does not determine the fixed point of rotation / scaling, or the order in which the operations "should" be applied. On the other hand, if you know these beforehand or choose them arbitrarily, you will find a rotation, translation and scaling factor that transform the data as supposed to.
For example, you can pick an any point, say, p0 = [X1, Y1]T (column vector) as the fixed point of rotation & scaling and subtract its coordinates from those of two other points to get p2 = [X2-X1, Y2-Y1]T, and p3 = [X3-X1, Y3-Y1]T. Also take the column vectors q2 = [x2-x1, y2-y1]T, q3 = [x3-x1, y3-y1]T. Now [p2 p3] = A*[q2 q3], where A is an unknwon 2x2 matrix representing the roto-scaling. You can solve it (unless you were unlucky and chose degenerate points) as A = [p2 p3] * [q2 q3]-1 where -1 denotes matrix inverse (of the 2x2 matrix [q2 q3]). Now, if the transformation between the coordinate systems really is a roto-scaling-translation, all the points should satisfy Pk = A * (Qk-q0) + p0, where Pk = [Xk, Yk]T, Qk = [xk, yk]T, q0=[x1, y1]T, and k=1,..,n.
If you want, you can quite easily determine the scaling and rotation parameter from the components of A or combine b = -A * q0 + p0 to get Pk = A*Qk + b.
The above method does not react well to noise or choosing degenerate points. If necessary, this can be fixed by applying, e.g., Principal Component Analysis, which is also just a few lines of code if MATLAB or some other linear algebra tools are available.
I have a quadratic bezier curve described as (startX, startY) to (anchorX, anchorY) and using a control point (controlX, controlY).
I have two questions:
(1) I want to determine y points on that curve based on an x point.
(2) Then, given a line-segment on my bezier (defined by two intermediary points on my bezier curve (startX', startY', anchorX', anchorY')), I want to know the control point for that line-segment so that it overlaps the original bezier exactly.
Why? I want this information for an optimization. I am drawing lots of horizontal beziers. When the beziers are larger than the screen, performance suffers because the rendering engine ends up rendering beyond the extents of what is visible. The answers to this question will let me just render what is visible.
Part 1
The formula for a quadratic Bezier is:
B(t) = a(1-t)2 + 2bt(1-t) + ct2
= a(1-2t+t2) + 2bt - 2bt2 + ct2
= (a-2b+c)t2+2(b-a)t + a
where bold indicates a vector. With Bx(t) given, we have:
x = (ax-2bx+cx)t2+2(bx-ax)t + ax
where vx is the x component of v.
According to the quadratic formula,
-2(bx-ax) ± 2√((bx-ax)2 - ax(ax-2bx+cx))
t = -----------------------------------------
2(ax-2bx+cx)
ax-bx ± √(bx2 - axcx)
= ----------------------
ax-2bx+cx
Assuming a solution exists, plug that t back into the original equation to get the other components of B(t) at a given x.
Part 2
Rather than producing a second Bezier curve that coincides with part of the first (I don't feel like crunching symbols right now), you can simply limit the domain of your parametric parameter to a proper sub-interval of [0,1]. That is, use part 1 to find the values of t for two different values of x; call these t-values i and j. Draw B(t) for t ∈ [i,j]. Equivalently, draw B(t(j-i)+i) for t ∈ [0,1].
The t equation is wrong, you need to use eq(1)
(1) x = (ax-2bx+cx)t2+2(bx-ax)t + ax
and solve it using the the quadratic formula for the roots (2).
-b ± √(b^2 - 4ac)
(2) x = -----------------
2a
Where
a = ax-2bx+cx
b = 2(bx-ax)
c = ax - x
Trilinear interpolation approximates the value of a point (x, y, z) inside a cube using the values at the cube vertices. I´m trying to do an "inverse" trilinear interpolation. Knowing the values at the cube vertices and the value attached to a point how can I find (x, y, z)? Any help would be highly appreciated. Thank you!
You are solving for 3 unknowns given 1 piece of data, and as you are using a linear interpolation your answer will typically be a plane (2 free variables). Depending on the cube there may be no solutions or a 3D solution space.
I would do the following. Let v be the initial value. For each "edge" of the 12 edges (pair of adjacent vertices) of the cube look to see if 1 vertex is >=v and the other <=v - call this an edge that crosses v.
If no edges cross v, then there are no possible solutions.
Otherwise, for each edge that crosses v, if both vertices for the edge equal v, then the whole edge is a solution. Otherwise, linearly interpolate on the edge to find the point that has a value of v. So suppose the edge is (x1, y1, z1)->v1 <= v <= (x2, y2, z2)->v2.
s = (v-v1)/(v2-v1)
(x,y,z) = (s*(x2-x1)+x1, (s*(y2-y1)+y1, s*(z2-z1)+z1)
This will give you all edge points that are equal to v. This is a solution, but possibly you want an internal solution - be aware that if there is an internal solution there will always be an edge solution.
If you want an internal solution then just take any point linearly between the edge solutions - as you are linearly interpolating then the result will also be v.
I'm not sure you can for all cases. For example using tri-linear filtering for colours where each colour (C) at each point is identical means that wherever you interpolate to you will still get the colour C returned. In this situation ANY x,y,z could be valid. As such it would be impossible to say for definite what the initial interpolation values were.
I'm sure for some cases you can reverse the maths but, i imagine, there are far too many cases where this is impossible to do without knowing more of the input information.
Good luck, I hope someone will prove me wrong :)
The wikipedia page for trilinear interpolation has link to a NASA page which allegedly describes the inversing process - have you had a look at that?
The problem as you're describing it somewhat ill-defined.
What you're asking for basically translates to this: I have a 3D function and I know its values in 8 known points. I'd like to know what is the point in which the function received value V.
The trouble is that in most likelihood there is an infinite number of such points which make a set of surfaces, lines or points, depending on the data.
One way to find this set is to use an iso-surfacing algorithm like Marching cubes.
Let's start with 2d: think of a bilinear hill over a square km,
with heights say 0 10 20 30 at the 4 corners
and a horizontal plane cutting the hill at height z.
Draw a line from the 0 corner to the 30 corner (whether adjacent or diagonal).
The plane must cut this line, for any z,
so all points x,y,z fall on this one line, right ? Hmm.
OK, there are many solutions -- any z plane cuts the hill in a contour curve.
Say we want solutions to be spread out over the whole hill,
i.e. minimize two things at once:
vertical distance z - bilin(x,y),
distance from x,y to some point in the square.
Scipy.optimize.leastsq is one way of doing this, sample code below;
trilinear is similar.
(Optimizing any two things at once requires an arbitrary tradeoff or weighting:
food vs. money, work vs. play ...
Cf. Bounded rationality
)
""" find x,y so bilin(x,y) ~ z and x,y near the middle """
from __future__ import division
import numpy as np
from scipy.optimize import leastsq
zmax = 30
corners = [ 0, 10, 20, zmax ]
midweight = 10
def bilin( x, y ):
""" bilinear interpolate
in: corners at 0 0 0 1 1 0 1 1 in that order (binary)
see wikipedia Bilinear_interpolation ff.
"""
z00,z01,z10,z11 = corners # 0 .. 1
return (z00 * (1-x) * (1-y)
+ z01 * (1-x) * y
+ z10 * x * (1-y)
+ z11 * x * y)
vecs = np.array([ (x, y) for x in (.25, .5, .75) for y in (.25, .5, .75) ])
def nearvec( x, vecs ):
""" -> (min, nearest vec) """
t = (np.inf,)
for v in vecs:
n = np.linalg.norm( x - v )
if n < t[0]: t = (n, v)
return t
def lsqmin( xy ): # z, corners
x,y = xy
near = nearvec( np.array(xy), vecs )[0] * midweight
return (z - bilin( x, y ), near )
# i.e. find x,y so both bilin(x,y) ~ z and x,y near a point in vecs
#...............................................................................
if __name__ == "__main__":
import sys
ftol = .1
maxfev = 10
exec "\n".join( sys.argv[1:] ) # ftol= ...
x0 = np.array(( .5, .5 ))
sumdiff = 0
for z in range(zmax+1):
xetc = leastsq( lsqmin, x0, ftol=ftol, maxfev=maxfev, full_output=1 )
# (x, {cov_x, infodict, mesg}, ier)
x,y = xetc[0] # may be < 0 or > 1
diff = bilin( x, y ) - z
sumdiff += abs(diff)
print "%.2g %8.2g %5.2g %5.2g" % (z, diff, x, y)
print "ftol %.2g maxfev %d midweight %.2g => av diff %.2g" % (
ftol, maxfev, midweight, sumdiff/zmax)