Related
Ok, here is the story : I found this problem in one of the pizza boxes a few weeks ago. It said if you can solve this before you could finish the pizza, you would get hired at tripadviser. Though I am not looking to get hired, this problem got my eyes and spoiled my focus on pizza and dinner. I worked out something but with some assumptions. Here is the question :
Assume we know P,Q R and S. There is the line connecting centers of each rectangle. We need to find out points C and D. I am not sure if there is some other variable that we should know to solve this.
EDIT
Looking for a programmatic or psudo-code explanation- no need to move to maxthexchange.
Any suggestions ?
It's pretty simple to do step-by-step:
Compute A = (P + Q) / 2 and B = R + S / 2 (component-by-component)
An equation for the line between A and B is L(t) = t * A + (1 - t) * (B - A). Just solve
this linear equation for a t* such that L(t*).y = Q.y to get C = L(t*). Do a similar thing with L(t).y = R.y to get D.
You can also use the values of t* that you get when solving for C and D to determine pathological cases like overlapping rectangles.
You actually don't need to find the points C and D to find the distance.
I assume you already know the coordinates of the rectangle. It's trivial to compute the coordinates of the center points and the lenghts of the edges.
Now, imagine a vertical line passing through A and a horizontal line passing through B. They intersect at a point, call it X. Also, imagine a vertical line passing through C and call its intersection point with the top edge of rectangle RS - C'.
You can trivially compute the length of AX. But the length of AX is half the height of RS + half the height of PQ (both of which you know) + the length of CC'.
So now you know the length of CC' (call it x).
You can also compute the angle (call it n) that AB makes with CC' from A and B's coordinates, since you know CC' is vertical.
Ergo, the length of the segment CD is x * cos(n).
I have been querying google for some material about kd-trees and image comparison but I couldn't make the 'link' between the technics for image comparison using kd-trees.
Firstly, I found some articles talking about speed improvement with randomized kd-trees, then I was introduced to SIFT. After understanding basically how SIFT works, I read about nearest neighbor search.
My real question is: If I have a mesh of points from SIFT, then I create the kd-tree for every image. How the nearest neighbor search can help me compare the images? At first, I thought that comparing images with a tree would work with some algorithm checking the tree structure and how near every point is from an image A from a point in the same node in and image B.
If the question is too dumb, please suggest material or some topic for search.
Thank you!
I'd suggest first understanding slow feature matching, without kdtrees.
input: 1000 reference features, e.g. of faces or flowers; call these F1 .. F1000
a query feature Q: which face or flower feature is most like, nearest, Q ?
As you know,
SIFT
reduces an image feature to 128 8-bit numbers, scaled so that
similarity( feature F, feature Q ) =
Euclidean distance( SIFT(F), SIFT(Q) ).
The simplest way to find which of F1 .. F1000 is most like Q
is just to look at F1, F2 ... one by one:
# find the feature of F1 .. F1000 nearest Q
nearestdistance = infinity
nearestindex = 0
for j in 1 .. 1000:
distance = Euclideandistance( SIFT(Fj), SIFT(Q) ) # 128 numbers vs. 128 numbers
if distance < nearestdistance:
nearestdistance = distance
nearestindex = j
(Of course one computes the SIFT numbers outside the loop.)
A Kdtree
is just a way of finding nearby vectors quickly;
it has little to do with what is being matched
(vectors of numbers representing ...), or how (Euclidean distance).
Now kdtrees are very fast for 2d, 3d ... up to perhaps 20d,
but may be no faster than a linear scan of all the data above 20d.
So how can a kdtree work for features in 128d ?
The main trick is to quit searching early.
The paper by Muja and Lowe,
Fast approximate nearest neighbors with automatic algorithm configuration,
2009, 10p, describes multiple randomized kdtrees for matching 128d SIFT features.
(Lowe is the inventor of SIFT.)
To compare two images I and Q, one finds a set of feature vectors --
several hundred up to a few thousand SIFT vectors -- for each,
and looks for near matches of these sets.
(One may think of images as molecules, features as atoms;
near-matching molecules is much harder than near-matching atoms,
but it helps to be able to match atoms quickly.)
Hope this helps.
If you are planning on using kd-trees for approximate NN search in higher dimensions, you might want to review the experiments here: http://zach.in.tu-clausthal.de/software/approximate_nn/
I suggest you to extract color code values of each image and create a KD tree using those features vectors.
You can use the following mat lab code to extract the color code features.
im = imread('image.jpg');
len = size(im,3);
if(len == 1)
im = ind2rgb(im, colourMap);
im = uint8(im.*255);
end
im(logical( 0 <= im & im <= 63)) = 0;
im(logical( 64 <= im & im <= 127)) = 1;
im(logical(128 <= im & im <= 191)) = 2;
im(logical(192 <= im & im <= 255)) = 3;
im = im(:,:,1) * 16 + im(:,:,2) * 4 + im(:,:,3);
imHist = histc(im(:),0:63);
I came across a traveling salesman solution which uses Matlab script, and in its code, I found that it uses a representation called City Coordinates, which looks like:
CityCood = [0.4000,0.2439,0.1707,0.2239,0.5171;0.4439,0.1463,0.2293,0.7610,0.9414]
for 5 cities.
At this point, I am really clueless about how did the author get this representation, since from what I have seen so far, the information at hand should be a 5*5 symmetric matrix representing distance between any two of these five cities.
So I would be grateful if anyone could give me an idea on how that coordinate-based representation works. Thanks in advance.
CityCoord (I think there's a letter missing) is a 2-by-5 array. I assume this means thatCityCoord contains two coordinates (x,y) for every single city.
To create a 5-by-5 distance matrix, you can call
squareform(pdist(CityCoord'))
If you don't have the Statistics Toolbox, an equivalent form to the solution provided by #Jonas to compute the Euclidean distance is:
%# dist(u,v) = norm(u-v) = sqrt(sum((u-v).^2))
D = cell2mat( arrayfun( ...
#(i) sqrt( sum( bsxfun(#minus, CityCoord, CityCoord(:,i)).^2 ) ), ...
(1:size(CityCood,2))', ...
'UniformOutput',false) );
Otherwise, we can use the fact that ||u-v||^2 = ||u||^2 + ||v||^2 - 2*u.v to implement an even faster vectorized code:
X = sum(CityCoord.^2);
D = real( sqrt(bsxfun(#plus,X,X')-2*(CityCoord'*CityCoord)) );
Given a set of points, what's the fastest way to fit a parabola to them? Is it doing the least squares calculation or is there an iterative way?
Thanks
Edit:
I think gradient descent is the way to go. The least squares calculation would have been a little bit more taxing (having to do qr decomposition or something to keep things stable).
If the points have no error associated, you may interpolate by three points. Otherwise least squares or any equivalent formulation is the way to go.
I recently needed to find a parabola that passes through 3 points.
suppose you have (x1,y1), (x2,y2) and (x3,y3) and you want the parabola
y-y0 = a*(x-x0)^2
to pass through them: find y0, x0, and a.
You can do some algebra and get this solution (providing the points aren't all on a line) :
let c = (y1-y2) / (y2-y3)
x0 = ( -x1^2 + x2^2 + c*( x2^2 - x3^2 ) ) / (2.0*( -x1+x2 + c*x2 - c*x3 ))
a = (y1-y2) / ( (x1-x0)^2 - (x2-x0)^2 )
y0 = y1 - a*(x1-x0)^2
Note in the equation for c if y2==y3 then you've got a problem. So in my algorithm I check for this and swap say x1, y1 with x2, y2 and then proceed.
hope that helps!
Paul Probert
A calculated solution is almost always faster than an iterative solution. The "exception" would be for low iteration counts and complex calculations.
I would use the least squares method. I've only every coded it for linear regression fits but it can be used for parabolas (I had reason to look it up recently - sources included an old edition of "Numerical Recipes" Press et al; and "Engineering Mathematics" Kreyzig).
ALGORITHM FOR PARABOLA
Read no. of data points n and order of polynomial Mp .
Read data values .
If n< Mp
[ Regression is not possible ]
stop
else
continue ;
Set M=Mp + 1 ;
Compute co-efficient of C-matrix .
Compute co-efficient of B-matrix .
Solve for the co-efficients
a1,a2,. . . . . . . an .
Write the co-efficient .
Estimate the function value at the glren of independents variables .
Using the free arbitrary accuracy math program "PARI" (for Mac or PC):
Here is how I would fit a parabola to a set of 641 points,
and I also show how to find the minimum of that parabola:
Set a high number of digits of precision:
\p 300
Write the data points to a text file separated by one space
for each data point
(use ASCII characters in base ten, no space at file start or file end, and no returns, write extremely large or small floating points as for example
"9.0E-23" but not "9.0D-23" ).
make a string to point to that file:
fileone="./desktop/data.txt"
read that file into PARI using the following instructions:
fileopen(fileone,r)
readsplit(file) = my(cmd);cmd="perl -ne \"chomp; print '[' . join(',', split(/ +/)) . ']\n';\"";eval(externstr(Str(cmd," ",file)))
readsplit(fileone)
Label that data with a name:
in = %
V = in[1]
Define a least squares fit function:
lsf(X,Y,n) = my(M=matrix(#X,n+1,i,j,X[i]^(j-1)));fit=Polrev(matsolve(M~*M,M~*Y~))
Apply that lsf function to your 641 data points:
lsf([-320..320],V, 2)
Then if you want to show the minimum of that parabolic fit, enter:
xextreme = solve (x=-1000,1000,eval(deriv(fit)));print (xextreme*(124.5678-123.5678)/640+(124.5678+123.5678)/2);x=xextreme;print(eval(fit))
(I had to adjust for my particular x-axis scaling before the "print" statement in that command line above).
(Note: A sacrifice made to simplify this algorithm
causes it to work only
when the data set has equally spaced x-axis coordinates.)
I was worried that my last post
was too compact to follow and
too hard to convert to other environments.
I would like to show here how to solve the
generalized problem of parabolic data fitting explicitly
without specialized matrix math terminology;
and so that each multiplication, division,
subtraction and addition can be seen at once.
To save ink this fit reparameterizes the x-axis as evenly
spaced points centered on zero
so that odd powered sums all get eliminated
(saving a lot of space and time),
so the x-coordinates of the N data points
are effectively labeled by points
of this vector: X=[-(N-1)/2..(N-1)/2].
For example "xextreme" will be returned
versus those integer indices
and so (if desired) a simple (consumes very little CPU time)
linear transformation must be applied after the algorithm below
to get it versus your problem's particular x-axis labels.
This is written in the language of
the free program "PARI" but all the
commands are simple to translate to any language.
Step 1: assign a label to the y-axis data:
? V=[5,2,1,2,5]
"PARI" confirms that entry:
%280 = [5, 2, 1, 2, 5]
Then type in the following processing algorithm
which calculates a best fit parabola
through any y-axis data set with constant x-axis separation:
? g=#V;h=(g-1)*g*(g+1)/3;i=h*(3*g*g-7)/5;\
a=sum(i=1,g,V[i]);b=sum(i=1,g,(2*i-1-g)*V[i]);c=sum(i=1,g,(2*i-1-g)*(2*i-1-g)*V[i]);\
A=matdet([a,c;h,i])/matdet([g,h;h,i]);B=b/h*2;C=matdet([g,h;a,c])/matdet([g,h;h,i])*4;\
xextreme=-B/(2*C);yextreme=-B*B/(4*C)+A;fit=Polrev([A,B,C]);\
print("\n","y of extreme is ",yextreme,"\n","which occurs this many data points from center of data: ",xextreme)
(Note for non-PARI users:
the command "matdet([a,c;h,i])"
is just another way of entering "a*i-c*h")
Those commands then produce the following screen output:
y of extreme is 1
which occurs this many data points from center of data: 0
The algorithm stores the polynomial of the fit in the variable "fit":
? fit
%282 = x^2 + 1
?
(Note that to make that algorithm short
the x-axis labels are assigned as X=[-(N-1)/2..(N-1)/2],
thus they are X=[-2,-1,0,1,2]
To correct that
for the same polynomial as parameterized
by an x-axis coordinate data set of say X=[−1,0,1,2,3]:
just apply a simple linear transform, in this case:
"x^2 + 1" --> "(t - 1)^2 + 1".)
I have two line segments: X1,Y1,Z1 - X2,Y2,Z2 And X3,Y3,Z3 - X4,Y4,Z4
I am trying to find the shortest distance between the two segments.
I have been looking for a solution for hours, but all of them seem to work with lines rather than line segments.
Any ideas how to go about this, or any sources of furmulae?
I'll answer this in terms of matlab, but other programming environments can be used. I'll add that this solution is valid to solve the problem in any number of dimensions (>= 3).
Assume that we have two line segments in space, PQ and RS. Here are a few random sets of points.
> P = randn(1,3)
P =
-0.43256 -1.6656 0.12533
> Q = randn(1,3)
Q =
0.28768 -1.1465 1.1909
> R = randn(1,3)
R =
1.1892 -0.037633 0.32729
> S = randn(1,3)
S =
0.17464 -0.18671 0.72579
The infinite line PQ(t) is easily defined as
PQ(u) = P + u*(Q-P)
Likewise, we have
RS(v) = R + v*(S-R)
See that for each line, when the parameter is at 0 or 1, we get one of the original endpoints on the line returned. Thus, we know that PQ(0) == P, PQ(1) == Q, RS(0) == R, and RS(1) == S.
This way of defining a line parametrically is very useful in many contexts.
Next, imagine we were looking down along line PQ. Can we find the point of smallest distance from the line segment RS to the infinite line PQ? This is most easily done by a projection into the null space of line PQ.
> N = null(P-Q)
N =
-0.37428 -0.76828
0.9078 -0.18927
-0.18927 0.61149
Thus, null(P-Q) is a pair of basis vectors that span the two dimensional subspace orthogonal to the line PQ.
> r = (R-P)*N
r =
0.83265 -1.4306
> s = (S-P)*N
s =
1.0016 -0.37923
Essentially what we have done is to project the vector RS into the 2 dimensional subspace (plane) orthogonal to the line PQ. By subtracting off P (a point on line PQ) to get r and s, we ensure that the infinite line passes through the origin in this projection plane.
So really, we have reduced this to finding the minimum distance from the line rs(v) to the origin (0,0) in the projection plane. Recall that the line rs(v) is defined by the parameter v as:
rs(v) = r + v*(s-r)
The normal vector to the line rs(v) will give us what we need. Since we have reduced this to 2 dimensions because the original space was 3-d, we can do it simply. Otherwise, I'd just have used null again. This little trick works in 2-d:
> n = (s - r)*[0 -1;1 0];
> n = n/norm(n);
n is now a vector with unit length. The distance from the infinite line rs(v) to the origin is simple.
> d = dot(n,r)
d =
1.0491
See that I could also have used s, to get the same distance. The actual distance is abs(d), but as it turns out, d was positive here anyway.
> d = dot(n,s)
d =
1.0491
Can we determine v from this? Yes. Recall that the origin is a distance of d units from the line that connects points r and s. Therefore we can write dn = r + v(s-r), for some value of the scalar v. Form the dot product of each side of this equation with the vector (s-r), and solve for v.
> v = dot(s-r,d*n-r)/dot(s-r,s-r)
v =
1.2024
This tells us that the closest approach of the line segment rs to the origin happened outside the end points of the line segment. So really the closest point on rs to the origin was the point rs(1) = s.
Backing out from the projection, this tells us that the closest point on line segment RS to the infinite line PQ was the point S.
There is one more step in the analysis to take. What is the closest point on the line segment PQ? Does this point fall inside the line segment, or does it too fall outside the endpoints?
We project the point S onto the line PQ. (This expression for u is easily enough derived from similar logic as I did before. Note here that I've used \ to do the work.)
> u = (Q-P)'\((S - (S*N)*N') - P)'
u =
0.95903
See that u lies in the interval [0,1]. We have solved the problem. The point on line PQ is
> P + u*(Q-P)
ans =
0.25817 -1.1677 1.1473
And, the distance between closest points on the two line segments was
> norm(P + u*(Q-P) - S)
ans =
1.071
Of course, all of this can be compressed into just a few short lines of code. But it helps to expand it all out to gain understanding of how it works.
One basic approach is the same as computing the shortest distance between 2 lines, with one exception.
If you look at most algorithms for finding the shortest distance between 2 lines, you'll find that it finds the points on each line that are the closest, then computes the distance from them.
The trick to extend this to segments (or rays), is to see if that point is beyond one of the end points of the line, and if so, use the end point instead of the actual closest point on the infinite line.
For a concrete sample, see:
http://softsurfer.com/Archive/algorithm_0106/algorithm_0106.htm
More specifically:
http://softsurfer.com/Archive/algorithm_0106/algorithm_0106.htm#dist3D_Segment_to_Segment()
I would parameterize both line segments to use one parameter each, bound between 0 and 1, inclusive. Then find the difference between both line functions and use that as the objective function in a linear optimization problem with the parameters as variables.
So say you have a line from (0,0,0) to (1,0,0) and another from (0,1,0) to (0,0,0) (Yeah, I'm using easy ones). The lines can be parameterized like (1*t,0*t,0*t) where t lies in [0,1] and (0*s,1*s,0*s) where s lies in [0,1], independent of t.
Then you need to minimize ||(1*t,1*s,0)|| where t, s lie in [0,1]. That's a pretty simple problem to solve.
How about extending the line segments into infinite lines and find the shortest distance between the two lines. Then find the points on each line that are the end points of the shortest distance line segment.
If the point for each line is on the original line segment, then the you have the answer. If a point for each line is not on the original segment, then the point is one of the original line segments' end points.
Finding the distance between two finite lines based on finding between two infinite lines and then bound the infinite lines to the finite lines doesn't work always. for example try this points
Q=[5 2 0]
P=[2 2 0]
S=[3 3.25 0]
R=[0 3 0]
Based on infinite approach the algorithm select R and P for distance calculation (distance=2.2361), but somewhere in the middle of R and S has got a closer distance to the P point. Apparently, selecting P and [2 3.166] from R to S line has lower distance of 1.1666. Even this answer could get better by precise calculation and finding orthogonal line from P to R S line.
First, find the closest approach Line Segment bridging between their extended lines. Let's call this LineSeg BR.
If BR.endPt1 falls on LS1 and BR.endPt2 falls on LS2, you're done...just calculate the length of BR.
If the bridge BR intersects LS1 but not LS2, use the shorter of these two distances: smallerOf(dist(BR.endPt1, LS2.endPt1), dist(BR.endPt1, LS2.endPt2))
If the bridge BR intersects LS2 but not LS1, use the shorter of these two distances: smallerOf(dist(BR.endPt2, LS1.endPt1), dist(BR.endPt2, LS1.endPt2))
If none of these conditions hold, the closest distance is the closest pairing of endpoints on opposite Line Segs.
This question is the topic of the article On fast computation of distance between line segments by Vladimir J. Lumelksy 1985. It goes even further by finding not only the minimal Euclidean distance (MinD) but a point on each segment separated by that distance.
The general algorithm is as follows:
Compute the global MinD (global means the distance between two infinite lines containing the segments) and coordinates of both points (bases) of the line of minimum distances, see skew lines; if both bases lie inside the segments, then actual MinD is equal to the global MinD; otherwise, continue.
Compute distances between the endpoints of both segments (a total of four distances).
Compute coordinates of the base points of perpendiculars from the endpoints of one segment onto the other segment; compute the lengths of those perpendiculars whose base points lie inside the corresponding segments (up to four base point, and four distances).
Out of the remaining distances, the smallest is the sought actual MinD.
Altogether, this represents the computation of six points and of nine distances.
The article then describes and proves how to reduce the amount of tests based on the data received in initial steps of the algorithm and how to handle degenerate cases (e.g. equal endpoints of a segment).
C-language implementation by Eric Larsen can be found here, see SegPoints() function.