Algorithm for arranging Cartesian points - algorithm

I have few Cartesian points of the form : (x,y)
where x and y both are non-negative integers.
For e.g.
(0,0) , (1,1), (0,1)
I need an algorithm to arrange the above points
in such a way that going from one point to other
changes either x or y by 1.
In other words, I would like to avoid
diagonal movement.
So, the above mentioned points will be arranged like :
(0,0), (0,1), (1,1).
Similarly for (0,0),(1,1),(0,2)
there is no such arrangement possible.
I am not sure about what to call it
but I would call it Manhattan ordering.
Can anyone help?

If you are looking for an arrangement that does not repeat vertices:
What you seem to be looking for is a Hamiltonian Path in a Grid Graph.
This is known to be NP-Complete for general Grid Graphs, see Hamilton Paths in Grid Graphs.
So you can probably try your luck with any of the approximate/heuristic/etc algorithms known for Hamiltonian Path/Euclidean Traveling Salesman Problem.
If you are looking for an arrangement that can repeat, but want the minimum possible number of points in the arrangement:
This is again NP-Complete. The above problem can be reduced to it. This is because the minimum possible walk has n vertices if and only if the graph has a hamiltonian path.
If you are just looking for some arrangement of points,
Then all you need to do is check if the graph is connected. If it is not connected, there can be no such arrangement.
You can do a depth first search to figure that out. The depth first search will also give you such an arrangement in case the graph is connected.
If you want something closer to optimal (but in reasonably fast time), you can probably use approximation algorithms for the Euclidean Traveling Salesman problem.

You can construct a graph with the vertices being your points, and the edges being the valid steps.
What you're then looking for is a Hamiltonian path for this graph. This, in its general form, is an NP-complete problem, which means there is no known efficient solution (i.e. one that scales well with the number of points). Wikipedia describes a randomized algorithm that is "fast on most graphs" and might be of use:
Start from a random vertex, and continue if there is a neighbor not visited. If there are no more unvisited neighbors, and the path formed isn't Hamiltonian, pick a neighbor uniformly at random, and rotate using that neighbor as a pivot. (That is, add an edge to that neighbor, and remove one of the existing edges from that neighbor so as not to form a loop.) Then, continue the algorithm at the new end of the path.
A more efficient solution might exist for this particular situation, though.

Think of it as a graph where each node as at most four edges. Then do depth/breadth-first search.

This could be simplified as minimizing the distance between each consecutive point. Going from (0,0) to (0,1) is simply 1 unit, but going from (0,0) to (1,1) is actually sqrt(2). So if you arrange the points into a graph, and then perform a simple minimum-total-distance traversal (traveling salesman), it should arrange them correctly.
Edit: If you never want a step that would be larger than 1, simply do not add any edges that are greater than 1. The traversal will still work correctly, and ignore any paths that require a movement > 1.
Edit 2: To clarify further, you can use any edge selection algorithm you wish. If you're ok with it moving 2 spaces, as long as the space is not diagonal, then you may choose to put an edge between (0,2) and (0,4). The minimum distance algorithm will still work. Simply place the edges in a smart way, and you can use any number of selection criteria to determine the outcome.

One way to do this is to create two sorted lists of the coordinates. One sorted by x and one sorted by y. Then, recursively find a solution.
Code coming (not sure what language yet; maybe pseudocode?)... Edit - nevermind, since my answer isn't as good as some of the others anyways.

How about a brute force recursive REXX routine... Try all possible
paths and print those that work out.
/* rexx */
point. = 0 /* Boolean for each existing point */
say 'Enter origin x and y coordinate:'
pull xo yo
point.xo.yo = 1 /* Point exists... */
say 'Enter destination x and y coordinate:'
pull xd yd
point.xd.yd = 1 /* Point exists... */
say 'Enter remaining x and y coordinates, one pair per line:'
do forever
pull x y
if x = '' then leave
point.x.y = 1
end
path = ''
call findpath xo yo path
say 'All possible paths have been displayed'
return
findpath: procedure expose point. xd yd
arg x y path
if \point.x.y then return /* no such point */
newpoint = '(' || x y || ')'
if pos(newpoint, path) > 0 then return /* abandon on cycle */
if x = xd & y = yd then /* found a path */
say path newpoint
else do /* keep searching */
call findpath x+1 y path newpoint
call findpath x-1 y path newpoint
call findpath x y+1 path newpoint
call findpath x y-1 path newpoint
end
return
Example session:
Path.rex
Enter origin x and y coordinate:
0 0
Enter destination x and y coordinate:
2 2
Enter remaining x and y coordinates, one pair per line:
0 1
1 1
2 1
1 2
(0 0) (0 1) (1 1) (2 1) (2 2)
(0 0) (0 1) (1 1) (1 2) (2 2)
All possible paths have been displayed
Don't try this on anything big though - could take a very long time! But then the question never mentioned anything about being an optimal solution.

Related

Triangulate a set of points with a concave domain

Setup
Given some set of nodes within a convex hull, assume the domain contains one or more concave areas:
where blue dots are points, and the black line illustrates the domain. Assume the points are held as a 2D array points of length n, where n is the number of point-pairs.
Let us then triangulate the points, using something like the Delaunay method from scipy.spatial:
As you can see, one may experience the creation of triangles crossing through the domain.
Question
What is a good algorithmic approach to removing any triangles that span outside of the domain? Ideally but not necessarily, where the simplex edges still preserve the domain shape (i.e., no major gaps where triangles are removed).
Since my question is seeming to continue to get a decent amount of activity, I wanted to follow up with the application that I'm currently using.
Assuming that you have your boundary defined, you can use a ray casting algorithm to determine whether or not the polygon is inside the domain.
To do this:
Take the centroid of each polygon as C_i = (x_i,y_i).
Then, imagine a line L = [C_i,(+inf,y_i)]: that is, a line that spans east past the end of your domain.
For each boundary segment s_i in boundary S, check for intersections with L. If yes, add +1 to an internal counter intersection_count; else, add nothing.
After the count of all intersections between L and s_i for i=1..N are calculated:
if intersection_count % 2 == 0:
return True # triangle outside convex hull
else:
return False # triangle inside convex hull
If your boundary is not explicitly defined, I find it helpful to 'map' the shape onto an boolean array and use a neighbor tracing algorithm to define it. Note that this approach assumes a solid domain and you will need to use a more complex algorithm for domains with 'holes' in them.
Here is some Python code that does what you want.
First, building the alpha shape (see my previous answer):
def alpha_shape(points, alpha, only_outer=True):
"""
Compute the alpha shape (concave hull) of a set of points.
:param points: np.array of shape (n,2) points.
:param alpha: alpha value.
:param only_outer: boolean value to specify if we keep only the outer border or also inner edges.
:return: set of (i,j) pairs representing edges of the alpha-shape. (i,j) are the indices in the points array.
"""
assert points.shape[0] > 3, "Need at least four points"
def add_edge(edges, i, j):
"""
Add a line between the i-th and j-th points,
if not in the list already
"""
if (i, j) in edges or (j, i) in edges:
# already added
assert (j, i) in edges, "Can't go twice over same directed edge right?"
if only_outer:
# if both neighboring triangles are in shape, it's not a boundary edge
edges.remove((j, i))
return
edges.add((i, j))
tri = Delaunay(points)
edges = set()
# Loop over triangles:
# ia, ib, ic = indices of corner points of the triangle
for ia, ib, ic in tri.vertices:
pa = points[ia]
pb = points[ib]
pc = points[ic]
# Computing radius of triangle circumcircle
# www.mathalino.com/reviewer/derivation-of-formulas/derivation-of-formula-for-radius-of-circumcircle
a = np.sqrt((pa[0] - pb[0]) ** 2 + (pa[1] - pb[1]) ** 2)
b = np.sqrt((pb[0] - pc[0]) ** 2 + (pb[1] - pc[1]) ** 2)
c = np.sqrt((pc[0] - pa[0]) ** 2 + (pc[1] - pa[1]) ** 2)
s = (a + b + c) / 2.0
area = np.sqrt(s * (s - a) * (s - b) * (s - c))
circum_r = a * b * c / (4.0 * area)
if circum_r < alpha:
add_edge(edges, ia, ib)
add_edge(edges, ib, ic)
add_edge(edges, ic, ia)
return edges
To compute the edges of the outer boundary of the alpha shape use the following example call:
edges = alpha_shape(points, alpha=alpha_value, only_outer=True)
Now, after the edges of the outer boundary of the alpha-shape of points have been computed, the following function will determine whether a point (x,y) is inside the outer boundary.
def is_inside(x, y, points, edges, eps=1.0e-10):
intersection_counter = 0
for i, j in edges:
assert abs((points[i,1]-y)*(points[j,1]-y)) > eps, 'Need to handle these end cases separately'
y_in_edge_domain = ((points[i,1]-y)*(points[j,1]-y) < 0)
if y_in_edge_domain:
upper_ind, lower_ind = (i,j) if (points[i,1]-y) > 0 else (j,i)
upper_x = points[upper_ind, 0]
upper_y = points[upper_ind, 1]
lower_x = points[lower_ind, 0]
lower_y = points[lower_ind, 1]
# is_left_turn predicate is evaluated with: sign(cross_product(upper-lower, p-lower))
cross_prod = (upper_x - lower_x)*(y-lower_y) - (upper_y - lower_y)*(x-lower_x)
assert abs(cross_prod) > eps, 'Need to handle these end cases separately'
point_is_left_of_segment = (cross_prod > 0.0)
if point_is_left_of_segment:
intersection_counter = intersection_counter + 1
return (intersection_counter % 2) != 0
On the input shown in the above figure (taken from my previous answer) the call is_inside(1.5, 0.0, points, edges) will return True, whereas is_inside(1.5, 3.0, points, edges) will return False.
Note that the is_inside function above does not handle degenerate cases. I added two assertions to detect such cases (you can define any epsilon value that fits your application). In many applications this is sufficient, but if not and you encounter these end cases, they need to be handled separately.
See, for example, here on robustness and precision issues when implementing geometric algorithms.
One of Classic DT algorithms generates first a bounding triangle, then adds all new triangles sorted by x, then prunes out all triangles having a vertex in the supertriangle.
At least from the provided image one can derive the heuristics of pruning out also some triangles having all vertices on the concave hull. Without a proof, the triangles to be pruned out have a negative area when their vertices are sorted in the same order as the concave hull is defined.
This may need the concave hull to be inserted as well, and to be pruned out.
Since my question is seeming to continue to get a decent amount of activity, I wanted to follow up with the application that I'm currently using.
Assuming that you have your boundary defined, you can use a ray casting algorithm to determine whether or not the polygon is inside the domain.
To do this:
Take the centroid of each polygon as C_i = (x_i,y_i).
Then, imagine a line L = [C_i,(+inf,y_i)]: that is, a line that spans east past the end of your domain.
For each boundary segment s_i in boundary S, check for intersections with L. If yes, add +1 to an internal counter intersection_count; else, add nothing.
After the count of all intersections between L and s_i for i=1..N are calculated:
if intersection_count % 2 == 0:
return True # triangle outside convex hull
else:
return False # triangle inside convex hull
If your boundary is not explicitly defined, I find it helpful to 'map' the shape onto an boolean array and use a neighbor tracing algorithm to define it. Note that this approach assumes a solid domain and you will need to use a more complex algorithm for domains with 'holes' in them.
You can try a constrained delaunay algorithm for example with sloan algoritm or cgal library.
[1] A Brute-Force Constrained Delaunay Triangulation?
A simple but elegant way is to loop over the triangels and check wether they are within our domain or not. The shapely package could do the trick for you.
for more on this please check the following post: https://gis.stackexchange.com/a/352442
Note that triangulation in shapely is also implemented, even for MultiPoin objects.
I used it, the performance was amazing and the code was only like five lines.
Compute the triangles centroid an check if it's inside the polygon using this algorithm.

Optimal algorithm to solve this maze?

I'm currently stuck on a challenge our lecturer gave us over at our university. We've been looking at the most popular pathfinding algorithms such as Dijkstra and A*. Although, I think this challenge exercise requires something else and it has me stumped.
A visual representation of the maze that needs solving:
Color legend
Blue = starting node
Gray = path
Green = destination node
The way it's supposed to be solved is that when movement is done, it has to be done until it collides with either the edge of the maze or an obstacle (the black borders). It would also need to be solved in the least amount of row-movements possible (in this case 7)
My question: Could someone push me in the right direction on what algorithm to look at? I think Dijkstra/A* is not the way to go, considering the shortest path is not always the correct path given the assignment.
It is still solvable with Dijkstra / A*, what needs to be changed is the configuration of neighbors.
A little background, first:
Dijkstra and A* are general path finding algorithms formulated on graphs. When, instead of a graph, we have a character moving on a grid, it might not be that obvious where the graph is. But it's still there, and one way to construct a graph would be the following:
the nodes of the graph correspond to the cells of the grid
there are edges between nodes corresponding to neighboring cells.
Actually, in most problems involving some configurations and transitions between them, it is possible to construct a corresponding graph, and apply Dijkstra/A*. Thus, it is also possible to tackle problems such as sliding puzzle, rubik's cube, etc., which apparently differ significantly from a character moving on a grid. But they have states, and transitions between states, thus it is possible to try graph search methods (these methods, especially the uninformed ones such as Dijkstra's algorithm, might not always be feasible due to the large search space, but in principle it's possible to apply them).
In the problem that you mentioned, the graph would not differ much from the one with typical character movements:
the nodes can still be the cells of the grid
there will now be edges from a node to nodes corresponding to valid movements (ending near a border or an obstacle), which, in this case, will not always coincide with the four spatial immediate neighbors of the grid cell.
As pointed by Tamas Hegedus in the comments section, it's not obvious what heuristic function should be chosen if A* is used.
The standard heuristics based on Manhattan or Euclidean distance would not be valid here, as they might over-estimate the distance to the target.
One valid heuristic would be id(row != destination_row) + id(col != destination_col), where id is the identity function, with id(false) = 0 and id(true) = 1.
Dijkstra/A* are fine. What you need is to carefully think about what you consider graph nodes and graph edges.
Standing in the blue cell (lets call it 5,5), you have three valid moves:
move one cell to the right (to 6,5)
move four cells to the left (to 1,5)
move five cells up (to 5,1)
Note that you can't go from 5,5 to 4,5 or 5,4. Apply the same reasoning to new nodes (eg from 5,1 you can go to 1,1, 10,1 and 5,5) and you will get a graph on which you run your Dijkstra/A*.
You need to evaluate every possible move and take the move that results with the minimum distance. Something like the following:
int minDistance(int x, int y, int prevX, int prevY, int distance) {
if (CollionWithBorder(x, y) // can't take this path
return int.MAX_VALUE;
if (NoCollionWithBorder(x, y) // it's OK to take this path
{
// update the distance only when there is a long change in direction
if (LongDirectionChange(x, y, prevX, prevY))
distance = distance + 1;
)
if (ReachedDestination(x, y) // we're done
return distance;
// find the path with the minimum distance
return min(minDistance(x, y + 1, x, y, distance), // go right
minDistance(x + 1, y, x, y, distance), // go up
minDistance(x - 1, y, x, y, distance), // go down
minDistance(x, y - 1, x, y, distance)); // go left
}
bool LongDirectionChange(x, y, prevX, prevY) {
if (y-2 == prevY && x == prevX) ||(y == prevY && x-2 == prevX)
return true;
return false;
}
This is assuming diagonal moves are not allowed. If they are, add them to the min() call:
minDistance(x + 1, y + 1, distance), // go up diagonally to right
minDistance(x - 1, y - 1, distance), // go down diagonally to left
minDistance(x + 1, y - 1, distance), // go up diagonally to left
minDistance(x - 1, y + 1, distance), // go down diagonally to right

How to find the closest rotation

Consider points Y given in increasing order from [0,T). We are to consider these points as lying on a circle of circumference T. Now consider points X also from [0,T) and also lying on a circle of circumference T.
We say the distance between X and Y is the sum of the absolute distance between the each point in X and its closest point in Y recalling that both are considered to be lying in a circle. Write this distance as Delta(X, Y).
I am trying to find a quick way of determining a rotation of X which makes this distance as small as possible.
My code for making some data to test with is
#!/usr/bin/python
import random
import numpy as np
from bisect import bisect_left
def simul(rate, T):
time = np.random.exponential(rate)
times = [0]
newtime = times[-1]+time
while (newtime < T):
times.append(newtime)
newtime = newtime+np.random.exponential(rate)
return times[1:]
For each point I use this function to find its closest neighbor.
def takeClosest(myList, myNumber, T):
"""
Assumes myList is sorted. Returns closest value to myNumber in a circle of circumference T.
If two numbers are equally close, return the smallest number.
"""
pos = bisect_left(myList, myNumber)
before = myList[pos - 1]
after = myList[pos%len(myList)]
if after - myNumber < myNumber - before:
return after
else:
return before
So the distance between two circles is:
def circle_dist(timesY, timesX):
dist = 0
for t in timesX:
closest_number = takeClosest(timesY, t, T)
dist += np.abs(closest_number - t)
return dist
So to make some data we just do
#First make some data
T = 5000
timesX = simul(1, T)
timesY = simul(10, T)
Finally to rotate circle timesX by offset we can
timesX = [(t + offset)%T for t in timesX]
In practice my timesX and timesY will have about 20,000 points each.
Given timesX and timesY, how can I quickly find (approximately) which rotation of timesX gives
the smallest distance to timesY?
Distance along the circle between a single point and a set of points is a piecewise linear function of rotation. The critical points of this function are the points of the set itself (zero distance) and points midway between neighbouring points of the set (local maximums of distance). Linear coefficients of such function are ±1.
Sum of such functions is again piecewise linear, but now with a quadratic number of critical points. Actually all these functions are the same, except shifted along the argument axis. Linear coefficients of the sum are integers.
To find its minimum one would have to calculate its value in all critical points.
I don'see a way to significantly reduce the amount of work needed, but 1,600,000,000 points is not such a big deal anyway, especially if you can spread the work between several processors.
To calculate sum of two such functions, represent the summands as sequences of critical points and associated coefficients to the left and to the right of each critical point. Then just merge the two point sequences while adding the coefficients.
You can solve your (original) problem with a sweep line algorithm. The trick is to use the right "discretization". Imagine cutting your circle up into two strips:
X: x....x....x..........x................x.........x...x
Y: .....x..........x.....x..x.x...........x.............
Now calculate the score = 5+0++1+1+5+9+6.
The key observation is that if we rotate X very slightly (right say), some of the points will improve and some will get worse. We can call this the "differential". In the above example the differential would be 1 - 1 - 1 + 1 + 1 - 1 + 1 because the first point is matched to something on its right, the second point is matched to something under it or to its left etc.
Of course, as we move X more, the differential will change. However only as many times as the matchings change, which is never more than |X||Y| but probably much less.
The proposed algorithm is thus to calculate the initial score and the time (X position) of the next change in differential. Go to that next position and calculate the score again. Continue until you reach your starting position.
This is probably a good example for the iterative closest point (ICP) algorithm:
It repeatedly matches each point with its closest neighbor and moves all points such that the mean squared distance is minimized. (Note that this corresponds to minimizing the sum of squared distances.)
import pylab as pl
T = 10.0
X = pl.array([3, 5.5, 6])
Y = pl.array([1, 1.5, 2, 4])
pl.clf()
pl.subplot(1, 2, 1, polar=True)
pl.plot(X / T * 2 * pl.pi, pl.ones(X.shape), 'r.', ms=10, mew=3)
pl.plot(Y / T * 2 * pl.pi, pl.ones(Y.shape), 'b+', ms=10, mew=3)
circDist = lambda X, Y: (Y - X + T / 2) % T - T / 2
while True:
D = circDist(pl.reshape(X, (-1, 1)), pl.reshape(Y, (1, -1)))
closestY = pl.argmin(D**2, axis = 1)
distance = circDist(X, Y[closestY])
shift = pl.mean(distance)
if pl.absolute(shift) < 1e-3:
break
X = (X + shift) % T
pl.subplot(1, 2, 2, polar=True)
pl.plot(X / T * 2 * pl.pi, pl.ones(X.shape), 'r.', ms=10, mew=3)
pl.plot(Y / T * 2 * pl.pi, pl.ones(Y.shape), 'b+', ms=10, mew=3)
Important properties of the proposed solution are:
The ICP is an iterative algorithm. Thus it depends on an initial approximate solution. Furthermore, it won't always converge to the global optimum. This mainly depends on your data and the initial solution. If in doubt, try evaluating the ICP with different starting configurations and choose the most frequent result.
The current implementation performs a directed match: It looks for the closest point in Y relative to each point in X. It might yield different matches when swapping X and Y.
Computing all pair-wise distances between points in X and points in Y might be intractable for large point clouds (like 20,000 points, as you indicated). Therefore, the line D = circDist(...) might get replaced by a more efficient approach, e.g. not evaluating all possible pairs.
All points contribute to the final rotation. If there are any outliers, they might distort the shift significantly. This can be overcome with a robust average like the median or simply by excluding points with large distance.

Find the Closest intersection point in plan

I was asked following question in interview recently:
Let suppose you have, following grid on Cartesian coordinate system ( Quadrant I).
o - x - x - x - o
| | | | |
x - x - x - o - x
| | | | |
x - o - o - x - x
where, o => person at intersection and x => no person at intersection
class Point {
int x;
int y;
boolean hasPerson;
}
Point findNearestPointWhereAllPeopleCanMeet(Point[] people) {
}
Implement a routine where given a list of people location (points), find a location(point) such that is closest point to all given point.
How would you solve this problem ?
1) Approach is k-d tree, but do you know any other solution ?
If the problem calls for minimizing the Manhattan distance (that is, people only walk parallel to the axes), then the problem is then pretty trivial. First, selecting the x coordinate and the y coordinate are independent problems.
Then, for the each coordinate, simply find the median value of the position of the people along that axis. For many configurations of people, there can be more than one point that minimizes the sum of the walking distances of all people. (Just consider 2 people separated by more than 2 blocks in x and at the same y coordinate; any intersection in between will require the same total walking by the two people.)
If the problem calls for minimizing the Euclidean distance, then the goal is to find the 2-variable L1 median. This is a standard problem, but it is far from trivial. (See here, for instance.) There is a unique answer. Given that this was an interview question, I suspect that this does not apply.
Before going to study k-d trees these are some thoughts that came to my mind:
Iterate every point of your matrix, graph, or whatever it is
Assign to each point (x,y) a value representing the MAX_distance from the Point to any of the people. (See a clarification example below)
Take any of the points that have the lowest MAX_distance
E.G. Given Point(0,0):
For (1,0) the distance is: 1
For (2,0) the distance is: 2
For (0,2) the distance is: 2
For (3,1) the distance is: 4
For (4,2) the distance is: 6
The MAX_distance for (0,0) is 6. Given your input the lowest MAX_distance should be 3 and there are many Points with that value like (2,1) for instance.
There should be ways to make this algorithm more efficient.. Maybe with k-d trees :p or with other tweaks like checking the total number of people, their relative location/distance, the MAX_distance value at any iteration, etc..
This will probably give you more of an approximate than the correct answer.
But maybe you can try some sort of clustering (see medical image processing)
What if you project all points onto the Y axis:
3*
4*
3*
Then project onto the X axis:
2* 2* 2* 2* 2*
Legend: 3* means 3 people at this coordinate on the axis
Now find the median also using the weights (weight #location = how many people at that location on axis)
If you find the median for both axis then you could take the meeting points as (medianX, medianY).
You could get the correct closest point if when you calculate median on one axis, you also make sure to minimize distance by calculating the median of the other axis. This latter case is harder.

Calculating the shortest distance between two lines (line segments) in 3D

I have two line segments: X1,Y1,Z1 - X2,Y2,Z2 And X3,Y3,Z3 - X4,Y4,Z4
I am trying to find the shortest distance between the two segments.
I have been looking for a solution for hours, but all of them seem to work with lines rather than line segments.
Any ideas how to go about this, or any sources of furmulae?
I'll answer this in terms of matlab, but other programming environments can be used. I'll add that this solution is valid to solve the problem in any number of dimensions (>= 3).
Assume that we have two line segments in space, PQ and RS. Here are a few random sets of points.
> P = randn(1,3)
P =
-0.43256 -1.6656 0.12533
> Q = randn(1,3)
Q =
0.28768 -1.1465 1.1909
> R = randn(1,3)
R =
1.1892 -0.037633 0.32729
> S = randn(1,3)
S =
0.17464 -0.18671 0.72579
The infinite line PQ(t) is easily defined as
PQ(u) = P + u*(Q-P)
Likewise, we have
RS(v) = R + v*(S-R)
See that for each line, when the parameter is at 0 or 1, we get one of the original endpoints on the line returned. Thus, we know that PQ(0) == P, PQ(1) == Q, RS(0) == R, and RS(1) == S.
This way of defining a line parametrically is very useful in many contexts.
Next, imagine we were looking down along line PQ. Can we find the point of smallest distance from the line segment RS to the infinite line PQ? This is most easily done by a projection into the null space of line PQ.
> N = null(P-Q)
N =
-0.37428 -0.76828
0.9078 -0.18927
-0.18927 0.61149
Thus, null(P-Q) is a pair of basis vectors that span the two dimensional subspace orthogonal to the line PQ.
> r = (R-P)*N
r =
0.83265 -1.4306
> s = (S-P)*N
s =
1.0016 -0.37923
Essentially what we have done is to project the vector RS into the 2 dimensional subspace (plane) orthogonal to the line PQ. By subtracting off P (a point on line PQ) to get r and s, we ensure that the infinite line passes through the origin in this projection plane.
So really, we have reduced this to finding the minimum distance from the line rs(v) to the origin (0,0) in the projection plane. Recall that the line rs(v) is defined by the parameter v as:
rs(v) = r + v*(s-r)
The normal vector to the line rs(v) will give us what we need. Since we have reduced this to 2 dimensions because the original space was 3-d, we can do it simply. Otherwise, I'd just have used null again. This little trick works in 2-d:
> n = (s - r)*[0 -1;1 0];
> n = n/norm(n);
n is now a vector with unit length. The distance from the infinite line rs(v) to the origin is simple.
> d = dot(n,r)
d =
1.0491
See that I could also have used s, to get the same distance. The actual distance is abs(d), but as it turns out, d was positive here anyway.
> d = dot(n,s)
d =
1.0491
Can we determine v from this? Yes. Recall that the origin is a distance of d units from the line that connects points r and s. Therefore we can write dn = r + v(s-r), for some value of the scalar v. Form the dot product of each side of this equation with the vector (s-r), and solve for v.
> v = dot(s-r,d*n-r)/dot(s-r,s-r)
v =
1.2024
This tells us that the closest approach of the line segment rs to the origin happened outside the end points of the line segment. So really the closest point on rs to the origin was the point rs(1) = s.
Backing out from the projection, this tells us that the closest point on line segment RS to the infinite line PQ was the point S.
There is one more step in the analysis to take. What is the closest point on the line segment PQ? Does this point fall inside the line segment, or does it too fall outside the endpoints?
We project the point S onto the line PQ. (This expression for u is easily enough derived from similar logic as I did before. Note here that I've used \ to do the work.)
> u = (Q-P)'\((S - (S*N)*N') - P)'
u =
0.95903
See that u lies in the interval [0,1]. We have solved the problem. The point on line PQ is
> P + u*(Q-P)
ans =
0.25817 -1.1677 1.1473
And, the distance between closest points on the two line segments was
> norm(P + u*(Q-P) - S)
ans =
1.071
Of course, all of this can be compressed into just a few short lines of code. But it helps to expand it all out to gain understanding of how it works.
One basic approach is the same as computing the shortest distance between 2 lines, with one exception.
If you look at most algorithms for finding the shortest distance between 2 lines, you'll find that it finds the points on each line that are the closest, then computes the distance from them.
The trick to extend this to segments (or rays), is to see if that point is beyond one of the end points of the line, and if so, use the end point instead of the actual closest point on the infinite line.
For a concrete sample, see:
http://softsurfer.com/Archive/algorithm_0106/algorithm_0106.htm
More specifically:
http://softsurfer.com/Archive/algorithm_0106/algorithm_0106.htm#dist3D_Segment_to_Segment()
I would parameterize both line segments to use one parameter each, bound between 0 and 1, inclusive. Then find the difference between both line functions and use that as the objective function in a linear optimization problem with the parameters as variables.
So say you have a line from (0,0,0) to (1,0,0) and another from (0,1,0) to (0,0,0) (Yeah, I'm using easy ones). The lines can be parameterized like (1*t,0*t,0*t) where t lies in [0,1] and (0*s,1*s,0*s) where s lies in [0,1], independent of t.
Then you need to minimize ||(1*t,1*s,0)|| where t, s lie in [0,1]. That's a pretty simple problem to solve.
How about extending the line segments into infinite lines and find the shortest distance between the two lines. Then find the points on each line that are the end points of the shortest distance line segment.
If the point for each line is on the original line segment, then the you have the answer. If a point for each line is not on the original segment, then the point is one of the original line segments' end points.
Finding the distance between two finite lines based on finding between two infinite lines and then bound the infinite lines to the finite lines doesn't work always. for example try this points
Q=[5 2 0]
P=[2 2 0]
S=[3 3.25 0]
R=[0 3 0]
Based on infinite approach the algorithm select R and P for distance calculation (distance=2.2361), but somewhere in the middle of R and S has got a closer distance to the P point. Apparently, selecting P and [2 3.166] from R to S line has lower distance of 1.1666. Even this answer could get better by precise calculation and finding orthogonal line from P to R S line.
First, find the closest approach Line Segment bridging between their extended lines. Let's call this LineSeg BR.
If BR.endPt1 falls on LS1 and BR.endPt2 falls on LS2, you're done...just calculate the length of BR.
If the bridge BR intersects LS1 but not LS2, use the shorter of these two distances: smallerOf(dist(BR.endPt1, LS2.endPt1), dist(BR.endPt1, LS2.endPt2))
If the bridge BR intersects LS2 but not LS1, use the shorter of these two distances: smallerOf(dist(BR.endPt2, LS1.endPt1), dist(BR.endPt2, LS1.endPt2))
If none of these conditions hold, the closest distance is the closest pairing of endpoints on opposite Line Segs.
This question is the topic of the article On fast computation of distance between line segments by Vladimir J. Lumelksy 1985. It goes even further by finding not only the minimal Euclidean distance (MinD) but a point on each segment separated by that distance.
The general algorithm is as follows:
Compute the global MinD (global means the distance between two infinite lines containing the segments) and coordinates of both points (bases) of the line of minimum distances, see skew lines; if both bases lie inside the segments, then actual MinD is equal to the global MinD; otherwise, continue.
Compute distances between the endpoints of both segments (a total of four distances).
Compute coordinates of the base points of perpendiculars from the endpoints of one segment onto the other segment; compute the lengths of those perpendiculars whose base points lie inside the corresponding segments (up to four base point, and four distances).
Out of the remaining distances, the smallest is the sought actual MinD.
Altogether, this represents the computation of six points and of nine distances.
The article then describes and proves how to reduce the amount of tests based on the data received in initial steps of the algorithm and how to handle degenerate cases (e.g. equal endpoints of a segment).
C-language implementation by Eric Larsen can be found here, see SegPoints() function.

Resources