I'm creating a simple game and come up with this problem while designing AI for my game:
Given a set of N points inside a rectangle in the Cartesian coordinate, i need to find the widest straight path through this rectangle. The path must be empty (i.e not containing any point).
I wonder if are there any efficient algorithm to solve this problem? Can you suggest any keyword/ paper/ anything related to this problem?
EDIT: The rectangle is always defined by 4 points in its corner. I added an image for illustration. the path in the above pictures are the determined by two red lines
This is the widest empty corridor problem. Houle and Maciel gave an O(n2)-time, O(n)-space algorithm in a 1988 tech report entitled "Finding the widest empty corridor through a set of points", which seems not to be available online. Fortunately, Janardan and Preparata describe this algorithm in Section 4 of their paper Widest-corridor problems, which is available.
Loop through all pairs of points. Construct a line l through the pair. (^1) On each side of l, either there are other points, or not. If not, then there is not a path on that side of l. If there are other points, loop through points calculating the perpendicular distance d from l to each such point. Record the minimum d. That is the widest path on that side of l. Continue looping through all pairs, comparing widest path for that pair with the previous widest path.
This algorithm can be considered naive and runs in O(n^3) time.
Edit: The above algorithm misses a case. At ^1 above, insert: "Construct two lines perpendicular to l through each point of the pair. If there is no third point between the lines, then record distance d between the points. This constitutes a path." Continue the algorithm at ^1. With additional case, algorithm is still O(n^3)
Myself, I would start by looking at the Delaunay triangulation of the point set:
http://en.wikipedia.org/wiki/Delaunay_triangulation
There appear to be plenty of resources there on efficient algorithms to build this - Fortune's algorithm, at O(n log n), for starters.
My intuition tells me that your widest path will be defined by one of the edges in this graph (Namely, it would run perpendicular to the edge, and its width would be equal to the length of the edge). How to sort the edges, check the candidates and identify the widest path remains. I like this question, and I'm going to keep thinking about it. :)
EDIT 1: My intuition fails me! A simple equilateral triangle is a counter-example: the widest path is shorter than any of the edges in the triangulation. Still thinking...
EDIT 2: So, we need a black-box algorithm which, given two points in the set, finds the widest path through the point set which is bounded by those two points. (Visualize two parallel lines running through the two points; rotate them in harmony with each other until there are no points between them). Let's call the runtime of this algorithm 'R'.
Given such an algorithm, we can do the following:
Build the Delaunay triangulation of the point set : O(n log n)
Sort the edges by width : O(n log n)
Beginning with the largest edge and moving down, use the black box algorithm to determine the widest path involving those two points; storing it as X : O(nR))
Stop when the edge being examined is shorter than the width of X.
Steps 1 and 2 are nice, but the O(nR) is kind of scary. If R turns out to be O(n), that's already O(n^2) for the whole algorithm. The nice thing is that, for a general set of random points, we would expect that we wouldn't have to go through all the edges.
Related
Imagine I am implementing Dijkstra's algorithm at a park. There are points and connections between those points; these specify valid paths the user can walk on (e.g. sidewalks).
Now imagine that the user is on the grass (i.e. not on a path) and wants to navigate to another location. The problem is not in Dijkstra's algorithm (which works fine), the problem is determining at which vertex to begin.
Here is a picture of the problem: (ignore the dotted lines for now)
Black lines show the edges in Dijkstra's algorithm; likewise, purple circles show the vertices. Sidewalks are in gray. The grass is, you guessed it, green. The user is located at the red star, and wants to get to the orange X.
If I naively look for the nearest vertex and use that as my starting point, the user is often directed to a suboptimal path, that involves walking further away from their destination at the start (i.e. the red solid path).
The blue solid path is the optimal path that my algorithm would ideally come up with.
Notes:
Assume no paths cross over other paths.
When navigating to a starting point, the user should never cross over a path (e.g. sidewalk).
In the image above, the first line segment coming out of the star is created dynamically, simply to assist the user. The star is not a vertex in the graph (since the user can be anywhere inside the grass region). The line segment from the star to a vertex is simply being displayed so that the user knows how to get to the first valid vertex in the graph.
How can I implement this efficiently and correctly?
Idea #1: Find the enclosing polygon
If I find the smallest polygon which surrounds my starting point, I can now create new paths for Dijkstra's algorithm from the starting point (which will be added as a new vertex temporarily) to each of the vertices that make up the polygon. In the example above, the polygon has 6 sides, so this would mean creating 6 new paths to each of its vertices (i.e. the blue dotted lines). I would then be able to run Dijkstra's algorithm and it would easily determine that the blue solid line is the optimal path.
The problem with this method is in determining which vertices comprise the smallest polygon that surrounds my point. I cannot create new paths to each vertex in the graph, otherwise I will end up with the red dotted lines as well, which completely defeats the purpose of using Dijkstra's algorithm (I should not be allowed to cross over a sidewalk). Therefore, I must take care to only create paths to the vertices of the enclosing polygon. Is there an algorithm for this?
There is another complication with this solution: imagine the user now starts at the purple lightning bolt. It has no enclosing polygon, yet the algorithm should still work by connecting it to the 3 points at the top right. Again, once it is connected to those, running Dijkstra's is easy.
Update: the reason we want to connect to one of these 3 points and not walk around everything to reach the orange X directly is because we want to minimize the walking done on unpaved paths. (Note: This is only a constraint if you start outside a polygon. We don't care how long you walk on the grass if it is within a polygon).
If this is the correct solution, then please post its algorithm as an answer.
Otherwise, please post a better solution.
You can start off by running Dijkstra from the target to find its distance to all vertices.
Now let's consider the case where you start "inside" the graph on the grass. We want to find all vertices that we can reach via a straight line without crossing any edge. For that we can throw together all the line segments representing the edges and the line segments connecting the start point to every vertex and use a sweep-line algorithm to find whether the start-vertex lines intersect any edge.
Alternatively you can use any offline algorithm for planar point location, those also work with a sweep line. I believe this is in the spirit of the more abstract algorithm proposed in the question in that it reports the polygon that surrounds the point.
Then we just need to find the vertex whose connection line to the start does not intersect any edge and the sum d(vertex, target) + d(vertex, start) is minimum.
The procedure when the vertex is outside the graph is somewhat underspecified, but I guess the exact same idea would work. Just keep in mind that there is the possibility to walk all around the graph to the target if it is on the border, like in your example.
This could probably be implemented in O((n+m) log m) per query. If you run an all-pairs shortest path algorithm as a preprocessing step and use an online point location algorithm, you can get logarithmic query time at the cost of the space necessary to store the information to speed up shortest path queries (quadratic if you just store all distance pairs).
I believe simple planar point location works just like the sweep line approaches, only with persistent BSTs to store all the sweepline states.
I'm not sure why you are a bothering with trying to find a starting vertex when you already have one. The point you (the user) are standing at is another vertex in of itself. So the real question now is to find the distance from your starting point to any other point in the enclosing polygon graph. And once you have that, you can simply run Dijkstra's or another shortest path algorithm method like A*, BFS, etc, to find the shortest path to your goal point.
On that note, I think you are better off implementing A* for this problem because a park involves things like trees, playgrounds, ponds (sometimes), etc. So you will need to use a shortest path algorithm that takes these into consideration, and A* is one algorithm that uses these factors to determine a path of shortest length.
Finding distance from start to graph:
The problem of finding the distance from your new vertex to other vertices can be done by only looking for points with the closest x or y coordinate to your start point. So this algorithm has to find points that form a sort of closure around the start point, i.e. a polygon of minimum area which contains the point. So as #Niklas B suggested, a planar point algorithm (with some modifications) might be able to accomplish this. I was looking at the sweep-line algorithm, but that only works for line segments so that will not work (still worth a shot, with modifications might be able to give the correct answer).
You can also decide to implement this algorithm in stages, so first, find the points with the closest y coordinate to the current point (Both negative and positive y, so have to use absolute value), then among those points, you find the ones with the closest x coordinate to the current point and that should give you the set of points that form the polygon. Then these are the points you use to find the distance from your start to the graph.
I have been given a task where I have to connects all the points in the 2D plane.
There are four conditions to to be met:
Length of the all segments joined together has to be minimal.
One point can be a part of only one line segment.
Line segments cannot intersect
All points have to be used(one can't be left alone but only if it cannot be avoided)
Image to visualize the problem:
The wrong image connected points correctly, although the total length is bigger that the the one in on the left.
At first I thought about sorting the points and doing it with a sweeping line and building a tree of all possibilities, although it does seem like a way to complicated solution with huge complexity. Therefore I search better approaches. I would appreciate some hints what to do, or how could I approach the problem.
I would start with a Delaunay triangulation of the point set. This should already give you the nearest neighbor connections of each point without any intersections. In the next step I'd look at the triangles that result from the triangulation - the convenient property here is that based on your ruleset you can pick exactly one side from each triangle and remove the remaining two from the selection.
The problem that remains now is to pick those edges that give you the smallest total sum which of course will not always be the smallest side since that one might already have been blocked by a neighboring triangle. I'd start with a greedy approach, always picking the smallest remaining edge that has not been blocked by neighboring triangles yet.
Edit: In the next step you retrieve a list of all the edges in that triangulation and sort them by length. You also make another list in which you count the amount of connections each point has. Now you iterate through the edge list going from the longest edge to the shortest one and check the two points it connects in the connection count list: if each of the points has still more than 1 connection left, you can discard the edge and decrement the connection count for the two points involved. If at least one of the points has only one connection left, you have got yourself one of the edges you are looking for. You repeat the process until there are no edges left and this should hopefully give you the smallest possible edge sum.
If I am not mistaken this problem is loosely related to the knapsack problem which is NP-Hard so I am not sure if this solution really gives you the best possible one.
I'd say this is an extension to the well-known travelling salesman problem.
A good technique (if a little old-fashioned) is to use a simulated annealing optimisation technique.
You'll need to make adjustments to the cost (a.k.a. objective) function to miss out sections of the path. But given a candidate continuous path, it's reasonably trivial to decide which sections to miss out to minimise its length. (You'd first remove the longer of any intersecting lines).
Wow, that's a tricky one. That's a lot of conditions to meet.
I think from a programming standpoint, the "simplest" solution might actually be to just loop through, find all the possibilities that satisfy the last 3 conditions, and record the total length as you loop through, and just choose the one with the shortest length in the end - brute force, guess-and-check. I think this is what you were referring to in your OP when you mentioned a "sweeping line and building a tree of all possibilities". This approach is very computationally expensive, but if the code is written right, it should always work in the end.
If you want the "best" solution, where you want to just solve for the single final answer right away, I'm afraid my math skills aren't strong enough for that - I'm not even sure if there is any single analytical solution to that problem for any arbitrary collection of points. Maybe try checking with the people over at MathOverflow. If someone over there can explain you with the math behind that calculation, and you then you still need help to convert that math into code in a certain programming language, update your question here (maybe with a link to the answer they provide you) and I'm sure someone will be able to help you out from that point.
One of the possible solutions is to use graph theory.
Construct a bipartite graph G, such that each point has its copy in both parts. Now put the edges between the points i and j with the weight = i == j ? infinity : distance[i][j]. The minimal weight maximum matching in the graph will be your desired configuration.
Notice that since this is on a euclidean 2D plane, the resulting "edges" of the matching will not intersect. Let's say that edges AB and XY intersect for points A, B, X, Y. Then the matching is not of the minimum weight, because either AX, BY or AY, BX will produce a smaller total weight without an intersection (this comes from triangle inequality a+b > c)
I'm looking for an algorithm to solve this problem:
Given N rectangles on the Cartesian coordinate, find out if the intersection of those rectangles is empty or not. Each rectangle can lie in any direction (not necessary to have its edges parallel to Ox and Oy)
Do you have any suggestion to solve this problem? :) I can think of testing the intersection of each rectangle pair. However, it's O(N*N) and quite slow :(
Abstract
Either use a sorting algorithm according to smallest X value of the rectangle, or store your rectangles in an R-tree and search it.
Straight-forward approach (with sorting)
Let us denote low_x() - the smallest (leftmost) X value of a rectangle, and high_x() - the highest (rightmost) X value of a rectangle.
Algorithm:
Sort the rectangles according to low_x(). # O(n log n)
For each rectangle in sorted array: # O(n)
Finds its highest X point. # O(1)
Compare it with all rectangles whose low_x() is smaller # O(log n)
than this.high(x)
Complexity analysis
This should work on O(n log n) on uniformly distributed rectangles.
The worst case would be O(n^2), for example when the rectangles don't overlap but are one above another. In this case, generalize the algorithm to have low_y() and high_y() too.
Data-structure approach: R-Trees
R-trees (a spatial generalization of B-trees) are one of the best ways to store geospatial data, and can be useful in this problem. Simply store your rectangles in an R-tree, and you can spot intersections with a straightforward O(n log n) complexity. (n searches, log n time for each).
Observation 1: given a polygon A and a rectangle B, the intersection A ∩ B can be computed by 4 intersection with half-planes corresponding to each edge of B.
Observation 2: cutting a half plane from a convex polygon gives you a convex polygon. The first rectangle is a convex polygon. This operation increases the number of vertices at most per 1.
Observation 3: the signed distance of the vertices of a convex polygon to a straight line is a unimodal function.
Here is a sketch of the algorithm:
Maintain the current partial intersection D in a balanced binary tree in a CCW order.
When cutting a half-plane defined by a line L, find the two edges in D that intersect L. This can be done in logarithmic time through some clever binary or ternary search exploiting the unimodality of the signed distance to L. (This is the part I don't exactly remember.) Remove all the vertices on the one side of L from D, and insert the intersection points to D.
Repeat for all edges L of all rectangles.
This seems like a good application of Klee's measure. Basically, if you read http://en.wikipedia.org/wiki/Klee%27s_measure_problem there are lower bounds on the runtime of the best algorithms that can be found for rectilinear intersections at O(n log n).
I think you should use something like the sweep line algorithm: finding intersections is one of its applications. Also, have a look at answers to this questions
Since the rectangles must not be parallel to the axis, it is easier to transform the problem to an already solved one: compute the intersections of the borders of the rectangles.
build a set S which contains all borders, together with the rectangle they're belonging to; you get a set of tuples of the form ((x_start,y_start), (x_end,y_end), r_n), where r_n is of course the ID of the corresponding rectangle
now use a sweep line algorithm to find the intersections of those lines
The sweep line stops at every x-coordinate in S, i.e. all start values and all end values. For every new start coordinate, put the corresponding line in a temporary set I. For each new end-coordinate, remove the corresponding line from I.
Additionally to adding new lines to I, you can check for each new line whether it intersects with one of the lines currently in I. If they do, the corresponding rectangles do, too.
You can find a detailed explanation of this algorithm here.
The runtime is O(n*log(n) + c*log(n)), where c is the number of intersection points of the lines in I.
Pick the smallest rectangle from the set (or any rectangle), and go over each point within it. If one of it's point also exists in all other rectangles, the intersection is not empty. If all points are free from ALL other rectangles, the intersection is empty.
Given a set of n points, can we find three points that describe a triangle with minimum area in O(n^2)? If yes, how, and if not, can we do better than O(n^3)?
I have found some papers that state that this problem is at least as hard as the problem that asks to find three collinear points (a triangle with area 0). These papers describe an O(n^2) solution to this problem by reducing it to an instance of the 3-sum problem. I couldn't find any solution for what I'm interested in however. See this (look for General Position) for such a paper and more information on 3-sum.
There are O(n2) algorithms for finding the minimum area triangle.
For instance you can find one here: http://www.cs.tufts.edu/comp/163/fall09/CG-lecture9-LA.pdf
If I understood that pdf correctly, the basic idea is as follows:
For each pair of points AB you find the point that is closest to it.
You construct a dual of the points so that lines <-> points.
Line y = mx + c is mapped to point (m,c)
In the dual, for a given point (which corresponds to a segment in original set of points) the nearest line vertically gives us the required point for 1.
Apparently 2 & 3 can be done in O(n2) time.
Also I doubt the papers showed 3SUM-hardness by reducing to 3SUM. It should be the other way round.
There's an algorithm that finds the required area with complexity O(n^2*log(n)).
For each point Pi in set do the following(without loss of generality we can assume that Pi is in the origin or translate the points to make it so).
Then for each points (x1,y1), (x2,y2) the triangle area will be 0.5*|x1*y2-x2*y1| so we need to minimize that value. Instead of iterating through all pairs of remaining points (which gives us O(N^3) complexity) we sort those points using predicate X1 * Y2 < X2 * Y1. It is claimed that to find triangle with minimal area we need to check only the pairs of adjacent points in the sorted array.
So the complexity of this procedure for each point is n*log(n) and the whole algorithm works in O(n^2*log(n))
P.S. Can't quickly find the proof that this algorithm is correct :(, hope will find it it later and post it then.
The problem
Given a set of n points, can we find three points that describe a triangle with minimum area in O(n^2)? If yes, how, and if not, can we do better than O(n^3)
is better resolved in this paper: James King, A Survey of 3sum-Hard Problems, 2004
What is the best algorithm to find if any three points are collinear in a set of points say n. Please also explain the complexity if it is not trivial.
Thanks
Bala
If you can come up with a better than O(N^2) algorithm, you can publish it!
This problem is 3-SUM Hard, and whether there is a sub-quadratic algorithm (i.e. better than O(N^2)) for it is an open problem. Many common computational geometry problems (including yours) have been shown to be 3SUM hard and this class of problems is growing. Like NP-Hardness, the concept of 3SUM-Hardness has proven useful in proving 'toughness' of some problems.
For a proof that your problem is 3SUM hard, refer to the excellent surver paper here: http://www.cs.mcgill.ca/~jking/papers/3sumhard.pdf
Your problem appears on page 3 (conveniently called 3-POINTS-ON-LINE) in the above mentioned paper.
So, the currently best known algorithm is O(N^2) and you already have it :-)
A simple O(d*N^2) time and space algorithm, where d is the dimensionality and N is the number of points (probably not optimal):
Create a bounding box around the set of points (make it big enough so there are no points on the boundary)
For each pair of points, compute the line passing through them.
For each line, compute its two collision points with the bounding box.
The two collision points define the original line, so if there any matching lines they will also produce the same two collision points.
Use a hash set to determine if there are any duplicate collision point pairs.
There are 3 collinear points if and only if there were duplicates.
Another simple (maybe even trivial) solution which doesn't use a hash table, runs in O(n2log n) time, and uses O(n) space:
Let S be a set of points, we will describe an algorithm which finds out whether or not S contains some three collinear points.
For each point o in S do:
Pass a line L parallel to the x-axis through o.
Replace every point in S below L, with its reflection. (For example if L is the x axis, (a,-x) for x>0 will become (a,x) after the reflection). Let the new set of points be S'
The angle of each point p in S', is the right angle of the segment po with the line L. Let us sort the points S' by their angles.
Walk through the sorted points in S'. If there are two consecutive points which are collinear with o - return true.
If no collinear points were found in the loop - return false.
The loop runs n times, and each iteration performs nlog n steps. It is not hard to prove that if there're three points on a line they'll be found, and we'll find nothing otherwise.