Efficient sorting of integer vertices of a convex polygon - performance

I am given as input n pairs of integers which describe points in the 2D plane that are known ahead of time to be vertices of some convex polygon.
I'd like to efficiently sort these points in a clockwise or counter-clockwise fashion.
At first I thought of doing something like the initial step of Graham Scan, but I can't see a simple way to break ties for vertices that make the same angle with the anchor point.
Notice that, as you walk along the sides of the polygon, sometimes these vertices may be getting closer to the anchor point, and sometimes they may be getting farther.
Something that does seem to work is producing a point in the interior of the polygon (for instance, the average of the n points) and using it is an anchor point for radial sorting of the input.
Indeed, because the anchor point lies in the interior, any ray emanating from it contains at most one input point, so there will be no ties.
The overall complexity is not affected: computing the midpoint is an O(n) task, and the bottleneck is the actual sorting.
This does involve a few more operations than the hopeful version of Graham Scan (where we assume there are no ties to be broken), but the bigger loss is leaving integer arithmethic behind by introducing division into the mix.
This in turn can be remedied with scaling everything by a factor of n, but at this point it seems like grasping at straws.
Am I missing something?
Is there a simpler, efficient way to solve this sorting problem, preferrably one that can avoid floating point calculations?

Related

Finding a point that is not inside any rotated rectangles

I'm looking for an algorithm that can quickly (I'm heavily constrained by performance) find a point inside of a circle, where this point is outside of all rectangles in a provided set (these rectangles can be rotated).
Or alternatively, to find a circle A with its center inside a circle B, where circle A does not intersect with a set of line segments.
The only solution I can come up with is to just loop through samples of points and then loop through the rectangles for each of them. But since my space is continuous, that's quite a pain. I'm basically satisfied with just a single point that doesn't intersect, but there will be cases where no such points exist. In the latter case I would ideally try to find a point with the least amount of intersections, or be able to find the answer that no such point exists.
Does anyone know of any algorithms that can accomplish this in something less than O(n^2)? Anything that would help identify good candidate points would be awesome too.
A typical example of the situation is this:
Lots of big rectangles, with small circle in which I hope to find a point (here indicated with blue). It's common that many of the rectangles fall completely outside of the circle, and also common that the circle is completely covered. There's only a small set of lengths and widths that tend to be used for the rectangles.
There are probably several interesting ways to do this. The simplest algorithm I can think of that gives a decent runtime is an algorithm as follows:
Treat all rectangles as a set of line segments.
Use an efficient algorithm to find the intersection of all line segments (for example the Bentley-Ottmann algorithm.)
Create a list of points of interest (POIs) that are either a) the corners of a rectangle or b) the intersection points computed in 2.
Create a finer set of line segments such that each line segment terminates at a POI defined in 3.
Using the POIs and the finer set of line segments from 4, compute a constrained triangulation (for example a Constrained Delaunay Triangulation.)
Pick any (unlabeled) triangle to start. Determine if the triangle lies within at least one rectangle (label it as a COVERED triangle) or not (label it as a FREE triangle). To do this you can use any point in polygon algorithm, for example ray-casting.
Run a Depth or Breadth first search starting at this triangle and expanding to neighbors, taking care not to cross between any triangle pair that would require crossing a line segment defined in 4. For every triangle visited, label it as the same label as the starting triangle.
Repeat 6-7 until all triangles are labeled (or all triangles covering the circle of interest are labeled.)
The union of all FREE triangles intersected with the circled of interest yields precisely the points that are not covered by any rectangle and are within the circle.
Note, this algorithm is a bit general and can be improved by focusing only in the area around the circle (for example a bounding box region can only be considered, with the bounding box encompassing all rectangles intersecting the circle.)
To analyze the runtime, consider the runtime of each key step:
has a runtime of O((n+k) log n) where k is the number of intersections, where n is the number of line segments.
has a runtime of O(m log m) where m is the number of POIs, m is O(n+k)
and 7. should be analyzed together. In the worst case, each triangle would need O(n) computations to check for containment in a rectangle. Given that there would be O(m) triangles this would yield a O(nm) bound. However, the purpose of the triangulation is to reuse the point in polygon computation for the seeding triangle to label as many neighboring triangles as possible. In practice the number of triangles that would require a point in polygon computation should be negligible. Therefore the runtime of this step is O(tn) where t is the number of traingles for which point in polygon computations are performed.
The runtime expected is, therefore, O((n+k) log n + t(n+k)) where k is the number of intersections in step 2 and t is the number of triangles for which point in polygon computations are performed. In the worst case this is O(n^2 log n) as you can create a pathological example with n^2 intersections, but this should be unlikely if not possible. Likewise, the number t should be kept to a minimum to make this as efficient as possible. If both t << n and k << n^2, this would be quite efficient.
One approximation that could yield performance improvement:
Consider approximating the circle by a set of r line segments, and including these line segments in steps 1-5. While this is an approximation, it would potentially improve the runtime, as only triangles inside the circle would ever need to be considered.

polygon union without holes

Im looking for some fairly easy (I know polygon union is NOT an easy operation but maybe someone could point me in the right direction with a relativly easy one) algorithm on merging two intersecting polygons. Polygons could be concave without holes and also output polygon should not have holes in it. Polygons are represented in counter-clockwise manner. What I mean is presented on a picture. As you can see even if there is a hole in union of polygons I dont need it in the output. Input polygons are for sure without holes. I think without holes it should be easier to do but still I dont have an idea.
Remove all the vertices of the polygons which lie inside the other polygon: http://paulbourke.net/geometry/insidepoly/
Pick a starting point that is guaranteed to be in the union polygon (one of the extremes would work)
Trace through the polygon's edges in counter-clockwise fashion. These are points in your union. Trace until you hit an intersection (note that an edge may intersect with more than one edge of the other polygon).
Find the first intersection (if there are more than one). This is a point in your Union.
Go back to step 3 with the other polygon. The next point should be the point that makes the greatest angle with the previous edge.
You can proceed as below:
First, add to your set of points all the points of intersection of your polygons.
Then I would proceed like graham scan algorithm but with one more constraint.
Instead of selecting the point that makes the highest angle with the previous line (have a look at graham scan to see what I mean (*), chose the one with the highest angle that was part of one of the previous polygon.
You will get an envellope (not convex) that will describe your shape.
Note:
It's similar to finding the convex hull of your points.
For example graham scan algorithm will help you find the convex hull of the set of points in O (N*ln (N) where N is the number of points.
Look up for convex hull algorithms, and you can find some ideas.
Remarques:
(*)From wikipedia:
The first step in this algorithm is to find the point with the lowest
y-coordinate. If the lowest y-coordinate exists in more than one point
in the set, the point with the lowest x-coordinate out of the
candidates should be chosen. Call this point P. This step takes O(n),
where n is the number of points in question.
Next, the set of points must be sorted in increasing order of the
angle they and the point P make with the x-axis. Any general-purpose
sorting algorithm is appropriate for this, for example heapsort (which
is O(n log n)). In order to speed up the calculations, it is not
necessary to calculate the actual angle these points make with the
x-axis; instead, it suffices to calculate the cosine of this angle: it
is a monotonically decreasing function in the domain in question
(which is 0 to 180 degrees, due to the first step) and may be
calculated with simple arithmetic.
In the convex hull algorithm you chose the point of the angle that makes the largest angle with the previous side.
To "stick" with your previous polygon, just add the constraint that you must select a side that previously existed.
And you take off the constraint of having angle less than 180°
I don't have a full answer but I'm about to embark on a similar problem. I think there are two step which are fairly important. First would be to find a point on some polygon which lies on the outside edge. Second would be to make a list of bounding boxes for all the vertices and see which of these overlap. This means when you iterate through vertices, you don't have to do tests for all of them, only those which you know have a chance of intersecting (bounding box problems are lightweight).
Since you now have an outside point, you can now iterate through connected points until you detect an intersection. If you know which side is inside and which outside (you may need to do some work on the first vertex to know this), you know which way to go on the intersection. Then it's merely a matter of switching polygons.
This gets a little more interesting if you want to maintain that hole (which I do) in which case, I would probably make sure I had used up all my intersecting bounding boxes. You also didn't specify what should happen if your polygons don't intersect at all. But that's either going to be leave them alone (which could potentially be a problem if you're expecting one polygon out) or return an error.

Sorting of Points in 2D space

Suppose random points P1 to P20 scattered in a plane.
Then is there any way to sort those points in either clock-wise or anti-clock wise.
Here we can’t use degree because you can see from the image many points can have same degree.
E.g, here P4,P5 and P13 acquire the same degree.
If your picture has realistic distance between the points, you might get by with just choosing a point at random, say P1, and then always picking the nearest unvisited neighbour as your next point. Traveling Salesman, kind of.
Are you saying you want an ordered result P1, P2, ... P13?
If that's the case, you need to find the convex hull of the points. Walking around the circumference of the hull will then give you the order of the points that you need.
In a practical sense, have a look at OpenCV's documentation -- calling convexHull with clockwise=true gives you a vector of points in the order that you want. The link is for C++, but there are C and Python APIs there as well. Other packages like Matlab should have a similar function, as this is a common geometrical problem to solve.
EDIT
Once you get your convex hull, you could iteratively collapse it from the outside to get the remaining points. Your iterations would stop when there are no more pixels left inside the hull. You would have to set up your collapse function such that closer points are included first, i.e. such that you get:
and not:
In both diagrams, green is the original convex hull, the other colors are collapsed areas.
Find the right-most of those points (in O(n)) and sort by the angle relative to that point (O(nlog(n))).
It's the first step of graham's convex-hull algorithm, so it's a very common procedure.
Edit: Actually, it's just not possible, since the polygonal representation (i.e. the output-order) of your points is ambiguous. The algorithm above will only work for convex polygons, but it can be extended to work for star-shaped polygons too (you need to pick a different "reference-point").
You need to define the order you actually want more precisely.

Largest triangle from a set of points [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How to find largest triangle in convex hull aside from brute force search
I have a set of random points from which i want to find the largest triangle by area who's verticies are each on one of those points.
So far I have figured out that the largest triangle's verticies will only lie on the outside points of the cloud of points (or the convex hull) so i have programmed a function to do just that (using Graham scan in nlogn time).
However that's where I'm stuck. The only way I can figure out how to find the largest triangle from these points is to use brute force at n^3 time which is still acceptable in an average case as the convex hull algorithm usually kicks out the vast majority of points. However in a worst case scenario where points are on a circle, this method would fail miserably.
Dose anyone know an algorithm to do this more efficiently?
Note: I know that CGAL has this algorithm there but they do not go into any details on how its done. I don't want to use libraries, i want to learn this and program it myself (and also allow me to tweak it to exactly the way i want it to operate, just like the graham scan in which other implementations pick up collinear points that i don't want).
Don't know if this help, but if you choose two points from the convex hull and rotate all points of the hull so that the connecting line of the two points is parallel to the x-Axis, either the point with the maximum or the one with the minimum y-coordinate forms the triangle with the largest area together with the two points chosen first.
Of course once you have tested one point for all possible base lines, you can remove it from the list.
Here's a thought on how to get it down to O(n2 log n). I don't really know anything about computational geometry, so I'll mark it community wiki; please feel free to improve on this.
Preprocess the convex hull by finding for each point the range of slopes of lines through that point such that the set lies completely on one side of the line. Then invert this relationship: construct an interval tree for slopes with points in leaf nodes, such that when querying with a slope you find the points such that there is a tangent through those points.
If there are no sets of three or more collinear points on the convex hull, there are at most four points for each slope (two on each side), but in case of collinear points we can just ignore the intermediate points.
Now, iterate through all pairs of points (P,Q) on the convex hull. We want to find the point R such that triangle PQR has maximum area. Taking PQ as the base of the triangle, we want to maximize the height by finding R as far away from the line PQ as possible. The line through R parallel to PQ must be such that all points lie on one side of the line, so we can find a bounded number of candidates in time O(log n) using the preconstructed interval tree.
To improve this further in practice, do branch-and-bound in the set of pairs of points: find an upper bound for the height of any triangle (e.g. the maximum distance between two points), and discard any pair of points whose distance multiplied by this upper bound is less than the largest triangle found so far.
I think the rotating calipers method may apply here.
Off the top of my head, perhaps you could do something involving gridding/splitting the collection of points up into groups? Maybe... separating the points into three groups (not sure what the best way to do that in this case would be, though), doing something to discard those points in each group that are closer to the other two groups than other points in the same group, and then using the remaining points to find the largest triangle that can be made having one vertex in each group? This would actually make the case of all points being on a circle a lot simpler, because you'd just focus on the points that are near the center of the arcs contained within each group, as those would be the ones in each group furthest from the other two groups.
I'm not sure if this would give you the proper result for certain triangles/distributions of points, though. There may be situations where the resultant triangle isn't of optimal area, either because the grouping and/or the vertex choosing aren't/isn't optimal. Something like that.
Anyway, those are my thoughts on the problem. I hope I've at least been able to give you ideas for how to work on it.
How about dropping a point at a time from the convex hull? Starting with the convex hull, calculate the area of the triangle formed by each triple of adjacent points (p1p2p3, p2p3p4, etc.). Find the triangle with minimum area, then drop the middle of the three points that formed that triangle. (In other words, if the smallest area triangle is p3p4p5, drop P4.) Now you have a convex polygon with N-1 points. Repeat the same procedure until you are left with three points. This should take O(N^2) time.
I would not be at all surprised if there is some pathological case where this doesn't work, but I expect that it would work for the majority of cases. (In other words, I haven't proven this, and I have no source to cite.)

Faster way to compare two sets of points in N-dimensional space?

List1 contains a high number (~7^10) of N-dimensional points (N <=10), List2 contains the same or fewer number of N-dimensional points (N <=10).
My task is this: I want to check which point in List2 is closest (euclidean distance) to a point in List1 for every point in List1 and subsequently perform some operation on it. I have been doing it the simple- the nested loop way when I didn't have more than 50 points in List1, but with 7^10 points, this obviously takes up a lot of time.
What is the fastest way to do this? Any concepts from Computational Geometry might help?
EDIT: I have the following in place, I have built a kd-tree out of List2 and then now I am doing a nearest-neighborhood search for each point in List1. Now as I originally pointed out, List1 has 7^10 points, and hence though I am saving on the brute force, Euclidean distance method for every pair, the sheer large number of points in List1 is causing a lot of time consumption. Is there any way I can improve this?
Well a good way would be to use something like a kd-tree and perform nearest neighbour searching. Fortunately you do not have to implement this data structure yourself, it has been done before. I recommend this one, but there are others:
http://www.cs.umd.edu/~mount/ANN/
It's not possible to tell you which is the most efficient algorithm without knowing anything about the distribution of points in the two solutions. However, for a first guess...
First algorithm doesn't work — for two reasons: (1) a wrong assumption - I assume the bounding hulls are disjoint, and (2) a misreading of the question - it doesn't find the shortest edge for every pair of points.
...compute the convex hull of the two sets: the closest points must be on the hyperface on the two hulls through which the line between the two centres of gravity passes.
You can compute the convex hull by computing the centre points, the centre of gravity assuming all points have equal mass, and ordering the lists from furthest from the centre to least far. Then take the furthest away point in the list, add this to the convex hull, and then remove all points that are within the so-far computed convex hull (you will need to compute lots of 10d hypertriangles to do this). Repeat unil there is nothing left in the list that is not on the convex hull.
Second algorithm: partial
Compute the convex hull for List2. For each point of List1, if the point is outside the convex hull, then find the hyperface as for first algorithm: the nearest point must be on this face. If it is on the face, likewise. If it is inside, you can still find the hyperface by extending the line past the point from List1: the nearest point must be inside the ball that includes the hyperface to List2's centre of gravity: here, though, you need a new algorithm to get the nearest point, perhaps the kd-tree approach.
Perfomance
When List2 is something like evenly distributed, or normally distributed, through some fairly oblique shape, this will do a good job of reducing the number of points under consideration, and it should be compatible with the kd-tree suggestion.
There are some horrible worts cases, though: if List2 contains only points on the surface of a torus whose geometric centre is the centre of gravity of the list, then the convex hull will be very expensive to calculate, and will not help much in reducing the number of points under consideration.
My evaluation
These kinds of geometric techniques may be a useful complement to the kd-trees approach of other posters, but you need to know a little about the distribution of points before you can determine whether they are worth applying.
kd-tree is pretty fast. I've used the algorithm in this paper and it works well Bentley - K-d trees for semidynamic point sets
I'm sure there are libraries around, but it's nice to know what's going on sometimes - Bentley explains it well.
Basically, there are a number of ways to search a tree: Nearest N neighbors, All neighbors within a given radius, nearest N neighbors within a radius. Sometimes you want to search for bounded objects.
The idea is that the kdTree partitions the space recursively. Each node is split in 2 down the axis in one of the dimensions of the space you are in. Ideally it splits perpendicular to the node's longest dimension. You should keep splitting the space until you have about 4 points in each bucket.
Then for every query point, as you recursively visit nodes, you check the distance from to the partition wall for the particular node you are in. You descend both nodes (the one you are in and its sibling) if the distance to the partition wall is closer than the search radius. If the wall is beyond the radius, just search children of the node you are in.
When you get to a bucket (leaf node), you test the points in there to see if they are within the radius.
If you want the closest point, you can start with a massive radius, and pass a pointer or reference to it as you recurse - and in that way you can shrink the search radius as you find close points - and home in on the closest point pretty fast.
(A year later) kd trees that quit early, after looking at say 1M of all 200M points,
can be much faster in high dimensions.
The results are only statistically close to the absolute nearest, depending on the data and metric;
there's no free lunch.
(Note that sampling 1M points, and kd tree only those 1M, is quite different, worse.)
FLANN does this for image data with dim=128,
and is I believe in opencv. A local mod of the fast and solid
SciPy cKDTree also has cutoff= .

Resources