I'm looking for an algorithm that can quickly (I'm heavily constrained by performance) find a point inside of a circle, where this point is outside of all rectangles in a provided set (these rectangles can be rotated).
Or alternatively, to find a circle A with its center inside a circle B, where circle A does not intersect with a set of line segments.
The only solution I can come up with is to just loop through samples of points and then loop through the rectangles for each of them. But since my space is continuous, that's quite a pain. I'm basically satisfied with just a single point that doesn't intersect, but there will be cases where no such points exist. In the latter case I would ideally try to find a point with the least amount of intersections, or be able to find the answer that no such point exists.
Does anyone know of any algorithms that can accomplish this in something less than O(n^2)? Anything that would help identify good candidate points would be awesome too.
A typical example of the situation is this:
Lots of big rectangles, with small circle in which I hope to find a point (here indicated with blue). It's common that many of the rectangles fall completely outside of the circle, and also common that the circle is completely covered. There's only a small set of lengths and widths that tend to be used for the rectangles.
There are probably several interesting ways to do this. The simplest algorithm I can think of that gives a decent runtime is an algorithm as follows:
Treat all rectangles as a set of line segments.
Use an efficient algorithm to find the intersection of all line segments (for example the Bentley-Ottmann algorithm.)
Create a list of points of interest (POIs) that are either a) the corners of a rectangle or b) the intersection points computed in 2.
Create a finer set of line segments such that each line segment terminates at a POI defined in 3.
Using the POIs and the finer set of line segments from 4, compute a constrained triangulation (for example a Constrained Delaunay Triangulation.)
Pick any (unlabeled) triangle to start. Determine if the triangle lies within at least one rectangle (label it as a COVERED triangle) or not (label it as a FREE triangle). To do this you can use any point in polygon algorithm, for example ray-casting.
Run a Depth or Breadth first search starting at this triangle and expanding to neighbors, taking care not to cross between any triangle pair that would require crossing a line segment defined in 4. For every triangle visited, label it as the same label as the starting triangle.
Repeat 6-7 until all triangles are labeled (or all triangles covering the circle of interest are labeled.)
The union of all FREE triangles intersected with the circled of interest yields precisely the points that are not covered by any rectangle and are within the circle.
Note, this algorithm is a bit general and can be improved by focusing only in the area around the circle (for example a bounding box region can only be considered, with the bounding box encompassing all rectangles intersecting the circle.)
To analyze the runtime, consider the runtime of each key step:
has a runtime of O((n+k) log n) where k is the number of intersections, where n is the number of line segments.
has a runtime of O(m log m) where m is the number of POIs, m is O(n+k)
and 7. should be analyzed together. In the worst case, each triangle would need O(n) computations to check for containment in a rectangle. Given that there would be O(m) triangles this would yield a O(nm) bound. However, the purpose of the triangulation is to reuse the point in polygon computation for the seeding triangle to label as many neighboring triangles as possible. In practice the number of triangles that would require a point in polygon computation should be negligible. Therefore the runtime of this step is O(tn) where t is the number of traingles for which point in polygon computations are performed.
The runtime expected is, therefore, O((n+k) log n + t(n+k)) where k is the number of intersections in step 2 and t is the number of triangles for which point in polygon computations are performed. In the worst case this is O(n^2 log n) as you can create a pathological example with n^2 intersections, but this should be unlikely if not possible. Likewise, the number t should be kept to a minimum to make this as efficient as possible. If both t << n and k << n^2, this would be quite efficient.
One approximation that could yield performance improvement:
Consider approximating the circle by a set of r line segments, and including these line segments in steps 1-5. While this is an approximation, it would potentially improve the runtime, as only triangles inside the circle would ever need to be considered.
Related
This question is an extension on some computation details of this question.
Suppose one has a set of (potentially overlapping) circles, and one wishes to compute the area this set of circles covers. (For simplicity, one can assume some precomputation steps have been made, such as getting rid of circles included entirely in other circles, as well as that the circles induce one connected component.)
One way to do this is mentioned in Ants Aasma's and Timothy's Shields' answers, being that the area of overlapping circles is just a collection of circle slices and polygons, both of which the area is easy to compute.
The trouble I'm encountering however is the computation of these polygons. The nodes of the polygons (consisting of circle centers and "outer" intersection points) are easy enough to compute:
And at first I thought a simple algorithm of picking a random node and visiting neighbors in clockwise order would be sufficient, but this can result in the following "outer" polygon to be constructed, which is not part of the correct polygons.
So I thought of different approaches. A Breadth First Search to compute minimal cycles, but I think the previous counterexample can easily be modified so that this approach results in the "inner" polygon containing the hole (and which is thus not a correct polygon).
I was thinking of maybe running a Las Vegas style algorithm, taking random points and if said point is in an intersection of circles, try to compute the corresponding polygon. If such a polygon exists, remove circle centers and intersection points composing said polygon. Repeat until no circle centers or intersection points remain.
This would avoid ending up computing the "outer" polygon or the "inner" polygon, but would introduce new problems (outside of the potentially high running time) e.g. more than 2 circles intersecting in a single intersection point could remove said intersection point when computing one polygon, but would be necessary still for the next.
Ultimately, my question is: How to compute such polygons?
PS: As a bonus question for after having computed the polygons, how to know which angle to consider when computing the area of some circle slice, between theta and 2PI - theta?
Once we have the points of the polygons in the right order, computing the area is a not too difficult.
The way to achieve that is by exploiting planar duality. See the Wikipedia article on the doubly connected edge list representation for diagrams, but the gist is, given an oriented edge whose right face is inside a polygon, the next oriented edge in that polygon is the reverse direction of the previous oriented edge with the same head in clockwise order.
Hence we've reduced the problem to finding the oriented edges of the polygonal union and determining the correct order with respect to each head. We actually solve the latter problem first. Each intersection of disks gives rise to a quadrilateral. Let's call the centers C and D and the intersections A and B. Assume without loss of generality that the disk centered at C is not smaller than the disk centered at D. The interior angle formed by A→C←B is less than 180 degrees, so the signed area of that triangle is negative if and only if A→C precedes B→C in clockwise order around C, in turn if and only if B→D precedes A→D in clockwise order around D.
Now we determine which edges are actually polygon boundaries. For a particular disk, we have a bunch of angle intervals around its center from before (each sweeping out the clockwise sector from the first endpoint to the second). What we need amounts to a more complicated version of the common interview question of computing the union of segments. The usual sweep line algorithm that increases the cover count whenever it scans an opening endpoint and decreases the cover count whenever it scans a closing endpoint can be made to work here, with the adjustment that we need to initialize the count not to 0 but to the proper cover count of the starting angle.
There's a way to do all of this with no trigonometry, just subtraction and determinants and comparisons.
Is there any algorithm that would allow to approximate a path on the x-y plane (i.e. an ordered suite of points defined by x and y) with a limited number of line segments and arcs of circles (constant curvature)? The resulting curve needs to be C1 (continuity of slope).
The maximum number or segments and arcs could be a parameter. An additional interesting constraint would be to prevent two consecutive circles of arcs without an intermediate line segment joining them.
I do not see any way to do this, and I do not think that there exists a method for it, but any hint towards this objective is welcome.
Example:
Sample file available here
Consider this path. It looks like a line, but is actually an ordered suite of very close points. There is no noise and the order of the sequence of points is well known.
I would like to approximate this curve with a minimum number of succession of line segments and circular arcs (let's say 10 line segments and 10 circular arcs) and a C1 continuity. The number of segments/arcs is not an objective itself but I need any parameter which would allow to reduce/increase this number to attain a certain simplicity of the parametrization, at the cost of accuracy loss.
Solution:
Here is my solution, based on Spektre's answer. Red curve is original data. Black lines are segments and blue curves are circle arcs. Green crosses are arc centers with radii shown and blue ones are points where segments potentially join.
Detect line segments, based on slope max deviation and segment minimal length as parameters. The slope of the new segment step is compared with the average step of the existing segment. I would prefer an optimization-based method, but I do not think that it exists for disjoint segments with unknown number, position and length.
Join segments with tangent arcs. To close the system, the radius is chosen such that the segments extremities are the least possible moved. A minimum radius constraint has been added for my purposes. I believe that there will be some special cases to treat in the inflexion points are far away when (e.g. lines are nearly parallel) and interact with neigboring segments.
so you got a point cloud ... for such Usually points close together are considered connected so:
you need to add info about what points are close to which ones
points close only to 2 neighbors signaling interior of curve/line. Only one neighbor means endpoint of curve/lines and more then 2 means intersection or too close almost or parallel lines/curves. No neighbors means either noise or just a dot.
group path segments together
This is called connected component analysis. So you need to form polylines from your neighbor info table.
detect linear path chunks
these have the same slope among neighboring segments so you can join them to single line.
fit the rest with curves
Here related QAs:
Finding holes in 2d point sets?
Algorithms: Ellipse matching
How approximation search works see the sublinks there are quite a bit of examples of fitting
Trace a shape into a polygon of max n sides
[Edit1] simple line detection from #3 on your data
I used 5.0 deg angle change as threshold for lines and also minimal size fo detected line as 50 samples (too lazy to compute length assuming constant point density). The result looks like this:
dots are detected line endpoints, green lines are the detected lines and white "lines" are the curves so I do not see any problem with this approach for now.
Now the problem is with the points left (curves) I think there should be also geometric approach for this as it is just circular arcs so something like this
Formula to draw arcs ending in straight lines, Y as a function of X, starting slope, ending slope, starting point and arc radius?
And this might help too:
Circular approximation of polygon (or its part)
the C1 requirement demands the you must have alternating straights and arcs. Also realize if you permit a sufficient number of segments you can trivially fit every pair of points with a straight and use a tiny arc to satisfy slope continuity.
I'd suggest this algorithm,
1 best fit with a set of (specified N) straight segments. (surely there are well developed algorithms for that.)
2 consider the straight segments fixed and at each joint place an arc. Treating each joint individually i think you have a tractable problem to find the optimum arc center/radius to satisfy continuity and improve the fit.
3 now that you are pretty close attempt to consider all arc centers and radii (segments being defined by tangency) as a global optimization problem. This of course blows up if N is large.
A typical constraint when approximating a given curve by some other curve is to bound the approximate curve to an epsilon-hose within the original curve (in terms if Minkowski sum with a disk of fixed radius epsilon).
For G1- or C2-continuous approximation (which people from CNC/CAD like) with biarcs (and a straight-line segment could be seen as an arc with infinite radius) former colleagues of mine developed an algorithm that gives solutions like this [click to enlarge]:
The above picture is taken from the project website: https://www.cosy.sbg.ac.at/~held/projects/apx/apx.html
The algorithm is fast, that is, it runs in O(n log n) time and is based on the generalized Voronoi diagram. However, it does not give an approximation with the exact minimum number of elements. If you look for the theoretical optimum I would refer to a paper by Drysdale et al., Approximation of an Open Polygonal Curve with
a Minimum Number of Circular Arcs and Biarcs, CGTA, 2008.
I have an image of which this is a small cut-out:
As you can see it are white pixels on a black background. We can draw imaginary lines between these pixels (or better, points). With these lines we can enclose areas.
How can I find the largest convex black area in this image that doesn't contain a white pixel in it?
Here is a small hand-drawn example of what I mean by the largest convex black area:
P.S.: The image is not noise, it represents the primes below 10000000 ordered horizontally.
Trying to find maximum convex area is a difficult task to do. Wouldn't you just be fine with finding rectangles with maximum area? This problem is much easier and can be solved in O(n) - linear time in number of pixels. The algorithm follows.
Say you want to find largest rectangle of free (white) pixels (Sorry, I have images with different colors - white is equivalent to your black, grey is equivalent to your white).
You can do this very efficiently by two pass linear O(n) time algorithm (n being number of pixels):
1) in a first pass, go by columns, from bottom to top, and for each pixel, denote the number of consecutive pixels available up to this one:
repeat, until:
2) in a second pass, go by rows, read current_number. For each number k keep track of the sums of consecutive numbers that were >= k (i.e. potential rectangles of height k). Close the sums (potential rectangles) for k > current_number and look if the sum (~ rectangle area) is greater than the current maximum - if yes, update the maximum. At the end of each line, close all opened potential rectangles (for all k).
This way you will obtain all maximum rectangles. It is not the same as maximum convex area of course, but probably would give you some hints (some heuristics) on where to look for maximum convex areas.
I'll sketch a correct, poly-time algorithm. Undoubtedly there are data-structural improvements to be made, but I believe that a better understanding of this problem in particular will be required to search very large datasets (or, perhaps, an ad-hoc upper bound on the dimensions of the box containing the polygon).
The main loop consists of guessing the lowest point p in the largest convex polygon (breaking ties in favor of the leftmost point) and then computing the largest convex polygon that can be with p and points q such that (q.y > p.y) || (q.y == p.y && q.x > p.x).
The dynamic program relies on the same geometric facts as Graham's scan. Assume without loss of generality that p = (0, 0) and sort the points q in order of the counterclockwise angle they make with the x-axis (compare two points by considering the sign of their dot product). Let the points in sorted order be q1, …, qn. Let q0 = p. For each 0 ≤ i < j ≤ n, we're going to compute the largest convex polygon on points q0, a subset of q1, …, qi - 1, qi, and qj.
The base cases where i = 0 are easy, since the only “polygon” is the zero-area segment q0qj. Inductively, to compute the (i, j) entry, we're going to try, for all 0 ≤ k ≤ i, extending the (k, i) polygon with (i, j). When can we do this? In the first place, the triangle q0qiqj must not contain other points. The other condition is that the angle qkqiqj had better not be a right turn (once again, check the sign of the appropriate dot product).
At the end, return the largest polygon found. Why does this work? It's not hard to prove that convex polygons have the optimal substructure required by the dynamic program and that the program considers exactly those polygons satisfying Graham's characterization of convexity.
You could try treating the pixels as vertices and performing Delaunay triangulation of the pointset. Then you would need to find the largest set of connected triangles that does not create a concave shape and does not have any internal vertices.
If I understand your problem correctly, it's an instance of Connected Component Labeling. You can start for example at: http://en.wikipedia.org/wiki/Connected-component_labeling
I thought of an approach to solve this problem:
Out of the set of all points generate all possible 3-point-subsets. This is a set of all the triangles in your space. From this set remove all triangles that contain another point and you obtain the set of all empty triangles.
For each of the empty triangles you would then grow it to its maximum size. That is, for every point outside the rectangle you would insert it between the two closest points of the polygon and check if there are points within this new triangle. If not, you will remember that point and the area it adds. For every new point you want to add that one that maximizes the added area. When no more point can be added the maximum convex polygon has been constructed. Record the area for each polygon and remember the one with the largest area.
Crucial to the performance of this algorithm is your ability to determine a) whether a point lies within a triangle and b) whether the polygon remains convex after adding a certain point.
I think you can reduce b) to be a problem of a) and then you only need to find the most efficient method to determine whether a point is within a triangle. The reduction of the search space can be achieved as follows: Take a triangle and increase all edges to infinite length in both directions. This separates the area outside the triangle into 6 subregions. Good for us is that only 3 of those subregions can contain points that would adhere to the convexity constraint. Thus for each point that you test you need to determine if its in a convex-expanding subregion, which again is the question of whether it's in a certain triangle.
The whole polygon as it evolves and approaches the shape of a circle will have smaller and smaller regions that still allow convex expansion. A point once in a concave region will not become part of the convex-expanding region again so you can quickly reduce the number of points you'll have to consider for expansion. Additionally while testing points for expansion you can further cut down the list of possible points. If a point is tested false, then it is in the concave subregion of another point and thus all other points in the concave subregion of the tested points need not be considered as they're also in the concave subregion of the inner point. You should be able to cut down to a list of possible points very quickly.
Still you need to do this for every empty triangle of course.
Unfortunately I can't guarantee that by adding always the maximum new region your polygon becomes the maximum polygon possible.
I have a list of rectangles that don't have to be parallel to the axes. I also have a master rectangle that is parallel to the axes.
I need an algorithm that can tell which rectangle is a point closest to(the point must be in the master rectangle). the list of rectangles and master rectangle won't change during the algorithm and will be called with many points so some data structure should be created to make the lookup faster.
To be clear: distance from a rectangle to a point is the distance between the closest point in the rectangle to the point.
What algorithm/data structure can be used for this? memory is on higher priority on this, n log n is ok but n^2 is not.
You should be able to do this with a Voronoi diagram with O(n log n) preprocessing time with O(log n) time queries. Because the objects are rectangles, not points, the cells may be curved. Nevertheless, a Voronoi diagram should work fine for your purposes. (See http://en.wikipedia.org/wiki/Voronoi_diagram)
For a quick and dirty solution that you could actually get working within a day, you could do something inspired by locality sensitive hashing. For example, if the rectangles are somewhat well-spaced, you could hash them into square buckets with a few different offsets, and then for each query examine each rectangle that falls in one of the handful of buckets that contain the query point.
You should be able to do this in O(n) time and O(n) memory.
Calculate the closest point on each edge of each rectangle to the point in question. To do this, see my detailed answer in the this question. Even though the question has to do with a point inside of the polygon (rather than outside of it), the algorithm still can be applied here.
Calculate the distance between each of these closest points on the edges, and find the closest point on the entire rectangle (for each rectangle) to the point in question. See the link above for more details.
Find the minimum distance between all of the rectangles. The rectangle corresponding with your minimum distance is the winner.
If memory is more valuable than speed, use brute force: for a given point S, compute the distance from S to each edge. Choose the rectangle with the shortest distance.
This solution requires no additional memory, while its execution time is in O(n).
Depending on your exact problem specification, you may have to adjust this solution if the rectangles are allowed to overlap with the master rectangle.
As you described, a distance between one point to a rectangle is the minimum length of all lines through that point which is perpendicular with all four edges of a rectangle and all lines connect that point with one of four vertices of the rectangle.
(My English is not good at describing a math solution, so I think you should think more deeply for understanding my explanation).
For each rectangle, you should save four vertices and four edges function for fast calculation distance between them with the specific point.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How to find largest triangle in convex hull aside from brute force search
I have a set of random points from which i want to find the largest triangle by area who's verticies are each on one of those points.
So far I have figured out that the largest triangle's verticies will only lie on the outside points of the cloud of points (or the convex hull) so i have programmed a function to do just that (using Graham scan in nlogn time).
However that's where I'm stuck. The only way I can figure out how to find the largest triangle from these points is to use brute force at n^3 time which is still acceptable in an average case as the convex hull algorithm usually kicks out the vast majority of points. However in a worst case scenario where points are on a circle, this method would fail miserably.
Dose anyone know an algorithm to do this more efficiently?
Note: I know that CGAL has this algorithm there but they do not go into any details on how its done. I don't want to use libraries, i want to learn this and program it myself (and also allow me to tweak it to exactly the way i want it to operate, just like the graham scan in which other implementations pick up collinear points that i don't want).
Don't know if this help, but if you choose two points from the convex hull and rotate all points of the hull so that the connecting line of the two points is parallel to the x-Axis, either the point with the maximum or the one with the minimum y-coordinate forms the triangle with the largest area together with the two points chosen first.
Of course once you have tested one point for all possible base lines, you can remove it from the list.
Here's a thought on how to get it down to O(n2 log n). I don't really know anything about computational geometry, so I'll mark it community wiki; please feel free to improve on this.
Preprocess the convex hull by finding for each point the range of slopes of lines through that point such that the set lies completely on one side of the line. Then invert this relationship: construct an interval tree for slopes with points in leaf nodes, such that when querying with a slope you find the points such that there is a tangent through those points.
If there are no sets of three or more collinear points on the convex hull, there are at most four points for each slope (two on each side), but in case of collinear points we can just ignore the intermediate points.
Now, iterate through all pairs of points (P,Q) on the convex hull. We want to find the point R such that triangle PQR has maximum area. Taking PQ as the base of the triangle, we want to maximize the height by finding R as far away from the line PQ as possible. The line through R parallel to PQ must be such that all points lie on one side of the line, so we can find a bounded number of candidates in time O(log n) using the preconstructed interval tree.
To improve this further in practice, do branch-and-bound in the set of pairs of points: find an upper bound for the height of any triangle (e.g. the maximum distance between two points), and discard any pair of points whose distance multiplied by this upper bound is less than the largest triangle found so far.
I think the rotating calipers method may apply here.
Off the top of my head, perhaps you could do something involving gridding/splitting the collection of points up into groups? Maybe... separating the points into three groups (not sure what the best way to do that in this case would be, though), doing something to discard those points in each group that are closer to the other two groups than other points in the same group, and then using the remaining points to find the largest triangle that can be made having one vertex in each group? This would actually make the case of all points being on a circle a lot simpler, because you'd just focus on the points that are near the center of the arcs contained within each group, as those would be the ones in each group furthest from the other two groups.
I'm not sure if this would give you the proper result for certain triangles/distributions of points, though. There may be situations where the resultant triangle isn't of optimal area, either because the grouping and/or the vertex choosing aren't/isn't optimal. Something like that.
Anyway, those are my thoughts on the problem. I hope I've at least been able to give you ideas for how to work on it.
How about dropping a point at a time from the convex hull? Starting with the convex hull, calculate the area of the triangle formed by each triple of adjacent points (p1p2p3, p2p3p4, etc.). Find the triangle with minimum area, then drop the middle of the three points that formed that triangle. (In other words, if the smallest area triangle is p3p4p5, drop P4.) Now you have a convex polygon with N-1 points. Repeat the same procedure until you are left with three points. This should take O(N^2) time.
I would not be at all surprised if there is some pathological case where this doesn't work, but I expect that it would work for the majority of cases. (In other words, I haven't proven this, and I have no source to cite.)