Algorithm for finding thickest line? - algorithm

I'm sorry if topic is confusing. It's because I don't know what i'm really searching for. I have a set P of points (|P| < 10^5). Each point have integer coordinates (between -10^4, 10^4) How to find straight, which goes across all of points specified on input. Condition is that line must be the thickest, and you have to output how thick that straight is with accuracy up to 2 places after decimal point. Any hint, clue, idea or algorithm name would be appreciate.
PS. It's neither homework nor SPOJ. I'm just preparing to programming contest by doing problems from last edition. And that one I can't solve, even I don't know where to start for searching solution...

You could start by determining the convex hull of this point cloud (see e.g. http://softsurfer.com/Archive/algorithm_0109/algorithm_0109.htm), and try to find the two parallel lines that bound this polygon with the shortest distance.
I think this should be an easier problem because it allows you to base the direction of the parallel lines on the segments of the convex hull (of which there are a limited number).
One implementation could be to process each segment of the convex hull in turn. Per segment, draw a line through it (this is one of the two parallel lines), and then determine the closest other parallel line that encloses the convex hull. Do this for each segment of the convex hull while recording the minimum distance you have found between the parallel lines so far. At the end you should have your optimum result.
Obviously, this still requires an efficient way to determine the closest other parallel line. A (naive, but maybe good enough) way of doing this, is to take all vertices of the convex hull that are not on the current segment, and determine the perpendicular distance to the line through it (e.g. http://en.wikipedia.org/wiki/Distance_from_a_point_to_a_line). The maximum distance for all these vertices is also the minimum distance to the parallel line.
In pseudo-code:
Function FindThinnestLine(PointCloud P)
CH = ConvexHull(P)
optS = nothing
optDist = infinite
For each segment S in CH
L = the line through S
/* Find the minimum distance that the line parallel to L must have in order to enclose CH */
maxDist = 0
For each vertex P in CH, except the two that limit S
dist = The distance between L and P
maxDist = max(dist, maxDist)
/* If the current S has a smaller maxDist, it is our new optimum */
if(maxDist < optDist)
optS = S
optDist = maxDist
Return the line through optS and the line parallel to optS at a distance of optDist as the result
End Function
This is an O(n^2) algorithm, with n being the number of segments in your convex hull.
Edit
Come to think of it, you don't need to iterate over O(n) vertices of the convex hull for every S (in order to find the maxDist), only for the first S. Let's say we call this first vertex oppV (opp for opposite to S), and let's say we process the segments of the convex hull in clockwise order. For every subsequent S that we process, the new oppV can be either the same vertex, or one of its right neigbours (but never a left neighbour, otherwise the segments wouldn't form a convex polygon).
Hence, processing the segments of the convex hull can then be done in O(n) (but creating the convex hull is still O(n log n)).

The thickest line containing all points of a given subset of P should be the perpendicular bisector of the segment xy, where x and y are those two points of the subset with the highest distance d between them. The thickness of this line would be d as well.

Related

Convex hull consisting of max. n points

Given a set of 2D points X, I would like to find a convex hull consisting of maximum n points. Of course, this is not always possible. Therefore, I am looking for an approximate convex hull consisting of max. n points that maximally covers the set of points X.
Stated more formally, if F(H,X) returns the amount of points of X the convex hull H covers, where |H| is the amount of points out of which the hull is constructed, then I seek the following quantity:
H_hat = argmax_H F(H,X), s.t |H|<=n
Another way to regard the problem is the task of finding the polygon consisting of max. n corners of a set X such that it maximally covers said set X.
The way I've come up with is the following:
X = getSet() \\ Get the set of 2D points
H = convexHull(X) \\ Compute the convex hull
while |H| > n do
n_max = 0
for h in H:
H_ = remove(h,H) \\ Remove one point of the convex hull
n_c = computeCoverage(X,H_) \\ Compute the coverage of the modified convex hull.
\\ Save which point can be removed such that the coverage is reduced minimally.
if n_c > n_max:
n_max = n_c
h_toremove = h
\\ Remove the point and recompute the convex hull.
X = remove(h,X)
H = convexHull(X)
However, this is a very slow way of doing this. I have a large set of points and n (the constraint) is small. This means that the original convex hull contains a lot of points, hence the while-loop iterates for a very long time. Is there a more efficient way of doing this? Or are there any other suggestions for approximate solutions?
A couple of ideas.
The points that we exclude when deleting a vertex lie in the triangle formed by that vertex and the two vertices adjacent to it on the convex hull. Deleting any other vertex does not affect the set of potentially excluded points. Therefore, we only have to recompute coverage twice for each deleted vertex.
Speaking of recomputing coverage, we don't necessarily have to look at every point. This idea doesn't improve the worst case, but I think it should be a big improvement in practice. Maintain an index of points as follows. Pick a random vertex to be the "root" and group the points by which triangle formed by the root and two other vertices contains them (O(m log m) time with a good algorithm). Whenever we delete a non-root vertex, we unite and filter the two point sets for the triangles involving the deleted vertex. Whenever we recompute coverage, we can scan only points in two relevant triangles. If we ever delete this root, choose a new one and redo the index. The total cost of this maintenance will be O(m log^2 m) in expectation where m is the number of points. It's harder to estimate the cost of computing coverage, though.
If the points are reasonably uniformly distributed within the hull, maybe use area as a proxy for coverage. Store the vertices in a priority queue ordered by the area of their triangle formed by their neighbors (ear). Whenever we delete a point, update the ear area of its two neighbors. This is an O(m log m) algorithm.
May be the following approach could work for you: Initially, compute the convex hull H. Then select a subset of n points at random from H, which constructs a new polygon that might not cover all the points, so let's call it quasi-convex hull Q. Count how many points are contained in Q (inliers). Repeat this for a certain amount of times and keep the Q proposal with the most inliers.
This seems a bit related to RANSAC, but for this task, we don't really have a notion of what an outlier is, so we can't really estimate the outlier ratio. Hence, I don't know how good the approximation will be or how many iterations you need to get a reasonable result. May be you can add some heuristics instead of choosing the n points purely at random or you can have a threshold of how many points should at least be contained in Q and then stop when you reach that threshold.
Edit
Actually after thinking about it, you could use a RANSAC approach:
max_inliers = 0
best_Q = None
while(true):
points = sample_n_points(X)
Q = construct_convex_hull(points)
n_inliers = count_inliers(Q, X)
if n_inliers > max_inliers:
max_inliers = n_inliers
best_Q = Q
if some_condition:
break
The advantage of this is that the creation of the convex hull is faster than in your approach, as it only uses a maximum of n points. Also, checking the amount of inliers should be fast as it can be just a bunch of sign comparisons with each leg of the convex hull.
The following does not solve the question in the way I have formulated it, but it solved the problem which spawned the question. Therefore I wanted to add it in case anybody else encounters something similar.
I investigated two approaches:
1) Regards the convex hull as a polygon and apply a polygon simplification algorithm on it. The specific algorithm I investigated was the Ramer-Douglas-Peucker Algorithm.
2) Apply the algorithm described in in the question, without recomputing the convex hull.
Neither approaches will give you (as far as I can tell), the desired solution to the optimization problem stated, but for my tasks they worked sufficiently well.

Queries to figure out if point lies inside polygon

I have been given a strictly convex polygon of S sides and Q queries to process.
All points of polygon and query points are given in (x,y) pairs.The points of the polygon are given in anti-clockwise order.
The aforementioned variables are limited such that 1<=S<=10^6 and 1<=Q<=10^5 and 1<=|x|,|y|<=10^9.
For each query I should output Yes if the given point lies inside the polygon; otherwise, No.
I tried using an O(S) inclusion test (ray-casting) and it timed out for the bigger test cases but also didn't pass all the preliminary ones.
Obviously, the implementation didn't cover all the edge cases and I got to know about a specific algorithm for this question which could answer each query in O(log S) using binary search but I can't figure out how to implement it from the pseudocode (first time doing computational geometry).
Could anyone provide me with the algorithm which covers all edge cases within the required time complexity (Q log S) or guide me to a page or paper that implements it?
First, you can split your convex polygon into left and right parts both starting with the upper point and ending with the lower point. The points in both parts are already sorted by y-coordinate.
Assume that query point has coordinates (qx, qy). Now you can try to find (using a binary search) a segment from the left part and a segment from the right part that intersect with the line y = qy. If you could find both segments and qx is lying between x-coordinates of the segments' intersections with the line y = qy, it's inside the polygon.
The complexity of the query is O(log(S)).
You can do a scan line algorithm.
You need to sort the Q points by their x coordinate.
Then find the S point with the lowest x and consider a line moving along the x axis. You need to track the two sides of the polygon.
Then move along the polygon and the Q set in ascending x coordinate. For every point you now just have to check if it's between the two lines you are tracking.
Complexity is O(Q logQ + S) if Q is not sorted and O(Q+S) if Q is already sorted.
There is no need to sort, a convex polygon is already sorted !
For a convex polygon, point location is quick and easy: split the polygon in two using a straight line between vertex 0 and vertex S/2. The signed area test will tell you on which side the test point lies and which half to keep (the half is also a convex polygon).
Continue recursively until S=3 and compare against the supporting line of the third side.
O(Log(S)) tests in total per query.
(The numbers show the order of the splits.)

Orthogonal hull algorithm

I am trying to find a way to determine the rectilinear polygon from a set of integer points (indicated by the red dots in the pictures below). The image below shows what I would like to achieve:
1.
I need only the minimal set of points that define the boundary of the rectilinear polygon. Most hull algorithms I can find do not satisfy the orthogonal nature of this problem, such as the gift-wrapping algorithm, which produce the following result (which is not what I want)...
2.
How can I get the set of points that defines the boundary shown in image 1.?
Updated:
Figure 1. is no longer refereed to as convex..
Following the definition from wikipedia, it is rather easy to create a fast algorithm.
Start constructing upper hull from the leftmost point (uppermost among such if there are many). Add this point to a list.
Find the next point: among all the points with both coordinates strictly greater than of the current point, choose the one with minimal x coordinate. Add this point to your list and continue from it.
Continue adding points in step 2 as long as you can.
Repeat the same from the rightmost point (uppermost among such), but going to the left. I.e. each time choose the next point with greater y, less x, and difference in x must be minimal.
Merge the two lists you got from steps 3 and 4, you got upper hull.
Do the same steps 1-5 for lower hull analogously.
Merge the upper and lower hulls found at steps 5 and 6.
In order to find the next point quickly, just sort your points by x coordinate. For example, when building the very first right-up chain, you sort by x increasing. Then iterate over all points. For each point check if its y coordinate is greater than the current value. If yes, add the point to the list and make it current.
Overall complexity would be O(N log N) for sorting.
EDIT: The description above only shows how to trace the main vertices of the hull. If you want to have a full rectilinear polygon (with line segments between consecutive points), then you have to add an additional point to your chain each time you find next point. For example, when building the right-up chain, if you find a point (x2, y2) from the current point (x1, y1), you have to add (x2, y1) and (x2, y2) to the current chain list (in this order).
I think what you want to compute is the Rectilinear Convex Hull (or Orthogonal Convex Hull) of the set of points. The rectilinear convex hull is an ortho-convex shape, that is, the intersection of the shape with any horizontal or vertical line results in an empty set, a point, or a line segment.
The vertices of the rectilinear convex hull are the set of maximal points under vector dominance. The rectilinear convex hull can then be computed in optimal O(n log n) time. A very simple algorithm is presented in Preparata's book on Computational Geometry (see the section 4.1.3).
I don't know of any standard algorithm for this but it doesn't seem too complicated to define:
Assuming each point in the grid has at least 2 neighbors (or else there's no solution)
p = a point with only two neighbors.
while p isn't null
2a. Mark p as visited
2b. next = the unmarked neighbor that has the least amount of neighbors
2c. next.parent = p
2d. p = next
done

find a point non collinear with all other points in a plane

Given a list of N points in the plane in general position (no three are collinear), find a new point p that is not collinear with any pair of the N original points.
We obviously cannot search for every point in the plane, I started with finding the coincidence point of all the lines that can be formed with the given points, or making a circle with them something.. I dont have any clue how to check all the points.
Question found in http://introcs.cs.princeton.edu/java/42sort/
I found this question in a renowned algorithm book that means it is answerable, but I cannot think of an optimal solution, thats why I am posting it here so that if some one knows it he/she can answer it
The best I can come up with is an N^2 algorithm. Here goes:
Choose a tolerance e to control how close you're willing to come to a line formed from the points in the set.
Compute the convex hull of your set of points.
Choose a line L parallel to one of the sides of the convex hull, at a distance 3e outside the hull.
Choose a point P on L, so that P is outside the projection of the convex hull on L. The projection of the convex hull on L is an interval of L. P must be placed outside this interval.
Test each pair of points in the set. For a particular line M formed by the 2 test points intersects a disc of radius 2e around P, move P out further along L until M no longer intersects the disc. By the construction of L, there can be no line intersecting the disk parallel to L, so this can always be done.
If M crosses L beyond P, move P beyond that intersection, again far enough that M doesn't pass through the disc.
After all this is done, choose your point at distance e, on the perpendicular to L at P. It can be colinear with no line of the set.
I'll leave the details of how to choose the next position of P along L in step 5 to you,
There are some obvious trivial rejection tests you can do so that you do more expensive checks only with the test line M is "parallel enough" to L.
Finally, I should mention that it is probably possible to push P far enough out that numerical problems occur. In that case the best I can suggest is to try another line outside of the convex hull by a distance of at least 3e.
You can actually solved it using a simple O(nlogn) algorithm, which we will then improve to O(n). Name A the bottom most point (in case of tie choose the one that is has smaller x coordinate). You can now sort in clockwise order the rest of the points using the CCW. Now as you process each point from the sorted order you can see that between any two successive points having different angle with point A and the bottom axis (let these be U, V) there is no point having angle c, with U <= c <= V. So we can add any point in this section and it is guaranteed that it won’t be collinear with any other points from the set.
So, all you need is to find one pair of adjacent points and you are done. So, find the minimum and the second minimum angle with A (these should be different) in O(n) time and select any point in between them.

The minimum perimeter convex hull of a subset of a point set

Given n points on the plane. No 3 are collinear.
Given the number k.
Find the subset of k points, such that the convex hull of the k points has minimum perimeter out of any convex hull of a subset of k points.
I can think of a naive method runs in O(n^k k log k). (Find the convex hull of every subset of size k and output the minimum).
I think this is a NP problem, but I can't find anything suitable for reduction to.
Anyone have ideas on this problem?
An example,
the set of n=4 points {(0,0), (0,1), (1,0), (2,2)} and k=3
Result:
{(0,0),(0,1),(1,0)}
Since this set contains 3 points the convex hull of and the perimeter of the result is smaller than that of any other sets of 3 points.
This can be done in O(kn^3) time and O(kn^2) space (or maybe O(kn^3) if you want the actual points).
This paper: http://www.win.tue.nl/~gwoegi/papers/area-k-gons.pdf
by Eppstein et al, has algorithms to solve this problem for minimum perimeter and other weight functions like area, sum of internal angles etc which follow certain constraints, even though the title says minimum area (See Corollary 5.3 for perimeter).
The basic idea is a dynamic programming approach as follows (read the first few paragraphs of Section 4):
Suppose S is the given set of points and Q is the convex hull of k points with minimum perimeter.
Let p1 be the bottom-most point of Q, p2 and p3 are the next points on the hull in counter-clockwise order.
We can decompose Q into a triangle p1p2p3 and a convex hull of k-1 points Q' (which shares the side p1p3 with triangle p1p2p3).
The main observation is that Q' is the optimal for k-1, in which the bottommost point is p1 and the next point is p3 and all the points of Q' lie on the same side of the line p2->p3.
Thus maintaining a 4d array of optimum polygons for the each quadruple (pi, pj, pk, m) such that
the polygon is a convex hull of exactly m points of S.
pi is the bottom most point of the polygon.
pj is the next vertex in counter-clockwise order,
all points of the polygon lie to the left of the line pi -> pj.
all points lie on the same side of pj->pk as pi does.
can help us find the optimum polygons for m=k, given the optimum polygons for m <= k-1.
The paper describes exactly how to go about doing that in order to achieve the stated space and time bounds.
Hope that helps.
It's not exactly pretty solution. In fact, it's quite a pain to implement, but it surely gives polynomial complexity. Although complexity is also big (n^5*k is my rough estimate), someone may find a way to improve it or find here an idea for better solution. Or it may be enough for you: even this complexity is much better than bruteforce.
Note: optimal solution (set S) with hull H includes all points from orignal set inside H. Otherwise, we could throw away one of the border points of H and include that missed point, reducing perimeter.
(update just like 'optimization' mbeckish posted)
Assumption: no two points from the set form a vertical line. It can be achieved easily by rotating whole set of points by some irrational angle around origin of coordinates.
Due to assumption above, any complex hull has one leftmost and one rightmost point. Also, these two points divide the hull into top and bottom parts.
Now, let's take one segment from the top part of this hull and one from the bottom part. Let's call those two segments middle segments and perimeter of the right part of this hull - right perimeter.
Note: those two segments is all we need to know about right part of our convex hull to continue building it to the left. But having just two points instead of 4 is not enough: we could not uphold condition of 'convexness' this way.
It leads to a solution. For each set of points {p0, p1, p2, p3} and number i (i <= k) we store minimal right perimeter that can be achieved if [p0, p1], [p2, p3] are two middle segments and i is the number of points in the right part of this solution (including the ones inside of it, not only on the border).
We go through all points from right to left. For each new point p we check all combinations of points {p0, p1, p2, p3} such that point p can continue this hull to the left (either on the top or on the bottom part). For each such set and size i, we already store optimal perimeter size (see paragraph above).
Note: if you add point p to a right-hull formed by points {p0, p1, p2, p3}, you'll increment set size i at least by 1. But sometimes this number will be > 1: you'll have to include all points in the triangle {p, p0, p2}. They aren't on the hull, but inside it.
Algorithm is over :) In addition, despite scary complexity, you may note that not all segments [p0, p1], [p2, p3] can be middle segments: it should reduce actual computation time substantially.
update This provides only optimal perimeter size, not the set itself. But finding the set is simple: for each 'state' above you store not only perimeter size, but also last point added. Then, you can 'trace' your solution back. It's quite standard trick, I suppose it's not a problem for you, you seem to be good at algorithms :)
update2 This is essentially DP (dynamic programming), only a bit bloated
One possible optimization: You can ignore any subsets whose convex hull contains points that are not in the subset.
Proof:
If your convex hull contains points that are not in your subset, then remove a point from your subset that is on the hull, and replace it with a point in the interior of the hull. This will yield a hull of equal or smaller perimeter.
In the planar case, you can use an algorithm known as the Jarvis march, which has worst case complexity O(n^2). In this algorithm, you start building a hull at an arbitrary point and then check which point needs to be added next. Pseudocode taken from wikipedia:
jarvis(S)
pointOnHull = leftmost point in S
i = 0
repeat
P[i] = pointOnHull
endpoint = S[0] // initial endpoint for a candidate edge on the hull
for j from 1 to |S|-1
if (S[j] is on left of line from P[i] to endpoint)
endpoint = S[j] // found greater left turn, update endpoint
i = i+1
pointOnHull = endpoint
until endpoint == P[0] // wrapped around to first hull point
As far as I understand it, convex hulls are unique to each set of points, so there is no need to find a minimum. You just find one, and it will be the smallest one by definition.
Edit
The posted solution solves for the convex hull with the fewest number of points. Any hull with more points will have a longer perimeter, and I misunderstood the question to seeking a minimum perimeter, instead of a minimum perimeter for a set with K points.
This new problem is probably NP as suspected, and most similar to the longest path problem. Unfortunately, I lack the ingenuity to provide a worthwhile reduction.

Resources