Tangents range for all pairs of points in a box - algorithm

Suppose i have a box with a lot of points. I need to be able to calculate min and max angles for all lines which go through all possible pairs of the points. I can do it in O(n^2) times by just enumerating every point with all others. But is there faster algorithm?

Taking the idea of dual plane proposed by Evgeny Kluev, and my comment about finding left-most intersection point, I'll try to give an equivalent direct solution without any dual space.
The solution is simple: sort your points by (x, y) lexicographically. Now draw a line through each two adjacent points in the sorted order. It can be proved that the minimal angle is achieved by one of these lines. In order to get maximal angle, you need to sort by (x, -y) lexicographically, and also check only adjacent pairs of points.
Let's prove by the idea for min angle. Consider the two points A and B which yield the minimal possible angle. Among such points we can choose the pair with minimal difference of x coordinates.
Suppose that they have same y. If there is no other point between them, then they are adjacent. If there are any points between them, then clearly at least one of them is adjacent to A in our order, and all of them yield the same angle.
Suppose that there exists a point P with x-coordinate in between A and B, i.e. Ax < Px < Bx. If P lies on AB, then AP has same angle but less difference of x coordinates, hence a contradiction. When P is not on AB, then either AP or PB would give you less angle, which also gives contradiction.
Now we have points A and B lying on two adjacent vertical lines. There are no other points between these lines. If A and B are the only points on their vertical lines, then the AB pair is clearly adjacent in sorted order and QED. If there many points on these lines, obviously the minimal angle is achieved by taking the highest point on the left vertical line (which must be A) and the lowest point on the right vertical line (which must be B). Since we sort points of equal x by y, these two points are also adjacent.

Sort the points (or use hash map) to find out if there are any horizontal lines.
Then solve this problem on dual plane. Here you only need to find the leftmost and the rightmost intersection points. Use binary searches to find a pair of horizontal coordinates such that all intersection points are between them. (You could quickly find approximate results just by continuing binary searches from these coordinates).
Then sort lines according to their tangents on dual plane. And for pairs of adjacent lines in this sorted order find intersections closest to those horizontal coordinates. This does not guarantee good complexity in the worst case (when some lines on primal plane are almost horizontal). But in most cases time complexity would be determined by sorting: O(N log N) + O(binary_search_complexity).

Related

Closest pair of points across a line

I have two sets of 2D points, separated from each other by a line in the plane. I'd like to efficiently find the pair of points, consisting of one point from each set, with the minimum distance between them. There's a really convenient looking paper by Radu Litiu, Closest Pair for Two Separated Sets of Points, but it uses an L1 (Manhattan) distance metric instead of Euclidean distance.
Does anyone know of a similar algorithm which works with Euclidean distance?
I can just about see an extension of the standard divide & conquer closest pair algorithm -- divide the two sets by a median line perpendicular to the original splitting line, recurse on the two sides, then look for a closer pair consisting of one point from each side of the median. If the minimal distance from the recursive step is d, then the companion for a point on one side of the median must lie within a box of dimensions 2d*d. But unlike with the original algorithm, I can't see any way to bound the number of points within that box, so the algorithm as a whole just becomes O(m*n).
Any ideas?
Evgeny's answer works, but it's a lot of effort without library support: compute a full Voronoi diagram plus an additional sweep line algorithm. It's easier to enumerate for both sets of points the points whose Voronoi cells intersect the separating line, in order, and then test all pairs of points whose cells intersect via a linear-time merge step.
To compute the needed fragment of the Voronoi diagram, assume that the x-axis is the separating line. Sort the points in the set by x-coordinate, discarding points with larger y than some other point with equal x. Begin scanning the points in order of x-coordinate, pushing them onto a stack. Between pushes, if the stack has at least three points, say p, q, r, with r most recently pushed, test whether the line bisecting pq intersects the separating line after the line bisecting qr. If so, discard q, and repeat the test with the new top three. Crude ASCII art:
Case 1: retain q
------1-2-------------- separating line
/ |
p / |
\ |
q-------r
Case 2: discard q
--2---1---------------- separating line
\ /
p X r
\ /
q
For each point of one set find closest point in other set. While doing this, keep only one pair of points having minimal distance between them. This reduces given problem to other one: "Algorithm to find for all points in set A the nearest neighbor in set B", which could be solved using sweep line algorithm over (1) one set of points and (2) Voronoi diagram for other set.
Algorithm complexity is O((M+N) log M). And this algorithm does not use the fact that two sets of points are separated from each other by a line.
well what about this:
determine on which side any point is:
let P be your points (P0,...Pi,...Pn)
let A,B be the separator line start-end points
so: side(Pi) = signum of ((B-A).(Pi-A))
this is based on simple fact that signum of scalar vector multiplication (dot product) depends on the order of points (see triangle/polygon winding rule for more info)
find minimal distance of any (Pi,Pj) where side(Pi)!=side(pj)
so first compute all sides for all points O(N)
then cycle through all Pi and inside that for
cycle through all Pj and search for min distance.
if the Pi and Pj groups aprox. equal size tahn it is O((N/2)^2)
you can further optimize the search by 'sort' the points Pi,Pj by 'distance' from AB
you can use another dot product to do that, this time instead (B-A)
use perpendicular vector to it let say (C-A)
discard all points from Pi2 (and similar Pj2 also)
where ((B-A).(P(i1)-A)) is close to ((B-A).(P(i2)-A))
and |((C-A).(P(i1)-A))| << |((C-A).(P(i2)-A))|
beacuese that means that Pi2 is behind Pi1 (farer from AB)
and close to the normal of AB going near Pi1
complexity after this optimization strongly depend on the dataset.
should be O(N+(Ni*Nj)) where Ni/Nj is number of remaining points Pi/Pj
you need 2N dot products, Ni*Nj distance comparision (do not neet to be sqrt-ed)
A typical approach to this problem is a sweep-line algorithm. Suppose you have a coordinate system that contains all points and the line separating points from different sets. Now imagine a line perpendicular to the separating line hopping from point to point in ascending order. For convenience, you may rotate and translate the point set and the separating line such that the separating line equals the x-axis. The sweep-line is now parallel with the y-axis.
Hopping from point to point with the sweep-line, keep track of the shortest distance of two points from different sets. If the first few points are all from the same set, it's easy to find a formula that will tell you which one you'll have to remember until you hit the first point from the other set.
Suppose you have a total of N points. You will have to sort all points in O(N*log(N)). The sweep-line algorithm itself will run in O(N).
(I'm not sure if this bears any similarity to David's idea...I only saw it now after I logged in to post my thoughts.) For the sake of the argument, let's say we transposed everything so the dividing line is the x axis and sorted our points by the x coordinate. Assuming N is not too large, if we scan along the x-axis (that is, traverse our sorted list of a's and b's), we can keep a record of the overall minimum and two lists of passed points. The current point in the scan is tested against each passed point from the other list while the distance from the point in the list to (x-coordinate of our scan,0) is greater than or equal to the overall min. In the example below, when reaching b2, we can stop testing at a2.
scan ->
Y
| a2
|
| a1 a3
X--------------------------
| b1 b3
| b2

Algorithm to take the union of rectangles and to see if the union is still a rectangle

I have a problem in which I have to test whether the union of given set of rectangles forms
a rectangle or not. I don't have much experience solving computational geometry problems.
What my approach to the problem was that since I know the coordinates of all the rectangles, I can easily sort the points and then deduce the corner points of the largest rectangle possible. Then I could sweep a line and see if all the points on the line falls inside the rectangle. But, this approach is flawed and this would fail because the union may be in the form of a 'U'.
I would be a great help if you could push me in the right direction.
Your own version does not take into account that the edges of the rectangles can be non-parallel to each other. Therefore, there might not be "largest rectangle possible".
I would try this general approach:
1) Find the convex hull. You can find convex hull calculation algorithms here http://en.wikipedia.org/wiki/Convex_hull_algorithms.
2) Check if the convex hull is a rectangle. You can do this by looping through all the points on convex hull and checking if they all form 180 or 90 degree angles. If they do not, union is not a rectangle.
3) Go through all points on the convex hull. For each point check if the middle point between ThisPoint and NextPoint lies on the edge of any initially given rectangle.
If every middle point does, union is a rectangle.
If it does not, union is not a rectangle.
Complexity would be O(n log h) for finding convex hull, O(h) for the second part and O(h*n) for third part, where h is number of points on the convex hull.
Edit:
If the goal is to check if the resulting object is a filled rectangle, not only edges and corners rectangle then add step (4).
4) Find all line segments that are formed by intersecting or touching rectangles. Note - by definition all of these line segments are segments of edges of given rectangles. If a rectangle does not touch/intersect other rectangles, the line segments are it's edges.
For each line segment check if it's middle point is
On the edge of the convex hull
Inside one of given rectangles
On the edge of two non-overlapping given rectangles.
If at least one of these is true for every line segment, resulting object is a filled rectangle.
You could deduce the he corner points of the largest rectangle possible, and then go over all the rectangle that share the border with the largest possible rectangle, for example the bottom, and make sure that the line is entirely contained in their borders. This will also fail if an empty space in the middle of the rectangle is a problem, however. I think the complexity will be O(n2).
I think you are on the right direction. After you get the coordinates of largest possible rectangle,
If the largest possible rectangle is a valid rectangle, then each side of it must be union of sides of original rectangles. You can scan the original rectangle set, find those rectangles that is a part of the largest side we are looking for (this can be done in O(n) by checking if X==largestRectangle.Top.X if you are looking at top side, etc.), lets call them S.
For each side s in S we can create an interval [from,to]. All we need to check is whether the union of all intervals matches the side of the largest Rectangle. This can be done in O(nlog(n)) by standard algorithms, or on average O(n) by some hash trick (see http://www.careercup.com/question?id=12523672 , see my last comment (of the last comment) there for the O(n) algorithm ).
For example, say we got two 1*1 rectangles in the first quadrant, there left bottom coordinates are (0,0) and (1,0). Largest rectangle is 2*1 with left bottom coordinate (0,0). Since [0,1] Union [1,2] is [0,2], top side and bottom side match the largest rectangle, similar for left and right side.
Now suppose we got an U shape. 3*1 at (0,0), 1*1 at (0,1), 1*1 at (2,1), we got largest rectangle 3*2 at (0,0). Since for the top side we got [0,1] Union [1,3] does not match [0,3], the algorithm will output the union of above rectangles is not a rectangle.
So you can do this in O(n) on average, or O(nlog(n)) at least if you don't want to mess with some complex hash bucket algorithm. Much better than O(n^4)!
Edit: We have a small problem if there exists empty space somewhere in the middle of all rectangles. Let me think about it....
Edit2: An easy way to detect empty space is for each corner of a rectangle which is not a point on the largest rectangle, we go outward a little bit for all four directions (diagonal) and check if we are still in any rectangle. This is O(n^2). (Which ruins my beautiful O(nlog(n))! Can anyone can come up a better idea?
I haven't looked at a similar problem in the past, so there maybe far more efficient ways of doing it. The key problem is that you cannot look at containment of one rectangle in another in isolation since they could be adjacent but still form a rectangle, or one rectangle could be contained within multiple.
You can't just look at the projection of each rectangle on to the edges of the bounding rectangle unless the problem allows you to leave holes in the middle of the rectangle, although that is probably a fast initial check that could be performed before the following exhaustive approach:
Running through the list once, calculating the minimum and maximum x and y coordinates and the area of each rectangle
Create an input list containing your input rectangles ordered by descending size.
Create a work list containing the bounding rectangle initially
While there are rectangles in the work list
Take the largest rectangle in the input list R
Create an empty list for fragments
for each rectangle r in the work list, intersect r with R, splitting r into a rectangular portion contained within R (if any) and zero or more rectangles not within R. If r was split, discard the portion contained within R and add the remaining rectangles to the fragment list.
add the contents of the fragment list to the work list
Assuming your rectangles are aligned to the coordinate axis:
Given two rectangles A, B, you can make a function that subtracts B from A returning a set of sub-rectangles of A (that may be the empty set): Set = subtract_rectangle(A, B)
Then, given a set of rectangles R for which you want to know if their union is a rectangle:
Calculate a maximum rectangle Big that covers all the rectangles as ((min_x,min_y)-(max_x,max_y))
make the set S contain the rectangle Big: S = (Big)
for every rectangle B in R:
S1 = ()
for evey rectangle A in S:
S1 = S1 + subtract_rectangle(A, B)
S = S1
if S is empty then the union of the rectangles is a rectangle.
End, S contains the parts of Big not covered by any rectangle from R
If the rectangles are not aligned to the coordinate axis you can use a similar algorithm but that employs triangles instead of rectangles. The only issues are that subtracting triangles is not so simple to implement and that handling numerical errors can be difficult.
A simple approach just came to mind: If two rectangles share an edge[1], then together they form a rectangle which contains both - either the rectangles are adjacent [][ ] or one contains the other [[] ].
So if the list of rectangles forms a larger rectangle, then all you need it to repeatedly iterate over the rectangles, and "unify" pairs of them into a single larger one. If in one iteration you can unify none, then it is not possible to create any larger rectangle than you already have, with those pieces; otherwise, you will keep "unifying" rectangles until a single is left.
[1] Share, as in they have the same edge; it is not enough for one of them to have an edge included in one of the other's edges.
efficiency
Since efficiency seems to be a problem, you could probably speed it up by creating two indexes of rectangles, one with the larger edge size and another with the smaller edge size.
Then compare the edges with the same size, and if they are the same unify the two rectangles, remove them from the indexes and add the new rectangle to the indexes.
You can probably speed it up by not moving to the next iteration when you unify something, but to proceed to the end of the indexes before reiterating. (Stopping when one iteration does no unifications, or there is only one rectangle left.)
Additionally, the edges of a rectangle resulting from unification are by analysis always equal or larger than the edges of the original rectangles.
So if the indexes are ordered by ascending edge size, the new rectangle will be inserted in either the same position as you are checking or in positions yet to be checked, so each unification will not require an extra iteration cycle. (As the new rectangle will assuredly not unify with any rectangle previously checked in this iteration, since its edges are larger than all edges checked.)
For this to hold, in each step of a particular iteration you need to attempt unification on the next smaller edge from either of the indexes:
If you're in index1=3 and index2=6, you check index1 and advance that index;
If next edge on that index is 5, next iteration step will be in index1=5 and index2=6, so it will check index1 and advance that index;
If next edge on that index is 7, next iteration step will be in index1=7 and index2=6, so it will check index2 and advance that index;
If next edge on that index is 10, next iteration step will be in index1=7 and index2=10, so it will check index1 and advance that index;
etc.
examples
[A ][B ]
[C ][D ]
A can be unified with B, C with D, and then AB with CD. One left, ABCD, thus possible.
[A ][B ]
[C ][D ]
A can be unified with B, C with D, but AB cannot be unified with CD. 2 left, AB and CD, thus not possible.
[A ][B ]
[C ][D [E]]
A can be unified with B, C with D, CD with E, CDE with AB. 1 left, ABCDE, thus possible.
[A ][B ]
[C ][D ][E]
A can be unified with B, C with D, CD with AB, but not E. 2 left, ABCD and E, thus not possible.
pitfall
If a rectangle is contained in another but does not share a border, this approach will not unify them.
A way to address this is, when one hits an iteration that does not unify anything and before concluding that it is not possible to unify the set of rectangles, to get the rectangle with the widest edge and discard from the indexes all others that are contained within this largest rectangle.
This still does not address two situations.
First, consider the situation where with this map:
A B C D
E F G H
we have rectangles ACGE and BDFH. These rectangles share no edge and are not contained, but form a larger rectangle.
Second, consider the situation where with this map:
A B C D
E F G H
I J K L
we have rectangles ABIJ, CDHG and EHLI. They do not share edges, are not contained within each-other, and no two of them can be unified into a single rectangle; but form a rectangle, in total.
With these pitfalls this method is not complete. But it can be used to greatly reduce the complexity of the problem and reduce the number of rectangles to analyse.
Maybe...
Gather up all the x-coordinates in a list, and sort them. From this list, create a sequence of adjacent intervals. Do the same thing for the y-coordinates. Now you've got two lists of intervals. For each pair of intervals (A=[x1,x2] from the x-list, B=[y1,y2] from the y-list), make their product rectangle A x B = (x1,y1)-(x2,y2)
If every single product rectangle is contained in at least one of your initial rectangles, then the union must be a rectangle.
Making this efficient (I think I've offered about an O(n4) algorithm) is a different question entirely.
As jva stated, "Your own version does not take into account that the edges of the rectangles can be non-parallel to each other." This answer also assumes "parallel" rectangles.
If you have a grid as opposed to needing infinite precision, depending on the number and sizes of the rectangles and the granularity of the grid, it might be feasible to brute-force it.
Just take your "largest rectangle possible" and test all its points to see whether each point is in at least one of the smaller rectangles.
I finally was able to find the impressive javascript project (thanks to github search :) !)
https://github.com/evanw/csg.js
Also have a look into my answer here with other interesting projects
General case, thinking in images:
| outer_rect - union(inner rectangles) |
Check that result is zero

skyline algorithm for triangles

I am trying to write an algorithm to find the upper envelop (skyline) of triangles.
By following you can see the skyline of rectangles:
I have an algorithm for merging two skylines(L and R) of rectangles as follows:
represent each rectangle by (x1, x2, h) where h is height and x2-x1 is width
represent each skyline by a list of couple (x, h)
for i= min(L[ 1].x, R[ 1].x) to max(L[size of L].x, R[size of R].x) choose max(L[i].h, R[i].h)
Now, my question is how can I represent triangle and how I can merge skylines of two triangles
any idea will be appreciated
In the following I assume that the triangles are with their baseline on the bottom. I'm also assuming that all triangles are such that the upper corner is above the baseline (i.e. if you go straight down from the upper corner, you get inside the triangle, not outside). However I'm not assuming that only symmetric triangles are allowed.
Actually a merging of triangles will give a skyline where simply points are connected with lines. so the representation of a triangle skyline could just be a ordered list of points (x_i, y_i) with the restriction that y_0 = 0 and y_N = 0 where N is the index of the last point. A single triangle would then be represented by the three-element list (x_0, 0), (x_1, h), (x2,0) where x_0 and x_2 are the left and right endpoint (the two points where the triangle reaches 0), x_1 gives the horizontal position of the upper corner, and h gives the height.
The merging of two skylines can then for example be done as follows:
Step 1: For each line segment (x_i, y_i)--(x_{i+1}, y_{i+1}) from skyline 1 and each line segment (x_j,y_j)--(x_{j+1},y_{j+1}) calculate whether they intersect, and if so, where (this means solving a simple system of two linear equations). Collect the intersection points into a new list, intersections. So now you have three lists of points: skyline1, skyline2 and intersections. Since all intersections will be part of the skyline, use that as the basis for the new Skyline. (a special case is when both skylines agree over an interval, but in such intervals the combined skyline is the same as each single one anyway, so just use the start and end points of those intervals as intersection points)
Now for each pair of intersection points (and also left of the first intersection and right of the last intersection), there will be always exactly one skyline which is above the other (unless they agree, but then it doesn't matter which you choose). Add the points in the interval from that skyline to your combined skyline. You find out the larger one by just choosing an arbitrary point of one skyline (except if the intersection point is also a skyline point, that one should not be chosen) and detect the height of the other skyline at its x value (if the other skyline also has a point at the same x value, it's a simple comparison of the y value, otherwise you have to interpolate the y values of the preceding and following points).
After doing that, you should have the correct combined skyline.

Intersection of N rectangles

I'm looking for an algorithm to solve this problem:
Given N rectangles on the Cartesian coordinate, find out if the intersection of those rectangles is empty or not. Each rectangle can lie in any direction (not necessary to have its edges parallel to Ox and Oy)
Do you have any suggestion to solve this problem? :) I can think of testing the intersection of each rectangle pair. However, it's O(N*N) and quite slow :(
Abstract
Either use a sorting algorithm according to smallest X value of the rectangle, or store your rectangles in an R-tree and search it.
Straight-forward approach (with sorting)
Let us denote low_x() - the smallest (leftmost) X value of a rectangle, and high_x() - the highest (rightmost) X value of a rectangle.
Algorithm:
Sort the rectangles according to low_x(). # O(n log n)
For each rectangle in sorted array: # O(n)
Finds its highest X point. # O(1)
Compare it with all rectangles whose low_x() is smaller # O(log n)
than this.high(x)
Complexity analysis
This should work on O(n log n) on uniformly distributed rectangles.
The worst case would be O(n^2), for example when the rectangles don't overlap but are one above another. In this case, generalize the algorithm to have low_y() and high_y() too.
Data-structure approach: R-Trees
R-trees (a spatial generalization of B-trees) are one of the best ways to store geospatial data, and can be useful in this problem. Simply store your rectangles in an R-tree, and you can spot intersections with a straightforward O(n log n) complexity. (n searches, log n time for each).
Observation 1: given a polygon A and a rectangle B, the intersection A ∩ B can be computed by 4 intersection with half-planes corresponding to each edge of B.
Observation 2: cutting a half plane from a convex polygon gives you a convex polygon. The first rectangle is a convex polygon. This operation increases the number of vertices at most per 1.
Observation 3: the signed distance of the vertices of a convex polygon to a straight line is a unimodal function.
Here is a sketch of the algorithm:
Maintain the current partial intersection D in a balanced binary tree in a CCW order.
When cutting a half-plane defined by a line L, find the two edges in D that intersect L. This can be done in logarithmic time through some clever binary or ternary search exploiting the unimodality of the signed distance to L. (This is the part I don't exactly remember.) Remove all the vertices on the one side of L from D, and insert the intersection points to D.
Repeat for all edges L of all rectangles.
This seems like a good application of Klee's measure. Basically, if you read http://en.wikipedia.org/wiki/Klee%27s_measure_problem there are lower bounds on the runtime of the best algorithms that can be found for rectilinear intersections at O(n log n).
I think you should use something like the sweep line algorithm: finding intersections is one of its applications. Also, have a look at answers to this questions
Since the rectangles must not be parallel to the axis, it is easier to transform the problem to an already solved one: compute the intersections of the borders of the rectangles.
build a set S which contains all borders, together with the rectangle they're belonging to; you get a set of tuples of the form ((x_start,y_start), (x_end,y_end), r_n), where r_n is of course the ID of the corresponding rectangle
now use a sweep line algorithm to find the intersections of those lines
The sweep line stops at every x-coordinate in S, i.e. all start values and all end values. For every new start coordinate, put the corresponding line in a temporary set I. For each new end-coordinate, remove the corresponding line from I.
Additionally to adding new lines to I, you can check for each new line whether it intersects with one of the lines currently in I. If they do, the corresponding rectangles do, too.
You can find a detailed explanation of this algorithm here.
The runtime is O(n*log(n) + c*log(n)), where c is the number of intersection points of the lines in I.
Pick the smallest rectangle from the set (or any rectangle), and go over each point within it. If one of it's point also exists in all other rectangles, the intersection is not empty. If all points are free from ALL other rectangles, the intersection is empty.

Dividing a plane of points into two equal halves [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 3 years ago.
Improve this question
Given a 2 dimensional plane in which there are n points. I need to generate the equation of a line that divides the plane such that there are n/2 points on one side and n/2 points on the other.
I have assumed the points are distinct, otherwise there might not even be such a line.
If points are distinct, then such a line always exists and is possible to find using a deterministic O(nlogn) time algorithm.
Say the points are P1, P2, ..., P2n. Assume they are not all on the same line. If they were, then we can easily form the splitting line.
First translate the points so that all the co-ordinates (x and y) are positive.
Now suppose we magically had a point Q on the y-axis such that no line formed by those points (i.e. any infinite line Pi-Pj) passes through Q.
Now since Q does not lie within the convex hull of the points, we can easily see that we can order the points by a rotating line passing through Q. For some angle of rotation, half the points will lie on one side and the other half will lie on the other of this rotating line, or, in other words, if we consider the points being sorted by the slope of the line Pi-Q, we could pick a slope between the (median)th and (median+1)th points. This selection can be done in O(n) time by any linear time selection algorithm without any need for actually sorting the points.
Now to pick the point Q.
Say Q was (0,b).
Suppose Q was collinear with P1 (x1,y1) and P2 (x2,y2).
Then we have that
(y1-b)/x1 = (y2-b)/x2 (note we translated the points so that xi > 0).
Solving for b gives
b = (x1y2 - y1x2)/(x1-x2)
(Note, if x1 = x2, then P1 and P2 cannot be collinear with a point on the Y axis).
Consider |b|.
|b| = |x1y2 - y1x2| / |x1 -x2|
Now let the xmax be the x-coordinate of the rightmost point and ymax the co-ordinate of the topmost.
Also let D be the smallest non-zero x-coordinate difference between two points (this exists, as not all xis are same, as not all points are collinear).
Then we have that |b| <= xmax*ymax/D.
Thus, pick our point Q (0,b) to be such that |b| > b_0 = xmax*ymax/D
D can be found in O(nlogn) time.
The magnitude of b_0 can get quite large and we might have to deal with precision issues.
Of course, a better option is to pick Q randomly! With probability 1, you will find the point you need, thus making the expected running time O(n).
If we could find a way to pick Q in O(n) time (by finding some other criterion), then we can make this algorithm run in O(n) time.
Create an arbitrary line in that plane. Project each point onto that line a.k.a for each point, get the closest point on that line to that point.
Order those points along the line in either direction, and choose a point on that line such that there is an equal number of points on the line in either direction.
Get the line perpendicular to the first line which passes through that point. This line will have half the original points on either side.
There are some cases to avoid when doing this. Most importantly, if all the point are themselves on a single line, don't choose a perpendicular line which passes through it. In fact, choose that line itself so you don't have to worry about projecting the points. In terms of the actual mathematics behind this, vector projections will be very useful.
This is a modification of Dividing a plane of points into two equal halves which allows for the case with overlapping points (in which case, it will say whether or not the answer exists).
If number of points is odd, return "impossible".
Pick a random line (two random points)
Project all points onto this line (`O(N)` operation)
(i.e. we pretend this line is our new X'-axis, and write down the
X'-coordinate of each point)
Perform any median-finding algorithm on the X'-coordinates
(`O(N)` or faster-if-desired operation)
(returns 2 medians if no overlapping points)
Return the line perpendicular to original random line that splits the medians
In rare case of overlapping points, repeat a few times (it would take
a pathological case to prevent a solution from existing).
This is O(N) unlike other proposed solutions.
Assuming a solution exists, the above method will probably terminate, though I don't have a proof.
Try the above algorithm a few times unless you detect overlapping points. If you detect a high number of overlapping points, you may be in for a rough ride, but there is a terribly inefficient brute-force solution that involves checking all possible angles:
For every "critical slope range", perform the above algorithm
by choosing a line with a slope within the range.
If all critical slope ranges fail, the solution is impossible.
A critical angle is defined as the angle which could possibly change the result (imagine the solution to a previous answer, rotate the entire set of points until one or more points swaps position with one or more other points, crossing the dividing line. There are only finitely many of these, and I think they are bounded by the number of points, so you're probably looking at something in the range O(N^2)-O(N^2 log(N)) if you have overlapping points, for a brute-force approach.
I'd guess that a good way is to sort/sequence/order the points (e.g. from left to right), and then choose a line which passes through (or between) the middle point[s] in the sequence.
There are obvious cases where no solution is possible. E.g. when you have three heaps of points. One point at location A, Two points at location B, and five points at location C.
If you expect some decent distribution, you can probably get a good result with tlayton's algorithm. To select the initial line slant, you could determine the extent of the whole point set, and choose the angle of the largest diagonal.
The median equally divides a set of numbers in the manner similar to what you're trying to accomplish, and it can be computed in O(n) time using a selection algorithm (the writeup in Cormen et al is better, so you may want to look there instead). So, find the median of your x values Mx (or your y values My if you prefer) and set x = Mx (or y = My) and that line will be axially aligned and split your points equally.
If the nature of your problem requires that no more than one point lies on the line (if you have an odd number of points in your set, at least one of them will be on the line) and you discover that's what's happened (or you just want to guard against the possibility), rotate all of your points by some random angle, θ, and compute the median of the rotated points. You then rotate the median line you computed by -θ and it will evenly divide points.
The likelihood of randomly choosing θ such that the problem manifests itself again is very small with a finite number of points, but if it does, try again with a different θ.
Here is how I approach this problem (with the assumption that n is even and NO three points are collinear):
1) Pick up the point with smallest Y value. Let's call it point P.
2) Take this point as the new origin point, so that all other points will have positive Y values after this transformation.
3) For every other point (there are n - 1 points remaining), think it under the polar coordinate system. Each other point can be represented with a radius and angle. You could ignore the radius and just focus on the angle.
4) How can you find a line that split the points evenly? Find the median of (n - 1) angles. The line from point P to the point with that median angle will split the points evenly.
Time complexity for this algorithm is O(n).
I dont know how useful this is I have seen a similar problem...
If you already have the directional vector (aka the coefficients of the dimensions of your plane).
You can then find two points inside that plane, and by simply using the midpoint formula you can find the midpoint of that plane.
Then using the coefficients of that plane and the midpoint you can find a plane that is from equal distance from both points, using the general equation of a plane.
A line then would constitute in expressing one variable in terms of the other
so you would find a line with equal distance between both planes.
There are different methods of doing this such as projection using the distance equation from a plane but I believe that would complicate your math a lot.
To add to M's answer: a method to generate a Q (that's not so far away) in O(n log n).
To begin with, let Q be any point on the y-axis ie. Q = (0,b) - some good choices might be (0,0) or (0, (ymax-ymin)/2).
Now check if there are two points (x1, y1), (x2, y2) collinear with Q. The line between any point and Q is y = mx + b; since b is constant, this means two points are collinear with Q if their slopes m are equal. So determine the slopes mi for all points and check if there are any duplicates: (amoritized O(n) using a hash-table)
If all the m's are distinct, we're done; we found Q, and M's algorithm above generates the line in O(n) steps.
If two points are collinear with Q, we'll move Q up just a tiny amount ε, Qnew = (0, b + ε), and show that Qnew will not be collinear with two other points.
The criterion for ε, derived below, is:
ε < mminΔ*xmin
To begin with, our m's look like this:
mi = yi/xi - b/xi
Let's find the minimum difference between any two distinct mi and call it mminΔ (O(n log n) by, for instance, sorting then comparing differences between mi and i+1 for all i)
If we fudge b up by ε, the new equation for m becomes:
mi,new = yi/xi - b/xi - ε/xi
= mi,old - ε/xi
Since ε > 0 and xi > 0, all m's are reduced, and all are reduced by a maximum of ε/xmin. Thus, if
ε/xmin < mminΔ, ie.
ε < mminΔ*xmin
is true, then two mi which were previously unequal will be guaranteed to remain unequal.
All that's left is to show that if m1,old = m2,old, then m1,new =/= m2,new. Since both mi were reduced by an amount ε/xi, this is equivalent to showing x1 =/= x2. If they were equal, then:
y1 = m1,oldx1 + b = m2,oldx2 + b = y2
Contradicting our assumption that all points are distinct. Thus, m1, new =/= m2, new, and no two points are collinear with Q.
I picked up the idea from Moron and andand and
continued to form a deterministic O(n) algorithm.
I also assumed that the points are distinct and
n is even (thought the algorithm can be
changed so that uneven n with one point
on the dividing line are also supported).
The algorithm tries to divide the points with a vertical line between them. This only fails if the points in the middle have the same x value. In that case the algorithm determines how many points with the same x value have to be on the left and lower site and and accordingly rotates the line.
I'll try to explain with an example.
Let's asume we have 16 points on a plane.
First we need to get the point with the 8th greatest x-value
and the point with the 9th greatest x-value.
With a selection algorithm this is possible in O(n),
as pointed out in another answer.
If the x-value of that points is different, we are done.
We create a vertical line between that two points and
that splits the points equal.
Problematically now is if the x-values are equal. So we have 3 sets of points.
That on the left side (x < xa), in the middle (x = xa)
and that on the right side (x > xa).
The idea now is to count the points on the left side and
calculate how many points from the middle needs to go there,
so that half of the points are on that side. We can ignore the right side here
because if we have half of the points on the left side, the over half must be on the right side.
So let's asume we have we have 3 points (c=3) on the left side,
6 in the middle and 7 on the right side
(the algorithm doesn't know the count from the middle or right side,
because it doesn't need it, but we could also determine it in O(n)).
We need 8-3=5 points from the middle to go on the left side.
The points we already got in the first step are useless now,
because they are only determined by the x-value
and can be any of the points in the middle.
We want the 5 points from the middle with the lowest y-value on the left side and
the point with the highest y-value on the right side.
Again using the selection algorithm, we get the point with the 5th greatest y-value
and the point with the 6th greatest y-value.
Both points will have the x-value equal to xa,
else we wouldn't get to this step,
because there would be a vertical line.
Now we can create the point Q in the middle of that two points.
Thats one point from the resulting line.
Another point is needed, so that no points from the left or right side are divided.
To get that point we need the point from the left side,
that has the lowest angle (bh) between the the vertical line at xa
and the line determined by that point and Q.
We also need that point from the right side (with angle ag).
The new point R is between the point with the lower angle
and a point on the vertical line
(if the lower angle is on the left side a point above Q
and if the lower angle is on the right side a point below Q).
The line determined by Q and R divides the points in the middle
so that there are a even number of points on both sides.
It doesn't divide any points on the left or right side,
because if it would that point would have a lower angle and
would have been choosen to calculate R.
From the view of a mathematican that should work well in O(n).
For computer programs it is fairly easy to find a case
where precision becomes a problem. An example with 4 points would be
A(0, 100000000), B(0, 100000001), C(0, 0), D(0.0000001, 0).
In this example Q would be (0, 100000000.5) and R (0.00000005, 0).
Which gives B and C on the left side and A and D on the right side.
But it is possible that A and B are both on the dividing line,
because of rounding errors. Or maybe only one of them.
So it belongs to the input values if this algorithm suits to the requirements.
get that two points Pa(xa, ya) and Pb(xb, yb)
which are the medians based on the x values > O(n)
if xa != xb you can stop here
because a y-axis parallel line between that two points is the result > O(1)
get all points where the x value equals xa > O(n)
count points with x value less than xa as c > O(n)
get the lowest point Pc based on the y values from the points from 3. > O(n)
get the greatest point Pd based on the y values from the points from 3. > O(n)
get the (n/2-c)th greatest point Pe based on the y values from the points from 3. > O(n)
also get the next greatest point Pf based on the y values from the points from 3. > O(n)
create a new point Q (xa, (ye+yf)/2)
between Pe and Pf > O(1)
for all points Pi calculate
the angle ai between Pc, Q and Pi and
the angle bi between Pd, Q and Pi > O(n)
get the point Pg with the lowest angle ag (with ag>0° and ag<180°) > O(n)
get the point Ph with the lowest angle bh (with bh>0° and bh<180°) > O(n)
if there aren't any Pg or Ph (all points have same x value)
create a new point R (xa+1, 0) anywhere but with a different x value than xa
else if ag is lower than bh
create a new point R ((xc+xg)/2, (yc+yg)/2) between Pc and Pg
else
create a new point R ((xd+xh)/2, (yd+yh)/2) between Pd and Ph > O(1)
the line determined by Q and R divides the points > O(1)

Resources