skyline algorithm for triangles

I am trying to write an algorithm to find the upper envelop (skyline) of triangles.
By following you can see the skyline of rectangles:
I have an algorithm for merging two skylines(L and R) of rectangles as follows:
represent each rectangle by (x1, x2, h) where h is height and x2-x1 is width
represent each skyline by a list of couple (x, h)
for i= min(L[ 1].x, R[ 1].x) to max(L[size of L].x, R[size of R].x) choose max(L[i].h, R[i].h)
Now, my question is how can I represent triangle and how I can merge skylines of two triangles
any idea will be appreciated

In the following I assume that the triangles are with their baseline on the bottom. I'm also assuming that all triangles are such that the upper corner is above the baseline (i.e. if you go straight down from the upper corner, you get inside the triangle, not outside). However I'm not assuming that only symmetric triangles are allowed.
Actually a merging of triangles will give a skyline where simply points are connected with lines. so the representation of a triangle skyline could just be a ordered list of points (x_i, y_i) with the restriction that y_0 = 0 and y_N = 0 where N is the index of the last point. A single triangle would then be represented by the three-element list (x_0, 0), (x_1, h), (x2,0) where x_0 and x_2 are the left and right endpoint (the two points where the triangle reaches 0), x_1 gives the horizontal position of the upper corner, and h gives the height.
The merging of two skylines can then for example be done as follows:
Step 1: For each line segment (x_i, y_i)--(x_{i+1}, y_{i+1}) from skyline 1 and each line segment (x_j,y_j)--(x_{j+1},y_{j+1}) calculate whether they intersect, and if so, where (this means solving a simple system of two linear equations). Collect the intersection points into a new list, intersections. So now you have three lists of points: skyline1, skyline2 and intersections. Since all intersections will be part of the skyline, use that as the basis for the new Skyline. (a special case is when both skylines agree over an interval, but in such intervals the combined skyline is the same as each single one anyway, so just use the start and end points of those intervals as intersection points)
Now for each pair of intersection points (and also left of the first intersection and right of the last intersection), there will be always exactly one skyline which is above the other (unless they agree, but then it doesn't matter which you choose). Add the points in the interval from that skyline to your combined skyline. You find out the larger one by just choosing an arbitrary point of one skyline (except if the intersection point is also a skyline point, that one should not be chosen) and detect the height of the other skyline at its x value (if the other skyline also has a point at the same x value, it's a simple comparison of the y value, otherwise you have to interpolate the y values of the preceding and following points).
After doing that, you should have the correct combined skyline.


Find two rectangles with minimum areas that cover all points

You're given a n points, unsorted in an array. You're supposed to find two rectangles that cover all points and they should not overlap. Edges of rectangles should be parallel to x or y ordinate.
The program should return the minimum area covered by all these dots. Area of first rectangle + area of second rectangle.
I tried to solve this problem. I sorted points by X ordinate and the first one is the leftmost one of the first rectangle. When we go through the points we find the highest and lowest one. I was thinking that when the difference between two points by x is the biggest, that means that the first point is rightmost one of the first rectangle, and the second point is the leftmost one of the second rectangle.
It should work when the points are given as in first example, however, if the example is the second one it doesn't work. As it would return something like this and that's wrong:
This should be correct:
Then i was thinking doing sorting twice, just, the second time do it by Y ordinate and then compare two total areas. Areas when points are sorted by x and when points are sorted by y and the smaller area is the correct answer.
The two rectangles cannot overlap, so one must be either completely to the right or on top of the other. Your idea to sort the points by x-value and find the biggest gap is good, but you should do that for both directions, as you suggested. That would find the correct rectangles in your example.
The biggest gap isn't necessarily the ideal splitting point, however. Depending on the extent of the bounding boxes in the perpendicular direction, the split may be somewhere else. Consider a rectangular area with four quadrants, where two diagonally opposite quadrants are populated with points:
Here, the ideal split isn't where the largest gap is.
You can find the ideal location by considering all possible splits between points with adjacent x- and y-coordinates.
Sort the points by x-coordinate.
Scan the sorted array in ascending order. Keep track of the minimum rectangle to the left of the current point by storing the minimum and maximum y-coordinates. Store these running top and bottom borders for each point.
Now do the same in descending order, where you keep running top and bottom borders for the right rectangle.
Finally, loop through the points again and calculate the areas of the left and right minimal rectangles for a split between two adjacent nodes. Keep track of the minimum area sum.
Then do the same for minimum top and bottom rectangles. The last two steps can be combined, which will save arrays for the minimum bounds for the right rectangle.
This should be O(n · log n) in time: Sorting is O(n · log n) and the individual passes are O(n). You need O(n) additional memory for the running bounding boxes for thze first rectangle.
The first observation is that any edge of a rectangle must touch one of the points. Edges that didn't touch a point could be pulled back, resulting in less area.
Given n points, there are thus n selections total for left1, right1, bottom1, top1, left2, right2, bottom2 and top2. This gives a simple O(n^8) algorithm already: try all possible assignments and remember the one giving the least total area (right1 - left1)(top1 - bottom1) + (right2 - left2)(top2 - bottom2). Indeed, you can skip any combinations with right < left or top < bottom. This gives a speedup, though it does not change the O(n^8) bound.
Another observation is that the edges should stay within the minimum and maximum X and Y bounds of all points. Find the minimum and maximum X and Y values of any points. Call these minX, maxX, minY and maxY. At least one of your rectangles will need to have its left, right, bottom and top edges, respectively, at those levels.
Because minx, maxX, minY and maxY must be assigned to one of the two rectangles, and there are exactly 2^4 = 16 ways to do this, you can try each of the four possible assignments with the remaining coordinates assigned as above. This gives an O(n^4) algorithm: O(n) to get minX, maxX, minY and maxY, and then O(n^4) to fill in the four unassigned variables for each of 16 assignments of minX, maxX, minY and maxY to the eight edge coordinates.
We have so far ignored the requirement that rectangles not overlap. To accommodate that, we must ensure at least one of the following four conditions holds true:
a horizontal line at Y coordinate h with top1 <= h <= bottom2
a horizontal line at Y coordinate h with top2 <= h <= bottom1
a vertical line at X coordinate w with right1 <= h <= left2
a vertical line at X coordinate w with right2 <= h <= left1
The two rectangles overlap if and only if all four of these conditions are simultaneously false. This allows us to skip over candidate solutions, giving a speedup but not changing the asymptotic bound O(n^4). Note that we need to check this condition specifically since, otherwise, optimal solutions might have overlap (exercise: show such an example).
Let's try to shave some more time off of this. Assume we have non-overlapping rectangles by condition #1 above. Then there are n choices for h; we can try each of these n choices and then determine the area of the resulting selections by finding the minimum and maximum coordinates of points in the resulting halves. By trying all n selections for h, we can determine the "best case" vertical split. We need not try condition #2, since the only difference is in the ordering of the rectangles which is arbitrary. We must also try condition #3 with a horizontal split. This suggests an O(n^2) algorithm:
For each point, choose h = point.y
Separate the points into groups with point.y <= h and point.y > h.
Find the minimum and maximum X and Y coordinates of both subsets of points.
Compute the sum of the areas of the two rectangles.
Remember the minimum area obtained from the above and the corresponding h.
Repeat, but using w and X coordinates.
Determine whether minimum area was obtained for a vertical or horizontal split
Return the corresponding rectangles as the answer
Can we do even better? This is O(n^2) and not O(n) because for each choice of h and w we need to then find the minimum and maximum coordinates of each subgroup. This assumes a linear scan through both subgroups. We don't actually need to do this for the min and max X/Y coordinates when scanning horizontally/vertically, since those will be known. What we need is a solution to this problem:
Given n points and a value h, find the maximum X coordinate of any point whose Y coordinate is no greater than h.
The obvious solution I give above is O(n^2), but you might be able to find an O(n log n) solution by clever application of sorting or maybe even an O(n) solution by some even more clever method. I will not attempt this.
Our solution is O(n^2); the theoretically optimal solution is Omega(n) since you must at least look at all the points. So we're pretty close but there is room for improvement.

Tangents range for all pairs of points in a box

Suppose i have a box with a lot of points. I need to be able to calculate min and max angles for all lines which go through all possible pairs of the points. I can do it in O(n^2) times by just enumerating every point with all others. But is there faster algorithm?
Taking the idea of dual plane proposed by Evgeny Kluev, and my comment about finding left-most intersection point, I'll try to give an equivalent direct solution without any dual space.
The solution is simple: sort your points by (x, y) lexicographically. Now draw a line through each two adjacent points in the sorted order. It can be proved that the minimal angle is achieved by one of these lines. In order to get maximal angle, you need to sort by (x, -y) lexicographically, and also check only adjacent pairs of points.
Let's prove by the idea for min angle. Consider the two points A and B which yield the minimal possible angle. Among such points we can choose the pair with minimal difference of x coordinates.
Suppose that they have same y. If there is no other point between them, then they are adjacent. If there are any points between them, then clearly at least one of them is adjacent to A in our order, and all of them yield the same angle.
Suppose that there exists a point P with x-coordinate in between A and B, i.e. Ax < Px < Bx. If P lies on AB, then AP has same angle but less difference of x coordinates, hence a contradiction. When P is not on AB, then either AP or PB would give you less angle, which also gives contradiction.
Now we have points A and B lying on two adjacent vertical lines. There are no other points between these lines. If A and B are the only points on their vertical lines, then the AB pair is clearly adjacent in sorted order and QED. If there many points on these lines, obviously the minimal angle is achieved by taking the highest point on the left vertical line (which must be A) and the lowest point on the right vertical line (which must be B). Since we sort points of equal x by y, these two points are also adjacent.
Sort the points (or use hash map) to find out if there are any horizontal lines.
Then solve this problem on dual plane. Here you only need to find the leftmost and the rightmost intersection points. Use binary searches to find a pair of horizontal coordinates such that all intersection points are between them. (You could quickly find approximate results just by continuing binary searches from these coordinates).
Then sort lines according to their tangents on dual plane. And for pairs of adjacent lines in this sorted order find intersections closest to those horizontal coordinates. This does not guarantee good complexity in the worst case (when some lines on primal plane are almost horizontal). But in most cases time complexity would be determined by sorting: O(N log N) + O(binary_search_complexity).

Given a set of rectangles, do any overlap?

Given a set of rectangles represented as tuples (xmin, xmax, ymin, ymax) where xmin and xmax are the left and right edges, and ymin and ymax are the bottom and top edges, respectively - is there any pair of overlapping rectangles in the set?
A straightforward approach is to compare every pair of rectangles for overlap, but this is O(n^2) - it should be possible to do better.
Update: xmin, xmax, ymin, ymax are integers. So a condition for rectangle 1 and rectangle 2 to overlap is xmin_2 <= xmax_1 AND xmax_2 >= xmin_1; similarly for the Y coordinates.
If one rectangle contains another, the pair is considered overlapping.
You can do it in O(N log N) approach the following way.
Firstly, "squeeze" your y coordinates. That is, sort all y coordinates (tops and bottoms) together in one array, and then replace coordinates in your rectangle description by its index in a sorted array. Now you have all y's being integers from 0 to 2n-1, and the answer to your problem did not change (in case you have equal y's, see below).
Now you can divide the plane into 2n-1 stripes, each unit height, and each rectangle spans completely several of them. Prepare an segment tree for these stripes. (See this link for segment tree overview.)
Then, sort all x-coordinates in question (both left and right boundaries) in the same array, keeping for each coordinate the information from which rectangle it comes and whether this is a left or right boundary.
Then go through this list, and as you go, maintain list of all the rectangles that are currently "active", that is, for which you have seen a left boundary but not right boundary yet.
More exactly, in your segment tree you need to keep for each stripe how many active rectangles cover it. When you encounter a left boundary, you need to add 1 for all stripes between a corresponding rectangle's bottom and top. When you encounter a right boundary, you need to subtract one. Both addition and subtraction can be done in O(log N) using the mass update (lazy propagation) of the segment tree.
And to actually check what you need, when you meet a left boundary, before adding 1, check, whether there is at least one stripe between bottom and top that has non-zero coverage. This can be done in O(log N) by performing a sum on interval query in segment tree. If the sum on this interval is greater than 0, then you have an intersection.
squeeze y's
sort all x's
t = segment tree on 2n-1 cells
for all x's
r = rectangle for which this x is
if this is left boundary
if t.sum(r.bottom,>0 // O(log N) request
you have occurence
t.add(r.bottom,, 1) // O(log N) request
t.subtract(r.bottom, // O(log N) request
You should implement it carefully taking into account whether you consider a touch to be an intersection or not, and this will affect your treatment of equal numbers. If you consider touch an intersection, then all you need to do is, when sorting y's, make sure that of all points with equal coordinates all tops go after all bottoms, and similarly when you sort x's, make sure that of all equal x's all lefts go before all rights.
Why don't you try a plane sweep algorithm? Plane sweep is a design paradigm widely used in computational geometry, so it has the advantage that it is well studied and a lot of documetation is available online. Take a look at this. The line segment intersection problem should give you some ideas, also the area of union of rectangles.
Read about Bentley-Ottman algorithm for line segment intersection, the problem is very similar to yours and it has O((n+k)logn) where k is the number of intersections, nevertheless, since your rectangles sides are parallel to the x and y axis, it is way more simpler so you can modify Bentley-Ottman to run in O(nlogn +k) since you won't need to update the event heap, since all intersections can be detected once the rectangle is visited and won't modify the sweep line ordering, so no need to mantain the events. To retrieve all intersecting rectangles with the new rectangle I suggest using a range tree on the ymin and ymax for each rectangle, it will give you all points lying in the interval defined by the ymin and ymax of the new rectangle and thus the rectangles intersecting it.
If you need more details you should take a look at chapter two of M. de Berg, et. al Computational Geometry book. Also take a look at this paper, they show how to find all intersections between convex polygons in O(nlogn + k), it might prove simpler than my above suggestion since all data strcutures are explained there and your rectangles are convex, a very good thing in this case.
You can do better by building a new list of rectangles that do not overlap. From the set of rectangles, take the first one and add it to the list. It obviously does not overlap with any others because it is the only one in the list. Take the next one from the set and see if it overlaps with the first one in the list. If it does, return true; otherwise, add it to the list. Repeat for all rectangles in the set.
Each time, you are comparing rectangle r with the r-1 rectangles in the list. This can be done in O(n*(n-1)/2) or O((n^2-n)/2). You can even apply this algorithm to the original set without having to create a new list.

Algorithm to take the union of rectangles and to see if the union is still a rectangle

I have a problem in which I have to test whether the union of given set of rectangles forms
a rectangle or not. I don't have much experience solving computational geometry problems.
What my approach to the problem was that since I know the coordinates of all the rectangles, I can easily sort the points and then deduce the corner points of the largest rectangle possible. Then I could sweep a line and see if all the points on the line falls inside the rectangle. But, this approach is flawed and this would fail because the union may be in the form of a 'U'.
I would be a great help if you could push me in the right direction.
Your own version does not take into account that the edges of the rectangles can be non-parallel to each other. Therefore, there might not be "largest rectangle possible".
I would try this general approach:
1) Find the convex hull. You can find convex hull calculation algorithms here
2) Check if the convex hull is a rectangle. You can do this by looping through all the points on convex hull and checking if they all form 180 or 90 degree angles. If they do not, union is not a rectangle.
3) Go through all points on the convex hull. For each point check if the middle point between ThisPoint and NextPoint lies on the edge of any initially given rectangle.
If every middle point does, union is a rectangle.
If it does not, union is not a rectangle.
Complexity would be O(n log h) for finding convex hull, O(h) for the second part and O(h*n) for third part, where h is number of points on the convex hull.
If the goal is to check if the resulting object is a filled rectangle, not only edges and corners rectangle then add step (4).
4) Find all line segments that are formed by intersecting or touching rectangles. Note - by definition all of these line segments are segments of edges of given rectangles. If a rectangle does not touch/intersect other rectangles, the line segments are it's edges.
For each line segment check if it's middle point is
On the edge of the convex hull
Inside one of given rectangles
On the edge of two non-overlapping given rectangles.
If at least one of these is true for every line segment, resulting object is a filled rectangle.
You could deduce the he corner points of the largest rectangle possible, and then go over all the rectangle that share the border with the largest possible rectangle, for example the bottom, and make sure that the line is entirely contained in their borders. This will also fail if an empty space in the middle of the rectangle is a problem, however. I think the complexity will be O(n2).
I think you are on the right direction. After you get the coordinates of largest possible rectangle,
If the largest possible rectangle is a valid rectangle, then each side of it must be union of sides of original rectangles. You can scan the original rectangle set, find those rectangles that is a part of the largest side we are looking for (this can be done in O(n) by checking if X==largestRectangle.Top.X if you are looking at top side, etc.), lets call them S.
For each side s in S we can create an interval [from,to]. All we need to check is whether the union of all intervals matches the side of the largest Rectangle. This can be done in O(nlog(n)) by standard algorithms, or on average O(n) by some hash trick (see , see my last comment (of the last comment) there for the O(n) algorithm ).
For example, say we got two 1*1 rectangles in the first quadrant, there left bottom coordinates are (0,0) and (1,0). Largest rectangle is 2*1 with left bottom coordinate (0,0). Since [0,1] Union [1,2] is [0,2], top side and bottom side match the largest rectangle, similar for left and right side.
Now suppose we got an U shape. 3*1 at (0,0), 1*1 at (0,1), 1*1 at (2,1), we got largest rectangle 3*2 at (0,0). Since for the top side we got [0,1] Union [1,3] does not match [0,3], the algorithm will output the union of above rectangles is not a rectangle.
So you can do this in O(n) on average, or O(nlog(n)) at least if you don't want to mess with some complex hash bucket algorithm. Much better than O(n^4)!
Edit: We have a small problem if there exists empty space somewhere in the middle of all rectangles. Let me think about it....
Edit2: An easy way to detect empty space is for each corner of a rectangle which is not a point on the largest rectangle, we go outward a little bit for all four directions (diagonal) and check if we are still in any rectangle. This is O(n^2). (Which ruins my beautiful O(nlog(n))! Can anyone can come up a better idea?
I haven't looked at a similar problem in the past, so there maybe far more efficient ways of doing it. The key problem is that you cannot look at containment of one rectangle in another in isolation since they could be adjacent but still form a rectangle, or one rectangle could be contained within multiple.
You can't just look at the projection of each rectangle on to the edges of the bounding rectangle unless the problem allows you to leave holes in the middle of the rectangle, although that is probably a fast initial check that could be performed before the following exhaustive approach:
Running through the list once, calculating the minimum and maximum x and y coordinates and the area of each rectangle
Create an input list containing your input rectangles ordered by descending size.
Create a work list containing the bounding rectangle initially
While there are rectangles in the work list
Take the largest rectangle in the input list R
Create an empty list for fragments
for each rectangle r in the work list, intersect r with R, splitting r into a rectangular portion contained within R (if any) and zero or more rectangles not within R. If r was split, discard the portion contained within R and add the remaining rectangles to the fragment list.
add the contents of the fragment list to the work list
Assuming your rectangles are aligned to the coordinate axis:
Given two rectangles A, B, you can make a function that subtracts B from A returning a set of sub-rectangles of A (that may be the empty set): Set = subtract_rectangle(A, B)
Then, given a set of rectangles R for which you want to know if their union is a rectangle:
Calculate a maximum rectangle Big that covers all the rectangles as ((min_x,min_y)-(max_x,max_y))
make the set S contain the rectangle Big: S = (Big)
for every rectangle B in R:
S1 = ()
for evey rectangle A in S:
S1 = S1 + subtract_rectangle(A, B)
S = S1
if S is empty then the union of the rectangles is a rectangle.
End, S contains the parts of Big not covered by any rectangle from R
If the rectangles are not aligned to the coordinate axis you can use a similar algorithm but that employs triangles instead of rectangles. The only issues are that subtracting triangles is not so simple to implement and that handling numerical errors can be difficult.
A simple approach just came to mind: If two rectangles share an edge[1], then together they form a rectangle which contains both - either the rectangles are adjacent [][ ] or one contains the other [[] ].
So if the list of rectangles forms a larger rectangle, then all you need it to repeatedly iterate over the rectangles, and "unify" pairs of them into a single larger one. If in one iteration you can unify none, then it is not possible to create any larger rectangle than you already have, with those pieces; otherwise, you will keep "unifying" rectangles until a single is left.
[1] Share, as in they have the same edge; it is not enough for one of them to have an edge included in one of the other's edges.
Since efficiency seems to be a problem, you could probably speed it up by creating two indexes of rectangles, one with the larger edge size and another with the smaller edge size.
Then compare the edges with the same size, and if they are the same unify the two rectangles, remove them from the indexes and add the new rectangle to the indexes.
You can probably speed it up by not moving to the next iteration when you unify something, but to proceed to the end of the indexes before reiterating. (Stopping when one iteration does no unifications, or there is only one rectangle left.)
Additionally, the edges of a rectangle resulting from unification are by analysis always equal or larger than the edges of the original rectangles.
So if the indexes are ordered by ascending edge size, the new rectangle will be inserted in either the same position as you are checking or in positions yet to be checked, so each unification will not require an extra iteration cycle. (As the new rectangle will assuredly not unify with any rectangle previously checked in this iteration, since its edges are larger than all edges checked.)
For this to hold, in each step of a particular iteration you need to attempt unification on the next smaller edge from either of the indexes:
If you're in index1=3 and index2=6, you check index1 and advance that index;
If next edge on that index is 5, next iteration step will be in index1=5 and index2=6, so it will check index1 and advance that index;
If next edge on that index is 7, next iteration step will be in index1=7 and index2=6, so it will check index2 and advance that index;
If next edge on that index is 10, next iteration step will be in index1=7 and index2=10, so it will check index1 and advance that index;
[A ][B ]
[C ][D ]
A can be unified with B, C with D, and then AB with CD. One left, ABCD, thus possible.
[A ][B ]
[C ][D ]
A can be unified with B, C with D, but AB cannot be unified with CD. 2 left, AB and CD, thus not possible.
[A ][B ]
[C ][D [E]]
A can be unified with B, C with D, CD with E, CDE with AB. 1 left, ABCDE, thus possible.
[A ][B ]
[C ][D ][E]
A can be unified with B, C with D, CD with AB, but not E. 2 left, ABCD and E, thus not possible.
If a rectangle is contained in another but does not share a border, this approach will not unify them.
A way to address this is, when one hits an iteration that does not unify anything and before concluding that it is not possible to unify the set of rectangles, to get the rectangle with the widest edge and discard from the indexes all others that are contained within this largest rectangle.
This still does not address two situations.
First, consider the situation where with this map:
we have rectangles ACGE and BDFH. These rectangles share no edge and are not contained, but form a larger rectangle.
Second, consider the situation where with this map:
we have rectangles ABIJ, CDHG and EHLI. They do not share edges, are not contained within each-other, and no two of them can be unified into a single rectangle; but form a rectangle, in total.
With these pitfalls this method is not complete. But it can be used to greatly reduce the complexity of the problem and reduce the number of rectangles to analyse.
Gather up all the x-coordinates in a list, and sort them. From this list, create a sequence of adjacent intervals. Do the same thing for the y-coordinates. Now you've got two lists of intervals. For each pair of intervals (A=[x1,x2] from the x-list, B=[y1,y2] from the y-list), make their product rectangle A x B = (x1,y1)-(x2,y2)
If every single product rectangle is contained in at least one of your initial rectangles, then the union must be a rectangle.
Making this efficient (I think I've offered about an O(n4) algorithm) is a different question entirely.
As jva stated, "Your own version does not take into account that the edges of the rectangles can be non-parallel to each other." This answer also assumes "parallel" rectangles.
If you have a grid as opposed to needing infinite precision, depending on the number and sizes of the rectangles and the granularity of the grid, it might be feasible to brute-force it.
Just take your "largest rectangle possible" and test all its points to see whether each point is in at least one of the smaller rectangles.
I finally was able to find the impressive javascript project (thanks to github search :) !)
Also have a look into my answer here with other interesting projects
General case, thinking in images:
| outer_rect - union(inner rectangles) |
Check that result is zero

Dividing a plane of points into two equal halves

Given a 2 dimensional plane in which there are n points. I need to generate the equation of a line that divides the plane such that there are n/2 points on one side and n/2 points on the other.
I have assumed the points are distinct, otherwise there might not even be such a line.
If points are distinct, then such a line always exists and is possible to find using a deterministic O(nlogn) time algorithm.
Say the points are P1, P2, ..., P2n. Assume they are not all on the same line. If they were, then we can easily form the splitting line.
First translate the points so that all the co-ordinates (x and y) are positive.
Now suppose we magically had a point Q on the y-axis such that no line formed by those points (i.e. any infinite line Pi-Pj) passes through Q.
Now since Q does not lie within the convex hull of the points, we can easily see that we can order the points by a rotating line passing through Q. For some angle of rotation, half the points will lie on one side and the other half will lie on the other of this rotating line, or, in other words, if we consider the points being sorted by the slope of the line Pi-Q, we could pick a slope between the (median)th and (median+1)th points. This selection can be done in O(n) time by any linear time selection algorithm without any need for actually sorting the points.
Now to pick the point Q.
Say Q was (0,b).
Suppose Q was collinear with P1 (x1,y1) and P2 (x2,y2).
Then we have that
(y1-b)/x1 = (y2-b)/x2 (note we translated the points so that xi > 0).
Solving for b gives
b = (x1y2 - y1x2)/(x1-x2)
(Note, if x1 = x2, then P1 and P2 cannot be collinear with a point on the Y axis).
Consider |b|.
|b| = |x1y2 - y1x2| / |x1 -x2|
Now let the xmax be the x-coordinate of the rightmost point and ymax the co-ordinate of the topmost.
Also let D be the smallest non-zero x-coordinate difference between two points (this exists, as not all xis are same, as not all points are collinear).
Then we have that |b| <= xmax*ymax/D.
Thus, pick our point Q (0,b) to be such that |b| > b_0 = xmax*ymax/D
D can be found in O(nlogn) time.
The magnitude of b_0 can get quite large and we might have to deal with precision issues.
Of course, a better option is to pick Q randomly! With probability 1, you will find the point you need, thus making the expected running time O(n).
If we could find a way to pick Q in O(n) time (by finding some other criterion), then we can make this algorithm run in O(n) time.
Create an arbitrary line in that plane. Project each point onto that line a.k.a for each point, get the closest point on that line to that point.
Order those points along the line in either direction, and choose a point on that line such that there is an equal number of points on the line in either direction.
Get the line perpendicular to the first line which passes through that point. This line will have half the original points on either side.
There are some cases to avoid when doing this. Most importantly, if all the point are themselves on a single line, don't choose a perpendicular line which passes through it. In fact, choose that line itself so you don't have to worry about projecting the points. In terms of the actual mathematics behind this, vector projections will be very useful.
This is a modification of Dividing a plane of points into two equal halves which allows for the case with overlapping points (in which case, it will say whether or not the answer exists).
If number of points is odd, return "impossible".
Pick a random line (two random points)
Project all points onto this line (`O(N)` operation)
(i.e. we pretend this line is our new X'-axis, and write down the
X'-coordinate of each point)
Perform any median-finding algorithm on the X'-coordinates
(`O(N)` or faster-if-desired operation)
(returns 2 medians if no overlapping points)
Return the line perpendicular to original random line that splits the medians
In rare case of overlapping points, repeat a few times (it would take
a pathological case to prevent a solution from existing).
This is O(N) unlike other proposed solutions.
Assuming a solution exists, the above method will probably terminate, though I don't have a proof.
Try the above algorithm a few times unless you detect overlapping points. If you detect a high number of overlapping points, you may be in for a rough ride, but there is a terribly inefficient brute-force solution that involves checking all possible angles:
For every "critical slope range", perform the above algorithm
by choosing a line with a slope within the range.
If all critical slope ranges fail, the solution is impossible.
A critical angle is defined as the angle which could possibly change the result (imagine the solution to a previous answer, rotate the entire set of points until one or more points swaps position with one or more other points, crossing the dividing line. There are only finitely many of these, and I think they are bounded by the number of points, so you're probably looking at something in the range O(N^2)-O(N^2 log(N)) if you have overlapping points, for a brute-force approach.
I'd guess that a good way is to sort/sequence/order the points (e.g. from left to right), and then choose a line which passes through (or between) the middle point[s] in the sequence.
There are obvious cases where no solution is possible. E.g. when you have three heaps of points. One point at location A, Two points at location B, and five points at location C.
If you expect some decent distribution, you can probably get a good result with tlayton's algorithm. To select the initial line slant, you could determine the extent of the whole point set, and choose the angle of the largest diagonal.
The median equally divides a set of numbers in the manner similar to what you're trying to accomplish, and it can be computed in O(n) time using a selection algorithm (the writeup in Cormen et al is better, so you may want to look there instead). So, find the median of your x values Mx (or your y values My if you prefer) and set x = Mx (or y = My) and that line will be axially aligned and split your points equally.
If the nature of your problem requires that no more than one point lies on the line (if you have an odd number of points in your set, at least one of them will be on the line) and you discover that's what's happened (or you just want to guard against the possibility), rotate all of your points by some random angle, θ, and compute the median of the rotated points. You then rotate the median line you computed by -θ and it will evenly divide points.
The likelihood of randomly choosing θ such that the problem manifests itself again is very small with a finite number of points, but if it does, try again with a different θ.
Here is how I approach this problem (with the assumption that n is even and NO three points are collinear):
1) Pick up the point with smallest Y value. Let's call it point P.
2) Take this point as the new origin point, so that all other points will have positive Y values after this transformation.
3) For every other point (there are n - 1 points remaining), think it under the polar coordinate system. Each other point can be represented with a radius and angle. You could ignore the radius and just focus on the angle.
4) How can you find a line that split the points evenly? Find the median of (n - 1) angles. The line from point P to the point with that median angle will split the points evenly.
Time complexity for this algorithm is O(n).
I dont know how useful this is I have seen a similar problem...
If you already have the directional vector (aka the coefficients of the dimensions of your plane).
You can then find two points inside that plane, and by simply using the midpoint formula you can find the midpoint of that plane.
Then using the coefficients of that plane and the midpoint you can find a plane that is from equal distance from both points, using the general equation of a plane.
A line then would constitute in expressing one variable in terms of the other
so you would find a line with equal distance between both planes.
There are different methods of doing this such as projection using the distance equation from a plane but I believe that would complicate your math a lot.
To add to M's answer: a method to generate a Q (that's not so far away) in O(n log n).
To begin with, let Q be any point on the y-axis ie. Q = (0,b) - some good choices might be (0,0) or (0, (ymax-ymin)/2).
Now check if there are two points (x1, y1), (x2, y2) collinear with Q. The line between any point and Q is y = mx + b; since b is constant, this means two points are collinear with Q if their slopes m are equal. So determine the slopes mi for all points and check if there are any duplicates: (amoritized O(n) using a hash-table)
If all the m's are distinct, we're done; we found Q, and M's algorithm above generates the line in O(n) steps.
If two points are collinear with Q, we'll move Q up just a tiny amount ε, Qnew = (0, b + ε), and show that Qnew will not be collinear with two other points.
The criterion for ε, derived below, is:
ε < mminΔ*xmin
To begin with, our m's look like this:
mi = yi/xi - b/xi
Let's find the minimum difference between any two distinct mi and call it mminΔ (O(n log n) by, for instance, sorting then comparing differences between mi and i+1 for all i)
If we fudge b up by ε, the new equation for m becomes:
mi,new = yi/xi - b/xi - ε/xi
= mi,old - ε/xi
Since ε > 0 and xi > 0, all m's are reduced, and all are reduced by a maximum of ε/xmin. Thus, if
ε/xmin < mminΔ, ie.
ε < mminΔ*xmin
is true, then two mi which were previously unequal will be guaranteed to remain unequal.
All that's left is to show that if m1,old = m2,old, then m1,new =/= m2,new. Since both mi were reduced by an amount ε/xi, this is equivalent to showing x1 =/= x2. If they were equal, then:
y1 = m1,oldx1 + b = m2,oldx2 + b = y2
Contradicting our assumption that all points are distinct. Thus, m1, new =/= m2, new, and no two points are collinear with Q.
I picked up the idea from Moron and andand and
continued to form a deterministic O(n) algorithm.
I also assumed that the points are distinct and
n is even (thought the algorithm can be
changed so that uneven n with one point
on the dividing line are also supported).
The algorithm tries to divide the points with a vertical line between them. This only fails if the points in the middle have the same x value. In that case the algorithm determines how many points with the same x value have to be on the left and lower site and and accordingly rotates the line.
I'll try to explain with an example.
Let's asume we have 16 points on a plane.
First we need to get the point with the 8th greatest x-value
and the point with the 9th greatest x-value.
With a selection algorithm this is possible in O(n),
as pointed out in another answer.
If the x-value of that points is different, we are done.
We create a vertical line between that two points and
that splits the points equal.
Problematically now is if the x-values are equal. So we have 3 sets of points.
That on the left side (x < xa), in the middle (x = xa)
and that on the right side (x > xa).
The idea now is to count the points on the left side and
calculate how many points from the middle needs to go there,
so that half of the points are on that side. We can ignore the right side here
because if we have half of the points on the left side, the over half must be on the right side.
So let's asume we have we have 3 points (c=3) on the left side,
6 in the middle and 7 on the right side
(the algorithm doesn't know the count from the middle or right side,
because it doesn't need it, but we could also determine it in O(n)).
We need 8-3=5 points from the middle to go on the left side.
The points we already got in the first step are useless now,
because they are only determined by the x-value
and can be any of the points in the middle.
We want the 5 points from the middle with the lowest y-value on the left side and
the point with the highest y-value on the right side.
Again using the selection algorithm, we get the point with the 5th greatest y-value
and the point with the 6th greatest y-value.
Both points will have the x-value equal to xa,
else we wouldn't get to this step,
because there would be a vertical line.
Now we can create the point Q in the middle of that two points.
Thats one point from the resulting line.
Another point is needed, so that no points from the left or right side are divided.
To get that point we need the point from the left side,
that has the lowest angle (bh) between the the vertical line at xa
and the line determined by that point and Q.
We also need that point from the right side (with angle ag).
The new point R is between the point with the lower angle
and a point on the vertical line
(if the lower angle is on the left side a point above Q
and if the lower angle is on the right side a point below Q).
The line determined by Q and R divides the points in the middle
so that there are a even number of points on both sides.
It doesn't divide any points on the left or right side,
because if it would that point would have a lower angle and
would have been choosen to calculate R.
From the view of a mathematican that should work well in O(n).
For computer programs it is fairly easy to find a case
where precision becomes a problem. An example with 4 points would be
A(0, 100000000), B(0, 100000001), C(0, 0), D(0.0000001, 0).
In this example Q would be (0, 100000000.5) and R (0.00000005, 0).
Which gives B and C on the left side and A and D on the right side.
But it is possible that A and B are both on the dividing line,
because of rounding errors. Or maybe only one of them.
So it belongs to the input values if this algorithm suits to the requirements.
get that two points Pa(xa, ya) and Pb(xb, yb)
which are the medians based on the x values > O(n)
if xa != xb you can stop here
because a y-axis parallel line between that two points is the result > O(1)
get all points where the x value equals xa > O(n)
count points with x value less than xa as c > O(n)
get the lowest point Pc based on the y values from the points from 3. > O(n)
get the greatest point Pd based on the y values from the points from 3. > O(n)
get the (n/2-c)th greatest point Pe based on the y values from the points from 3. > O(n)
also get the next greatest point Pf based on the y values from the points from 3. > O(n)
create a new point Q (xa, (ye+yf)/2)
between Pe and Pf > O(1)
for all points Pi calculate
the angle ai between Pc, Q and Pi and
the angle bi between Pd, Q and Pi > O(n)
get the point Pg with the lowest angle ag (with ag>0° and ag<180°) > O(n)
get the point Ph with the lowest angle bh (with bh>0° and bh<180°) > O(n)
if there aren't any Pg or Ph (all points have same x value)
create a new point R (xa+1, 0) anywhere but with a different x value than xa
else if ag is lower than bh
create a new point R ((xc+xg)/2, (yc+yg)/2) between Pc and Pg
create a new point R ((xd+xh)/2, (yd+yh)/2) between Pd and Ph > O(1)
the line determined by Q and R divides the points > O(1)
