Find minimum number of triangles enclosing all points in the point cloud - algorithm

Input
You have a points list which represents a 2D point cloud.
Output
You have to generate a list of triangles (should be as less as possible triangles) so the following restrictions are fulfilled:
Each point from the cloud should be a vertex of a triangle or be
inside a triangle.
Triangles can be build only on the points from
the original point cloud.
Triangles should not intersect with each
other.
One point of the cloud can be a vertex for several triangles.
If triangle vertex lies on the side of another triangle we assume such triangles do not intersect.
If point lies on the side of triangle we assume the point is inside a triangle.
For example
Investigation
I invented the way to find a convex hull of given set of points and divide that convex hull into triangles but it is not right solution.
Any guesses how to solve it?

Here is my opinion.
Create a Delaunay Triangulation of the point cloud.
Do a Mesh Simplification by Half Edge Collapse.
For step 1, the boundary of the triangulation will be the convex hull. You can also use a Constrained Delaunay Triangulation (CDT) if you need to honor a non-convex boundary.
For step 2 half-edge collapse operation is going to preserve existing vertices, so no new vertices will be added. Note that in your case the collapses are not removing vertices, they are only removing edges. Before applying an edge collapse you should check that you are not introducing triangle inversions (which produce self intersection) and that no point is outside a triangle. The order of collapses matter but you can follow the usual rule that measures the "cost" of collapses in terms of introducing poor quality triangles (i.e. triangles with acute angles). So you should choose collapses that produce the most isometric triangles as possible.
Edit:
The order of collapses guide the simplification to different results. It can be guided by other criteria than minimize acute angles. I think the most empty triangles can be minimized by choosing collapses that produce triangles most filled vs triangles most empty. Still all criteria are euristics.

Some musings about triangles and convex hulls
Ignoring any set with 2 or less points and 3 points gives always gives 1 triangle.
Make a convex hull.
Select any random internal point.
unless all points are in hull ...
All point in the hull must be part of an triangle as they per definition of convex hull can't be internal.
Now we have an upper bound of triangles, namely the number of points in the hull.
An upper bound is also number of points / 3 rounded up as you can make that many independent triangles.
so the upper bound is the minimum of the two above
We can also guess at the lower bound roundup(hull points / 3) as each 3 neighboring points can make a triangle and any surplus can reuse 1-2.
Now the difficult part reducing the upper bound
walk through the inner points using them as center for all triangles.
If any triangle is empty we can save a triangle by removing the hull edge.
if two or more adjacent triangles are empty we will have to keep every other triangle or join the 3 points to a new triangle, as the middle point can be left out.
note the best result.
Is this prof that no better result exist? no.
If there exist a triangle that envelop all remaining points that would this be better.
N = number of points
U = upper bound
L = lower bound
T = set of triangles
R = set of remaining points
A = set of all points
B = best solution
BestSolution(A)
if A < 3 return NoSolution
if A == 3 return A
if not Sorted(A) // O(N)
SortByX(A) // O(n lg n) or radex if possible O(N)
H = ConvexHull(A)
noneHull = A - H
B = HullTriangles(H, noneHull) // removing empty triangles
U = size B
if noneHull == 0
return U // make triangles of 3 successive points in H and add the remaining to the last
if U > Roundup(N/3)
U = Roundup(N/3)
B = MakeIndepenTriangles(A)
AddTriangle(empty, A)
return // B is best solution, size B is number of triangles.
AddTriangle(T, R)
if size T+1 >= U return // no reason to test if we just end up with another U solution
ForEach r in R // O(N)
ForEach p2 in A-r // O(N)
ForEach p3 in A-r-p2 // O(N)
t = Triangle(r, p2, p3)
c = Candidate(t, T, R)
if c < 0
return c+1 // found better solution
return 0
Candidate(t, T, R)
if not Overlap(t, T) // pt. 3, O(T), T < U
left = R-t
left -= ContainedPoints(t) // O(R) -> O(N)
if left is empty
u = U
U = size T + 1
B = T+t
return U-u // found better solution
return AddTriangle(T+t, left)
return 0
So ... total runtime ...
Candidate O(N)
AddTriangle O(N^3)
recursion is limited to the current best solution U
O((N N^3)^U) -> O((N^4)^U)
space is O(U N)
So reducing U before we go to brute force is essential.
- Reducing R quickly should reduce recursion
- so starting with bigger and hopefully more enclosing triangles would be good
- any 3 points in the hull should make some good candidates
- these split the remaining points in 3 parts which can be investigated independently
- treat each part as a hull where its 2 base points are part of a triangle but the 3rd is not in the set.
- if possible make this a BFS so we can select the most enclosing first
- space migth be a problem
- O(H U N)
- else start with points that are a 1/3 around the hull relative to each other first.
AddTriangle really sucks performance so how many triangles can we really make
Selecting 3 out of N is
N!/(N-3)!
And we don't care about order so
N!/(3!(N-3)!)
N!/(6(N-3)!)
N (N-1) (n-2) / 6
Which is still O(N^3) for the loops, but it makes us feel better. The loops might still be faster if the permutation takes too long.
AddTriangle(T, R)
if size T+1 >= U return // no reason to test if we just end up with another U solution
while t = LazySelectUnordered(3, R, A) // always select one from R first O(R (N-1)(N-2) / 6) aka O(N^3)
c = Candidate(t, T, R)
if c < 0
return c+1 // found better solution
return 0

Related

Efficient algorithm to find the largest rectangle from a set of points

I have an array of points, and my goal is to pick two so that I maximize the area of the rectangle formed by the two points (one representing the low left corner and the other one the right top corner).
I could do this in O(n^2) by just doing two for loops and calculating every single possible area, but I think there must be a more efficient solution:
max_area = 0
for p1 in points:
for p2 in points:
area = p2[0]p2[1] + p1[0]p1[1] - p2[1]p1[0] - p2[0]p1[1]
if area > max_area:
max_area = area
It's clear that I want to maximize the area of the second point with the origin (0,0) (so p2[0]p2[1]), but I'm not sure how to go forward with that.
Yes, there's an O(n log n)-time algorithm (that should be matched by an element distinctness lower bound).
It suffices to find, for each p1, the p2 with which it has the largest rectangular area, then return the overall largest. This can be expressed as a 3D extreme point problem: each p2 gives rise to a 3D point (p2[0], p2[1], p2[0] p2[1]), and each p1 gives rise to a 3D vector (-p1[0], -p1[1], 1), and we want to maximize the dot product (technically plus p1[0] p1[1], but this constant offset doesn't affect the answer). Then we "just" have to follow Kirkpatrick's 1983 construction.
Say you have a rectangle formed by four points: A (top left), B (top right), C (bottom right) and D (bottom left).
The idea is to find two points p1 and p2 that are the closest to B and D respectively. This means that p1 and p2 are the furthest possible from each other.
def nearest_point(origin, points):
nearest = None
mindist = dist(origin, points[0])
for p in points[1:]:
d = dist(origin, p)
if mindist > d:
mindist = d
nearest = p
return nearest
Call it for B and D as origins:
points = [...]
p1 = nearest_point(B, points) # one for loop
p2 = nearest_point(D, points) # one for loop
Note that there can be multiples closest points which are equally distant from the origin (B or D). In this case, nearest_point() should return an array of points. You have to do two nested for loops to find the furthest two points.
Divide and conquer.
Note: This algorithm presumes that the rectangle is axis-aligned.
Step 1: Bucket the points into a grid of 4x4 buckets. Some buckets
may get empty.
Step 2: Using the corners of the buckets, calculate
maximum areas by opposite corners between not empty buckets. This may result in
several pairs of buckets, because your work with corners, not points. Notice also that you use left corners for left buckets, and so for bottom, right, top corners for those b,r,t buckets. That's why an even number is used for the size of the grid.
Step 3: Re-bucket each bucket selected in step 2 as a new, smaller, 4x4 grid.
Repeat steps 2 & 3 until you get only a pair of buckets that contain only a point in each bucket.
I have not calculated the complexity of this algorithm. Seems O(n log(n)).

How to find independent points in a unit square in O(n log n)?

Consider a unit square containing n 2D points. We say that two points p and q are independent in a square, if the Euclidean distance between them is greater than 1. A unit square can contain at most 3 mutually independent points. I would like to find those 3 mutually independent points in the given unit square in O(n log n). Is it possible? Please help me.
Can this problem be solved in O(n^2) without using any spatial data structures such as Quadtree, kd-tree, etc?
Use a spatial data structure such as a Quadtree to store your points. Each node in the quadtree has a bounding box and a set of 4 child nodes, and a list of points (empty except for the leaf nodes). The points are stored in the leaf nodes.
The point quadtree is an adaptation of a binary tree used to represent two-dimensional point data. It shares the features of all quadtrees but is a true tree as the center of a subdivision is always on a point. The tree shape depends on the order in which data is processed. It is often very efficient in comparing two-dimensional, ordered data points, usually operating in O(log n) time.
For each point, maintain a set of all points that are independent of that point.
Insert all your points into the quadtree, then iterate through the points and use the quadtree to find the points that are independent of each:
main()
{
for each point p
insert p into quadtree
set p's set to empty
for each point p
findIndependentPoints(p, root node of quadtree)
}
findIndependentPoints(Point p, QuadTreeNode n)
{
Point f = farthest corner of bounding box of n
if distance between f and p < 1
return // none of the points in this node or
// its children are independent of p
for each point q in n
if distance between p and q > 1
find intersection r of q's set and p's set
if r is non-empty then
p, q, r are the 3 points -> ***SOLVED***
add p to q's set of independent points
add q to p's set of independent points
for each subnode m of n (up 4 of them)
findIndependentPoints(p, m)
}
You could speed up this:
find intersection r of q's set and p's set
by storing each set as a quadtree. Then you could find the intersection by searching in q's quadtree for a point independent of p using the same early-out technique:
// find intersection r of q's set and p's set:
// r = findMututallyIndependentPoint(p, q's quadtree root)
Point findMututallyIndependentPoint(Point p, QuadTreeNode n)
{
Point f = farthest corner of bounding box of n
if distance between f and p < 1
return // none of the points in this node or
// its children are independent of p
for each point r in n
if distance between p and r > 1
return r
for each subnode m of n (up 4 of them)
findMututallyIndependentPoint(p, m)
}
An alternative to using Quadtrees is using K-d trees, which produces more balanced trees where each leaf node is a similar depth from the root. The algorithm for finding independent points in that case would be the same, except that there would only be up to 2 and not 4 child nodes for each node in the data structure, and the bounding boxes at each level would be of variable size.
You might want to try this out.
Pick the top left point (Y) with coordinate (0,1). Calculate distance from each point from the List to point Y.
Sort the result in increasing order into SortedPointList (L)
If the first point (A) and the last point (B) in list L are independent:
Foreach point P in list L:
if P is independent to both A and B:
Return A, B, P
Pick the top right point (X) with coordinate (1,1). Calculate distance from each point from the List to point X.
Sort the result in increasing order into SortedPointList (S)
If the first point (C) and the last point (D) in list L are independent:
Foreach point O in list S:
if P is independent to both C and D:
Return C, D, O
Return null
This is a wrong solution. Kept it just for comments. If one finds another solution based on smallest enclosing circle, please put a link as a comment.
Solve the Smallest-circle problem.
If diameter of a circle <= 1, return null.
If the circle is determined by 3 points, check which are "mutually independent". If there are only two of them, try to find the third by iteration.
If the circle is determined by 2 points, they are "mutually independent". Try to find the third one by iteration.
Smallest-sircle problem can be solved in O(N), thus the whole problem complexity is also O(N).

Designing a linear time algorithm to check whether an n-vertex polygon is convex

And also i shouldn't assume that the polygon is non-intersecting
I know that a convex polygon is a non intersecting polygon with the property that all the angles are less than PI in another words , the orientation is always clockwise or anticlockwise.
So i am thinking to run the graham scan which is known to be a linear time algorithm and modify it .
SO here is my algorithm
we sort the vertices by orientation (using determinants)
Select the right most vertex in the Polygon P (call it r)
Let q and p be the next and second next vertex of Polygon P (based on orientation)
while(there is a vertex in the Polygon P)
if orientation(p, q, r) == CW (clock wise , that means we changed directions)
return false
else
r = p
p = q
q = next vertex
return true
is that algorithm accurate (returns false means its not convex)
Indeed, a check whether an angle is less or more than PI can be done in constant time using determinants.
This totals to O (N).
If all the angles have the same orientation, we should still check whether the polygon is also not self-intersecting.
Here, it will suffice to add up all supplementary angles and check whether the sum is 2 * PI.
It will be fine to use floating point and check for approximate equality.
This is also O (N).
However, we should visit the vertices in the order they follow on the polygon border.
If you are given a polygon, you perhaps have that order already.
On the other hand, if you are given just a set of points on the plane, a polygon with vertices in these points is not unique in the general case.
So, there is no need to sort the vertices by anything: not only it is O (N log N), but we also lose important order information by doing so.
We can start from any vertex.
your algorithm does not work
(e.g. if the polygonal chain turns twice around the origin)
see http://hal.inria.fr/inria-00413179/en

Finding a square side length is R in 2D plane ?

I was at the high frequency Trading firm interview, they asked me
Find a square whose length size is R with given n points in the 2D plane
conditions:
--parallel sides to the axis
and it contains at least 5 of the n points
running complexity is not relative to the R
they told me to give them O(n) algorithm
Interesting problem, thanks for posting! Here's my solution. It feels a bit inelegant but I think it meets the problem definition:
Inputs: R, P = {(x_0, y_0), (x_1, y_1), ..., (x_N-1, y_N-1)}
Output: (u,v) such that the square with corners (u,v) and (u+R, v+R) contains at least 5 points from P, or NULL if no such (u,v) exist
Constraint: asymptotic run time should be O(n)
Consider tiling the plane with RxR squares. Construct a sparse matrix, B defined as
B[i][j] = {(x,y) in P | floor(x/R) = i and floor(y/R) = j}
As you are constructing B, if you find an entry that contains at least five elements stop and output (u,v) = (i*R, j*R) for i,j of the matrix entry containing five points.
If the construction of B did not yield a solution then either there is no solution or else the square with side length R does not line up with our tiling. To test for this second case we will consider points from four adjacent tiles.
Iterate the non-empty entries in B. For each non-empty entry B[i][j], consider the collection of points contained in the tile represented by the entry itself and in the tiles above and to the right. These are the points in entries: B[i][j], B[i+1][j], B[i][j+1], B[i+1][j+1]. There can be no more than 16 points in this collection, since each entry must have fewer than 5. Examine this collection and test if there are 5 points among the points in this collection satisfying the problem criteria; if so stop and output the solution. (I could specify this algorithm in more detail, but since (a) such an algorithm clearly exists, and (b) its asymptotic runtime is O(1), I won't go into that detail).
If after iterating the entries in B no solution is found then output NULL.
The construction of B involves just a single pass over P and hence is O(N). B has no more than N elements, so iterating it is O(N). The algorithm for each element in B considers no more than 16 points and hence does not depend on N and is O(1), so the overall solution meets the O(N) target.
Run through set once, keeping the 5 largest x values in a (sorted) local array. Maintaining the sorted local array is O(N) (constant time performed N times at most).
Define xMin and xMax as the x-coordinates of the two points with largest and 5th largest x values respectively (ie (a[0] and a[4]).
Sort a[] again on Y value, and set yMin and yMax as above, again in constant time.
Define deltaX = xMax- xMin, and deltaY as yMax - yMin, and R = largest of deltaX and deltaY.
The square of side length R located with upper-right at (xMax,yMax) meets the criteria.
Observation if R is fixed in advance:
O(N) complexity means no sort is allowed except on a fixed number of points, as only a Radix sort would meet the criteria and it requires a constraint on the values of xMax-xMin and of yMax-yMin, which was not provided.
Perhaps the trick is to start with the point furthest down and left, and move up and right. The lower-left-most point can be determined in a single pass of the input.
Moving up and right in steps and counitng points in the square requries sorting the points on X and Y in advance, which to be done in O(N) time requiress that the Radix sort constraint be met.

Algorithm to find the closest 3 points that when triangulated cover another point

Picture a canvas that has a bunch of points randomly dispersed around it. Now pick one of those points. How would you find the closest 3 points to it such that if you drew a triangle connecting those points it would cover the chosen point?
Clarification: By "closest", I mean minimum sum of distances to the point.
This is mostly out of curiosity. I thought it would be a good way to estimate the "value" of a point if it is unknown, but the surrounding points are known. With 3 surrounding points you could extrapolate the value. I haven't heard of a problem like this before, doesn't seem very trivial so I thought it might be a fun exercise, even if it's not the best way to estimate something.
Your problem description is ambiguous. Which triangle are you after in this figure, the red one or the blue one?
The blue triangle is closer based on lexicographic comparison of the distances of the points, while the red triangle is closer based on the sum of the distances of the points.
Edit: you clarified it to make it clear that you want the sum of distances to be minimized (the red triangle).
So, how about this sketch algorithm?
Assume that the chosen point is at the origin (makes description of algorithm easy).
Sort the points by distance from the origin: P(1) is closest, P(n) is farthest.
Start with i = 3, s = ∞.
For each triple of points P(a), P(b), P(i) with a < b < i, if the triangle contains the origin, let s = min(s, |P(a)| + |P(b)| + |P(i)|).
If s ≤ |P(1)| + |P(2)| + |P(i)|, stop.
If i = n, stop.
Otherwise, increment i and go back to step 4.
Obviously this is O(n³) in the worst case.
Here's a sketch of another algorithm. Consider all pairs of points (A, B). For a third point to make a triangle containing the origin, it must lie in the grey shaded region in this figure:
By representing the points in polar coordinates (r, θ) and sorting them according to θ, it is straightforward to examine all these points and pick the closest one to the origin.
This is also O(n³) in the worst case, but a sensible order of visiting pairs (A, B) should yield an early exit in many problem instances.
Just a warning on the iterative method. You may find a triangle with 3 "near points" whose "length" is greater than another resulting by adding a more distant point to the set. Sorry, can't post this as a comment.
See Graph.
Red triangle has perimeter near 4 R while the black one has 3 Sqrt[3] -> 5.2 R
Like #thejh suggests, sort your points by distance from the chosen point.
Starting with the first 3 points, look for a triangle covering the chosen point.
If no triangle is found, expand you range to include the next closest point, and try all combinations.
Once a triangle is found, you don't necessarily have the final answer. However, you have now limited the final set of points to check. The furthest possible point to check would be at a distance equal to the sum of the distances of the first triangle found. Any further than this, and the sum of the distances is guaranteed to exceed the first triangle that was found.
Increase your range of points to include the last point whose distance <= the sum of the distances of the first triangle found.
Now check all combinations, and the answer is the triangle found from this set with the minimal sum of distances.
second shot
subsolution: (analytic geometry basics, skip if you are familiar with this) finding point of the opposite half-plane
Example: Let's have two points: A=[a,b]=[2,3] and B=[c,d]=[4,1]. Find vector u = A-B = (2-4,3-1) = (-2,2). This vector is parallel to AB line, so is the vector (-1,1). The equation for this line is defined by vector u and point in AB (i.e. A):
X = 2 -1*t
Y = 3 +1*t
Where t is any real number. Get rid of t:
t = 2 - X
Y = 3 + t = 3 + (2 - X) = 5 - X
X + Y - 5 = 0
Any point that fits in this equation is in the line.
Now let's have another point to define the half-plane, i.e. C=[1,1], we get:
X + Y - 5 = 1 + 1 - 5 < 0
Any point with opposite non-equation sign is in another half-plane, which are these points:
X + Y - 5 > 0
solution: finding the minimum triangle that fits the point S
Find the closest point P as min(sqrt( (Xp - Xs)^2 + (Yp - Ys)^2 ))
Find perpendicular vector to SP as u = (-Yp+Ys,Xp-Xs)
Find two closest points A, B from the opposite half-plane to sigma = pP where p = Su (see subsolution), such as A is on the different site of line q = SP (see final part of the subsolution)
Now we have triangle ABP that covers S: calculate sum of distances |SP|+|SA|+|SB|
Find the second closest point to S and continue from 1. If the sum of distances is smaller than that in previous steps, remember it. Stop if |SP| is greater than the smallest sum of distances or no more points are available.
I hope this diagram makes it clear.
This is my first shot:
split the space into quadrants
with picked point at the [0,0]
coords
find the closest point
from each quadrant (so you have 4
points)
any triangle from these
points should be small enough (but not necesarilly the smallest)
Take the closest N=3 points. Check whether the triange fits. If not, increment N by one and try out all combinations. Do that until something fits or nothing does.

Resources