Algorithm to find the closest 3 points that when triangulated cover another point - algorithm

Picture a canvas that has a bunch of points randomly dispersed around it. Now pick one of those points. How would you find the closest 3 points to it such that if you drew a triangle connecting those points it would cover the chosen point?
Clarification: By "closest", I mean minimum sum of distances to the point.
This is mostly out of curiosity. I thought it would be a good way to estimate the "value" of a point if it is unknown, but the surrounding points are known. With 3 surrounding points you could extrapolate the value. I haven't heard of a problem like this before, doesn't seem very trivial so I thought it might be a fun exercise, even if it's not the best way to estimate something.

Your problem description is ambiguous. Which triangle are you after in this figure, the red one or the blue one?
The blue triangle is closer based on lexicographic comparison of the distances of the points, while the red triangle is closer based on the sum of the distances of the points.
Edit: you clarified it to make it clear that you want the sum of distances to be minimized (the red triangle).
So, how about this sketch algorithm?
Assume that the chosen point is at the origin (makes description of algorithm easy).
Sort the points by distance from the origin: P(1) is closest, P(n) is farthest.
Start with i = 3, s = ∞.
For each triple of points P(a), P(b), P(i) with a < b < i, if the triangle contains the origin, let s = min(s, |P(a)| + |P(b)| + |P(i)|).
If s ≤ |P(1)| + |P(2)| + |P(i)|, stop.
If i = n, stop.
Otherwise, increment i and go back to step 4.
Obviously this is O(n³) in the worst case.
Here's a sketch of another algorithm. Consider all pairs of points (A, B). For a third point to make a triangle containing the origin, it must lie in the grey shaded region in this figure:
By representing the points in polar coordinates (r, θ) and sorting them according to θ, it is straightforward to examine all these points and pick the closest one to the origin.
This is also O(n³) in the worst case, but a sensible order of visiting pairs (A, B) should yield an early exit in many problem instances.

Just a warning on the iterative method. You may find a triangle with 3 "near points" whose "length" is greater than another resulting by adding a more distant point to the set. Sorry, can't post this as a comment.
See Graph.
Red triangle has perimeter near 4 R while the black one has 3 Sqrt[3] -> 5.2 R

Like #thejh suggests, sort your points by distance from the chosen point.
Starting with the first 3 points, look for a triangle covering the chosen point.
If no triangle is found, expand you range to include the next closest point, and try all combinations.
Once a triangle is found, you don't necessarily have the final answer. However, you have now limited the final set of points to check. The furthest possible point to check would be at a distance equal to the sum of the distances of the first triangle found. Any further than this, and the sum of the distances is guaranteed to exceed the first triangle that was found.
Increase your range of points to include the last point whose distance <= the sum of the distances of the first triangle found.
Now check all combinations, and the answer is the triangle found from this set with the minimal sum of distances.

second shot
subsolution: (analytic geometry basics, skip if you are familiar with this) finding point of the opposite half-plane
Example: Let's have two points: A=[a,b]=[2,3] and B=[c,d]=[4,1]. Find vector u = A-B = (2-4,3-1) = (-2,2). This vector is parallel to AB line, so is the vector (-1,1). The equation for this line is defined by vector u and point in AB (i.e. A):
X = 2 -1*t
Y = 3 +1*t
Where t is any real number. Get rid of t:
t = 2 - X
Y = 3 + t = 3 + (2 - X) = 5 - X
X + Y - 5 = 0
Any point that fits in this equation is in the line.
Now let's have another point to define the half-plane, i.e. C=[1,1], we get:
X + Y - 5 = 1 + 1 - 5 < 0
Any point with opposite non-equation sign is in another half-plane, which are these points:
X + Y - 5 > 0
solution: finding the minimum triangle that fits the point S
Find the closest point P as min(sqrt( (Xp - Xs)^2 + (Yp - Ys)^2 ))
Find perpendicular vector to SP as u = (-Yp+Ys,Xp-Xs)
Find two closest points A, B from the opposite half-plane to sigma = pP where p = Su (see subsolution), such as A is on the different site of line q = SP (see final part of the subsolution)
Now we have triangle ABP that covers S: calculate sum of distances |SP|+|SA|+|SB|
Find the second closest point to S and continue from 1. If the sum of distances is smaller than that in previous steps, remember it. Stop if |SP| is greater than the smallest sum of distances or no more points are available.
I hope this diagram makes it clear.

This is my first shot:
split the space into quadrants
with picked point at the [0,0]
coords
find the closest point
from each quadrant (so you have 4
points)
any triangle from these
points should be small enough (but not necesarilly the smallest)

Take the closest N=3 points. Check whether the triange fits. If not, increment N by one and try out all combinations. Do that until something fits or nothing does.

Related

Efficient algorithm to find the largest rectangle from a set of points

I have an array of points, and my goal is to pick two so that I maximize the area of the rectangle formed by the two points (one representing the low left corner and the other one the right top corner).
I could do this in O(n^2) by just doing two for loops and calculating every single possible area, but I think there must be a more efficient solution:
max_area = 0
for p1 in points:
for p2 in points:
area = p2[0]p2[1] + p1[0]p1[1] - p2[1]p1[0] - p2[0]p1[1]
if area > max_area:
max_area = area
It's clear that I want to maximize the area of the second point with the origin (0,0) (so p2[0]p2[1]), but I'm not sure how to go forward with that.
Yes, there's an O(n log n)-time algorithm (that should be matched by an element distinctness lower bound).
It suffices to find, for each p1, the p2 with which it has the largest rectangular area, then return the overall largest. This can be expressed as a 3D extreme point problem: each p2 gives rise to a 3D point (p2[0], p2[1], p2[0] p2[1]), and each p1 gives rise to a 3D vector (-p1[0], -p1[1], 1), and we want to maximize the dot product (technically plus p1[0] p1[1], but this constant offset doesn't affect the answer). Then we "just" have to follow Kirkpatrick's 1983 construction.
Say you have a rectangle formed by four points: A (top left), B (top right), C (bottom right) and D (bottom left).
The idea is to find two points p1 and p2 that are the closest to B and D respectively. This means that p1 and p2 are the furthest possible from each other.
def nearest_point(origin, points):
nearest = None
mindist = dist(origin, points[0])
for p in points[1:]:
d = dist(origin, p)
if mindist > d:
mindist = d
nearest = p
return nearest
Call it for B and D as origins:
points = [...]
p1 = nearest_point(B, points) # one for loop
p2 = nearest_point(D, points) # one for loop
Note that there can be multiples closest points which are equally distant from the origin (B or D). In this case, nearest_point() should return an array of points. You have to do two nested for loops to find the furthest two points.
Divide and conquer.
Note: This algorithm presumes that the rectangle is axis-aligned.
Step 1: Bucket the points into a grid of 4x4 buckets. Some buckets
may get empty.
Step 2: Using the corners of the buckets, calculate
maximum areas by opposite corners between not empty buckets. This may result in
several pairs of buckets, because your work with corners, not points. Notice also that you use left corners for left buckets, and so for bottom, right, top corners for those b,r,t buckets. That's why an even number is used for the size of the grid.
Step 3: Re-bucket each bucket selected in step 2 as a new, smaller, 4x4 grid.
Repeat steps 2 & 3 until you get only a pair of buckets that contain only a point in each bucket.
I have not calculated the complexity of this algorithm. Seems O(n log(n)).

Find minimum number of triangles enclosing all points in the point cloud

Input
You have a points list which represents a 2D point cloud.
Output
You have to generate a list of triangles (should be as less as possible triangles) so the following restrictions are fulfilled:
Each point from the cloud should be a vertex of a triangle or be
inside a triangle.
Triangles can be build only on the points from
the original point cloud.
Triangles should not intersect with each
other.
One point of the cloud can be a vertex for several triangles.
If triangle vertex lies on the side of another triangle we assume such triangles do not intersect.
If point lies on the side of triangle we assume the point is inside a triangle.
For example
Investigation
I invented the way to find a convex hull of given set of points and divide that convex hull into triangles but it is not right solution.
Any guesses how to solve it?
Here is my opinion.
Create a Delaunay Triangulation of the point cloud.
Do a Mesh Simplification by Half Edge Collapse.
For step 1, the boundary of the triangulation will be the convex hull. You can also use a Constrained Delaunay Triangulation (CDT) if you need to honor a non-convex boundary.
For step 2 half-edge collapse operation is going to preserve existing vertices, so no new vertices will be added. Note that in your case the collapses are not removing vertices, they are only removing edges. Before applying an edge collapse you should check that you are not introducing triangle inversions (which produce self intersection) and that no point is outside a triangle. The order of collapses matter but you can follow the usual rule that measures the "cost" of collapses in terms of introducing poor quality triangles (i.e. triangles with acute angles). So you should choose collapses that produce the most isometric triangles as possible.
Edit:
The order of collapses guide the simplification to different results. It can be guided by other criteria than minimize acute angles. I think the most empty triangles can be minimized by choosing collapses that produce triangles most filled vs triangles most empty. Still all criteria are euristics.
Some musings about triangles and convex hulls
Ignoring any set with 2 or less points and 3 points gives always gives 1 triangle.
Make a convex hull.
Select any random internal point.
unless all points are in hull ...
All point in the hull must be part of an triangle as they per definition of convex hull can't be internal.
Now we have an upper bound of triangles, namely the number of points in the hull.
An upper bound is also number of points / 3 rounded up as you can make that many independent triangles.
so the upper bound is the minimum of the two above
We can also guess at the lower bound roundup(hull points / 3) as each 3 neighboring points can make a triangle and any surplus can reuse 1-2.
Now the difficult part reducing the upper bound
walk through the inner points using them as center for all triangles.
If any triangle is empty we can save a triangle by removing the hull edge.
if two or more adjacent triangles are empty we will have to keep every other triangle or join the 3 points to a new triangle, as the middle point can be left out.
note the best result.
Is this prof that no better result exist? no.
If there exist a triangle that envelop all remaining points that would this be better.
N = number of points
U = upper bound
L = lower bound
T = set of triangles
R = set of remaining points
A = set of all points
B = best solution
BestSolution(A)
if A < 3 return NoSolution
if A == 3 return A
if not Sorted(A) // O(N)
SortByX(A) // O(n lg n) or radex if possible O(N)
H = ConvexHull(A)
noneHull = A - H
B = HullTriangles(H, noneHull) // removing empty triangles
U = size B
if noneHull == 0
return U // make triangles of 3 successive points in H and add the remaining to the last
if U > Roundup(N/3)
U = Roundup(N/3)
B = MakeIndepenTriangles(A)
AddTriangle(empty, A)
return // B is best solution, size B is number of triangles.
AddTriangle(T, R)
if size T+1 >= U return // no reason to test if we just end up with another U solution
ForEach r in R // O(N)
ForEach p2 in A-r // O(N)
ForEach p3 in A-r-p2 // O(N)
t = Triangle(r, p2, p3)
c = Candidate(t, T, R)
if c < 0
return c+1 // found better solution
return 0
Candidate(t, T, R)
if not Overlap(t, T) // pt. 3, O(T), T < U
left = R-t
left -= ContainedPoints(t) // O(R) -> O(N)
if left is empty
u = U
U = size T + 1
B = T+t
return U-u // found better solution
return AddTriangle(T+t, left)
return 0
So ... total runtime ...
Candidate O(N)
AddTriangle O(N^3)
recursion is limited to the current best solution U
O((N N^3)^U) -> O((N^4)^U)
space is O(U N)
So reducing U before we go to brute force is essential.
- Reducing R quickly should reduce recursion
- so starting with bigger and hopefully more enclosing triangles would be good
- any 3 points in the hull should make some good candidates
- these split the remaining points in 3 parts which can be investigated independently
- treat each part as a hull where its 2 base points are part of a triangle but the 3rd is not in the set.
- if possible make this a BFS so we can select the most enclosing first
- space migth be a problem
- O(H U N)
- else start with points that are a 1/3 around the hull relative to each other first.
AddTriangle really sucks performance so how many triangles can we really make
Selecting 3 out of N is
N!/(N-3)!
And we don't care about order so
N!/(3!(N-3)!)
N!/(6(N-3)!)
N (N-1) (n-2) / 6
Which is still O(N^3) for the loops, but it makes us feel better. The loops might still be faster if the permutation takes too long.
AddTriangle(T, R)
if size T+1 >= U return // no reason to test if we just end up with another U solution
while t = LazySelectUnordered(3, R, A) // always select one from R first O(R (N-1)(N-2) / 6) aka O(N^3)
c = Candidate(t, T, R)
if c < 0
return c+1 // found better solution
return 0

Given a convex polygon, add a point and recalculate the area

Assume you have a convex polygon P(defined by an array of points p), and a set of points S(all of them outside of P), how do you choose a point s in S such that it increases the most the area of P.
Example
I have a O(|P|) formula to calculate the area of the polygon, but I can't do this for every point in S given that
3 ≤ |P|, |S| ≤ 10^5
The big dots are the points in S
No 3 points in P u S are collinear
Given fixed points p = (px, py), q = (qx, qy) and a variable point s = (sx, sy), the signed area of the triangle ∆pqs is
|px py 1|
½ |qx qy 1|
|sx sy 1| ,
which is a linear polynomial in sx, sy.
One approach is to compute cumulative sums of these polynomials where p, q are the edges in clockwise order. Use binary search to find the sublist of edges that remain in the convex hull with a given point s, add the polynomials, and evaluate for s.
You have a method to calculate the exact area that is added by a point n (and David Eisenstat posted another), but their complexity depends on the number of sides of the polygon. Ideally you'd have a method that can quickly approximate the additional area, and you'd only have to run the exact method for a limited number of points.
As Paul pointed out in a comment, such an approximation should give a result that is consistently larger than the real value; this way, if the approximation tells you that a point adds less area than the current maximum (and with randomly ordered input this will be true for a large majority of points), you can discard it without needing the exact method.
The simplest method would be one where you only measure the distance from each point to one point in the polygon; this could be done e.g. like this:
Start by calculating the area of the polygon, and then find the smallest circle that contains the whole polygon, with center point c and radius r.
Then for each point n, calculate the distance d from n to c, and approximate the additional area as:
the triangle with area r × (d - r)
plus the rectangle with area 2 × r 2 (pre-calculated)
plus the half circle with area r × π (pre-calculated)
minus the area of the polygon (pre-calculated)
This area is indicated in blue on the image below, with the real additional area slightly darker and the excess area added by the approximation slightly lighter:
So for each point, you need to calculate a distance using √ ((xn - xc)2 + (yn - yc)2) and then multiply this distance by a constant and add a constant.
Of course, the precision of this approximation depends on how irregular the shape of the polygon is; if it does not resemble a circle at all, you may be better off creating a larger simple polygon (like a triangle or rectangle) that contains the original polygon, and use the precise method on the larger polygon as an approximation.
UPDATE
In a simple test where the polygon is a 1x1 square in the middle of a 100x100 square space, with 100,000 points randomly placed around it, the method described above reduces the number of calls to the precise measuring function from 100,000 to between 150 and 200, and between 10 and 20 of these calls result in a new maximum.
While writing the precise measuring function for the square I used in the test, I realised that using an axis-aligned rectangle instead of a circle around the polygon leads to a much simpler approximation method:
Create a rectangle around the polygon, with sides A and B and center point c, and calculate the areas of the rectangle and the polygon. Then, for each point n, the approximation of the additional area is the sum of:
the triangle with base A and height abs(yn - yc) - B/2
the triangle with base B and height abs(xn - xc) - A/2
the area of the rectangle minus the area of the polygon
(If the point is above, below or next to the rectangle, then one of the triangles has a height < 0, and only the other triangle is added.)
So the steps needed for the approximation are:
abs(xn - xc) × X + abs(yn - yc) × Y + Z
where X, Y and Z are constants, i.e. 2 subtractions, 2 additions, 2 multiplications and 2 absolute values. This is even simpler than the circle method, and a rectangle is also better suited for oblong polygons. The reduction in the number of calls to the precise measuring function should be similar to the test results mentioned above.

Maximum minimum manhattan distance

Input:
A set of points
Coordinates are non-negative integer type.
Integer k
Output:
A point P(x, y) (in or not in the given set) whose manhattan distance to closest is maximal and max(x, y) <= k
My (naive) solution:
For every (x, y) in the grid which contain given set
BFS to find closest point to (x, y)
...
return maximum;
But I feel it run very slow for a large grid, please help me to design a better algorithm (or the code / peseudo code) to solve this problem.
Should I instead of loop over every (x, y) in grid, just need to loop every median x, y
P.S: Sorry for my English
EDIT:
example:
Given P1(x1,y1), P2(x2,y2), P3(x3,y3). Find P(x,y) such that min{dist(P,P1), dist(P,P2),
dist(P,P3)} is maximal
Yes, you can do it better. I'm not sure if my solution is optimal, but it's better than yours.
Instead of doing separate BFS for every point in the grid. Do a 'cumulative' BFS from all the input points at once.
You start with 2-dimensional array dist[k][k] with cells initialized to +inf and zero if there is a point in the input for this cell, then from every point P in the input you try to go in every possible direction. The further you are from the start point the bigger integer you put in the array dist. If there is a value in dist for a specific cell, but you can get there with a smaller amount of steps (smaller integer) you overwrite it.
In the end, when no more moves can be done, you scan the array dist to find the cell with maximum value. This is your point.
I think this would work quite well in practice.
For k = 3, assuming 1 <= x,y <= k, P1 = (1,1), P2 = (1,3), P3 = (2,2)
dist would be equal in the beginning
0, +inf, +inf,
+inf, 0, +inf,
0, +inf, +inf,
in the next step it would be:
0, 1, +inf,
1, 0, 1,
0, 1, +inf,
and in the next step it would be:
0, 1, 2,
1, 0, 1,
0, 1, 2,
so the output is P = (3,1) or (3,3)
If K is not large enough and you need to find a point with integer coordinates, you should do, as another answer suggested - Calculate minimum distances for all points on the grid, using BFS, strarting from all given points at once.
Faster solution, for large K, and probably the only one which can find a point with float coordinates, is as following. It has complexity of O(n log n log k)
Search for resulting maximum distance using dihotomy. You have to check if there is any point inside the square [0, k] X [0, k] which is at least given distance away from all points in given set. Suppose, you can check that fast enough for any distance. It is obvious, that if there is such point for some distance R, there always will be some point for all smaller distances r < R. For example, the same point would go. Thus you can search for maximum distance using binary search procedure.
Now, how to fast check for existence (and also find) a point which is at least r units away from all given points. You should draw "Manhattan spheres of radius r" around all given points. These are set of points at most r units away from given point. They are tilted by 45 degrees squares with diagonal equal to 2r. Now turn the picture by 45 degrees, and all squares will be parallel to the axis. Now you can check for existence of any point outside such squares using sweeping line algorithm. You have to sort all vertical edges of squares, and then process them one by one from left to right. Left borders will add segment mark to sweeping line, Left borders will erase it. And you have to check if there is any non marked point on the line. You can implement it using segment tree. Then, you have to check if there is any non marked point on the line inside the initial square [0,k]X[0,k].
So, again, overall solution will be binary search for r. Inside of it you will have to check if there is any point at least r units away from all given points. Do that by constructing "manhattans spheres of radius r" and then scanning them with a diagonal line from left-top corner to right-bottom. While moving line you should store number of opened spheres at each point at the line in the segment tree. between opening and closing of any spheres, line does not change, and if there is any free point there, it means, that you found it for distance r.
Binary search contributes log k to complexity. Each checking procedure is n log n for sorting squares borders, and n log k (n log n?) for processing them all.
Voronoi diagram would be another fast solution and could also find non integer answer. But it is much much harder to implement even for Manhattan measure.
First try
We can turn a 2D problem into a 1D problem by projecting onto the lines y=x and y=-x. If the points are (x1,y1) and (x2,y2) then the manhattan distance is abs(x1-x2)+abs(y1-y2). Change coordinate to a u-v system with basis U = (1,1), V = (1,-1). Coords of the two points in this basis are u1 = (x1-y1)/sqrt(2), v1= (x1+y1), u2= (x1-y1), v2 = (x1+y1). And the manhatten distance is the largest of abs(u1-u2), abs(v1-v2).
How this helps. We can just work with the 1D u-values of each points. Sort by u-value, loop through points and find the largest difference between pains of points. Do the same of v-values.
Calculating u,v coords of O(n), quick sorting is O(n log n), looping through sorted list is O(n).
Alas does not work well. Fails if we have point (-10,0), (10,0), (0,-10), (0,10). Lets try a
Voronoi diagram
Construct a Voronoi diagram
using Manhattan distance. This can be calculate in O(n log n) using https://en.wikipedia.org/wiki/Fortune%27s_algorithm
The vertices in the diagram are points which have maximum distance from its nearest vertices. There is psudo-code for the algorithm on the wikipedia page. You might need to adapt this for Manhattan distance.

Minimize maximum manhattan distance of a point to a set of points

For 3 points in 2D :
P1(x1,y1),
P2(x2,y2),
P3(x3,y3)
I need to find a point P(x,y), such that the maximum of the manhattan distances
max(dist(P,P1),
dist(P,P2),
dist(P,P3))
will be minimal.
Any ideas about the algorithm?
I would really prefer an exact algorithm.
There is an exact, noniterative algorithm for the problem; as Knoothe pointed out, the Manhattan distance is rotationally equivalent to the Chebyshev distance, and P is trivially computable for the Chebyshev distance as the mean of the extreme coordinates.
The points reachable from P within the Manhattan distance x form a diamond around P. Therefore, we need to find the minimum diamond that encloses all points, and its center will be P.
If we rotate the coordinate system by 45 degrees, the diamond is a square. Therefore, the problem can be reduced to finding the smallest enclosing square of the points.
The center of a smallest enclosing square can be found as the center of the smallest enclosing rectangle (which is trivially computed as the max and min of the coordinates). There is an infinite number of smallest enclosing squares, since you can shift the center along the shorter edge of the minimum rectangle and still have a minimal enclosing square. For our purposes, we can simply use the one whose center coincides with the enclosing rectangle.
So, in algorithmic form:
Rotate and scale the coordinate system by assigning x' = x/sqrt(2) - y/sqrt(2), y' = x/sqrt(2) + y/sqrt(2)
Compute x'_c = (max(x'_i) + min(x'_i))/2, y'_c = (max(y'_i) + min(y'_i))/2
Rotate back with x_c = x'_c/sqrt(2) + y'_c/sqrt(2), y_c = - x'_c/sqrt(2) + y'_c/sqrt(2)
Then x_c and y_c give the coordinates of P.
If an approximate solution is okay, you could try a simple optimization algorithm. Here's an example, in Python
import random
def opt(*points):
best, dist = (0, 0), 99999999
for i in range(10000):
new = best[0] + random.gauss(0, .5), best[1] + random.gauss(0, .5)
dist_new = max(abs(new[0] - qx) + abs(new[1] - qy) for qx, qy in points)
if dist_new < dist:
best, dist = new, dist_new
print new, dist_new
return best, dist
Explanation: We start with the point (0, 0), or any other random point, and modify it a few thousand times, each time keeping the better of the new and the previously best point. Gradually, this will approximate the optimum.
Note that simply picking the mean or median of the three points, or solving for x and y independently does not work when minimizing the maximum manhattan distance. Counter-example: Consider the points (0,0), (0,20) and (10,10), or (0,0), (0,1) and (0,100). If we pick the mean of the most separated points, this would yield (10,5) for the first example, and if we take the median this would be (0,1) for the second example, which both have a higher maximum manhattan distance than the optimum.
Update: Looks like solving for x and y independently and taking the mean of the most distant points does in fact work, provided that one does some pre- and postprocessing, as pointed out by thiton.

Resources