Find the point minimizing the distance from a set of N lines - algorithm

Given Multiple (N) lines in 3d space, find the point minimizing the distance to all lines.
Given that the Shortest distance between a line [aX + b] and a point [P] will be on the perpendicular line [aX+b]ā€“[P] I can express the minimal squared distance as the sum of squared line distances, eg. ([aX+b]ā€“[P])^2 +ā€¦+ ([aX+b]nā€“[P])^2 .
Since the lines are perpendicular I can use Dot Product to express [P] in the line terms
I have considered using Least Squares for estimating the point minimizing the distance, the problem is that the standard least squares will approximate the best fitting line/curve given a set of points, What I need is the opposite, given a set of lines estimate the best fitting point.
How should this be approached ?

From wikipedia, we read that the squared distance between line a'x + b = 0 and point p is (a'p+b)^2 / (a'a). We can therefore see that the point that minimizes the sum of squared distances is a weighted linear regression problem with one observation for each line. The regression model has the following properties:
Sample data a for each line ax+b=0
Sample outcome -b for each line ax+b=0
Sample weight 1/(a'a) for each line ax+b=0
You should be able to solve this problem with any standard statistical software.

An approach:
form the equations giving the distance from the point to each line
these equations give you N distances
optimize the set of distances by the criterion you want (least squares, minimax, etc.)
This reduces into a simple optimization question once you have the N equations. Of course, the difficulty of the last step depends heavily on the criterion you choose (least squares is simple, minimax not that simple.)
One thing that might help you forward is to find the simplest form of equation giving the distance from a point to line. Your thinking is correct in your #1, but you will need to think a bit more (or then check "distance from a point to line" with any search engine).

I have solved the same problem using hill climbing. Consider a single point and 26 neighbours step away from it(points on a cube centered at the current point). If the distance from the point is better than the distance from all neighbours, divide step by 2, otherwise make the neighbor with best distance new current point. Continue until step is small enough.

Following is solution using calculus :-
F(x,y) = sum((y-mix-ci)^2/(1+mi^2))
Using Partial differentiation :-
dF(x,y)/dx = sum(2*(y-mix-ci)*mi/(1+mi^2))
dF(x,y)/dy = sum(2*(y-mix-ci)/(1+mi^2))
To Minimize F(x,y) :-
dF(x,y)/dy = dF(x,y)/dx = 0
Use Gradient Descent using certain learning rate and random restarts to solve find minimum as much as possible

You can apply the following answer (which talks about finding the point that is closest to a set of planes) to this problem, since just as a plane can be defined by a point on the plane and a normal to the plane, a line can be defined by a point the line passes through and a "normal" vector orthogonal to the line:
https://math.stackexchange.com/a/3483313/365886
You can solve the resulting quadratic form by observing that the solution to 1/2 x^T A x - b x + c is x_min = A^{-1} b.

Related

Build a linear approximation for an unknown function

I have some unknown function f(x), I am using matlab to calculate 2000 points on the function graph. I need a piecewise linear function g containing 20 to 30 segments, and it fits best to the original function, how could I do this in an acceptable way? The possible solution space is impossible to traverse and can't think of a good heuristic function to effectively shrink it.
Here is the code from which the function is derived:
x = sym('x', 'real');
inventory = sym('inventory', 'real');
demand = sym('demand', 'real');
f1 = 1/(sqrt(2*pi))*(-x)*exp(-(x - (demand - inventory)).^2./2);
f2 = 20/(sqrt(2*pi))*(x)*exp(-(x - (demand - inventory)).^2./2);
expectation_expression = int(f1, x, -inf, 0) + int(f2, x, 0, inf);
Depending on what your idea of a good approximation is, there may be a dynamic programming solution for this.
For example, given 2000 points and corresponding values, we wish to find the piecewise linear approximation with 20 segments which minimizes the sum of squared deviations between the true value at each point and the result of the linear approximation.
Work along the 2000 points from left to right, and at each point calculate for i=1 to 20 the total error from the far left to that point for the best piecewise linear approximation using i segments.
You can work out the values at position n+1 using the values calculated for points to the left of that position - points 1..n. For each value of i, consider all points to its left - say points j < n+1. Work out the error contributions resulting from a linear segment running from point j to point n+1. Add to that the value you have worked out for the best possible error using i-1 segments at point j (or possibly point j-1 depending on exactly you you define your piecewise linear approximation). If you now take the minimum such value over all possible j, you have calculated the error from the best possible piecewise linear approximation using i segments for the first n+1 points.
When you have worked out the best value for the first 2000 points using 20 segments you have solved the problem, and you can work back along this table to find out where the segments are - or, if this is inconvenient, you can save extra information as you go along to make this easier.
I believe similar approaches will minimize the sum of absolute deviations, or minimize the maximum deviation at any point, subject to you being able to solve the corresponding problems for a single line. I have implicitly assumed you can fit a straight line to minimize the sum of squared errors, which is of course a standard sum of squares line fit. Minimizing the absolute deviations from a straight lines is an exercise in convex optimization which I would attempt by repeatedly weighted least squares. Minimizing the maximum absolute deviation is linear programming.

Count acute triangles

You're given coordinates (x,y) of N different points in plane. Count acute triangles formed with given points. Brute-force is simple, but it's complexity is binomial coefficient (n,3). Does anyone have faster idea?
Here is my solution:
Start with one point p1. Now, find the slopes of lines formed with the current point and all other points. Sort the points accordingly.
Consider the first point a1 in this array. Let the slope be denoted by m. Now, find the slope of the line that is perpendicular to this line.
m_p = tan-1 ( 90 + tan(m))
Perform a binary search for m_p in the array and take the index of that slope that is less than or equal to m_p. This gives the count of tuples which form acute angle in which two of the points are p1 and a1. Now, consider the next point in the array and do the same operation.
Repeat the above procedure for each and every point.
Time Complexity Analysis:
For sorting: O(NlogN)
For Binary Search: logN. Repeating this on every point in the array takes O(NlogN)
Repeating the above steps for each point takes:
O(N*NlogN) = O(N2logN)
EDIT:
I thought tan is increasing function in range (0,2*PI). Its better to find the angle each line makes with the positive x-axis and then sort them w.r.t these values. Now, consider for each point pi, the number of points between angle ai and ai+90. When you consider these points, the angle made at the Main point is always acute.
This will cover all the cases no matter what the angle is.
EDIT 2:
My solution is only half-correct. It only ensures that angle at the Main point is acute. But doesn't guarantee at the other two points.
What you need to do in addition to the above procedure is form another set of points (xi,yi) for each point. Where xi denotes the angle of the line between main point and x-axis and yi denotes the distance between the Main Point and the current one.
Construct a kd-tree with these new set of points. Now for each point ai, search in the kd-tree for those points whose angle lies between mi and mi+90, and whose distance lies between 0 and distance between ai and Main_Point.
This additional constraint forces the other two angles to be acute. I leave it as an exercise to resolve it.
Now the time complexity is: O(N2logN) in average case and O(N3) in the worst case(because of the kd-tree we are using).

Parabola fitting with two given points and a cost function

Suppose that there is a parabola Y = aX^2 + bX + c, and it might be rotated as follow:
X = x.sin(phi) + y.cos(phi)
Y = x.cos(phi) - y.sin(phi)
phi = rotation angle
We wish to fit it on a border (e.g. inner border of an eyelid, figure below). The problem is that how we can change the parabola in each iteration such that it minimizes a cost function. We know that the parabola can be in different rotation and its origin may vary in the search region. Note that the there are two given points which the fitted parabola should passes through them (e.g. the white squares in fig below). So, In each iteration we can compute a, b and c by the two given points and the origin point (three equations and three variables).
The question is how we can reach the goal in minimum iteration (not to test all the possibilities, i.e. all angles and all positions in the search region).
Any idea will be appreciated.
#woodchips: I think this is a programming problem, and he asked a solution for the implementation.I definitely disagree with you.
A possible solution would be to first search along the vertical line which is orthogonal to the line between the two given points. And also you can vary the angle in this interval. As the nature of your problem (the border of eyelid), you can limit the angle variation between -pi/4 and pi/4. After you find the minimum cost for a position in this vertical line, you can search along the horizontal line and do similar tasks.
Why not use regression to fit a parabola to several points in the target shape? Then you could use which ever algorithm you wanted to get an approximate solution. Newton's method converges pretty fast. The optimization here is on the coefficients in the approximate parabolas.

How to find the point most distant from a given set and its bounding box

I have a bounding box, and a number of points inside of it. I'd like to add another point whose location is farthest away from any previously-added points, as well as far away from the edges of the box.
Is there a common solution for this sort of thing? Thanks!
Here is a little Mathematica program.
Although it is only two lines of code (!) you'll probably need more in a conventional language, as well as a math library able to find maximum of functions.
I assume you are not fluent in Mathematica, so I'll explain and comment line by line.
First we create a table with 10 random points in {0,1}x{0,1}, and name it p.
p = Table[{RandomReal[], RandomReal[]}, {10}];
Now we create a function to maximize:
f[x_, y_] = Min[ x^2,
y^2,
(1 - x)^2,
(1 - y)^2,
((x - #[[1]])^2 + (y - #[[2]])^2) & /# p];
Ha! Syntax got tricky! Let's explain:
The function gives you for any point in {0,1}x{0,1} the minimum distance from that point to our set p AND the edges. The first four terms are the distances to the edges and the last (difficult to read, I know) is a set containing the distance to all points.
What we will do next is maximizing this function, so we will get THE point where the minimum distance to our targets in maximal.
But first lets take a look at f[]. If you look at it critically, you'll see that it is not really the distance, but the distance squared. I defined it so, because that way the function is much easier to maximize and the results are the same.
Also note that f[] is not a "pretty" function. If we plot it in {0,1}, we get something like:
That's why you will need a nice math package to find the maximum.
Mathematica is such a nice package, that we can maximize the thing straightforward:
max = Maximize[{f[x, y], {0 <= x <= 1, 0 <= y <= 1}}, {x, y}];
And that is it. The Maximize function returns the point, and the squared distance to its nearest border/point.
HTH! If you need help translating to another language, leave a comment.
Edit
Although I'm not a C# person, after looking for references in SO and googling, came to this:
One candidate package is DotNumerics
You should follow the following example provided in the package:
file: \DotNumerics Samples\Samples\Optimization.cs
Example header:
[Category("Constrained Minimization")]
[Title("Simplex method")]
[Description("The Nelder-Mead Simplex method. ")]
public void OptimizationSimplexConstrained()
HTH!
The name of the problem you're solving is the largest empty sphere problem.
It can easily be solved in O(n^4) time in the plane. Just consider all O(n^3) triples of points and compute their circumcenter. One of these points is your desired point. (Well, in your case, you also have to consider "a side" as one of your three points, so you not only find circumcenters but slightly more general points, like ones equidistant from two points and a side.)
As the Wikipedia link above indicates, the problem can also be solved in O(n log n) time by computing a Voronoi diagram. More specifically, then your desired point is the circumcenter of one of the triangles in the Delaunay triangulation of your points (which is the dual of the Voronoi diagram), of which there are only O(n). (Again, to adapt exactly to your problem, you'll have to consider the effects of the sides of the box.)

Minimum area triangle from a given set of points

Given a set of n points, can we find three points that describe a triangle with minimum area in O(n^2)? If yes, how, and if not, can we do better than O(n^3)?
I have found some papers that state that this problem is at least as hard as the problem that asks to find three collinear points (a triangle with area 0). These papers describe an O(n^2) solution to this problem by reducing it to an instance of the 3-sum problem. I couldn't find any solution for what I'm interested in however. See this (look for General Position) for such a paper and more information on 3-sum.
There are O(n2) algorithms for finding the minimum area triangle.
For instance you can find one here: http://www.cs.tufts.edu/comp/163/fall09/CG-lecture9-LA.pdf
If I understood that pdf correctly, the basic idea is as follows:
For each pair of points AB you find the point that is closest to it.
You construct a dual of the points so that lines <-> points.
Line y = mx + c is mapped to point (m,c)
In the dual, for a given point (which corresponds to a segment in original set of points) the nearest line vertically gives us the required point for 1.
Apparently 2 & 3 can be done in O(n2) time.
Also I doubt the papers showed 3SUM-hardness by reducing to 3SUM. It should be the other way round.
There's an algorithm that finds the required area with complexity O(n^2*log(n)).
For each point Pi in set do the following(without loss of generality we can assume that Pi is in the origin or translate the points to make it so).
Then for each points (x1,y1), (x2,y2) the triangle area will be 0.5*|x1*y2-x2*y1| so we need to minimize that value. Instead of iterating through all pairs of remaining points (which gives us O(N^3) complexity) we sort those points using predicate X1 * Y2 < X2 * Y1. It is claimed that to find triangle with minimal area we need to check only the pairs of adjacent points in the sorted array.
So the complexity of this procedure for each point is n*log(n) and the whole algorithm works in O(n^2*log(n))
P.S. Can't quickly find the proof that this algorithm is correct :(, hope will find it it later and post it then.
The problem
Given a set of n points, can we find three points that describe a triangle with minimum area in O(n^2)? If yes, how, and if not, can we do better than O(n^3)
is better resolved in this paper: James King, A Survey of 3sum-Hard Problems, 2004

Resources