Parabola fitting with two given points and a cost function - algorithm

Suppose that there is a parabola Y = aX^2 + bX + c, and it might be rotated as follow:
X = x.sin(phi) + y.cos(phi)
Y = x.cos(phi) - y.sin(phi)
phi = rotation angle
We wish to fit it on a border (e.g. inner border of an eyelid, figure below). The problem is that how we can change the parabola in each iteration such that it minimizes a cost function. We know that the parabola can be in different rotation and its origin may vary in the search region. Note that the there are two given points which the fitted parabola should passes through them (e.g. the white squares in fig below). So, In each iteration we can compute a, b and c by the two given points and the origin point (three equations and three variables).
The question is how we can reach the goal in minimum iteration (not to test all the possibilities, i.e. all angles and all positions in the search region).
Any idea will be appreciated.

#woodchips: I think this is a programming problem, and he asked a solution for the implementation.I definitely disagree with you.
A possible solution would be to first search along the vertical line which is orthogonal to the line between the two given points. And also you can vary the angle in this interval. As the nature of your problem (the border of eyelid), you can limit the angle variation between -pi/4 and pi/4. After you find the minimum cost for a position in this vertical line, you can search along the horizontal line and do similar tasks.

Why not use regression to fit a parabola to several points in the target shape? Then you could use which ever algorithm you wanted to get an approximate solution. Newton's method converges pretty fast. The optimization here is on the coefficients in the approximate parabolas.

Related

How to find the farthest x, y coordinates from many points?

I have a set of random points in a plane and I want to put another point at the most "sparse" position.
For example, if there are some points in 0 < x < 10 and 0 < y < 10:
# this python snippet just generates the plot blow.
import matplotlib.pyplot as plt
# there are actually a lot more, ~10000 points.
xs = [8.36, 1.14, 0.93, 8.55, 7.49, 6.55, 5.13, 8.49, 0.15, 3.48]
ys = [0.65, 6.32, 2.04, 0.51, 4.5, 7.05, 1.07, 5.23, 0.66, 2.54]
plt.xlim([0, 10])
plt.ylim([0, 10])
plt.plot(xs, ys, 'o')
plt.show()
Where should I put a new point in this plane so that the new point becomes the most distant from the others? Please note that I want to maximise the minimum distance to another point, but not to maximise the average distance to all other points (Thanks to user985366's comment).
"How can I find the farthest point from a set of existing points?" is the one at least I could find, but I'm not sure if the page directly solves my situation (actually the linked case looks more complicated than my case).
[edit] By the way, I noticed that general constrained global optimization can find a possible solution (if I add a point at each corner) [4.01, 5.48] in this case, but I think it doesn't work if there are a lot more, say ~10000 points.
Your problem can be solved by computing the Voronoi diagram of the set of points. This is a division of the plane into regions such that there is one region per point in the original set, and within that region, the corresponding point is closer than other points from the set.
The boundaries of these regions are straight lines such that any point on that line is equidistant from the two points corresponding to the regions which meet at that boundary. The vertices where multiple boundaries meet are therefore equidistant to at least three points from the original set.
The sparsest point in the plane is either a vertex in the Voronoi diagram, or the intersection of an edge in the Voronoi diagram with the boundary of the plane, or one of the corners of the plane. The Voronoi diagram can be computed by standard algorithms in O(n log n) time; after this, the sparsest point can then be found in linear time, since you know which Voronoi regions each vertex/edge is adjacent to, and hence which point from the original set to measure the distance to.

Find the diameter of a set of n points in d-dimensional space

I am interesting in finding the diameter of two points sets, in 128 dimensions. The first has 10000 points and the second 1000000. For that reason I would like to do something better than the naive approach which takes O(n²). The algorithm will be able to handle any number of points and dimensions, but I am currently very interested in these two particular data sets.
I am very interesting in gaining speed over accuracy, thus, based on this, I would find the (approximate) bounding box of the point set, by computing the min and max value per coordinate, thus O(n*d) time. Then, if I find the diameter of this box, the problem is solved.
In the 3d case, I could find the diameter of the one side, since I know the two edges and then, I could apply the Pythagorean theorem on the other, which is vertical to this side. I am not sure for this however and for sure, I can't see how to generalize it to d dimensions.
An interesting answer can be found here, but it seems to be specific for 3 dimensions and I want a method for d dimensions.
Interesting paper: On computing the diameter of a point set in high dimensional Euclidean space. Link. However, implementing the algorithm seems too much for me in this phase.
The classic 2-approximation algorithm for this problem, with running time O(nd), is to choose an arbitrary point and then return the maximum distance to another point. The diameter is no smaller than this value and no larger than twice this value.
I would like to add a comment, but not enough reputation for that...
I just want to warn other readers that the "bounding box" solution is very inaccurate. Take for example the Euclidean ball of radius one. This set has diameter two, but its bounding box is [-1, 1]^d, which has diameter twice the square root of d. For d = 128, this is already a very bad approximation.
For a crude estimate, I would stay with David Eisenstat's answer.
There is a precision based algorithm which performs very well on any dimension, which is based on computing the dimension of an axial bounding box.
The idea is that it's possible to find the lower and upper boundaries of the axis bounding box length function since it's partial derivatives are limited, and depend on the angle between the axises.
The limit of the local maxima derivatives between two axises in 2d space can be computed as:
sin(a/2)*(1 + tan(a/2))
That means that, for example, for 90deg between axises the boundary is 1.42 (sqrt(2))
Which reduces to a/2 when a => 0, so the upper boundary is proportional to the angle.
For a multidimensional case the formula varies slightly, but still it's easy to compute.
So, the search of local minima convolves in logarithmic time.
The good news is that we can run the search of such local maxima in parallel.
Also, we can filter out both the regions of the search based on the best achieved result so far, as well as the points themselves, which are belo the lower limit of the search in the worst region.
The worst case of the algorithm is where all of the points are placed on the surface of a sphere.
This can be firther improved: when we detect a local search which operates on just few points, we swap to bruteforce for this particular axis. It works fast, because we need only the points which are subject to that particular local search, which can be determined as points actually bound by two opposite spherical cones of a particular angle sharing the same axis.
It's hard to figure out the big O notation, because it depends on desired precision and the distribution of points (bad when most of the points are on a sphere's surface).
The algorithm i use is here:
Set the initial angle a = pi/2.
Take one axis for each dimension. The angle and the axises form the initial 'bucket'
For each axis, compute the span on that axis by projecting all the points onto the axis, and finding min and max of the coordinates on the axis.
Compute the upper and lower bounds of the diameter which is interesting. It's based on the formula: sin(a/2)*(1 + tan(a/2)) and multiplied by assimetry cooficient, computed from the length of the current axis projections.
For the next step, kill all of the points which fall under the lower bound in each dimension at the same time.
For each exis, If the amount of points above the upper bound is less then some reasonable amount (experimentally computed) then compute using a bruteforce (N^2) on the set of the points in question, and adjust the lower bound, and kill the axis for the next step.
For the next step, Kill all of the axises, which have all of their points under the lower bound.
If the precision is satisfactory (upper bound - lower bound) < epsilon, then return the upper bound as the result.
For all of the survived axises, there is a virtual cone on that axis (actually, the two opposite cones), which covers some area on a virtual sphere which encloses a face of the cube. If i'm not mistaken, it's angle would be a * sqrt(2). Set the new angle to a / sqrt(2). Create a whole bucket of new axises (2 * number of dimensions), so the new cone areas would cover the initial cone area. It's the hard part for me, as i have not enough imagination for n>3-dimensional case.
Continue from step (3).
You can paralellize the procedure, synchronizing the limits computed so far for the points from (5) through (7).
I'm going to summarize the algorithm proposed by Timothy Shields.
Pick random point x.
Pick point y furthest from x.
If not done, let x = y, and go to step 2
The more times you repeat, the more accurate the result will be... ??
EDIT: actually this algorithm is not very good. Think about a 2D rectangle with vertices ABCD. There are two maxima: between AC and BD, which are separated by a sizable valley. This algorithm will get stuck at one or the other 50/50. If AC is slightly larger than BD, you'll be getting the wrong answer 50% of the time no matter how many times you iterate. Other regular polygons have the same issue, and in higher dimensions it is even worse.

Find the point minimizing the distance from a set of N lines

Given Multiple (N) lines in 3d space, find the point minimizing the distance to all lines.
Given that the Shortest distance between a line [aX + b] and a point [P] will be on the perpendicular line [aX+b]–[P] I can express the minimal squared distance as the sum of squared line distances, eg. ([aX+b]–[P])^2 +…+ ([aX+b]n–[P])^2 .
Since the lines are perpendicular I can use Dot Product to express [P] in the line terms
I have considered using Least Squares for estimating the point minimizing the distance, the problem is that the standard least squares will approximate the best fitting line/curve given a set of points, What I need is the opposite, given a set of lines estimate the best fitting point.
How should this be approached ?
From wikipedia, we read that the squared distance between line a'x + b = 0 and point p is (a'p+b)^2 / (a'a). We can therefore see that the point that minimizes the sum of squared distances is a weighted linear regression problem with one observation for each line. The regression model has the following properties:
Sample data a for each line ax+b=0
Sample outcome -b for each line ax+b=0
Sample weight 1/(a'a) for each line ax+b=0
You should be able to solve this problem with any standard statistical software.
An approach:
form the equations giving the distance from the point to each line
these equations give you N distances
optimize the set of distances by the criterion you want (least squares, minimax, etc.)
This reduces into a simple optimization question once you have the N equations. Of course, the difficulty of the last step depends heavily on the criterion you choose (least squares is simple, minimax not that simple.)
One thing that might help you forward is to find the simplest form of equation giving the distance from a point to line. Your thinking is correct in your #1, but you will need to think a bit more (or then check "distance from a point to line" with any search engine).
I have solved the same problem using hill climbing. Consider a single point and 26 neighbours step away from it(points on a cube centered at the current point). If the distance from the point is better than the distance from all neighbours, divide step by 2, otherwise make the neighbor with best distance new current point. Continue until step is small enough.
Following is solution using calculus :-
F(x,y) = sum((y-mix-ci)^2/(1+mi^2))
Using Partial differentiation :-
dF(x,y)/dx = sum(2*(y-mix-ci)*mi/(1+mi^2))
dF(x,y)/dy = sum(2*(y-mix-ci)/(1+mi^2))
To Minimize F(x,y) :-
dF(x,y)/dy = dF(x,y)/dx = 0
Use Gradient Descent using certain learning rate and random restarts to solve find minimum as much as possible
You can apply the following answer (which talks about finding the point that is closest to a set of planes) to this problem, since just as a plane can be defined by a point on the plane and a normal to the plane, a line can be defined by a point the line passes through and a "normal" vector orthogonal to the line:
https://math.stackexchange.com/a/3483313/365886
You can solve the resulting quadratic form by observing that the solution to 1/2 x^T A x - b x + c is x_min = A^{-1} b.

finding saddle points in 3d heightmap

Given a 3d heightmap (from a laser scanner), how do I find the saddle points?
I.e. given something like this:
I am looking for all points where the curvature is positive in one direction and negative in the other.
(These directions should not need to be aligned with the X and Y axis.
I know how to check whether the curvature in X direction has the opposite sign as the curvature in Y direction, but that does not cover all cases. To make matters worse, the resolution in X is different from the resolution in Y)
Ideally I am looking for an algorithm that can tolerate some amount of noise and only mark "significant" saddle points.
I've been exploring a similar problem for a computational topology class and have had some success with the method outlined below.
First you will need a comparison function that will evaluate the height at two input points and will return < or > (not equal) for any input. One way to do this is that if the points are equal height you use some position-based or random index to find the greater point. You can think of this as adding an infinitesimal perturbation to the height.
Now, for each point, you will compare the height at all the surrounding neighbors (there will be 8 neighbors on a 2D rectangular grid). The lower link for a point will be the set of all neighbors for which the height is less than the point.
If all the neighboring values are in the lower link, you are at a local maximum. If none of the points are in the lower link you are at a local minimum. Otherwise, if the lower link is a single connected set, you are at a regular point on a slope. But if the lower link is two unconnected sets, you are at a saddle.
In 2D you can construct a list of the 8 neighboring point in cyclic order around the point you are checking. You assign a value of +/-1 for each neighbor depending on your comparison function. You can then step through that list (remember to compare the two end points) and count how many times the sign changes to determine the number of connected components in the lower link.
Determining which saddles are "important" is a more difficult analysis. You may wish to look at this: http://www.cs.jhu.edu/~misha/ReadingSeminar/Papers/Gyulassy08.pdf for some guidance.
-Michael
(From a guess at the maths rather than practical experience)
Fit a quadratic to the surface in a small patch around each candidate point, e.g. with least squares. How big the patch is is one way of controlling noise, and you might gain by weighting points depending on their distance from the candidate point. In matrix notation, you can represent the quadratic as x'Ax + b'x + c, where A is symmetric.
The quadratic will have zero gradient at x = (A^-1)b/2. If this not within the patch, discard it.
If A has both +ve and -ve eigenvalues you have a saddle point at x. Since A is only 2x2 and so has at most two eigenvalues, you can ignore the case when it as a zero eigenvalue and so you couldn't invert it at the previous stage.

Closest grid square to a point in spherical coordinates

I am programming an algorithm where I have broken up the surface of a sphere into grid points (for simplicity I have the grid lines parallel and perpendicular to the meridians). Given a point A, I would like to be able to efficiently take any grid "square" and determine the point B in the square with the least spherical coordinate distance AB. In the degenerate case the "squares" are actually "triangles".
I am actually only using it to bound which squares I am searching, so I can also accept a lower bound if it is only a tiny bit off. For this reason, I need the algorithm to be extremely quick otherwise it would be better to just take the loss of accuracy and search a few more squares.
I decided to repost this question to Math Overflow: https://mathoverflow.net/questions/854/closest-grid-square-to-a-point-in-spherical-coordinates. More progress has been made here
For points on a sphere, the points closest in the full 3D space will also be closest when measured along the surface of the sphere. The actual distances will be different, but if you're just after the closest point it's probably easiest to minimize the 3D distance rather than worry about great circle arcs, etc.
To find the actual great-circle distance between two (latitidude, longitude) points on the sphere, you can use the first formula in this link.
A few points, for clarity.
Unless you specifically wish these squares to be square (and hence to not fit exactly in this parallel and perpendicular layout with regards to the meridians), these are not exactly squares. This is particularly visible if the dimensions of the square are big.
The question speaks of a [perfect] sphere. Matters would be somewhat different if we were considering the Earth (or other planets) with its flattened poles.
Following is a "algorithm" that would fit the bill, I doubt it is optimal, but could offer a good basis. EDIT: see Tom10's suggestion to work with the plain 3D distance between the points rather than the corresponding great cirle distance (i.e. that of the cord rather than the arc), as this will greatly reduce the complexity of the formulas.
Problem layout: (A, B and Sq as defined in the OP's question)
A : a given point the the surface of the sphere
Sq : a given "square" from the grid
B : solution to problem : point located within Sq which has the shortest
distance to A.
C : point at the center of Sq
Tentative algorithm:
Using the formulas associated with [Great Circle][1], we can:
- find the equation of the circle that includes A and C
- find the distance between A and C. See the [formula here][2] (kindly lifted
from Tom10's reply).
- find the intersect of the Great Circle arc between these points, with the
arcs of parallel or meridian defining the Sq.
There should be only one such point, unless this finds a "corner" of Sq,
or -a rarer case- if the two points are on the same diameter (see
'antipodes' below).
Then comes the more algorithmic part of this procedure (so far formulas...):
- find, by dichotomy, the point on Sq's arc/seqment which is the closest from
point A. We're at B! QED.
Optimization:
It is probably possible make a good "guess" as to the location
of B, based on the relative position of A and C, hence cutting the number of
iterations for the binary search.
Also, if the distance A and C is past a certain threshold the intersection
of the cicles' arcs is probably a good enough estimate of B. Only when A
and C are relatively close will B be found a bit further on the median or
parallel arc in these cases, projection errors between A and C (or B) are
smaller and it may be ok to work with orthogonal coordinates and their
simpler formulas.
Another approach is to calculate the distance between A and each of the 4
corners of the square and to work the dichotomic search from two of these
points (not quite sure which; could be on the meridian or parallel...)
( * ) *Antipodes case*: When points A and C happen to be diametrically
opposite to one another, all great circle lines between A and C have the same
length, that of 1/2 the circonference of the sphere, which is the maximum any
two points on the surface of a sphere may be. In this case, the point B will
be the "square"'s corner that is the furthest from C.
I hope this helps...
The lazy lower bound method is to find the distance to the center of the square, then subtract the half diagonal distance and bound using the triangle inequality. Given these aren't real squares, there will actually be two diagonal distances - we will use the greater. I suppose that it will be reasonably accurate as well.
See Math Overflow: https://mathoverflow.net/questions/854/closest-grid-square-to-a-point-in-spherical-coordinates for an exact solution

Resources