Evaluating/Fitting an ellipse from scattered points

Evaluating/Fitting an ellipse from scattered points - algorithm

Here is the deal. I have multiple points (X,Y) that form an 'ellipse like' shape.
I would like to evaluate/fit the 'best' ellipse possible and get its properties (a,b,F1,F2), or just the center of the ellipse.
Any ideas/leads would be appreciated.
Gilad.

There's a Matlab function fit_ellipse that can do the job. There's also this paper on methods for orthogonal distance fitting of ellipses. A web search for orthogonal ellipse fit will probably turn up a lot of other resources as well.

The ellipse fitting method proposed by:
Z. L. Szpak, W. Chojnacki, and A. van den Hengel.
Guaranteed ellipse fitting with a confidence region and an uncertainty measure for centre, axes, and orientation.
J. Math. Imaging Vision, 2015.
may be of interest to you. They provide estimates of both algebraic and geometric ellipse
parameters, together with covariance matrices that express the uncertainty of the parameter estimates.
They also provide a means of computing a planar 95% confidence region associated with the estimate
that allows one to visualise the uncertainty in the ellipse fit.
A pre-print version of the paper is available on the authors websites (http://cs.adelaide.edu.au/~wojtek/publicationsWC.html).
A MATLAB implementation of the method is also available for download:
https://sites.google.com/site/szpakz/source-code/guaranteed-ellipse-fitting-with-a-confidence-region-and-an-uncertainty-measure-for-centre-axes-and-orientation

I will explain how I would approach the problem. I would suggest a hill climbing approach. First compute the gravity center of the points as a start point and choose two values for a and b in some way(probably arbitrary positive values will do). You need to have a fit function and I would suggest it to return the number of points (close enough to)lying on a given ellipse:
int fit(x, y, a, b)
int res := 0
for point in points
if point_almost_on_ellipse(x, y, a, b, point)
res = res + 1
end_if
end_for
return res
Now start with some step. I would choose a big enough value to be sure the best center of the elipse will never be more then step away from the first point. Choosing such a big value is not necessary, but the slowest part of the algorithm is the time it takes to get close to the best center so bigger value is better, I think.
So now we have some initial point(x, y), some initial values of a and b and an initial step. The algorithm iteratively chooses the best of the neighbours of the current point if there is any neighbour better then it, or decrease step twice otherwise. Here by 'best' I mean using the fit function. And also a position is defined by four values (x, y, a, b) and it's neighbours are 8: (x+-step, y, a, b),(x, y+-step, a, b), (x, y, a+-step, b), (x, y, a, b+-step)(if results are not good enough you can add more neighbours by also going by diagonal - for instance (x+-step, y+-step, a, b) and so on). Here is how you do that
neighbours = [[-1, 0, 0, 0], [1, 0, 0, 0], [0, -1, 0, 0], [0, 1, 0, 0],
[0, 0, -1, 0], [0, 0, 1, 0], [0, 0, 0, -1], [0, 0, 0, 1]]
iterate (cx, cy, ca, cb, step)
current_fit = fit(cx, cy, ca, cb)
best_neighbour = []
best_fit = current_fit
for neighbour in neighbours
tx = cx + neighbour[0]*step
ty = cx + neighbour[1]*step
ta = ca + neighbour[2]*step
tb = cb + neighbour[3]*step
tfit = fit(tx, ty, ta, tb)
if (tfit > best_fit)
best_fit = tfit
best_neighbour = [tx,ty,ta,tb]
endif
end_for
if best_neighbour.size == 4
cx := best_neighbour[0]
cy := best_neighbour[1]
ca := best_neighbour[2]
cb := best_neighbour[3]
else
step = step * 0.5
end_if
And you continue iterating until the value of step is smaller then a given threshold(for instance 1e-6). I have written everything in pseudo code as I am not sure which language do you want to use.
It is not guaranteed that the answer found this way will be optimal but I am pretty sure it will be good enough approximation.
Here is an article about hill climbing.

I think that Wild Magic library contains a function for ellipse fitting. There is article with method decription

The problem is to define "best". What is best in your case? The ellipse with the smallest area which contains n% of pointS?
If you define "best" in terms of probability, you can simply use the covariance matrix of your points, and compute the error ellipse.
An error ellipse for this "multivariate Gaussian distribution" would then contain the points corresponding to whatever confidence interval you decide.
Many computing packages can compute the covariance, with its corresponding eigenvalues and eigenvectors. The angle of the ellipse is the angle between the x axis and the eigenvector corresponding to the largest eigenvalue. The semi-axes are the reciprocal of the eigenvalues.
If your routine returns everything normalized (which it should), then you can decide by what factor to multiply everything to obtain an alpha-confidence interval.

Related

How do I calculate the area of a non-convex polygon?

Assuming that the polygon does not self-intersect, what would be the most efficient way to do this? The polygon has N vertices.
I know that it can be calculated with the coordinates but is there another general way?

The signed area, A(T), of the triangle T = ((x1, y1), (x2, y2), (x3, y3)) is defined to be 1/2 times the determinant of the following matrix:
|x1 y1 1|
|x2 y2 1|
|x3 y3 1|
The determinant is -y1*x2 + x1*y2 + y1*x3 - y2*x3 - x1*y3 + x2*y3.
Given a polygon (convex or concave) defined by the vertices p[0], p[1], ..., p[N - 1], you can compute the area of the polygon as follows.
area = 0
for i in [0, N - 2]:
area += A((0, 0), p[i], p[i + 1])
area += A((0, 0), p[N - 1], p[0])
area = abs(area)
Using the expression for the determinant above, you can compute A((0, 0), p, q) efficiently as 0.5 * (-p.y*q.x + p.x*q.y). A further improvement is to do the multiplication by 0.5 only once:
area = 0
for i in [0, N - 2]:
area += -p[i].y * p[i+1].x + p[i].x * p[i+1].y
area += -p[N-1].y * p[0].x + p[N-1].x * p[0].y
area = 0.5 * abs(area)
This is a linear time algorithm, and it is trivial to parallelize. Note also that it is an exact algorithm when the coordinates of your vertices are all integer-valued.
Link to Wikipedia article on this algorithm

The best way to approach this problem that I can think of is to consider the polygon as several triangles, find their areas separately, and sum them for the total area. All polygons, regular, or irregular, are essentially just a bunch of triangle (cut a quadrilateral diagonally to make two triangles, a pentagon in two cuts from one corner to the two most opposite ones, and the pattern continues on). This is quite simple to put to code.
A general algorithm for this can be coded as follows:
function polygonArea(Xcoords, Ycoords) {
numPoints = len(Xcoords)
area = 0; // Accumulates area in the loop
j = numPoints-1; // The last vertex is the 'previous' one to the first
for (i=0; i<numPoints; i++)
{ area = area + (Xcoords[j]+Xcoords[i]) * (Ycoords[j]-Ycoords[i]);
j = i; //j is previous vertex to i
}
return area/2;
}
Xcoords and Ycoords are arrays, where Xcoords stores the X coordinates, and Ycoords the Y coordinates.
The algorithm iteratively constructs the triangles from previous vertices.
I modified this from the algorithm provided Here by Math Open Ref
It should be relatively painless to adapt this to whatever form you are storing your coordinates in, and whatever language you are using for your project.

The "Tear one ear at a time" algorithm works, provided the triangle you remove does not contain "holes" (other vertices of the polygon).
That is, you need to choose the green triangle below, not the red one:
However, it is always possible to do so (Can't prove it mathematically right now, but you'l have to trust me). You just need to walk the polygon's vertices and perform some inclusion tests until you find a suitable triple.
Source: I once implemented a triangulation of arbitrary, non-intersecting polygons based on what I read in Computational Geometry in C by Joseph O'Rourke.

Take 3 consecutive points from the polygon.
Calculate the area of the resulting triangle.
Remove the middle of the 3 points from the polygon.
Do a test to see if the removed point is inside the remaining polygon or not. If it's inside subtract the triangle area from the total, otherwise add it.
Repeat until the polygon consists of a single triangle, and add that triangle's area to the total.
Edit: to solve the problem given by #NicolasMiari simply make two passes, on the first pass only process the vertices that are inside the remainder polygon, on the second pass process the remainder.

How to find the closest rotation

Consider points Y given in increasing order from [0,T). We are to consider these points as lying on a circle of circumference T. Now consider points X also from [0,T) and also lying on a circle of circumference T.
We say the distance between X and Y is the sum of the absolute distance between the each point in X and its closest point in Y recalling that both are considered to be lying in a circle. Write this distance as Delta(X, Y).
I am trying to find a quick way of determining a rotation of X which makes this distance as small as possible.
My code for making some data to test with is
#!/usr/bin/python
import random
import numpy as np
from bisect import bisect_left
def simul(rate, T):
time = np.random.exponential(rate)
times = [0]
newtime = times[-1]+time
while (newtime < T):
times.append(newtime)
newtime = newtime+np.random.exponential(rate)
return times[1:]
For each point I use this function to find its closest neighbor.
def takeClosest(myList, myNumber, T):
"""
Assumes myList is sorted. Returns closest value to myNumber in a circle of circumference T.
If two numbers are equally close, return the smallest number.
"""
pos = bisect_left(myList, myNumber)
before = myList[pos - 1]
after = myList[pos%len(myList)]
if after - myNumber < myNumber - before:
return after
else:
return before
So the distance between two circles is:
def circle_dist(timesY, timesX):
dist = 0
for t in timesX:
closest_number = takeClosest(timesY, t, T)
dist += np.abs(closest_number - t)
return dist
So to make some data we just do
#First make some data
T = 5000
timesX = simul(1, T)
timesY = simul(10, T)
Finally to rotate circle timesX by offset we can
timesX = [(t + offset)%T for t in timesX]
In practice my timesX and timesY will have about 20,000 points each.
Given timesX and timesY, how can I quickly find (approximately) which rotation of timesX gives
the smallest distance to timesY?

Distance along the circle between a single point and a set of points is a piecewise linear function of rotation. The critical points of this function are the points of the set itself (zero distance) and points midway between neighbouring points of the set (local maximums of distance). Linear coefficients of such function are ±1.
Sum of such functions is again piecewise linear, but now with a quadratic number of critical points. Actually all these functions are the same, except shifted along the argument axis. Linear coefficients of the sum are integers.
To find its minimum one would have to calculate its value in all critical points.
I don'see a way to significantly reduce the amount of work needed, but 1,600,000,000 points is not such a big deal anyway, especially if you can spread the work between several processors.
To calculate sum of two such functions, represent the summands as sequences of critical points and associated coefficients to the left and to the right of each critical point. Then just merge the two point sequences while adding the coefficients.

You can solve your (original) problem with a sweep line algorithm. The trick is to use the right "discretization". Imagine cutting your circle up into two strips:
X: x....x....x..........x................x.........x...x
Y: .....x..........x.....x..x.x...........x.............
Now calculate the score = 5+0++1+1+5+9+6.
The key observation is that if we rotate X very slightly (right say), some of the points will improve and some will get worse. We can call this the "differential". In the above example the differential would be 1 - 1 - 1 + 1 + 1 - 1 + 1 because the first point is matched to something on its right, the second point is matched to something under it or to its left etc.
Of course, as we move X more, the differential will change. However only as many times as the matchings change, which is never more than |X||Y| but probably much less.
The proposed algorithm is thus to calculate the initial score and the time (X position) of the next change in differential. Go to that next position and calculate the score again. Continue until you reach your starting position.

This is probably a good example for the iterative closest point (ICP) algorithm:
It repeatedly matches each point with its closest neighbor and moves all points such that the mean squared distance is minimized. (Note that this corresponds to minimizing the sum of squared distances.)
import pylab as pl
T = 10.0
X = pl.array([3, 5.5, 6])
Y = pl.array([1, 1.5, 2, 4])
pl.clf()
pl.subplot(1, 2, 1, polar=True)
pl.plot(X / T * 2 * pl.pi, pl.ones(X.shape), 'r.', ms=10, mew=3)
pl.plot(Y / T * 2 * pl.pi, pl.ones(Y.shape), 'b+', ms=10, mew=3)
circDist = lambda X, Y: (Y - X + T / 2) % T - T / 2
while True:
D = circDist(pl.reshape(X, (-1, 1)), pl.reshape(Y, (1, -1)))
closestY = pl.argmin(D**2, axis = 1)
distance = circDist(X, Y[closestY])
shift = pl.mean(distance)
if pl.absolute(shift) < 1e-3:
break
X = (X + shift) % T
pl.subplot(1, 2, 2, polar=True)
pl.plot(X / T * 2 * pl.pi, pl.ones(X.shape), 'r.', ms=10, mew=3)
pl.plot(Y / T * 2 * pl.pi, pl.ones(Y.shape), 'b+', ms=10, mew=3)
Important properties of the proposed solution are:
The ICP is an iterative algorithm. Thus it depends on an initial approximate solution. Furthermore, it won't always converge to the global optimum. This mainly depends on your data and the initial solution. If in doubt, try evaluating the ICP with different starting configurations and choose the most frequent result.
The current implementation performs a directed match: It looks for the closest point in Y relative to each point in X. It might yield different matches when swapping X and Y.
Computing all pair-wise distances between points in X and points in Y might be intractable for large point clouds (like 20,000 points, as you indicated). Therefore, the line D = circDist(...) might get replaced by a more efficient approach, e.g. not evaluating all possible pairs.
All points contribute to the final rotation. If there are any outliers, they might distort the shift significantly. This can be overcome with a robust average like the median or simply by excluding points with large distance.

Mapping coordinates from plane given by normal vector to XY plane

So, I have this algorithm to calculate cross-section of 3D shape with plane given with normal vector.
However, my current problem is, that the cross-section is set of 3D points (all lying on that given plane) and to display it I need to map this coordinates to XY plane.
This works perfect if the plane normal is something like (0,0,c) - I just copy x and y coordinates discarding z.
And here is my question: Since I have no idea how to convert any other plain could anybody give me any hint as to what should I do now?

Your pane is defined by a normal vector
n=(xn,yn,zn)
For coordination transformation we need 2 base vectors and a zero point for the pane
Base vectors
We chose those "naturally" fitting to the x/y pane (see later for edge case):
b1=(1,0,zb1)
b2=(0,1,zb2)
And we want
b1 x b2 = n*c (c const scalar)
to make sure these two are really bases
Now solve this:
b1 x b2= (0*zb2-zb1*1,zb1*0-1*zb2,1*1-0*0) = (zb1,zb2,1)
zb1*c=xn
zb2*c=yn
1*c=zn
c=zn,
zb2=yn/c=yn/zn
zb1=xn/c=xn/zn
b1=(1,0,yn/zn)
b2=(0,1,xn/zn)
and normalize it
bv1=(1,0,yn/zn)*sqrt(1+(yn/zn*yn/zn))
bv2=(0,1,yn/zn)*sqrt(1+(xn/zn*xn/zn))
An edge case is, when zn=0: In this case the normal vector is parallel to the x/y pane and no natural base vectors exist, ind this case you have to chose base b1 and b2 vectors by an esthetic POV and go through the same solution process or just chose bv1 and bv2.
Zero point
you spoke of no anchor point for your pane in the OQ, but it is necessary to differentiate your pane from the infinite family of parallel panes.
If your anchor point is (0,0,0) this is a perfect anchor point for the coordinate transformation and your pane has
x*xn+y*yn+z*zn=0,
(y0,y0,z0)=(0,0,0)
If not, I assume you have an anchor point of (xa,ya,za) and your pane has
x*xn+y*yn+z*zn=d
with d const scalar. A natural fit would be the point of the pane, that is defined by normal projection of the original zero point onto the pane:
P0=(x0,y0,z0)
with
(x0, y0, z0) = c * (xn,yn,zn)
Solving this against
x*xn+y*yn+z*zn=d
gives
c*xn*xn+c*yn*yn+c*zn*zn=d
and
c=d/(xn*xn+yn*yn+zn*zn)
thus
P0=(x0,y0,z0)=c*(xn,yn,zn)
is found.
Final transformation
is achieved by representing every point of your pane (i.e. those points you want to show) as
P0+x'*bv1+y'*bv2
with x' and y' being the new coordinates. Since we know P0, bv1 and bv2 this is quite trivial. If we are not on the edge case, we have zeroes in bv1.y and bv2.x further reducing the problem.
x' and y' are the new coordinates you want.

I would like to add to Eugen's answer, a suggestion for the case where zn=0 extending his answer and also offer an alternative solution (which is similar).
In the case of zn=0, you can actually think of all the planes as points in a circle around the z-axis and the radius depends on the parameters of the plane.
Any vector orthogonal to the radius should be parallel to the plane, while the radius being the normal of the plane.
So in some way, the problem is reduced to a 2D-space.
The normal to the plane is (xn, yn, 0).
By using a technique to find orthogonal vectors in 2D, we get that a base vector could therefore be (-yn, xn, 0).
The second base vector is (0, 0, 1) which is just the normalized vector of their cross product. We can see that by developing the following expression:
corss_product((-yn, xn, 0), (xn, yn, 0)) =
(xn*0 - 0*yn, 0*xn - (-yn)*0, (-b)*b - a*a) =
(0, 0, -(xn^2 + yn^2)).
Which after normalizing and negating becomes (0, 0, 1).
From here, I suggest b1=normalize(-yn, xn, 0) and b2=(0, 0, 1).
Now, there's an even more general solution using this approach.
If you'll develop the dot product of (-yn, xn, 0) and (xn, yn, zn), you'll see that they are orthogonal for any zn while (-yn, xn, 0) also being part of the plane in question (when d=0). Thus, this actually works as long at least one of xn and yn is not zero (because otherwise (-yn, xn, 0) is actually just (0, 0, 0)).
Just to make sure it's clear, the second base vector is again their cross product, that is: b1=(-yn, xn, 0) and b2=cross_product(b1, n).
Well then, what about the case where both xn and yn are zero? In this case the plane is parallel to the xy plane. Now that's an easy one, just choose b1=(1, 0, 0) and b2=(0, 1, 0).
And as the other approach, use an anchor vector when d is not 0, exactly as it is described there, no changes needed.
Summary: 2 different solutions:
Use Eugen's answer answer and for the case of zn=0, take: b1=(-yn, xn, 0) and b2=(0, 0, 1).
A different approach: If both xn and yn equal 0, take b1=(1, 0, 0) and b2=(0, 1, 0), otherwise take b1=(-yn, xn, 0) and b2=cross_product(b1, n).
In both solutions, use an anchor vector P0 as described by the aforementioned answer.

What algorithm determines the nearness of a point to a Bezier curve?

I wish to determine when a point (mouse position) in on, or near a curve defined by a series of B-Spline control points.
The information I will have for the B-Spline is the list of n control points (in x,y coordinates). The list of control points can be of any length (>= 4) and define a B-spline consisting of (n−1)/3 cubic Bezier curves. The Bezier curves are are all cubic. I wish to set a parameter k,(in pixels) of the distance defined to be "near" the curve. If the mouse position is within k pixels of the curve then I need to return true, otherwise false.
Is there an algorithm that gives me this information. Any solution does not need to be precise - I am working to a tolerance of 1 pixel (or coordinate).
I have found the following questions seem to offer some help, but do not answer my exact question. In particular the first reference seems to be a solution only for 4 control points, and does not take into account the nearness factor I wish to define.
Position of a point relative to a Bezier curve
Intersection between bezier curve and a line segment
EDIT:
An example curve:
e, 63.068, 127.26
29.124, 284.61
25.066, 258.56
20.926, 212.47
34, 176
38.706, 162.87
46.556, 149.82
54.393, 138.78
The description of the format is: "Every edge is assigned a pos attribute, which consists of a list of 3n + 1 locations. These are B-spline control points: points p0, p1, p2, p3 are the first Bezier spline, p3, p4, p5, p6 are the second, etc. Points are represented by two integers separated by a comma, representing the X and Y coordinates of the location specified in points (1/72 of an inch). In the pos attribute, the list of control points might be preceded by a start point ps and/or an end point pe. These have the usual position representation with a "s," or "e," prefix, respectively."
EDIT2: Further explanation of the "e" point (and s if present).
In the pos attribute, the list of control points might be preceded by a start
point ps and/or an end point pe. These have the usual position representation with a
"s," or "e," prefix, respectively. A start point is present if there is an arrow at p0.
In this case, the arrow is from p0 to ps, where ps is actually on the node’s boundary.
The length and direction of the arrowhead is given by the vector (ps −p0). If there
is no arrow, p0 is on the node’s boundary. Similarly, the point pe designates an
arrow at the other end of the edge, connecting to the last spline point.

You may do this analitically, but a little math is needed.
A Bezier curve can be expressed in terms of the Bernstein Basis. Here I'll use Mathematica, that provides good support for the math involved.
So if you have the points:
pts = {{0, -1}, {1, 1}, {2, -1}, {3, 1}};
The eq. for the Bezier curve is:
f[t_] := Sum[pts[[i + 1]] BernsteinBasis[3, i, t], {i, 0, 3}];
Keep in mind that I am using the Bernstein basis for convenience, but ANY parametric representation of the Bezier curve would do.
Which gives:
Now to find the minimum distance to a point (say {3,-1}, for example) you have to minimize the function:
d[t_] := Norm[{3, -1} - f[t]];
For doing that you need a minimization algorithm. I have one handy, so:
NMinimize[{d[t], 0 <= t <= 1}, t]
gives:
{1.3475, {t -> 0.771653}}
And that is it.
HTH!
Edit Regarding your edit "B-spline with consisting of (n−1)/3 cubic Bezier curves."
If you constructed a piecewise B-spline representation you should iterate on all segments to find the minima. If you joined the pieces on a continuous parameter, then this same approach will do.
Edit
Solving your curve. I disregard the first point because I really didn't understand what it is.
I solved it using standard Bsplines instead of the mathematica features, for the sake of clarity.
Clear["Global`*"];
(*first define the points *)
pts = {{
29.124, 284.61}, {
25.066, 258.56}, {
20.926, 212.47}, {
34, 176}, {
38.706, 162.87}, {
46.556, 149.82}, {
54.393, 138.78}};
(*define a bspline template function *)
b[t_, p0_, p1_, p2_, p3_] :=
(1-t)^3 p0 + 3 (1-t)^2 t p1 + 3 (1-t) t^2 p2 + t^3 p3;
(* define two bsplines *)
b1[t_] := b[t, pts[[1]], pts[[2]], pts[[3]], pts[[4]]];
b2[t_] := b[t, pts[[4]], pts[[5]], pts[[6]], pts[[7]]];
(* Lets see the curve *)
Show[Graphics[{Red, Point[pts], Green, Line[pts]}, Axes -> True],
ParametricPlot[BSplineFunction[pts][t], {t, 0, 1}]]
.
( Rotated ! for screen space saving )
(*Now define the distance from any point u to a point in our Bezier*)
d[u_, t_] := If[(0 <= t <= 1), Norm[u - b1[t]], Norm[u - b2[t - 1]]];
(*Define a function that find the minimum distance from any point u \
to our curve*)
h[u_] := NMinimize[{d[u, t], 0.0001 <= t <= 1.9999}, t];
(*Lets test it ! *)
Plot3D[h[{x, y}][[1]], {x, 20, 55}, {y, 130, 300}]
This plot is the (minimum) distance from any point in space to our curve (of course the value over the curve is zero):

First, render the curve to a bitmap (black and white) with your favourite algorithm. Then, whenever you need, determine the nearest pixel to the mouse position using information from this question. You can modify the searching function so that it will return distance, so you can easilly compare it with your requirements. This method gives you the distance with tolerance of 1-2 pixels, which will do, I guess.

Definition: distance from a point to a line segment = distance from the original point to the closest point still on the segment.
Assumption: an algo to compute the distance from a point to a segment is known (e.g. compute the intercept with the segment of the normal to the segment passing through the original point. If the intersection is outside the segment, pick the closest end-point of the segment)
use the deCasteljau algo and subdivide your cubics until getting to a good enough daisy-chain of linear segments. Supplementary info the "Bezier curve flattening" section
consider the minimum of the distances between your point and the resulted segments as the distance from your point to the curve. Repeat for all the curves in your set.
Refinement at point 2: don't compute the actual distance, but the square of it, getting the minimum square distance is good enough - saves a sqrt call/segment.
Computation effort: empirically a cubic curve with a maximum extent (i.e. bounding box) of 200-300 results in about 64 line segments when flattened to a maximum tolerance of 0.5 (approx good enough for the naked eye).
Each deCasteljau step requires 12 division-by-2 and 12 additions.
Flatness evaluation - 8 multiplications + 4 additions (if using the TaxiCab distance to evaluate a distance)
the evaluation of point-to-segment distance requires at max 12 multiplications and 11 additions - but this will be a rare case in the context of Bezier flattening, I'd expect an average of 6 multiplications and 9 additions.
So, assuming a very bad case (100 straight segments/cubic), you finish in finding your distance with a cost of approx 2600 multiplications + 2500 additions per considered cubic.
Disclaimers:
don't ask me for a demonstration on the numbers in
the computational effort evaluation above,
I'll answer with "Use the source-code" (note: Java implementation).
other approaches may be possible and maybe less costly.
Regards,
Adrian Colomitchi

"Center of Mass" between a set of points on a Toroidally-Wrapped Map that minimizes average distance to all points

edit As someone has pointed out, what I'm looking for is actually the point minimizing total geodesic distance between all other points
My map is topographically similar to the ones in Pac Man and Asteroids. Going past the top will warp you to the bottom, and going past the left will warp you to the right.
Say I have two points (of the same mass) on the map and I wanted to find their center of mass. I could use the classical definition, which basically is the midpoint.
However, let's say the two points are on opposite ends of the mass. There is another center of mass, so to speak, formed by wrapping "around". Basically, it is the point equidistant to both other points, but linked by "wrapping around" the edge.
Example
b . O . . a . . O .
Two points O. Their "classical" midpoint/center of mass is the point marked a. However, another midpoint is also at b (b is equidistant to both points, by wrapping around).
In my situation, I want to pick the one that has lower average distance between the two points. In this case, a has an average distance between the two points of three steps. b has an average distance of two steps. So I would pick b.
One way to solve for the two-point situation is to simply test both the classical midpoint and the shortest wrapped-around midpoint, and use the one that has a shorter average distance.
However! This does not easily generalize to 3 points, or 4, or 5, or n points.
Is there a formula or algorithm that I could use to find this?
(Assume that all points will always be of equal mass. I only use "center of mass" because it is the only term I knew to loosely describe what I was trying to do)
If my explanation is unclear, I will try to explain it better.

The notion of center of mass is a notion relevant on affine spaces. The n-dimensional torus has no affine structure.
What you want is a point which minimizes (geodesic) distance to all the other points.
I suggest the following: let x_1...x_n be a collection of points on the d-dimensional torus (or any other metric space for that purpose).
Your problem:
find a point mu such that sum(dist(mu, x_k)^2) is minimal.
In the affine-euclidian case, you get the usual notion of center of mass back.
This is a problem you will be able to solve (for instance, there are probably better options) with the conjugate gradient algorithm, which performs well in this case. Beware that you need moderate n (say n < 10^3) since the algorithm needs n^2 in space and n^3 in time.
Perhaps better suited is the Levenberg-Marquardt algorithm, which is tailored for minimization of sum of squares.
Note that if you have a good initial guess (eg. the usual center of mass of the points seen as points in R^d instead of the torus) the method will converge faster.
Edit:
If (x1...xd) and (y1...yd) are points on the torus, the distance is given by
dist(x, y)^2 = alpha1^2 + ... + alphad^2
where alphai = min((xi - yi) mod 1, (yi - xi) mod 1)

I made a little program to check the goodness of the involved functions and found that you should be very carefull with the minimization process.
Below you can see two sets of plots showing the points distribution, the function to minimize in the euclidean case, and the one corresponding to the "toric metric".
As you may see, the euclidean distance is very well-behaved, while the toric present several local minima that difficult the finding of the global minima. Also, the global minimum in the toric case is not unique.
Just in case, the program in Mathematica is:
Clear["Global`*"];
(*Define non wrapping distance for dimension n*)
nwd[p1_, p2_, n_] := (p1[[n]] - p2[[n]])^2;
(*Define wrapping distance for dimension n *)
wd[p1_, p2_, max_,n_] := (max[[n]] - Max[p1[[n]], p2[[n]]] + Min[p1[[n]], p2[[n]]])^2;
(*Define minimal distance*)
dist[p1_, p2_, max_] :=
Min[nwd[p1, p2, 1], wd[p1, p2, max, 1]] +
Min[nwd[p1, p2, 2], wd[p1, p2, max, 2]];
(*Define Euclidean distance*)
euclDist[p1_, p2_, max_] := nwd[p1, p2, 1] + nwd[p1, p2, 2];
(*Set torus dimensions *)
MaxX = 20;
MaxY = 15;
(*Examples of Points sets *)
lCircle =
Table[{10 Cos[fi] + 10, 5 Sin[fi] + 10}, {fi, 0, 2 Pi - .0001, Pi/20}];
lRect = Join[
Table[{3, y}, {y, MaxY - 1}],
Table[{MaxX - 1, y}, {y, MaxY - 1}],
Table[{x, MaxY/2}, {x, MaxY - 1}],
Table[{x, MaxY - 1}, {x, MaxX - 1}],
Table[{x, 1}, {x, MaxX - 1}]];
(*Find Euclidean Center of mass *)
feucl = FindMinimum[{Total[
euclDist[#, {a, b}, {MaxX, MaxY}] & /# lRect], 0 <= a <= MaxX,
0 <= b <= MaxY}, {{a, 10}, {b, 10}}]
(*Find Toric Center of mass *)
ftoric = FindMinimum[{Total[dist[#, {a, b}, {MaxX, MaxY}] & /# lRect],
0 <= a <= MaxX, 0 <= b <= MaxY}, {{a, 10}, {b, 10}}]

In the 1 dimensional case, your problem would be analagous to finding a mean angle.
The mean of angles a and b can be computed by
mean = remainder( a + remainder( b-a, C)/2.0, C)
where C is the measure of a whole circle (ie 2*PI if you're using radians).
If you have n angles a[], the mean can be computed by
mean = a[0];
for i=1..n mean=remainder( mean + remainder( a[i]-mean, C)/(i+1), C)
So I reckon
meanX = X[0]; meanY = Y[0]
for i=1..n
meanX = remainder( meanX + remainder( X[i]-meanX, W)/(i+1), W)
meanY = remainder( meanY + remainder( Y[i]-meanY, H)/(i+1), H)
might do the job.
But note that this will result in -W/2<=meanX

IANATopologist, and I don't know how clear I'm making myself in this, but for what it's worth, these are some thoughts on the matter:
Using mass and gravity to calculate this sort of thing might indeed be elegant -- ISTR that there are a number of libraries and efficient algorithms to find the gravity vectors for any number of masses.
If you were using a spherical map, I'd suggest finding within the sphere the actual center of gravity for your N mass points. You then draw a line from the center outward through this inner center of gravity to find the point on the sphere's surface where your mass points wish to congregate.
However, a toroidal map makes this difficult.
My suggestion, then, is to flatten and copy your map to give you a 3 x 3 quilt of maps (using an infinite field of maps will give better results, but might be overkill). I'll assign coordinates (0, 0) to (2, 2) to them, with (1, 1) being your source map. Find the point(s) to which the mass points of your inner map (1, 1) are attracted -- if they all go towards the middle of your map, fine: you found your center of gravity. If not, if one of the points close to the edge is going towards some mass accumulation outside of your inner map, say into map (2, 1), then discard this mass point when calculating your center of gravity. Instead you use the mass point from the opposite map ((0, 1) in this case) that wants to wander over into your middle map.
Adding the acceleration vectors for these mass points gives you the center of gravity on your torus.
Done.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio