Smallest enclosing regular hexagon - algorithm

Is there any algorithm / method to find the smallest regular hexagon around a set of points (x, y).
And by smallest I mean smallest area.
My current idea was to find the smallest circle enclosing the points, and then create a hexagon from there and check if all the points are inside, but that is starting to sound like a never ending problem.

Requirements
First of all, let's define a hexagon as quadruple [x0, y0, t0, s], where (x0, y0), t0 and s are its center, rotation and side-length respectively.
Next, we need to find whether an arbitrary point is inside the hexagon. The following functions do this:
function getHexAlpha(t, hex)
t = t - hex.t0;
t = t - 2*pi * floor(t / (2*pi));
return pi/2 - abs(rem(t, pi/3) - (pi/6));
end
function getHexRadious( P, hex )
x = P.x - hex.x0;
y = P.y - hex.y0;
t = atan2(y, x);
return hex.s * cos(pi/6) / sin(getHexAlpha(t, hex));
end
function isInHex(P, hex)
r = getHexRadious(P, hex);
d = sqrt((P.x - hex.x0)^2 + (P.y - hex.y0)^2);
return r >= d;
end
Long story short, the getHexRadious function formulates the hexagon in polar form and returns distance from center of hexagon to its boundary at each angle. Read this post for more details about getHexRadious and getHexRadious functions. This is how these work for a set of random points and an arbitrary hexagon:
The Algorithm
I suggest a two-stepped algorithm:
1- Guess an initial hexagon that covers most of points :)
2- Tune s to cover all points
Chapter 1: (2) Following Tarantino in Kill Bill Vol.1
For now, let's assume that our arbitrary hexagon is a good guess. Following functions keep x0, y0, t0 and tune s to cover all points:
function getHexSide( P, hex )
x = P.x - hex.x0;
y = P.y - hex.y0;
r = sqrt(x^2 + y^2);
t = atan2(y, x);
return r / (cos(pi/6) / sin(getHexAlpha(t, hex)));
end
function findMinSide( P[], hex )
for all P[i] in P
S[i] = getHexSide(P, hex);
end
return max(S[]);
end
The getHexSide function is reverse of getHexRadious. It returns the minimum required side-length for a hexagon with x0, y0, t0 to cover point P. This is the outcome for previous test case:
Chapter 2: (1)
As a guess, we can find two points furthest away from each other and fit one of hexagon diameters' on them:
function guessHex( P[] )
D[,] = pairwiseDistance(P[]);
[i, j] = indexOf(max(max(D[,])));
[~, j] = max(D(i, :));
hex.x0 = (P[i].x + P[j].x) / 2;
hex.y0 = (P[i].y + P[j].y) / 2;
hex.s = D[i, j]/2;
hex.t0 = atan2(P.y(i)-hex.y0, P.x(i)-hex.x0);
return hex;
end
Although this method can find a relatively small polygon, but as a greedy approach, it never guarantees to find the optimum solutions.
Chapter 3: A Better Guess
Well, this problem is definitely an optimization problem with its objective being to minimize area of hexagon (or s variable). I don't know if it has an analytical solution, and SO is not the right place to discuss it. But any optimization algorithm can be used to provide a better initial guess. I used GA to solve this with findMinSide as its cost function. In fact GA generates many guesses about x0, y0, and t0 and the best one will be selected. It finds better results but is more time consuming. Still no guarantee to find the optimum!
Optimization of Optimization
When it comes to optimization algorithms, performance is always an issue. Keep in mind that hexagon only needs to enclose the convex-hall of points. If you are dealing with large sets of points, it's better to find the convex-hall and get rid of the rest of the points.

Related

Find tangent points in a circle from a point

Circle center : Cx,Cy
Circle radius : a
Point from which we need to draw a tangent line : Px,Py
I need the formula to find the two tangents (t1x, t1y) and (t2x,t2y) given all the above.
Edit: Is there any simpler solution using vector algebra or something, rather than finding the equation of two lines and then solving equation of two straight lines to find the two tangents separately? Also this question is not off-topic because I need to write a code to find this optimally
Here is one way using trigonometry. If you understand trig, this method is easy to understand, though it may not give the exact correct answer when one is possible, due to the lack of exactness in trig functions.
The points C = (Cx, Cy) and P = (Px, Py) are given, as well as the radius a. The radius is shown twice in my diagram, as a1 and a2. You can easily calculate the distance b between points P and C, and you can see that segment b forms the hypotenuse of two right triangles with side a. The angle theta (also shown twice in my diagram) is between the hypotenuse and adjacent side a so it can be calculated with an arccosine. The direction angle of the vector from point C to point P is also easily found by an arctangent. The direction angles of the tangency points are the sum and difference of the original direction angle and the calculated triangle angle. Finally, we can use those direction angles and the distance a to find the coordinates of those tangency points.
Here is code in Python 3.
# Example values
(Px, Py) = (5, 2)
(Cx, Cy) = (1, 1)
a = 2
from math import sqrt, acos, atan2, sin, cos
b = sqrt((Px - Cx)**2 + (Py - Cy)**2) # hypot() also works here
th = acos(a / b) # angle theta
d = atan2(Py - Cy, Px - Cx) # direction angle of point P from C
d1 = d + th # direction angle of point T1 from C
d2 = d - th # direction angle of point T2 from C
T1x = Cx + a * cos(d1)
T1y = Cy + a * sin(d1)
T2x = Cx + a * cos(d2)
T2y = Cy + a * sin(d2)
There are obvious ways to combine those calculations and make them a little more optimized, but I'll leave that to you. It is also possible to use the angle addition and subtraction formulas of trigonometry with a few other identities to completely remove the trig functions from the calculations. However, the result is more complicated and difficult to understand. Without testing I do not know which approach is more "optimized" but that depends on your purposes anyway. Let me know if you need this other approach, but the other answers here give you other approaches anyway.
Note that if a > b then acos(a / b) will throw an exception, but this means that point P is inside the circle and there is no tangency point. If a == b then point P is on the circle and there is only one tangency point, namely point P itself. My code is for the case a < b. I'll leave it to you to code the other cases and to decide the needed precision to decide if a and b are equal.
Here's another way using complex numbers.
If a is the direction (a complex number of length 1) of the tangent point on the circle from the centre c, and d is the (real) length along the tangent to get to p, then (because the direction of the tangent is I*a)
p = c + r*a + d*I*a
rearranging
(r+I*d)*a = p-c
But a has length 1 so taking the length we get
|r+I*d| = |p-c|
We know everything but d, so we can solve for d:
d = +- sqrt( |p-c|*|p-c| - r*r)
and then find the a's and the points on the circle, one of each for each value of d above:
a = (p-c)/(r+I*d)
q = c + r*a
Hmm not really an algorithm question (people tend to mistake algorithm and equation) If you want to write a code then do (you did not specify language nor what prevents you from doing this which is the reason of close votes)... Without this info your OP is just asking for math equation which is indeed off-topic here and by answering this I risk (right-full) down-votes too (but this is/was asked a lot here with much less info and 4 reopen votes against 1 close put my decision weight on reopen and answering this anyway).
You can exploit the fact that you are in 2D as in 2D perpendicular vectors to vector a(x,y) are computed like this:
c = (-y, x)
d = ( y,-x)
c = -d
so you swap x,y and negate one (which one determines if the perpendicular vector is CW or CCW). It is really a rotation formula but as we rotate by 90deg the cos,sin are just +1 and -1.
Now normal n to any circumference point on circle lies in the line going through that point and circles center. So putting all this together your tangents are:
// normal
nx = Px-Cx
ny = Py-Cy
// tangent 1
tx = -ny
ty = +nx
// tangent 2
tx = +ny
ty = -nx
If you want unit vectors than just divide by radius a (not sure why you do not call it r like the rest of the math world) so:
// normal
nx = (Px-Cx)/a
ny = (Py-Cy)/a
// tangent 1
tx = -ny
ty = +nx
// tangent 2
tx = +ny
ty = -nx
Let's go through derivation process:
As you can see, if the interior of the square is < 0 it's because the point is interior to the circumferemce. When the point is outside of the circumference there are two solutions, depending on the sign of the square.
The rest is easy. Take atan(solution) and be carefull here with the signs, you may better do some checks.
Use (2) and then undo (1) transformations and that's all.
c# implementation of dmuir's answer:
static void FindTangents(Vector2 point, Vector2 circle, float r, out Line l1, out Line l2)
{
var p = new Complex(point.x, point.y);
var c = new Complex(circle.x, circle.y);
var cp = p - c;
var d = Math.Sqrt(cp.Real * cp.Real + cp.Imaginary * cp.Imaginary - r * r);
var q = GetQ(r, cp, d, c);
var q2 = GetQ(r, cp, -d, c);
l1 = new Line(point, new Vector2((float) q.Real, (float) q.Imaginary));
l2 = new Line(point, new Vector2((float) q2.Real, (float) q2.Imaginary));
}
static Complex GetQ(float r, Complex cp, double d, Complex c)
{
return c + r * (cp / (r + Complex.ImaginaryOne * d));
}
Move the circle to the origin, rotate to bring the point on X and downscale by R to obtain a unit circle.
Now tangency is achieved when the origin (0, 0), the (reduced) given point (d, 0) and an arbitrary point on the unit circle (cos t, sin t) form a right triangle.
cos t (cos t - d) + sin t sin t = 1 - d cos t = 0
From this, you draw
cos t = 1 / d
and
sin t = ±√(1-1/d²).
To get the tangency points in the initial geometry, upscale, unrotate and untranslate. (These are simple linear algebra operations.) Notice that there is no need to perform the direct transform explicitly. All you need is d, ratio of the distance center-point over the radius.

Find indices of polygon vertices nearest to a point

Heading
I need to find the indices of the polygon nearest to a point
So in this case the ouput would be 4 and 0. Such that if the red point is added I know to where to place the vertex in the array. Does anyone know where to start?
(Sorry if the title is misleading, I wasnt sure how to phrase it properly)
In this case the ouput would be 0 and 1, rather than the closest 4.
Point P lies on the segment AB, if two simple conditions are met together:
AP x PB = 0 //cross product, vectors are collinear or anticollinear, P lies on AB line
AP . PB > 0 //scalar product, exclude anticollinear case to ensure that P is inside the segment
So you can check all sequential vertice pairs (pseudocode):
if (P.X-V[i].X)*(V[i+1].Y-P.Y)-(P.Y-V[i].Y)*(V[i+1].X-P.X)=0 then
//with some tolerance if point coordinates are float
if (P.X-V[i].X)*(V[i+1].X-P.X)+(P.Y-V[i].Y)*(V[i+1].Y-P.Y)>0
then P belongs to (i,i+1) segment
This is fast direct (brute-force) method.
Special data structures exist in computer geometry to quickly select candidate segments - for example, r-tree. But these complicated methods will gain for long (many-point) polylines and for case where the same polygon is used many times (so pre-treatment is negligible)
I'll assume that the new point is to be added to an edge. So you are given the coordinates of a point a = (x, y) and you want to find the indices of the edge on which it lies. Let's call the vertices of that edge b, c. Observe that the area of the triangle abc is zero.
So iterate over all edges and choose the one that minimizes area of triangle abc where a is your point and bc is current edge.
a = input point
min_area = +infinity
closest_edge = none
n = number of vertices in polygon
for(int i = 1; i <= n; i++)
{ b = poly[ i - 1 ];
c = poly[ i % n ];
if(area(a, b, c) < min_area)
{ min_area = area(a, b, c);
closest_edge = bc
}
}
You can calculate area using:
/* Computes area x 2 */
int area(a, b, c)
{ int ans = 0;
ans = (a.x*b.y + b.x*x.y + c.x*a.y) - (a.y*b.x + b.y*c.x + c.y*a.x);
return ABS(ans);
}
I think you would be better off trying to compare the distance from the actual point to a comparable point on the line. The closest comparable point would be the one that forms a perpendicular line like this. a is your point in question and b is the comparable point on the line line between the two vertices that you will check distance to.
However there's another method which I think might be more optimal for this case (as it seems most of your test points lie pretty close to the desired line already). Instead of find the perpendicular line point we can simply check the point on the line that has the same X value like this. b in this case is a lot easier to calculate:
X = a.X - 0.X;
Slope = (1.Y - 0.Y) / (1.X - 0.X);
b.X = 0.X + X;
b.Y = 0.Y + (X * Slope);
And the distance is simply the difference in Y values between a and b:
distance = abs(a.Y - b.Y);
One thing to keep in mind is that this method will become more inaccurate as the slope increases as well as become infinite when the slope is undefined. I would suggest flipping it when the slope > 1 and checking for a b that lies at the same y rather than x. That would look like this:
Y = a.Y - 0.Y;
Inverse_Slope = (1.X - 0.X) / (1.Y - 0.Y);
b.Y = 0.Y + Y;
b.X = 0.Y + (Y * Inverse_Slope);
distance = abs(a.X - b.X);
Note: You should also check whether b.X is between 0.X and 1.X and b.Y is between 0.Y and 1.Y in the second case. That way we are not checking against points that dont lie on the line segment.
I admit I don't know the perfect terminology when it comes to this kind of thing so it might be a little confusing, but hope this helps!
Rather than checking if the point is close to an edge with a prescribed tolerance, as MBo suggested, you can fin the edge with the shortest distance to the point. The distance must be computed with respect to the line segment, not the whole line.
How do you compute this distance ? Let P be the point and Q, R two edge endpoints.
Let t be in range [0,1], you need to minimize
D²(P, QR) = D²(P, Q + t QR) = (PQ + t QR)² = PQ² + 2 t PQ.QR + t² QR².
The minimum is achieved when the derivative cancels, i.e. t = - PQ.QR / QR². If this quantity exceeds the range [0,1], just clamp it to 0 or 1.
To summarize,
if t <= 0, D² = PQ²
if t >= 1, D² = PR²
otherwise, D² = PQ² - t² QR²
Loop through all the vertices, calculate the distance of that vertex to the point, find the minimum.
double min_dist = Double.MAX_VALUE;
int min_index=-1;
for(int i=0;i<num_vertices;++i) {
double d = dist(vertices[i],point);
if(d<min_dist) {
min_dist = d;
min_index = i;
}
}

Area between two lines inside the square [-1,+1] x [-1,+1]

I'm working on a project in Matlab and need to find the area between two lines inside of the square [-1,+1]x[-1,+1] intersecting in a point (xIntersection,yIntersection). So the idea is to subtract the two lines and integrate between [-1, xIntersection] and [xIntersection, +1], sum the results and if it's negative, change its sign.
For details on how I find the intersection of the two lines check this link.
I'm using Matlab's function integral(), here a snippet of my code:
xIntersection = ((x_1 * y_2 - y_1 * x_2) * (x_3 - x_4) - (x_1 - x_2) * (x_3 * y_4 - y_3 * x_4) ) / ((x_1 - x_2) * (y_3 - y_4) - (y_1 - y_2) * (x_3 - x_4));
d = #(x) g(x) - f(x);
result = integral(d, -1, xIntersection) - int( d, xIntersection, 1)
if(result < 0),
result = result * -1;
end
Note that I defined previously in the code g(x) and f(x) but haven't reported it in the snippet.
The problem is that I soon realized that the lines could intersect either inside or outside of the square, furthermore they could intersect the square on any of its sides and the number of possible combinations grows up very quickly.
I.e.:
These are just 4 cases, but considering that f(+1), f(-1), g(+1), g(-1) could be inside the interval [-1,+1], above it or under it and that the intersection could be inside or outside of the square the total number is 3*3*3*3*2 = 162.
Obviously in each case the explicit function to integrate in order to get the area between the two lines is different, but I can't possibly think of writing a switch case for each one.
Any ideas?
I think my answer to your previous question still applies, for the most part.
If you want to compute the area of the region bounded by the smaller angle of the two lines and the boundaries of the square, then you can forget about the intersection, and forget about all the different cases.
You can just use the fact that
the area of the square S is equal to 4
the value of this integral
A = quadgk(#(x) ...
abs( max(min(line1(x),+1),-1) - max(min(line2(x),+1),-1) ), -1, +1);
gives you the area between the lines (sometimes the large angle, sometimes the small angle)
the value of of min(A, S-A) is the correct answer (always the small angle).
Assuming “between the lines” means “inside the smaller angle formed by the lines”:
With the lines l and h, S := [-1,+1]x[-1,+1] and B as the border of S.
Transform l to the form l_1 + t*l_2 with l_1 and l_2 beeing vectors. Do the same for h.
If the intersection is not inside S, find the 4 intersections of l and h with B. Sort them so you get a convex quadrilateral. Calculate its area.
Else:
Find the intersection point p and find the intersection angle α between l_2 and h_2. and then check:
If α in [90°,180°] or α in [270°,360°], swap l and h.
If α > 180°, set l_2 = −l_2
Set l_1 := h_1 := p. Do once for positive t and negative t
(forwards and backwards along l_2 and h_2 from p):
Find intersections s_l and s_h of l and h with B.
If on same border of B: compute area of triangle s_l, s_h, p
If on adjacent borders of B: find the corner c of B in between the hit borders and once again sort the four points s_l, s_h, p and c so you get an convex quadrilateral and calculate it’s area.
If on opposite borders, find the matching side of B (you can do this by looking at the direction of s_l-p). Using its two corner points c_1 and c_2, you now have 5 points that form a polygon. Sort them so the polygon is convex and calculate its area.
This is still quite a few cases, but I don’t think there’s a way around that. You can simplify this somewhat by also using the polygon formula for the triangle and the quadrilateral.
How to sort the points so you get a convex polygon/quadrilateral: select any one of them as p_1 and then sort the rest of the points according to angle to p_1.
If you define an intersection function and a polygon_area that takes a list of points, sorts them and returns the area, this algorithm should be rather simple to implement.
edit: Image to help explain the comment:
Hey guys thanks for your answers, I also thought of an empirical method to find the area between the lines and wanted to share it for the sake of discussion and completeness.
If you take a large number of random points within the square [-1,+1]x[-1,+1] you can measure the area as the fraction of the points which fall in the area between the two lines.
Here's a little snippet and two images to show the different accuracy of empirical result obtained with different number of points.
minX = -1;
maxX = +1;
errors = 0;
size = 10000;
for j=1:size,
%random point in [-1,+1]
T(j,:) = minX + (maxX - minX).*rand(2,1);
%equation of the two lines is used to compute the y-value
y1 = ( ( B(2) - A(2) ) / ( B(1) - A(1) ) ) * (T(j,1) - A(1)) + A(2);
y2 = (- W(1) / W(2)) * T(j,1) -tresh / W(2);
if(T(j,2) < y1),
%point is under line one
Y1 = -1;
else
%point is above line one
Y1 = +1;
end
if(T(j,2) < y2),
%point is under line two
Y2 = -1;
else
%point is above line two
Y2 = +1;
end
if(Y1 * Y2 < 0),
errors = errors + 1;
scatter(T(j,1),T(j,2),'fill','r')
else
scatter(T(j,1),T(j,2),'fill','g')
end
end
area = (errors / size) / 4;
And here are two images, it sure takes longer than the solution posted by #Rody but as you can see you can make it accurate.
Number of points = 2000
Number of points = 10000

Curve (spline, bezier path, etc) described by a set of base points

As an input I have set of 'base' points (e.g. 9 points), and as an output I must return another set of points, which describe a curve.
A1-A9 is an input; these are the 'base' points. My task is to return a set of points, from which the user can build the depicted curve, the black line from A1-A9
My mathematical skills are low, and Googling is not very helpful. As I understand, this can be a cubic spline. I have found some C-based source code, but this code loops endlessly when I try to build spline parts, where nextPoint.x < currentPoint.x.
Please, explain me, what kind of splines, bezier paths, or other constructs I should use for my task. It will be very good if you point me to code, an algorithm, or a good manual for dummies.
Use Interpolation methods to generate the intermediate points on your curve.
For example, given the CubicInterpolate function:
double CubicInterpolate(
double y0,double y1,
double y2,double y3,
double mu)
{
double a0,a1,a2,a3,mu2;
mu2 = mu*mu;
a0 = y3 - y2 - y0 + y1;
a1 = y0 - y1 - a0;
a2 = y2 - y0;
a3 = y1;
return(a0*mu*mu2+a1*mu2+a2*mu+a3);
}
to find the point halfway between point[1] and point[2] on a cubic spline, you would use:
newPoint.X = CubicInterpolate(point[0].X, point[1].X, point[2].X, point[3].X, 0.5);
newPoint.Y = CubicInterpolate(point[0].Y, point[1].Y, point[2].Y, point[3].Y, 0.5);
point[0] and point[3] do affect the section of the curve between point[1] and point[2]. At either end of the curve, simply use the end point again.
To ensure a roughly equal distance between points, you can calculate the distance between input points to determine how many intermediate points (and mu values) to generate. So, for points that are further apart, you would use many more mu values between 0 and 1. Conversely, for points that are very close together, you may not need to add intermediate points at all.
Thank you all. I found the solution. For build 2D curve by basic points I do follow:
I found this article about a cubic spline with C++ and C# examples. This example allows to find interpolation values of 'one dimension' cubic spline by base points. Because I need a two dimension cubic spline - I create two one dimension splines - for 'x' and 'y' axes. Next, Next, I was run a cycle from first point to end point with some step and in each iteration of cycle I found interpolation value. From interpolation value I was make a point. So, when cycle has be ended I get a curve.
pseudo code (using spline class from article pointed above):
- (array*)splineByBasePoints:(array*)basePoints
{
int n = basePoints.count;
cubic_spline xSpline, ySpline;
xSpline.build_spline(basePoints.pointNumbers, basePoints.XValuesOfPoints, n);
ySpline.build_spline(basePoints.pointNumbers, basePoints.YValuesOfPoints, n);
array curve;
int t = 1; //t - intermediate point. '1' because number of point, not index
for (; t <= n; t += step)
{
[curve addToArray:PontWithXY([xSpline f:t], [ySpline f:t])];
}
return array;
}
If you have MATLAB license,
x = -4:4;
y = [0 .15 1.12 2.36 2.36 1.46 .49 .06 0];
cs = spline(x,[0 y 0]);
xx = linspace(-4,4,101);
y=ppval(cs,xx);

Line Passing Through Given Points

I am trying to find the angle of the outer line of the object in the green region of the image as shown in the image above…
For that, I have scanned the green region and get the points (dark blue points as shown in the image)...
As you can see, the points are not making straight line so I can’t find angle easily.
So I think I have to find a middle way and
that is to find the line so that the distance between each point and line remain as minimum as possible.
So how can I find the line so that each point exposes minimum distance to it……?
Is there any algorithm for this or is there any good way other than this?
The obvious route would be to do a least-squares linear regression through the points.
The standard least squares regression formulae for x on y or y on x assume there is no error in one coordinate and minimize the deviations in the coordinate from the line.
However, it is perfectly possible to set up a least squares calculation such that the value minimized is the sum of squares of the perpendicular distances of the points from the lines. I'm not sure whether I can locate the notebooks where I did the mathematics - it was over twenty years ago - but I did find the code I wrote at the time to implement the algorithm.
With:
n = ∑ 1
sx = ∑ x
sx2 = ∑ x2
sy = ∑ y
sy2 = ∑ y2
sxy = ∑ x·y
You can calculate the variances of x and y and the covariance:
vx = sx2 - ((sx * sx) / n)
vy = sy2 - ((sy * sy) / n)
vxy = sxy - ((sx * sy) / n)
Now, if the covariance is 0, then there is no semblance of a line. Otherwise, the slope and intercept can be found from:
slope = quad((vx - vy) / vxy, vxy)
intcpt = (sy - slope * sx) / n
Where quad() is a function that calculates the root of quadratic equation x2 + b·x - 1 with the same sign as c. In C, that would be:
double quad(double b, double c)
{
double b1;
double q;
b1 = sqrt(b * b + 4.0);
if (c < 0.0)
q = -(b1 + b) / 2;
else
q = (b1 - b) / 2;
return (q);
}
From there, you can find the angle of your line easily enough.
Obviously the line will pass through averaged point (x_average,y_average).
For direction you may use the following algorithm (derived directly from minimizing average square distance between line and points):
dx[i]=x[i]-x_average;
dy[i]=y[i]-y_average;
a=sum(dx[i]^2-dy[i]^2);
b=sum(2*dx[i]*dy[i]);
direction=atan2(b,a);
Usual linear regression will not work here, because it assumes that variables are not symmetric - one depends on other, so if you will swap x and y, you will have another solution.
The hough transform might be also a good option:
http://en.wikipedia.org/wiki/Hough_transform
You might try searching for "total least squares", or "least orthogonal distance" but when I tried that I saw nothing immediately applicable.
Anyway suppose you have points x[],y[], and the line is represented by a*x+b*y+c = 0, where hypot(a,b) = 1. The least orthogonal distance line is the one that minimises Sum{ (a*x[i]+b*y[i]+c)^2}. Some algebra shows that:
c is -(a*X+b*Y) where X is the mean of the x's and Y the mean of the y's.
(a,b) is the eigenvector of C corresponding to it's smaller eigenvalue, where C is the covariance matrix of the x's and y's

Resources