Equally divide area under a curve - algorithm

I have a y = sin(x) curve and x is between 0 and pi (first quadrant - no negative values). Something like this:
I want to equally divide area under the curve to n pieces and get the (biggest) x value for each piece.
Any ideas would be appreciated for an algorithm.

The area under the curve is its integral. The integral of sin(x) from 0 to u is 1-cos(u), so the integral from 0 to πis 2. Inverting that formula finds the points t for which u gets a certain value. So, we're looking for the values t=acos(1-u) for the values of u that divide [0, 2] into n equal parts.
In code:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-0.2, 3.3, 500)
y = np.sin(x)
plt.plot(x, y)
n = 7
u = np.linspace(0, 2, n + 1, endpoint=True)
t = np.arccos(1 - u)
print("The limits of the areas are:", list(t))
colors = plt.cm.Set2.colors
for i in range(n):
filter = (x > t[i]) & (x <= t[i + 1])
plt.fill_between(x[filter], 0, y[filter], color=colors[i])
plt.xticks(t)
plt.gca().spines['bottom'].set_position('zero')
plt.gca().spines['top'].set_color('none')
plt.gca().spines['right'].set_color('none')
plt.tight_layout()
plt.show()

Though it is not entirely consequent answer, maybe it would be interesting.
Deviding the area in question into three equal parts.
On the basis OA, as on the hypotenuse, build a right angled triangle with the ratio of the legs OC:AC= 4:5. Raise the vertical from the point C. Shift that vertical symmetrically also to the righy side. Division completed. Error about 1%.
Now on the merits/ One must use a recurrent formulae: X0= 0, X(i+1)= X(i)+Δ(i+1); Δ(i+1)= arccossqrt((p(i))^2-q(i))-p(i)), where p(i)= cos(X(i))(2/n-cos(X(i)); q(i)= cos(2X(i))+4/n(1/n-cos(X(i))

Related

Sample two random variables uniformly, in region where sum is greater than zero

I am trying to figure out how to sample for two random variables uniformly in the region where the sum of the two is greater than zero. I thought a solution might be to sample for X~U(-1,1) and then sample for Y~U(-x,1) where x would be the current sample for X.
But this resulted in a distribution that looks like this.
This doesn't look uniformly distributed as the density of points at the top left is higher and keeps reducing as we move to the right. Can someone point out where the flaw in my reasoning is and how to possibly fix this?
Thank you
You just need to make sure that adjust the density of x points away from the "top-left" corner appropriately. I'd also suggest generating in [0,1] and then transforming into [-1,1] afterwards.
For example:
import numpy as np
# generate points, sqrt takes care of moving points away from zero
n = 50000
x = np.sqrt(np.random.uniform(size=n))
y = np.random.uniform(1-x)
# transform to -1,1
x = x * 2 - 1
y = y * 2 - 1
plotting these gives:
which looks reasonable to me. Note I've colored the [-1,1] square to show where it should fit.
Could you please elaborate a bit on how you arrived at the answer?
Well, the main problem consists in getting a fair way to sample the non-uniform distribution of coordinate X.
From elementary geometry, the area of the part of the upper triangle with x < x0 is: (1/2) * (x0 + 1)2. As the total area of this upper triangle is equal to 2, it follows that the cumulative probability P of (X < x0) within the upper triangle is: P = (1/4) * (x0 + 1)2.
So, inverting the last formula, we have: x0 = 2*sqrt(P) - 1
Now, from the Inverse Transform Sampling theorem, we know that we can generate a fair sampling of X by reinterpreting P as a random variable U0 uniformly distributed between 0 and 1.
In Python, this gives us:
u0 = random.uniform(0.0, 1.0)
x = (2*math.sqrt(u0)) - 1.0
or equivalently:
u0 = random.random()
x = (2 * math.sqrt(u0)) - 1.0
Note that this is essentially the same maths as in the excellent answer by #SamMason. That thing comes from a general statistical principle. It can just as well be used to prove that a fair sampling of the latitude on a 3D sphere is given by arcsin(2*u - 1).
So now we have x, but we still need y. The underlying 2D density is an uniform one, so for a given x, all possible values of y are equidistributed.
The interval of possible values for y is [-x, 1]. So if U1 is yet another independent random variable uniformly distributed between 0 and 1, y can be drawn from the equation:
y = (1+x) * u1 - x
which in Python is rendered by:
u1 = random.random()
y = (1+x)*u1 - x
Overall, the Python code can be written like this:
import math
import random
import matplotlib.pyplot as plt
def mySampler():
u0 = random.random()
u1 = random.random()
x = 2*math.sqrt(u0) - 1.0
y = (1+x)*u1 - x
return (x,y)
#--- Main program:
points = (mySampler() for _ in range(10000)) # an iterator object
xx, yy = zip(*points)
plt.scatter(xx, yy, s=0.2)
plt.show()
Graphically, the result looks good enough:
Side note: a cheaper, ad hoc solution:
There is always the possibility of sampling uniformly in the whole square, and rejecting the points whose x+y sum happens to be negative. But this is a bit wasteful. We can have a more elegant solution by noting that the “bad” region has the same shape and area as the “good” region.
So if we get a “bad” point, instead of just rejecting it, we can replace it by its symmetic point with respect to the x+y=0 dividing line. This can be done using the following Python code:
def mySampler2():
x0 = random.uniform(-1.0, 1.0)
y0 = random.uniform(-1.0, 1.0)
s = x0+y0
if (s >= 0):
return (x0, y0) # good point
else:
return (x0-s, y0-s) # symmetric of bad point
This works fine too. And this is probably the cheapest possible solution regarding CPU time, as we reject nothing, and we don't need to compute a square root.
Following Generate random locations within a triangular domain
Code, to sample uniformly in any triangle, Python 3.9.4, Win 10 x64
import math
import random
import matplotlib.pyplot as plt
def trisample(A, B, C):
"""
Given three vertices A, B, C,
sample point uniformly in the triangle
"""
r1 = random.random()
r2 = random.random()
s1 = math.sqrt(r1)
x = A[0] * (1.0 - s1) + B[0] * (1.0 - r2) * s1 + C[0] * r2 * s1
y = A[1] * (1.0 - s1) + B[1] * (1.0 - r2) * s1 + C[1] * r2 * s1
return (x, y)
random.seed(312345)
A = (1, 0)
B = (1, 1)
C = (0, 1)
points = [trisample(A, B, C) for _ in range(10000)]
xx, yy = zip(*points)
plt.scatter(xx, yy, s=0.2)
plt.show()

Results from my thin plate spline interpolation implementation are dependant of the independent variables

I implemented the thin plate spline algorithm (see also this description) in order to interpolate scattered data using Python.
My algorithm seems to work correctly when the bounding box of the initial scattered data has an aspect ratio close to 1. However, scaling one of the data points coordinates changes the interpolation result. I created a minimal working example that is representative of what I am trying to accomplish. Below are two plots showing the results of the interpolation of 50 random points.
First, the interpolation of z = x^2 on the domain x = [0, 3], y = [0, 120]:
As you can see, the interpolation fails. Now, executing the same process but after scaling the x values by a factor of 40, I get:
This time, the result looks better. Choosing a slightly different scaling factor would have resulted in a slightly different interpolation. This shows that something is wrong in my algorithm but I can't find what exactly. Here is the algorithm:
import numpy as np
import numba as nb
# pts1 = Mx2 matrix (original coordinates)
# z1 = Mx1 column vector (original values)
# pts2 = Nx2 matrix (interpolation coordinates)
def gen_K(n, pts1):
K = np.zeros((n,n))
for i in range(0,n):
for j in range(0,n):
if i != j:
r = ( (pts1[i,0] - pts1[j,0])**2.0 + (pts1[i,1] - pts1[j,1])**2.0 )**0.5
K[i,j] = r**2.0*np.log(r)
return K
def compute_z2(m, n, pts1, pts2, coeffs):
z2 = np.zeros((m,1))
x_min = np.min(pts1[:,0])
x_max = np.max(pts1[:,0])
y_min = np.min(pts1[:,1])
y_max = np.max(pts1[:,1])
for k in range(0,m):
pt = pts2[k,:]
# If point is located inside bounding box of pts1
if (pt[0] >= x_min and pt[0] <= x_max and pt[1] >= y_min and pt[1] <= y_max):
z2[k,0] = coeffs[-3,0] + coeffs[-2,0]*pts2[k,0] + coeffs[-1,0]*pts2[k,1]
for i in range(0,n):
r2 = ( (pts1[i,0] - pts2[k,0])**2.0 + (pts1[i,1] - pts2[k,1])**2.0 )**0.5
if r2 != 0:
z2[k,0] += coeffs[i,0]*( r2**2.0*np.log(r2) )
else:
z2[k,0] = np.nan
return z2
gen_K_nb = nb.jit(nb.float64[:,:](nb.int64, nb.float64[:,:]), nopython = True)(gen_K)
compute_z2_nb = nb.jit(nb.float64[:,:](nb.int64, nb.int64, nb.float64[:,:], nb.float64[:,:], nb.float64[:,:]), nopython = True)(compute_z2)
def TPS(pts1, z1, pts2, factor):
n, m = pts1.shape[0], pts2.shape[0]
P = np.hstack((np.ones((n,1)),pts1))
Y = np.vstack((z1, np.zeros((3,1))))
K = gen_K_nb(n, pts1)
K += factor*np.identity(n)
L = np.zeros((n+3,n+3))
L[0:n, 0:n] = K
L[0:n, n:n+3] = P
L[n:n+3, 0:n] = P.T
L_inv = np.linalg.inv(L)
coeffs = L_inv.dot(Y)
return compute_z2_nb(m, n, pts1, pts2, coeffs)
Finally, here is the code snippet I used to create the two plots:
import matplotlib.pyplot as plt
import numpy as np
N = 50 # Number of random points
pts = np.random.rand(N,2)
pts[:,0] *= 3.0 # initial x values
pts[:,1] *= 120.0 # initial y values
z1 = (pts[:,0])**2.0
for scale in [1.0, 40.0]:
pts1 = pts.copy()
pts1[:,0] *= scale
x2 = np.linspace(np.min(pts1[:,0]), np.max(pts1[:,0]), 40)
y2 = np.linspace(np.min(pts1[:,1]), np.max(pts1[:,1]), 40)
x2, y2 = np.meshgrid(x2, y2)
pts2 = np.vstack((x2.flatten(), y2.flatten())).T
z2 = TPS(pts1, z1.reshape(z1.shape[0], 1), pts2, 0.0)
# Display
fig = plt.figure(figsize=(4,3))
ax = fig.add_subplot(111)
C = ax.contourf(x2, y2, z2.reshape(x2.shape), np.linspace(0,9,10), extend='both')
ax.plot(pts1[:,0], pts1[:,1], 'ok')
ax.set_xlabel('x')
ax.set_ylabel('y')
plt.colorbar(C, extendfrac=0)
plt.tight_layout()
plt.show()
Thin Plate Spline is scalar invariant, which means if you scale x and y by the same factor, the result should be the same. However, if you scale x and y differently, then the result will be different. This is common characteristics among radial basis functions. Some radial basis functions are not even scalar invariant.
When you say it "fails", what do you mean? The big question is, does it still exactly interpolate at the construction points? Assuming your code is correct and you do not have ill-conditioning, it should in which case it does not fail.
What I think is happening is that the addition of the scale is making the behavior in the x direction more dominant so you do not see the wiggles that come naturally from the interpolation.
As an aside, you can greatly speed up your code without using Numba by vectorizing.
import scipy.spatial.distance
import scipy.special
def gen_K(n,pts1):
# No need for n but kept to maintain compatability
pts1 = np.atleast_2d(pts1)
r = scipy.spatial.distance.cdist(pts1,pts1)
return scipy.special.xlogy(r**2,r)
It means you will get horrible ridges running through the surface. Resulting in a sub-optimal model fit. Read the caption below the images. Your model is experiencing the same effect, although plotted in 2D.

Algorithm: Calculate pseudo-random point inside an ellipse

For a simple particle system I'm making, I need to, given an ellipse with width and height, calculate a random point X, Y which lies in that ellipse.
Now I'm not the best at maths, so I wanted to ask here if anybody could point me in the right direction.
Maybe the right way is to choose a random float in the range of the width, take it for X and from it calculate the Y value?
Generate a random point inside a circle of radius 1. This can be done by taking a random angle phi in the interval [0, 2*pi) and a random value rho in the interval [0, 1) and compute
x = sqrt(rho) * cos(phi)
y = sqrt(rho) * sin(phi)
The square root in the formula ensures a uniform distribution inside the circle.
Scale x and y to the dimensions of the ellipse
x = x * width/2.0
y = y * height/2.0
Use rejection sampling: choose a random point in the rectangle around the ellipse. Test whether the point is inside the ellipse by checking the sign of (x-x0)^2/a^2+(y-y0)^2/b^2-1. Repeat if the point is not inside. (This assumes that the ellipse is aligned with the coordinate axes. A similar solution works in the general case but is more complicated, of course.)
It is possible to generate points within an ellipse without using rejection sampling too by carefully considering its definition in polar form. From wikipedia the polar form of an ellipse is given by
Intuitively speaking, we should sample polar angle θ more often where the radius is larger. Put more mathematically, our PDF for the random variable θ should be p(θ) dθ = dA / A, where dA is the area of a single segment at angle θ with width dθ. Using the equation for polar angle area dA = 1/2 r2 dθ and the area of an ellipse being π a b, then the PDF becomes
To randomly sample from this PDF, one direct method is the inverse CDF technique. This requires calculating the cumulative density function (CDF) and then inverting this function. Using Wolfram Alpha to get the indefinite integral, then inverting it gives inverse CDF of
where u runs between 0 and 1. So to sample a random angle θ, you just generate a uniform random number u between 0 and 1, and substitute it into this equation for the inverse CDF.
To get the random radius, the same technique that works for a circle can be used (see for example Generate a random point within a circle (uniformly)).
Here is some sample Python code which implements this algorithm:
import numpy
import matplotlib.pyplot as plt
import random
# Returns theta in [-pi/2, 3pi/2]
def generate_theta(a, b):
u = random.random() / 4.0
theta = numpy.arctan(b/a * numpy.tan(2*numpy.pi*u))
v = random.random()
if v < 0.25:
return theta
elif v < 0.5:
return numpy.pi - theta
elif v < 0.75:
return numpy.pi + theta
else:
return -theta
def radius(a, b, theta):
return a * b / numpy.sqrt((b*numpy.cos(theta))**2 + (a*numpy.sin(theta))**2)
def random_point(a, b):
random_theta = generate_theta(a, b)
max_radius = radius(a, b, random_theta)
random_radius = max_radius * numpy.sqrt(random.random())
return numpy.array([
random_radius * numpy.cos(random_theta),
random_radius * numpy.sin(random_theta)
])
a = 2
b = 1
points = numpy.array([random_point(a, b) for _ in range(2000)])
plt.scatter(points[:,0], points[:,1])
plt.show()
I know this is an old question, but I think none of the existing answers are good enough.
I was looking for a solution for exactly the same problem and got directed here by Google, found all the existing answers are not what I wanted, so I implemented my own solution entirely by myself, using information found here: https://en.wikipedia.org/wiki/Ellipse
So any point on the ellipse must satisfy that equation, how to make a point inside the ellipse?
Just scale a and b with two random numbers between 0 and 1.
I will post my code here, I just want to help.
import math
import matplotlib.pyplot as plt
import random
from matplotlib.patches import Ellipse
a = 4
b = a*math.tan(math.radians((random.random()+0.5)/2*45))
def random_point(a, b):
d = math.radians(random.random()*360)
return (a * math.cos(d) * random.random(), b * math.sin(d) * random.random())
points = [random_point(a, b) for i in range(360)]
x, y = zip(*points)
fig = plt.figure(frameon=False)
ax = fig.add_subplot(111)
ax.set_axis_off()
ax.add_patch(Ellipse((0, 0), 2*a, 2*b, edgecolor='k', fc='None', lw=2))
ax.scatter(x, y)
fig.subplots_adjust(left=0, bottom=0, right=1, top=1, wspace=0, hspace=0)
plt.axis('scaled')
plt.box(False)
ax = plt.gca()
ax.set_xlim([-a, a])
ax.set_ylim([-b, b])
plt.set_cmap('rainbow')
plt.show()

Does anyone know how to do an "inverse" trilinear interpolation?

Trilinear interpolation approximates the value of a point (x, y, z) inside a cube using the values at the cube vertices. I´m trying to do an "inverse" trilinear interpolation. Knowing the values at the cube vertices and the value attached to a point how can I find (x, y, z)? Any help would be highly appreciated. Thank you!
You are solving for 3 unknowns given 1 piece of data, and as you are using a linear interpolation your answer will typically be a plane (2 free variables). Depending on the cube there may be no solutions or a 3D solution space.
I would do the following. Let v be the initial value. For each "edge" of the 12 edges (pair of adjacent vertices) of the cube look to see if 1 vertex is >=v and the other <=v - call this an edge that crosses v.
If no edges cross v, then there are no possible solutions.
Otherwise, for each edge that crosses v, if both vertices for the edge equal v, then the whole edge is a solution. Otherwise, linearly interpolate on the edge to find the point that has a value of v. So suppose the edge is (x1, y1, z1)->v1 <= v <= (x2, y2, z2)->v2.
s = (v-v1)/(v2-v1)
(x,y,z) = (s*(x2-x1)+x1, (s*(y2-y1)+y1, s*(z2-z1)+z1)
This will give you all edge points that are equal to v. This is a solution, but possibly you want an internal solution - be aware that if there is an internal solution there will always be an edge solution.
If you want an internal solution then just take any point linearly between the edge solutions - as you are linearly interpolating then the result will also be v.
I'm not sure you can for all cases. For example using tri-linear filtering for colours where each colour (C) at each point is identical means that wherever you interpolate to you will still get the colour C returned. In this situation ANY x,y,z could be valid. As such it would be impossible to say for definite what the initial interpolation values were.
I'm sure for some cases you can reverse the maths but, i imagine, there are far too many cases where this is impossible to do without knowing more of the input information.
Good luck, I hope someone will prove me wrong :)
The wikipedia page for trilinear interpolation has link to a NASA page which allegedly describes the inversing process - have you had a look at that?
The problem as you're describing it somewhat ill-defined.
What you're asking for basically translates to this: I have a 3D function and I know its values in 8 known points. I'd like to know what is the point in which the function received value V.
The trouble is that in most likelihood there is an infinite number of such points which make a set of surfaces, lines or points, depending on the data.
One way to find this set is to use an iso-surfacing algorithm like Marching cubes.
Let's start with 2d: think of a bilinear hill over a square km,
with heights say 0 10 20 30 at the 4 corners
and a horizontal plane cutting the hill at height z.
Draw a line from the 0 corner to the 30 corner (whether adjacent or diagonal).
The plane must cut this line, for any z,
so all points x,y,z fall on this one line, right ? Hmm.
OK, there are many solutions -- any z plane cuts the hill in a contour curve.
Say we want solutions to be spread out over the whole hill,
i.e. minimize two things at once:
vertical distance z - bilin(x,y),
distance from x,y to some point in the square.
Scipy.optimize.leastsq is one way of doing this, sample code below;
trilinear is similar.
(Optimizing any two things at once requires an arbitrary tradeoff or weighting:
food vs. money, work vs. play ...
Cf. Bounded rationality
)
""" find x,y so bilin(x,y) ~ z and x,y near the middle """
from __future__ import division
import numpy as np
from scipy.optimize import leastsq
zmax = 30
corners = [ 0, 10, 20, zmax ]
midweight = 10
def bilin( x, y ):
""" bilinear interpolate
in: corners at 0 0 0 1 1 0 1 1 in that order (binary)
see wikipedia Bilinear_interpolation ff.
"""
z00,z01,z10,z11 = corners # 0 .. 1
return (z00 * (1-x) * (1-y)
+ z01 * (1-x) * y
+ z10 * x * (1-y)
+ z11 * x * y)
vecs = np.array([ (x, y) for x in (.25, .5, .75) for y in (.25, .5, .75) ])
def nearvec( x, vecs ):
""" -> (min, nearest vec) """
t = (np.inf,)
for v in vecs:
n = np.linalg.norm( x - v )
if n < t[0]: t = (n, v)
return t
def lsqmin( xy ): # z, corners
x,y = xy
near = nearvec( np.array(xy), vecs )[0] * midweight
return (z - bilin( x, y ), near )
# i.e. find x,y so both bilin(x,y) ~ z and x,y near a point in vecs
#...............................................................................
if __name__ == "__main__":
import sys
ftol = .1
maxfev = 10
exec "\n".join( sys.argv[1:] ) # ftol= ...
x0 = np.array(( .5, .5 ))
sumdiff = 0
for z in range(zmax+1):
xetc = leastsq( lsqmin, x0, ftol=ftol, maxfev=maxfev, full_output=1 )
# (x, {cov_x, infodict, mesg}, ier)
x,y = xetc[0] # may be < 0 or > 1
diff = bilin( x, y ) - z
sumdiff += abs(diff)
print "%.2g %8.2g %5.2g %5.2g" % (z, diff, x, y)
print "ftol %.2g maxfev %d midweight %.2g => av diff %.2g" % (
ftol, maxfev, midweight, sumdiff/zmax)

3D Least Squares Plane

What's the algorithm for computing a least squares plane in (x, y, z) space, given a set of 3D data points? In other words, if I had a bunch of points like (1, 2, 3), (4, 5, 6), (7, 8, 9), etc., how would one go about calculating the best fit plane f(x, y) = ax + by + c? What's the algorithm for getting a, b, and c out of a set of 3D points?
If you have n data points (x[i], y[i], z[i]), compute the 3x3 symmetric matrix A whose entries are:
sum_i x[i]*x[i], sum_i x[i]*y[i], sum_i x[i]
sum_i x[i]*y[i], sum_i y[i]*y[i], sum_i y[i]
sum_i x[i], sum_i y[i], n
Also compute the 3 element vector b:
{sum_i x[i]*z[i], sum_i y[i]*z[i], sum_i z[i]}
Then solve Ax = b for the given A and b. The three components of the solution vector are the coefficients to the least-square fit plane {a,b,c}.
Note that this is the "ordinary least squares" fit, which is appropriate only when z is expected to be a linear function of x and y. If you are looking more generally for a "best fit plane" in 3-space, you may want to learn about "geometric" least squares.
Note also that this will fail if your points are in a line, as your example points are.
The equation for a plane is: ax + by + c = z. So set up matrices like this with all your data:
x_0 y_0 1
A = x_1 y_1 1
...
x_n y_n 1
And
a
x = b
c
And
z_0
B = z_1
...
z_n
In other words: Ax = B. Now solve for x which are your coefficients. But since (I assume) you have more than 3 points, the system is over-determined so you need to use the left pseudo inverse. So the answer is:
a
b = (A^T A)^-1 A^T B
c
And here is some simple Python code with an example:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
N_POINTS = 10
TARGET_X_SLOPE = 2
TARGET_y_SLOPE = 3
TARGET_OFFSET = 5
EXTENTS = 5
NOISE = 5
# create random data
xs = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
ys = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
zs = []
for i in range(N_POINTS):
zs.append(xs[i]*TARGET_X_SLOPE + \
ys[i]*TARGET_y_SLOPE + \
TARGET_OFFSET + np.random.normal(scale=NOISE))
# plot raw data
plt.figure()
ax = plt.subplot(111, projection='3d')
ax.scatter(xs, ys, zs, color='b')
# do fit
tmp_A = []
tmp_b = []
for i in range(len(xs)):
tmp_A.append([xs[i], ys[i], 1])
tmp_b.append(zs[i])
b = np.matrix(tmp_b).T
A = np.matrix(tmp_A)
fit = (A.T * A).I * A.T * b
errors = b - A * fit
residual = np.linalg.norm(errors)
print("solution:")
print("%f x + %f y + %f = z" % (fit[0], fit[1], fit[2]))
print("errors:")
print(errors)
print("residual:")
print(residual)
# plot plane
xlim = ax.get_xlim()
ylim = ax.get_ylim()
X,Y = np.meshgrid(np.arange(xlim[0], xlim[1]),
np.arange(ylim[0], ylim[1]))
Z = np.zeros(X.shape)
for r in range(X.shape[0]):
for c in range(X.shape[1]):
Z[r,c] = fit[0] * X[r,c] + fit[1] * Y[r,c] + fit[2]
ax.plot_wireframe(X,Y,Z, color='k')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()
unless someone tells me how to type equations here, let me just write down the final computations you have to do:
first, given points r_i \n \R, i=1..N, calculate the center of mass of all points:
r_G = \frac{\sum_{i=1}^N r_i}{N}
then, calculate the normal vector n, that together with the base vector r_G defines the plane by calculating the 3x3 matrix A as
A = \sum_{i=1}^N (r_i - r_G)(r_i - r_G)^T
with this matrix, the normal vector n is now given by the eigenvector of A corresponding to the minimal eigenvalue of A.
To find out about the eigenvector/eigenvalue pairs, use any linear algebra library of your choice.
This solution is based on the Rayleight-Ritz Theorem for the Hermitian matrix A.
See 'Least Squares Fitting of Data' by David Eberly for how I came up with this one to minimize the geometric fit (orthogonal distance from points to the plane).
bool Geom_utils::Fit_plane_direct(const arma::mat& pts_in, Plane& plane_out)
{
bool success(false);
int K(pts_in.n_cols);
if(pts_in.n_rows == 3 && K > 2) // check for bad sizing and indeterminate case
{
plane_out._p_3 = (1.0/static_cast<double>(K))*arma::sum(pts_in,1);
arma::mat A(pts_in);
A.each_col() -= plane_out._p_3; //[x1-p, x2-p, ..., xk-p]
arma::mat33 M(A*A.t());
arma::vec3 D;
arma::mat33 V;
if(arma::eig_sym(D,V,M))
{
// diagonalization succeeded
plane_out._n_3 = V.col(0); // in ascending order by default
if(plane_out._n_3(2) < 0)
{
plane_out._n_3 = -plane_out._n_3; // upward pointing
}
success = true;
}
}
return success;
}
Timed at 37 micro seconds fitting a plane to 1000 points (Windows 7, i7, 32bit program)
This reduces to the Total Least Squares problem, that can be solved using SVD decomposition.
C++ code using OpenCV:
float fitPlaneToSetOfPoints(const std::vector<cv::Point3f> &pts, cv::Point3f &p0, cv::Vec3f &nml) {
const int SCALAR_TYPE = CV_32F;
typedef float ScalarType;
// Calculate centroid
p0 = cv::Point3f(0,0,0);
for (int i = 0; i < pts.size(); ++i)
p0 = p0 + conv<cv::Vec3f>(pts[i]);
p0 *= 1.0/pts.size();
// Compose data matrix subtracting the centroid from each point
cv::Mat Q(pts.size(), 3, SCALAR_TYPE);
for (int i = 0; i < pts.size(); ++i) {
Q.at<ScalarType>(i,0) = pts[i].x - p0.x;
Q.at<ScalarType>(i,1) = pts[i].y - p0.y;
Q.at<ScalarType>(i,2) = pts[i].z - p0.z;
}
// Compute SVD decomposition and the Total Least Squares solution, which is the eigenvector corresponding to the least eigenvalue
cv::SVD svd(Q, cv::SVD::MODIFY_A|cv::SVD::FULL_UV);
nml = svd.vt.row(2);
// Calculate the actual RMS error
float err = 0;
for (int i = 0; i < pts.size(); ++i)
err += powf(nml.dot(pts[i] - p0), 2);
err = sqrtf(err / pts.size());
return err;
}
As with any least-squares approach, you proceed like this:
Before you start coding
Write down an equation for a plane in some parameterization, say 0 = ax + by + z + d in thee parameters (a, b, d).
Find an expression D(\vec{v};a, b, d) for the distance from an arbitrary point \vec{v}.
Write down the sum S = \sigma_i=0,n D^2(\vec{x}_i), and simplify until it is expressed in terms of simple sums of the components of v like \sigma v_x, \sigma v_y^2, \sigma v_x*v_z ...
Write down the per parameter minimization expressions dS/dx_0 = 0, dS/dy_0 = 0 ... which gives you a set of three equations in three parameters and the sums from the previous step.
Solve this set of equations for the parameters.
(or for simple cases, just look up the form). Using a symbolic algebra package (like Mathematica) could make you life much easier.
The coding
Write code to form the needed sums and find the parameters from the last set above.
Alternatives
Note that if you actually had only three points, you'd be better just finding the plane that goes through them.
Also, if the analytic solution in unfeasible (not the case for a plane, but possible in general) you can do steps 1 and 2, and use a Monte Carlo minimizer on the sum in step 3.
CGAL::linear_least_squares_fitting_3
Function linear_least_squares_fitting_3 computes the best fitting 3D
line or plane (in the least squares sense) of a set of 3D objects such
as points, segments, triangles, spheres, balls, cuboids or tetrahedra.
http://www.cgal.org/Manual/latest/doc_html/cgal_manual/Principal_component_analysis_ref/Function_linear_least_squares_fitting_3.html
It sounds like all you want to do is linear regression with 2 regressors. The wikipedia page on the subject should tell you all you need to know and then some.
All you'll have to do is to solve the system of equations.
If those are your points:
(1, 2, 3), (4, 5, 6), (7, 8, 9)
That gives you the equations:
3=a*1 + b*2 + c
6=a*4 + b*5 + c
9=a*7 + b*8 + c
So your question actually should be: How do I solve a system of equations?
Therefore I recommend reading this SO question.
If I've misunderstood your question let us know.
EDIT:
Ignore my answer as you probably meant something else.
We first present a linear least-squares plane fitting method that minimizes the residuals between the estimated normal vector and provided points.
Recall that the equation for a plane passing through origin is Ax + By + Cz = 0, where (x, y, z) can be any point on the plane and (A, B, C) is the normal vector perpendicular to this plane.
The equation for a general plane (that may or may not pass through origin) is Ax + By + Cz + D = 0, where the additional coefficient D represents how far the plane is away from the origin, along the direction of the normal vector of the plane. [Note that in this equation (A, B, C) forms a unit normal vector.]
Now, we can apply a trick here and fit the plane using only provided point coordinates. Divide both sides by D and rearrange this term to the right-hand side. This leads to A/D x + B/D y + C/D z = -1. [Note that in this equation (A/D, B/D, C/D) forms a normal vector with length 1/D.]
We can set up a system of linear equations accordingly, and then solve it by an Eigen solver in C++ as follows.
// Example for 5 points
Eigen::Matrix<double, 5, 3> matA; // row: 5 points; column: xyz coordinates
Eigen::Matrix<double, 5, 1> matB = -1 * Eigen::Matrix<double, 5, 1>::Ones();
// Find the plane normal
Eigen::Vector3d normal = matA.colPivHouseholderQr().solve(matB);
// Check if the fitting is healthy
double D = 1 / normal.norm();
normal.normalize(); // normal is a unit vector from now on
bool planeValid = true;
for (int i = 0; i < 5; ++i) { // compare Ax + By + Cz + D with 0.2 (ideally Ax + By + Cz + D = 0)
if ( fabs( normal(0)*matA(i, 0) + normal(1)*matA(i, 1) + normal(2)*matA(i, 2) + D) > 0.2) {
planeValid = false; // 0.2 is an experimental threshold; can be tuned
break;
}
}
We then discuss its equivalence to the typical SVD-based method and their comparison.
The aforementioned linear least-squares (LLS) method fits the general plane equation Ax + By + Cz + D = 0, whereas the SVD-based method replaces D with D = - (Ax0 + By0 + Cz0) and fits the plane equation A(x-x0) + B(y-y0) + C(z-z0) = 0, where (x0, y0, z0) is the mean of all points that serves as the origin of the new local coordinate frame.
Comparison between two methods:
The LLS fitting method is much faster than the SVD-based method, and is suitable for use when points are known to be roughly in a plane shape.
The SVD-based method is more numerically stable when the plane is far away from origin, because the LLS method would require more digits after decimal to be stored and processed in such cases.
The LLS method can detect outliers by checking the dot product residual between each point and the estimated normal vector, whereas the SVD-based method can detect outliers by checking if the smallest eigenvalue of the covariance matrix is significantly smaller than the two larger eigenvalues (i.e. checking the shape of the covariance matrix).
We finally provide a test case in C++ and MATLAB.
// Test case in C++ (using LLS fitting method)
matA(0,0) = 5.4637; matA(0,1) = 10.3354; matA(0,2) = 2.7203;
matA(1,0) = 5.8038; matA(1,1) = 10.2393; matA(1,2) = 2.7354;
matA(2,0) = 5.8565; matA(2,1) = 10.2520; matA(2,2) = 2.3138;
matA(3,0) = 6.0405; matA(3,1) = 10.1836; matA(3,2) = 2.3218;
matA(4,0) = 5.5537; matA(4,1) = 10.3349; matA(4,2) = 1.8796;
// With this sample data, LLS fitting method can produce the following result
// fitted normal vector = (-0.0231143, -0.0838307, -0.00266429)
// unit normal vector = (-0.265682, -0.963574, -0.0306241)
// D = 11.4943
% Test case in MATLAB (using SVD-based method)
points = [5.4637 10.3354 2.7203;
5.8038 10.2393 2.7354;
5.8565 10.2520 2.3138;
6.0405 10.1836 2.3218;
5.5537 10.3349 1.8796]
covariance = cov(points)
[V, D] = eig(covariance)
normal = V(:, 1) % pick the eigenvector that corresponds to the smallest eigenvalue
% normal = (0.2655, 0.9636, 0.0306)

Resources