Uniform sampling around hemisphere - probability

suppose that rand() can generate random value in [0, 1] uniformly. Is the direction of the ray generated by the following method uniformly distributed? (I am doing monte carlo integration.)
X = rand() * 2 - 1
Y = rand() * 2 - 1
Z = rand()
vec3 dir = vec3(X, Y, Z).normalized()

I figure it out by myself
This method can produce sample points uniformly in the cube. but after the normalization, there're different number of points projected on the surface of the hemisphere. thus, p(x) is not a constant

Related

Obtaining a region of evenly distributed points on a sphere

There are several questions on this site about distributing points on the surface of a sphere, but all of these are based on actually generating all of the points on that sphere. My favorite thus far is the golden spiral discussed in Evenly distributing n points on a sphere.
I need to cover a sphere in trillions of points, but only ever need to actually generate a tiny region of the surface (earth down to ~10 meters, looking at a roughly 1 km^2 area). The points generated for that region must match the points that would be generated for the entire sphere (i.e., stitching small regions together must yield the same result as generating a larger region), and generation should be pretty fast.
My attempts to use the golden spiral with such a large number of points have been thwarted by floating point precision issues.
The best I've managed to come up with is generating points at equally spaced latitudes and calculating longitudinal spacing based on the circumference at that latitude. The result is far from satisfactory however (especially the resulting horizontal rings of points).
Does anyone have a suggestion for generating a small region of distributed points on the surface of a sphere?
The vertices of a geodesic sphere would work well in this application.
You start with an icosahedron, divide each face into a triangular mesh of whatever resolution you like, and project the points onto the surface of the sphere.
The Fibonacci sphere approximation is quite easy to generalize efficiently to a subset of points computation, as the analytic formulas are very straight-forward.
The below code computes the subset of points shown below for a trillion points in a few seconds of runtime on my weak laptop and a relatively under optimised python implementation.
Code to compute the above is below, and includes a means to verify the subset computation is exactly the same as a brute-force computation (however don't try it for trillion points, it will never finish unless you have a super-computer!)
Please note, the use of 128-bit doubles is an absolute requirement when you do the computation over more than about a billion points as there are major quantisation artefacts otherwise!
Runtime scales with r' * N where r' is the ratio of the subset to that of the full sphere. Thus, a very small r' can be computed very efficiently.
#!/usr/bin/env python3
import argparse
import mpl_toolkits.mplot3d.axes3d as ax3d
import matplotlib.pyplot as plt
import numpy as np
def fibonacci_sphere_pts(num_pts):
ga = (3 - np.sqrt(5)) * np.pi # golden angle
# Create a list of golden angle increments along tha range of number of points
theta = ga * np.arange(num_pts)
# Z is a split into a range of -1 to 1 in order to create a unit circle
z = np.linspace(1 / num_pts - 1, 1 - 1 / num_pts, num_pts)
# a list of the radii at each height step of the unit circle
radius = np.sqrt(1 - z * z)
# Determine where xy fall on the sphere, given the azimuthal and polar angles
y = radius * np.sin(theta)
x = radius * np.cos(theta)
return np.asarray(list(zip(x,y,z)))
def fibonacci_sphere(num_pts):
x,y,z = zip(*fibonacci_sphere_subset(num_pts))
# Display points in a scatter plot
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
ax.scatter(x, y, z)
plt.show()
def fibonacci_sphere_subset_pts(num_pts, p0, r0 ):
"""
Get a subset of a full fibonacci_sphere
"""
ga = (3 - np.sqrt(5)) * np.pi # golden angle
x0, y0, z0 = p0
z_s = 1 / num_pts - 1
z_e = 1 - 1 / num_pts
# linspace formula for range [z_s,z_e] for N points is
# z_k = z_s + (z_e - z_s) / (N-1) * k , for k [0,N)
# therefore k = (z_k - z_s)*(N-1) / (z_e - z_s)
# would be the closest value of k
k = int(np.round((z0 - z_s) * (num_pts - 1) / (z_e - z_s)))
# here a sufficient number of "layers" of the fibonacci sphere must be
# selected to obtain enough points to be a superset of the subset given the
# radius, we use a heuristic to determine the number but it can be obtained
# exactly by the correct formula instead (by choosing an upperbound)
dz = (z_e - z_s) / (num_pts-1)
n_dk = int(np.ceil( r0 / dz ))
dk = np.arange(k - n_dk, k + n_dk+1)
dk = dk[np.where((dk>=0)&(dk<num_pts))[0]]
# NOTE: *must* use long double over regular doubles below, otherwise there
# are major quantization errors in the output for large number of points
theta = ga * dk.astype(np.longdouble)
z = z_s + (z_e - z_s ) / (num_pts-1) *dk
radius = np.sqrt(1 - z * z)
y = radius * np.sin(theta)
x = radius * np.cos(theta)
idx = np.where(np.square(x - x0) + np.square(y-y0) + np.square(z-z0) <= r0*r0)[0]
return x[idx],y[idx],z[idx]
def fibonacci_sphere_subset(num_pts, p0, r0, do_compare=False ):
"""
Display fib sphere subset points and optionally compare against bruteforce computation
"""
x,y,z = fibonacci_sphere_subset_pts(num_pts,p0,r0)
if do_compare:
subset = zip(x,y,z)
subset_bf = fibonacci_sphere_pts(num_pts)
x0,y0,z0 = p0
subset_bf = [ (x,y,z) for (x,y,z) in subset_bf if np.square(x - x0) + np.square(y-y0) + np.square(z-z0) <= r0*r0 ]
subset_bf = np.asarray(subset_bf)
if np.allclose(subset,subset_bf):
print('PASS: subset and bruteforce computation agree completely')
else:
print('FAIL: subset and bruteforce computation DO NOT agree completely')
# Display points in a scatter plot
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
ax.scatter(x, y, z)
plt.show()
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="fibonacci sphere")
parser.add_argument(
"numpts", type=int, help="number of points to distribute along sphere"
)
args = parser.parse_args()
# hard-coded point to query with a tiny fixed radius
p0 = (.5,.5,np.sqrt(1. - .5*.5 - .5*.5)) # coordinate of query point representing center of subset, note all coordinates fall between -1 and 1
r0 = .00001 # the radius of the subset, a very small number is chosen as radius of full sphere is 1.0
fibonacci_sphere_subset(int(args.numpts),p0,r0,do_compare=False)

Convert a bivariate draw in a univariate draw in Matlab

I have in mind the following experiment to run in Matlab and I am asking for an help to implement step (3). Any suggestion would be very appreciated.
(1) Consider the random variables X and Y both uniformly distributed on [0,1]
(2) Draw N realisation from the joint distribution of X and Y assuming that X and Y are independent (meaning that X and Y are uniformly jointly distributed on [0,1]x[0,1]). Each draw will be in [0,1]x[0,1].
(3) Transform each draw in [0,1]x[0,1] in a draw in [0,1] using the Hilbert space filling curve: under the Hilbert curve mapping, the draw in [0,1]x[0,1] should be the image of one (or more because of surjectivity) point(s) in [0,1]. I want pick one of these points. Is there any pre-built package in Matlab doing this?
I found this answer which I don't think does what I want as it explains how to obtain the Hilbert value of the draw (curve length from the start of curve to the picked point)
On wikipedia I found this code in C language (from (x,y) to d) which, again, does not fulfil my question.
EDIT This answer does not address updated version of the question, which explicitly asks about constructing Hilbert curve. Instead, this answer addresses a related question on construction of bijective mapping, and the relation to uniform distribution.
Your problem in not really well defined. If you only need the resulting distribution to be uniform, nothing is stopping you from simply picking f:(X,Y)->X. Result would be uniform regardless of whether X and Y are correlated. From your post I can only presume that what you want, in fact, is for the resulting transformation to be bijective, or as close to it as possible given machine precision limitations.
Worth noting that unless you need the algorithm that is best in preserving locality (which is clearly not required for resulting distribution to be bijective, not to mention uniform), there's no need to bother constructing Hilbert curves that you mention in your question. They have just as much to do with the solution as any other space-filling curve, and are incredibly computationally intensive.
So assuming you're looking for a bijective mapping, your question is equivalent to asking whether the set of points in a [unit] square has the same cardinality as the set of points in a [unit] line segment, and if it is, how to construct that bijection, i.e. 1-to-1 correspondence. The intuition says the square should have a higher cardinality, and Cantor spent 3 years trying to prove that, eventually proving quite the opposite - that these sets are in fact equinumerous. He was so surprised at his discovery that he wrote:
I see it, but I don't believe it!
The most commonly referred to bijection, fulfilling** this criteria, is the following. Represent x and y in their decimal form, i.e. x=0. x1 x2 x3 x4 x5..., and y=0. y1 y2 y3 y4 y5..., and let f:(X,Y)->Z be z=0. x1 y1 x2 y2 x3 y3 x4 y4 x5 y5..., i.e. alternating the decimals of the two numbers. The idea behind the bijection is trivial, though a rigorous proof requires quite a bit of prior knowledge.
** The caveat is that if we take e.g. x = 1/3 = 0.33333... and y = 1/5 = 0.199999... = 0.200000..., we can see there are two sequences corresponding to them: z = 0.313939393939... and z = 0.323030303030.... To overcome this obstacle we have to prove that adding a countable set to an uncountable one does not change the cardinality of the latter.
In reality we have to deal with machine precision and not pure math, which strictly speaking means both sets are actually finite and hence not equinumerous (assuming you store result with the same precision as original numbers). Which means we're simply forced to do some assumptions and loose some information, such as, in this case, the last half of significant digits of x and y. That is, unless we use a different data type that allows to store result with a double precision, compared to that of original variables.
Finally, sample implementation in Matlab:
x = rand();
y = rand();
chars = [num2str(x, '%.17f'); num2str(y, '%.17f')];
z = str2double(['0.' reshape(chars(:,3:end), 1, [])]);
>> cellstr(['x=' num2str(x, '%.17f'); 'y=' num2str(y, '%.17f'); 'z=' num2str(z, '%.17f')])
ans =
'x=0.65549803980353738'
'y=0.10975505072305158'
'z=0.61505947958500362'
Edit This answers the original request for a transformation f(x,y) -> t ~ U[0,1] given x,y ~ U[0,1], and additionally for x and y correlated. The updated question asks specifically for a Hilbert curve, H(x,y) -> t ~ U[0,1] and only for x,y ~ U[0,1] so this answer is no longer relevant.
Consider a random uniform sequence in [0,1] r1, r2, r3, .... You are assigning this sequence to pairs of numbers (x1,y1), (x2,y2), .... What you are asking for is a transformation on pairs (x,y) which yield a uniform random number in [0,1].
Consider the random subsequence r1, r3, ... corresponding to x1, x2, .... If you trust that your number generator is random and uncorrelated in [0,1], then the subsequence x1, x2, ... should also be random and uncorrelated in [0,1]. So the rather simple answer to the first part of your question is a projection onto either the x or y axis. That is, just pick x.
Next consider correlations between x and y. Since you haven't specified the nature of the correlation, let's assume a simple scaling of the axes,
such as x' => [0, 0.5], y' => [0, 3.0], followed by a rotation. The scaling doesn't introduce any correlation since x' and y' are still independent. You can generate it easily enough with a matrix multiply:
M1*p = [x_scale, 0; 0, y_scale] * [x; y]
for matrix M1 and point p. You can introduce a correlation by taking this stretched form and rotating it by theta:
M2*M1*p = [cos(theta), sin(theta); -sin(theta), cos(theta)]*M1*p
Putting it all together with theta = pi/4, or 45 degrees, you can see that larger values of y are correlated with larger values of x:
cos_t = sin_t = cos(pi/4); % at 45 degrees, sin(t) = cos(t) = 1/sqrt(2)
M2 = [cos_t, sin_t; -sin_t, cos_t];
M1 = [0.5, 0.0; 0.0, 3.0];
p = random(2,1000);
p_prime = M2*M1*p;
plot(p_prime(1)', p_prime(2)', '.');
axis('equal');
The resulting plot* shows a band of uniformly distributed numbers at a 45 degree angle:
Further transformations are possible with shear, and if you are clever about it, translation (OpenGL uses 4x4 transformation matrices so that translation can be represented as a linear transform matrix, with an extra dimension added before the transformation steps and removed before they are done).
Given a known affine correlation structure, you can transform back from random points (x',y') to points (x,y) where x and y are independent in [0,1] by solving Mk*...*M1 p = p_prime for p, or equivalently, by setting p = inv(Mk*...*M1) * p_prime, where p=[x;y]. Again, just pick x, which will be uniform in [0,1]. This doesn't work if the transformation matrix is singular, e.g., if you introduce a projection matrix Mj into the mix (though if the projection is the first step you can still recover).
* You may notice that the plot is from python rather than matlab. I don't have matlab or octave sitting in front of me right now, so I hope I got the syntax details right.
You could compute the hilbert curve from f(x,y)=z. Basically it's a hamiltonian path traversal. You can find a good description at Nick's spatial index hilbert curve quadtree blog. Or take a look at monotonic n-ary gray code. I've written an implementation based on Nick's blog in php:http://monstercurves.codeplex.com.
I will focus only on your last point
(3) Transform each draw in [0,1]x[0,1] in a draw in [0,1] using the Hilbert space filling curve: under the Hilbert curve mapping, the draw in [0,1]x[0,1] should be the image of one (or more because of surjectivity) point(s) in [0,1]. I want pick one of these points. Is there any pre-built package in Matlab doing this?
As far as I know, there aren't pre-built packages in Matlab doing this, but the good news is that the code on wikipedia can be called from MATLAB, and it is as simple as putting together the conversion routine with a gateway function in a xy2d.c file:
#include "mex.h"
// source: https://en.wikipedia.org/wiki/Hilbert_curve
// rotate/flip a quadrant appropriately
void rot(int n, int *x, int *y, int rx, int ry) {
if (ry == 0) {
if (rx == 1) {
*x = n-1 - *x;
*y = n-1 - *y;
}
//Swap x and y
int t = *x;
*x = *y;
*y = t;
}
}
// convert (x,y) to d
int xy2d (int n, int x, int y) {
int rx, ry, s, d=0;
for (s=n/2; s>0; s/=2) {
rx = (x & s) > 0;
ry = (y & s) > 0;
d += s * s * ((3 * rx) ^ ry);
rot(s, &x, &y, rx, ry);
}
return d;
}
/* The gateway function */
void mexFunction( int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
int n; /* input scalar */
int x; /* input scalar */
int y; /* input scalar */
int *d; /* output scalar */
/* check for proper number of arguments */
if(nrhs!=3) {
mexErrMsgIdAndTxt("MyToolbox:arrayProduct:nrhs","Three inputs required.");
}
if(nlhs!=1) {
mexErrMsgIdAndTxt("MyToolbox:arrayProduct:nlhs","One output required.");
}
/* get the value of the scalar inputs */
n = mxGetScalar(prhs[0]);
x = mxGetScalar(prhs[1]);
y = mxGetScalar(prhs[2]);
/* create the output */
plhs[0] = mxCreateDoubleScalar(xy2d(n,x,y));
/* get a pointer to the output scalar */
d = mxGetPr(plhs[0]);
}
and compile it with mex('xy2d.c').
The above implementation
[...] assumes a square divided into n by n cells, for n a power of 2, with integer coordinates, with (0,0) in the lower left corner, (n-1,n-1) in the upper right corner.
In practice, a discretization step is required before applying the mapping. As in every discretization problem, it is crucial to choose the precision wisely. The snippet below puts everything together.
close all; clear; clc;
% number of random samples
NSAMPL = 100;
% unit square divided into n-by-n cells
% has to be a power of 2
n = 2^2;
% quantum
d = 1/n;
N = 0:d:1;
% generate random samples
x = rand(1,NSAMPL);
y = rand(1,NSAMPL);
% discretization
bX = floor(x/d);
bY = floor(y/d);
% 2d to 1d mapping
dd = zeros(1,NSAMPL);
for iid = 1:length(dd)
dd(iid) = xy2d(n, bX(iid), bY(iid));
end
figure;
hold on;
axis equal;
plot(x, y, '.');
plot(repmat([0;1], 1, length(N)), repmat(N, 2, 1), '-r');
plot(repmat(N, 2, 1), repmat([0;1], 1, length(N)), '-r');
figure;
plot(1:NSAMPL, dd);
xlabel('# of sample')

Algorithm to generate random 2D polygon

I'm not sure how to approach this problem. I'm not sure how complex a task it is. My aim is to have an algorithm that generates any polygon. My only requirement is that the polygon is not complex (i.e. sides do not intersect). I'm using Matlab for doing the maths but anything abstract is welcome.
Any aid/direction?
EDIT:
I was thinking more of code that could generate any polygon even things like this:
I took #MitchWheat and #templatetypedef's idea of sampling points on a circle and took it a bit farther.
In my application I need to be able to control how weird the polygons are, ie start with regular polygons and as I crank up the parameters they get increasingly chaotic. The basic idea is as stated by #templatetypedef; walk around the circle taking a random angular step each time, and at each step put a point at a random radius. In equations I'm generating the angular steps as
where theta_i and r_i give the angle and radius of each point relative to the centre, U(min, max) pulls a random number from a uniform distribution, and N(mu, sigma) pulls a random number from a Gaussian distribution, and clip(x, min, max) thresholds a value into a range. This gives us two really nice parameters to control how wild the polygons are - epsilon which I'll call irregularity controls whether or not the points are uniformly space angularly around the circle, and sigma which I'll call spikeyness which controls how much the points can vary from the circle of radius r_ave. If you set both of these to 0 then you get perfectly regular polygons, if you crank them up then the polygons get crazier.
I whipped this up quickly in python and got stuff like this:
Here's the full python code:
import math, random
from typing import List, Tuple
def generate_polygon(center: Tuple[float, float], avg_radius: float,
irregularity: float, spikiness: float,
num_vertices: int) -> List[Tuple[float, float]]:
"""
Start with the center of the polygon at center, then creates the
polygon by sampling points on a circle around the center.
Random noise is added by varying the angular spacing between
sequential points, and by varying the radial distance of each
point from the centre.
Args:
center (Tuple[float, float]):
a pair representing the center of the circumference used
to generate the polygon.
avg_radius (float):
the average radius (distance of each generated vertex to
the center of the circumference) used to generate points
with a normal distribution.
irregularity (float):
variance of the spacing of the angles between consecutive
vertices.
spikiness (float):
variance of the distance of each vertex to the center of
the circumference.
num_vertices (int):
the number of vertices of the polygon.
Returns:
List[Tuple[float, float]]: list of vertices, in CCW order.
"""
# Parameter check
if irregularity < 0 or irregularity > 1:
raise ValueError("Irregularity must be between 0 and 1.")
if spikiness < 0 or spikiness > 1:
raise ValueError("Spikiness must be between 0 and 1.")
irregularity *= 2 * math.pi / num_vertices
spikiness *= avg_radius
angle_steps = random_angle_steps(num_vertices, irregularity)
# now generate the points
points = []
angle = random.uniform(0, 2 * math.pi)
for i in range(num_vertices):
radius = clip(random.gauss(avg_radius, spikiness), 0, 2 * avg_radius)
point = (center[0] + radius * math.cos(angle),
center[1] + radius * math.sin(angle))
points.append(point)
angle += angle_steps[i]
return points
def random_angle_steps(steps: int, irregularity: float) -> List[float]:
"""Generates the division of a circumference in random angles.
Args:
steps (int):
the number of angles to generate.
irregularity (float):
variance of the spacing of the angles between consecutive vertices.
Returns:
List[float]: the list of the random angles.
"""
# generate n angle steps
angles = []
lower = (2 * math.pi / steps) - irregularity
upper = (2 * math.pi / steps) + irregularity
cumsum = 0
for i in range(steps):
angle = random.uniform(lower, upper)
angles.append(angle)
cumsum += angle
# normalize the steps so that point 0 and point n+1 are the same
cumsum /= (2 * math.pi)
for i in range(steps):
angles[i] /= cumsum
return angles
def clip(value, lower, upper):
"""
Given an interval, values outside the interval are clipped to the interval
edges.
"""
return min(upper, max(value, lower))
#MateuszKonieczny here is code to create an image of a polygon from a list of vertices.
vertices = generate_polygon(center=(250, 250),
avg_radius=100,
irregularity=0.35,
spikiness=0.2,
num_vertices=16)
black = (0, 0, 0)
white = (255, 255, 255)
img = Image.new('RGB', (500, 500), white)
im_px_access = img.load()
draw = ImageDraw.Draw(img)
# either use .polygon(), if you want to fill the area with a solid colour
draw.polygon(vertices, outline=black, fill=white)
# or .line() if you want to control the line thickness, or use both methods together!
draw.line(vertices + [vertices[0]], width=2, fill=black)
img.show()
# now you can save the image (img), or do whatever else you want with it.
There's a neat way to do what you want by taking advantage of the MATLAB classes DelaunayTri and TriRep and the various methods they employ for handling triangular meshes. The code below follows these steps to create an arbitrary simple polygon:
Generate a number of random points equal to the desired number of sides plus a fudge factor. The fudge factor ensures that, regardless of the result of the triangulation, we should have enough facets to be able to trim the triangular mesh down to a polygon with the desired number of sides.
Create a Delaunay triangulation of the points, resulting in a convex polygon that is constructed from a series of triangular facets.
If the boundary of the triangulation has more edges than desired, pick a random triangular facet on the edge that has a unique vertex (i.e. the triangle only shares one edge with the rest of the triangulation). Removing this triangular facet will reduce the number of boundary edges.
If the boundary of the triangulation has fewer edges than desired, or the previous step was unable to find a triangle to remove, pick a random triangular facet on the edge that has only one of its edges on the triangulation boundary. Removing this triangular facet will increase the number of boundary edges.
If no triangular facets can be found matching the above criteria, post a warning that a polygon with the desired number of sides couldn't be found and return the x and y coordinates of the current triangulation boundary. Otherwise, keep removing triangular facets until the desired number of edges is met, then return the x and y coordinates of triangulation boundary.
Here's the resulting function:
function [x, y, dt] = simple_polygon(numSides)
if numSides < 3
x = [];
y = [];
dt = DelaunayTri();
return
end
oldState = warning('off', 'MATLAB:TriRep:PtsNotInTriWarnId');
fudge = ceil(numSides/10);
x = rand(numSides+fudge, 1);
y = rand(numSides+fudge, 1);
dt = DelaunayTri(x, y);
boundaryEdges = freeBoundary(dt);
numEdges = size(boundaryEdges, 1);
while numEdges ~= numSides
if numEdges > numSides
triIndex = vertexAttachments(dt, boundaryEdges(:,1));
triIndex = triIndex(randperm(numel(triIndex)));
keep = (cellfun('size', triIndex, 2) ~= 1);
end
if (numEdges < numSides) || all(keep)
triIndex = edgeAttachments(dt, boundaryEdges);
triIndex = triIndex(randperm(numel(triIndex)));
triPoints = dt([triIndex{:}], :);
keep = all(ismember(triPoints, boundaryEdges(:,1)), 2);
end
if all(keep)
warning('Couldn''t achieve desired number of sides!');
break
end
triPoints = dt.Triangulation;
triPoints(triIndex{find(~keep, 1)}, :) = [];
dt = TriRep(triPoints, x, y);
boundaryEdges = freeBoundary(dt);
numEdges = size(boundaryEdges, 1);
end
boundaryEdges = [boundaryEdges(:,1); boundaryEdges(1,1)];
x = dt.X(boundaryEdges, 1);
y = dt.X(boundaryEdges, 2);
warning(oldState);
end
And here are some sample results:
The generated polygons could be either convex or concave, but for larger numbers of desired sides they will almost certainly be concave. The polygons are also generated from points randomly generated within a unit square, so polygons with larger numbers of sides will generally look like they have a "squarish" boundary (such as the lower right example above with the 50-sided polygon). To modify this general bounding shape, you can change the way the initial x and y points are randomly chosen (i.e. from a Gaussian distribution, etc.).
For a convex 2D polygon (totally off the top of my head):
Generate a random radius, R
Generate N random points on the circumference of a circle of Radius R
Move around the circle and draw straight lines between adjacent points on the circle.
As #templatetypedef and #MitchWheat said, it is easy to do so by generating N random angles and radii. It is important to sort the angles, otherwise it will not be a simple polygon. Note that I am using a neat trick to draw closed curves - I described it in here. By the way, the polygons might be concave.
Note that all of these polygons will be star shaped. Generating a more general polygon is not a simple problem at all.
Just to give you a taste of the problem - check out
http://www.cosy.sbg.ac.at/~held/projects/rpg/rpg.html
and http://compgeom.cs.uiuc.edu/~jeffe/open/randompoly.html.
function CreateRandomPoly()
figure();
colors = {'r','g','b','k'};
for i=1:5
[x,y]=CreatePoly();
c = colors{ mod(i-1,numel(colors))+1};
plotc(x,y,c);
hold on;
end
end
function [x,y]=CreatePoly()
numOfPoints = randi(30);
theta = randi(360,[1 numOfPoints]);
theta = theta * pi / 180;
theta = sort(theta);
rho = randi(200,size(theta));
[x,y] = pol2cart(theta,rho);
xCenter = randi([-1000 1000]);
yCenter = randi([-1000 1000]);
x = x + xCenter;
y = y + yCenter;
end
function plotc(x,y,varargin)
x = [x(:) ; x(1)];
y = [y(:) ; y(1)];
plot(x,y,varargin{:})
end
Here is a working port for Matlab of Mike Ounsworth solution. I did not optimized it for matlab. I might update the solution later for that.
function [points] = generatePolygon(ctrX, ctrY, aveRadius, irregularity, spikeyness, numVerts)
%{
Start with the centre of the polygon at ctrX, ctrY,
then creates the polygon by sampling points on a circle around the centre.
Randon noise is added by varying the angular spacing between sequential points,
and by varying the radial distance of each point from the centre.
Params:
ctrX, ctrY - coordinates of the "centre" of the polygon
aveRadius - in px, the average radius of this polygon, this roughly controls how large the polygon is, really only useful for order of magnitude.
irregularity - [0,1] indicating how much variance there is in the angular spacing of vertices. [0,1] will map to [0, 2pi/numberOfVerts]
spikeyness - [0,1] indicating how much variance there is in each vertex from the circle of radius aveRadius. [0,1] will map to [0, aveRadius]
numVerts - self-explanatory
Returns a list of vertices, in CCW order.
Website: https://stackoverflow.com/questions/8997099/algorithm-to-generate-random-2d-polygon
%}
irregularity = clip( irregularity, 0,1 ) * 2*pi/ numVerts;
spikeyness = clip( spikeyness, 0,1 ) * aveRadius;
% generate n angle steps
angleSteps = [];
lower = (2*pi / numVerts) - irregularity;
upper = (2*pi / numVerts) + irregularity;
sum = 0;
for i =1:numVerts
tmp = unifrnd(lower, upper);
angleSteps(i) = tmp;
sum = sum + tmp;
end
% normalize the steps so that point 0 and point n+1 are the same
k = sum / (2*pi);
for i =1:numVerts
angleSteps(i) = angleSteps(i) / k;
end
% now generate the points
points = [];
angle = unifrnd(0, 2*pi);
for i =1:numVerts
r_i = clip( normrnd(aveRadius, spikeyness), 0, 2*aveRadius);
x = ctrX + r_i* cos(angle);
y = ctrY + r_i* sin(angle);
points(i,:)= [(x),(y)];
angle = angle + angleSteps(i);
end
end
function value = clip(x, min, max)
if( min > max ); value = x; return; end
if( x < min ) ; value = min; return; end
if( x > max ) ; value = max; return; end
value = x;
end

random unit vector in multi-dimensional space

I'm working on a data mining algorithm where i want to pick a random direction from a particular point in the feature space.
If I pick a random number for each of the n dimensions from [-1,1] and then normalize the vector to a length of 1 will I get an even distribution across all possible directions?
I'm speaking only theoretically here since computer generated random numbers are not actually random.
One simple trick is to select each dimension from a gaussian distribution, then normalize:
from random import gauss
def make_rand_vector(dims):
vec = [gauss(0, 1) for i in range(dims)]
mag = sum(x**2 for x in vec) ** .5
return [x/mag for x in vec]
For example, if you want a 7-dimensional random vector, select 7 random values (from a Gaussian distribution with mean 0 and standard deviation 1). Then, compute the magnitude of the resulting vector using the Pythagorean formula (square each value, add the squares, and take the square root of the result). Finally, divide each value by the magnitude to obtain a normalized random vector.
If your number of dimensions is large then this has the strong benefit of always working immediately, while generating random vectors until you find one which happens to have magnitude less than one will cause your computer to simply hang at more than a dozen dimensions or so, because the probability of any of them qualifying becomes vanishingly small.
You will not get a uniformly distributed ensemble of angles with the algorithm you described. The angles will be biased toward the corners of your n-dimensional hypercube.
This can be fixed by eliminating any points with distance greater than 1 from the origin. Then you're dealing with a spherical rather than a cubical (n-dimensional) volume, and your set of angles should then be uniformly distributed over the sample space.
Pseudocode:
Let n be the number of dimensions, K the desired number of vectors:
vec_count=0
while vec_count < K
generate n uniformly distributed values a[0..n-1] over [-1, 1]
r_squared = sum over i=0,n-1 of a[i]^2
if 0 < r_squared <= 1.0
b[i] = a[i]/sqrt(r_squared) ; normalize to length of 1
add vector b[0..n-1] to output list
vec_count = vec_count + 1
else
reject this sample
end while
There is a boost implementation of the algorithm that samples from normal distributions: random::uniform_on_sphere
I had the exact same question when also developing a ML algorithm.
I got to the same conclusion as Jim Lewis after drawing samples for the 2-d case and plotting the resulting distribution of the angle.
Furthermore, if you try to derive the density distribution for the direction in 2d when you draw at random from [-1,1] for the x- and y-axis ,you will see that:
f_X(x) = 1/(4*cos²(x)) if 0 < x < 45⁰
and
f_X(x) = 1/(4*sin²(x)) if x > 45⁰
where x is the angle, and f_X is the probability density distribution.
I have written about this here:
https://aerodatablog.wordpress.com/2018/01/14/random-hyperplanes/
#define SCL1 (M_SQRT2/2)
#define SCL2 (M_SQRT2*2)
// unitrand in [-1,1].
double u = SCL1 * unitrand();
double v = SCL1 * unitrand();
double w = SCL2 * sqrt(1.0 - u*u - v*v);
double x = w * u;
double y = w * v;
double z = 1.0 - 2.0 * (u*u + v*v);

3D Least Squares Plane

What's the algorithm for computing a least squares plane in (x, y, z) space, given a set of 3D data points? In other words, if I had a bunch of points like (1, 2, 3), (4, 5, 6), (7, 8, 9), etc., how would one go about calculating the best fit plane f(x, y) = ax + by + c? What's the algorithm for getting a, b, and c out of a set of 3D points?
If you have n data points (x[i], y[i], z[i]), compute the 3x3 symmetric matrix A whose entries are:
sum_i x[i]*x[i], sum_i x[i]*y[i], sum_i x[i]
sum_i x[i]*y[i], sum_i y[i]*y[i], sum_i y[i]
sum_i x[i], sum_i y[i], n
Also compute the 3 element vector b:
{sum_i x[i]*z[i], sum_i y[i]*z[i], sum_i z[i]}
Then solve Ax = b for the given A and b. The three components of the solution vector are the coefficients to the least-square fit plane {a,b,c}.
Note that this is the "ordinary least squares" fit, which is appropriate only when z is expected to be a linear function of x and y. If you are looking more generally for a "best fit plane" in 3-space, you may want to learn about "geometric" least squares.
Note also that this will fail if your points are in a line, as your example points are.
The equation for a plane is: ax + by + c = z. So set up matrices like this with all your data:
x_0 y_0 1
A = x_1 y_1 1
...
x_n y_n 1
And
a
x = b
c
And
z_0
B = z_1
...
z_n
In other words: Ax = B. Now solve for x which are your coefficients. But since (I assume) you have more than 3 points, the system is over-determined so you need to use the left pseudo inverse. So the answer is:
a
b = (A^T A)^-1 A^T B
c
And here is some simple Python code with an example:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
N_POINTS = 10
TARGET_X_SLOPE = 2
TARGET_y_SLOPE = 3
TARGET_OFFSET = 5
EXTENTS = 5
NOISE = 5
# create random data
xs = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
ys = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
zs = []
for i in range(N_POINTS):
zs.append(xs[i]*TARGET_X_SLOPE + \
ys[i]*TARGET_y_SLOPE + \
TARGET_OFFSET + np.random.normal(scale=NOISE))
# plot raw data
plt.figure()
ax = plt.subplot(111, projection='3d')
ax.scatter(xs, ys, zs, color='b')
# do fit
tmp_A = []
tmp_b = []
for i in range(len(xs)):
tmp_A.append([xs[i], ys[i], 1])
tmp_b.append(zs[i])
b = np.matrix(tmp_b).T
A = np.matrix(tmp_A)
fit = (A.T * A).I * A.T * b
errors = b - A * fit
residual = np.linalg.norm(errors)
print("solution:")
print("%f x + %f y + %f = z" % (fit[0], fit[1], fit[2]))
print("errors:")
print(errors)
print("residual:")
print(residual)
# plot plane
xlim = ax.get_xlim()
ylim = ax.get_ylim()
X,Y = np.meshgrid(np.arange(xlim[0], xlim[1]),
np.arange(ylim[0], ylim[1]))
Z = np.zeros(X.shape)
for r in range(X.shape[0]):
for c in range(X.shape[1]):
Z[r,c] = fit[0] * X[r,c] + fit[1] * Y[r,c] + fit[2]
ax.plot_wireframe(X,Y,Z, color='k')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()
unless someone tells me how to type equations here, let me just write down the final computations you have to do:
first, given points r_i \n \R, i=1..N, calculate the center of mass of all points:
r_G = \frac{\sum_{i=1}^N r_i}{N}
then, calculate the normal vector n, that together with the base vector r_G defines the plane by calculating the 3x3 matrix A as
A = \sum_{i=1}^N (r_i - r_G)(r_i - r_G)^T
with this matrix, the normal vector n is now given by the eigenvector of A corresponding to the minimal eigenvalue of A.
To find out about the eigenvector/eigenvalue pairs, use any linear algebra library of your choice.
This solution is based on the Rayleight-Ritz Theorem for the Hermitian matrix A.
See 'Least Squares Fitting of Data' by David Eberly for how I came up with this one to minimize the geometric fit (orthogonal distance from points to the plane).
bool Geom_utils::Fit_plane_direct(const arma::mat& pts_in, Plane& plane_out)
{
bool success(false);
int K(pts_in.n_cols);
if(pts_in.n_rows == 3 && K > 2) // check for bad sizing and indeterminate case
{
plane_out._p_3 = (1.0/static_cast<double>(K))*arma::sum(pts_in,1);
arma::mat A(pts_in);
A.each_col() -= plane_out._p_3; //[x1-p, x2-p, ..., xk-p]
arma::mat33 M(A*A.t());
arma::vec3 D;
arma::mat33 V;
if(arma::eig_sym(D,V,M))
{
// diagonalization succeeded
plane_out._n_3 = V.col(0); // in ascending order by default
if(plane_out._n_3(2) < 0)
{
plane_out._n_3 = -plane_out._n_3; // upward pointing
}
success = true;
}
}
return success;
}
Timed at 37 micro seconds fitting a plane to 1000 points (Windows 7, i7, 32bit program)
This reduces to the Total Least Squares problem, that can be solved using SVD decomposition.
C++ code using OpenCV:
float fitPlaneToSetOfPoints(const std::vector<cv::Point3f> &pts, cv::Point3f &p0, cv::Vec3f &nml) {
const int SCALAR_TYPE = CV_32F;
typedef float ScalarType;
// Calculate centroid
p0 = cv::Point3f(0,0,0);
for (int i = 0; i < pts.size(); ++i)
p0 = p0 + conv<cv::Vec3f>(pts[i]);
p0 *= 1.0/pts.size();
// Compose data matrix subtracting the centroid from each point
cv::Mat Q(pts.size(), 3, SCALAR_TYPE);
for (int i = 0; i < pts.size(); ++i) {
Q.at<ScalarType>(i,0) = pts[i].x - p0.x;
Q.at<ScalarType>(i,1) = pts[i].y - p0.y;
Q.at<ScalarType>(i,2) = pts[i].z - p0.z;
}
// Compute SVD decomposition and the Total Least Squares solution, which is the eigenvector corresponding to the least eigenvalue
cv::SVD svd(Q, cv::SVD::MODIFY_A|cv::SVD::FULL_UV);
nml = svd.vt.row(2);
// Calculate the actual RMS error
float err = 0;
for (int i = 0; i < pts.size(); ++i)
err += powf(nml.dot(pts[i] - p0), 2);
err = sqrtf(err / pts.size());
return err;
}
As with any least-squares approach, you proceed like this:
Before you start coding
Write down an equation for a plane in some parameterization, say 0 = ax + by + z + d in thee parameters (a, b, d).
Find an expression D(\vec{v};a, b, d) for the distance from an arbitrary point \vec{v}.
Write down the sum S = \sigma_i=0,n D^2(\vec{x}_i), and simplify until it is expressed in terms of simple sums of the components of v like \sigma v_x, \sigma v_y^2, \sigma v_x*v_z ...
Write down the per parameter minimization expressions dS/dx_0 = 0, dS/dy_0 = 0 ... which gives you a set of three equations in three parameters and the sums from the previous step.
Solve this set of equations for the parameters.
(or for simple cases, just look up the form). Using a symbolic algebra package (like Mathematica) could make you life much easier.
The coding
Write code to form the needed sums and find the parameters from the last set above.
Alternatives
Note that if you actually had only three points, you'd be better just finding the plane that goes through them.
Also, if the analytic solution in unfeasible (not the case for a plane, but possible in general) you can do steps 1 and 2, and use a Monte Carlo minimizer on the sum in step 3.
CGAL::linear_least_squares_fitting_3
Function linear_least_squares_fitting_3 computes the best fitting 3D
line or plane (in the least squares sense) of a set of 3D objects such
as points, segments, triangles, spheres, balls, cuboids or tetrahedra.
http://www.cgal.org/Manual/latest/doc_html/cgal_manual/Principal_component_analysis_ref/Function_linear_least_squares_fitting_3.html
It sounds like all you want to do is linear regression with 2 regressors. The wikipedia page on the subject should tell you all you need to know and then some.
All you'll have to do is to solve the system of equations.
If those are your points:
(1, 2, 3), (4, 5, 6), (7, 8, 9)
That gives you the equations:
3=a*1 + b*2 + c
6=a*4 + b*5 + c
9=a*7 + b*8 + c
So your question actually should be: How do I solve a system of equations?
Therefore I recommend reading this SO question.
If I've misunderstood your question let us know.
EDIT:
Ignore my answer as you probably meant something else.
We first present a linear least-squares plane fitting method that minimizes the residuals between the estimated normal vector and provided points.
Recall that the equation for a plane passing through origin is Ax + By + Cz = 0, where (x, y, z) can be any point on the plane and (A, B, C) is the normal vector perpendicular to this plane.
The equation for a general plane (that may or may not pass through origin) is Ax + By + Cz + D = 0, where the additional coefficient D represents how far the plane is away from the origin, along the direction of the normal vector of the plane. [Note that in this equation (A, B, C) forms a unit normal vector.]
Now, we can apply a trick here and fit the plane using only provided point coordinates. Divide both sides by D and rearrange this term to the right-hand side. This leads to A/D x + B/D y + C/D z = -1. [Note that in this equation (A/D, B/D, C/D) forms a normal vector with length 1/D.]
We can set up a system of linear equations accordingly, and then solve it by an Eigen solver in C++ as follows.
// Example for 5 points
Eigen::Matrix<double, 5, 3> matA; // row: 5 points; column: xyz coordinates
Eigen::Matrix<double, 5, 1> matB = -1 * Eigen::Matrix<double, 5, 1>::Ones();
// Find the plane normal
Eigen::Vector3d normal = matA.colPivHouseholderQr().solve(matB);
// Check if the fitting is healthy
double D = 1 / normal.norm();
normal.normalize(); // normal is a unit vector from now on
bool planeValid = true;
for (int i = 0; i < 5; ++i) { // compare Ax + By + Cz + D with 0.2 (ideally Ax + By + Cz + D = 0)
if ( fabs( normal(0)*matA(i, 0) + normal(1)*matA(i, 1) + normal(2)*matA(i, 2) + D) > 0.2) {
planeValid = false; // 0.2 is an experimental threshold; can be tuned
break;
}
}
We then discuss its equivalence to the typical SVD-based method and their comparison.
The aforementioned linear least-squares (LLS) method fits the general plane equation Ax + By + Cz + D = 0, whereas the SVD-based method replaces D with D = - (Ax0 + By0 + Cz0) and fits the plane equation A(x-x0) + B(y-y0) + C(z-z0) = 0, where (x0, y0, z0) is the mean of all points that serves as the origin of the new local coordinate frame.
Comparison between two methods:
The LLS fitting method is much faster than the SVD-based method, and is suitable for use when points are known to be roughly in a plane shape.
The SVD-based method is more numerically stable when the plane is far away from origin, because the LLS method would require more digits after decimal to be stored and processed in such cases.
The LLS method can detect outliers by checking the dot product residual between each point and the estimated normal vector, whereas the SVD-based method can detect outliers by checking if the smallest eigenvalue of the covariance matrix is significantly smaller than the two larger eigenvalues (i.e. checking the shape of the covariance matrix).
We finally provide a test case in C++ and MATLAB.
// Test case in C++ (using LLS fitting method)
matA(0,0) = 5.4637; matA(0,1) = 10.3354; matA(0,2) = 2.7203;
matA(1,0) = 5.8038; matA(1,1) = 10.2393; matA(1,2) = 2.7354;
matA(2,0) = 5.8565; matA(2,1) = 10.2520; matA(2,2) = 2.3138;
matA(3,0) = 6.0405; matA(3,1) = 10.1836; matA(3,2) = 2.3218;
matA(4,0) = 5.5537; matA(4,1) = 10.3349; matA(4,2) = 1.8796;
// With this sample data, LLS fitting method can produce the following result
// fitted normal vector = (-0.0231143, -0.0838307, -0.00266429)
// unit normal vector = (-0.265682, -0.963574, -0.0306241)
// D = 11.4943
% Test case in MATLAB (using SVD-based method)
points = [5.4637 10.3354 2.7203;
5.8038 10.2393 2.7354;
5.8565 10.2520 2.3138;
6.0405 10.1836 2.3218;
5.5537 10.3349 1.8796]
covariance = cov(points)
[V, D] = eig(covariance)
normal = V(:, 1) % pick the eigenvector that corresponds to the smallest eigenvalue
% normal = (0.2655, 0.9636, 0.0306)

Resources