I am concerned with the following algorithm:
As input, it takes n points in n dimensional space in rectangular coordinates. These n points define an n-1 dimensional hyperplane (we can ignore the infintesimal probability that they don't). As output, I would like the equation of this hyperplane.
Is there a known algorithm - or at least a known complexity class - for this problem?
Thanks in advance.
The equation you're looking for is
A_1 x_1 + A_2 x_2 + ... + A_n x_n + C = 0
for some coefficients A_1 and C and for the x_i being the rectangular coordinates of a point on the plane. Substitute in the input points and you've got a set of n simultaneous equations which you can solve (up to a scale factor).
Related
Assume we have m data points.
What is the number of degrees needed for polynomial curve fitting if we wish to make the adjusted R^2 value to be 1? (Theoretically, it will be 1, but realistically it's nearly 1 due to round off errors).
What is the reason for the chosen number?
8 points (2 0 0 3 8 5 3 3 ) example shown below, but you have to answer with m data points. If you use 8 data points your score will be reduced.
A polynomial of degree m-1 will exactly fit (R^2 = 1) m data points with different x values.
A m-1 degree polynomial has m degrees of freedom a_i:
y(x) = a_1 + a_2 x^1 + a_3 x^2 + ... + a_m x^(m-1)
The m degrees of freedom of a m-1 degree polynomial allow it to uniquely fit to m data points.
N segments in 3d space are given. Segment is represented by 2 points. The problem is to find the point with minimal possible sum of distances to all segments.
Let the segments be p1 q1, …, pn qn. We formulate an optimization problem:
minimize ∑i ‖x − ((1 − yi) pi + yi qi)‖
subject to
x ∈ 𝐑³
∀i, yi ∈ [0, 1]
The variable x is the point we’re looking for. The variables yi take advantage that we’re trying to minimize a minimum (minimum distance to x) and are used to form the convex combination in the objective.
This is a convex problem, so either cvxpy or scipy.optimize should be able to handle it nicely.
I am working on a genetic algorithm. Here is how it works :
Input : a list of 2D points
Input : the degree of the curve
Output : the equation of the curve that passes through points the best way (try to minimize the sum of vertical distances from point's Ys to the curve)
The algorithm finds good equations for simple straight lines and for 2-degree equations.
But for 4 points and 3 degree equations and more, it gets more complicated. I cannot find the right combination of parameters : sometimes I have to wait 5 minutes and the curve found is still very bad. I tried modifying many parameters, from population size to number of parents selected...
Do famous combinations/theorems in GA programming can help me ?
Thank you ! :)
Based on what is given, you would need a polynomial interpolation in which, the degree of the equation is number of points minus 1.
n = (Number of points) - 1
Now having said that, let's assume you have 5 points that need to be fitted and I am going to define them in a variable:
var points = [[0,0], [2,3], [4,-1], [5,7], [6,9]]
Please be noted the array of the points have been ordered by the x values which you need to do.
Then the equation would be:
f(x) = a1*x^4 + a2*x^3 + a3*x^2 + a4*x + a5
Now based on definition (https://en.wikipedia.org/wiki/Polynomial_interpolation#Constructing_the_interpolation_polynomial), the coefficients are computed like this:
Now you need to used the referenced page to come up with the coefficient.
It is not that complicated, for the polynomial interpolation of degree n you get the following equation:
p(x) = c0 + c1 * x + c2 * x^2 + ... + cn * x^n = y
This means we need n + 1 genes for the coefficients c0 to cn.
The fitness function is the sum of all squared distances from the points to the curve, below is the formula for the squared distance. Like this a smaller value is obviously better, if you don't want that you can take the inverse (1 / sum of squared distances):
d_squared(xi, yi) = (yi - p(xi))^2
I think for faster conversion you could limit the mutation, e.g. when mutating choose a new value with 20% probability between min and max (e.g. -1000 and 1000) and with 80% probabilty a random factor between 0.8 and 1.2 with which you multiply the old value.
Given n points in 2-D plane, like (0,0),(1,1), ... We can select any three points from them to construct angle. For example, we choose A(0, 0), B(1, 1), C(1, 0), then we get angle ABC = 45 degree, ACB = 90 degree and CAB = 45 degree.
My question is how to calculate max or min angle determined by three points selected from n points.
Obviously, we can use brute-force algorithm - calculate all angels and find maximal and minimal value, using Law Of Cosines to calculate angles and Pythagorean theorem to calculate distances. But does efficient algorithm exist?
If I'm correct, brute-force runs in O(n^3): you basically take every triplet, compute the 3 angles, and store the overall max.
You can improve slightly to O(n^2 * log(n)) but it's trickier:
best_angle = 0
for each point p1:
for each point p2:
compute vector (p1, p2), and the signed angle it makes with the X-axis
store it in an array A
sort array A # O(n * log(n))
# Traverse A to find the best choice:
for each alpha in A:
look for the element beta in A closest to alpha+Pi
# Takes O(log n) because it's sorted. Don't forget to take into account the fact that A represents a circle: both ends touch...
best_angle = max(best_angle, abs(beta - alpha))
The complexity is O(n * (n + nlog(n) + n * log(n))) = O(n^2 * log(n))
Of course you can also retrieve the pt1, pt2 that obtained the best angle during the loops.
There is probably still better, this feels like doing too much work overall, even if you re-use your previous computations of pt1 for pt2, ..., ptn ...
assume there are three group of high dimension vectors:
{a_1, a_2, ..., a_N},
{b_1, b_2, ... , b_N},
{c_1, c_2, ..., c_N}.
each of my vector can be represented as: x = a_i + b_j + c_k, where 1 <=i, j, k <= N. then the vector is encoded as (i, j, k) wich is then can be decoded as x = a_i + b_j + c_k.
my question is, if there are two vector: x = (i_1, j_1, k_1), y = (i_2, j_2, k_2), is there a method to compute the euclidian distance of these two vector without decode x and y.
Square root of the sum of squares of the differences between components. There's no other way to do it.
You should scale the values to guard against overflow/underflow issues. Search for the max difference and divide all the components by it before squaring, summing, and taking the square root.
Let's assume you have only two groups. You are trying to compute the scalar product
(a_i1 + b_j1, a_i2 + b_j2)
= (a_i1,a_i2) + (b_j1,b_j2) + (a_i1,b_j2) + (a_i2,b_j1) # <- elementary scalar products
So if you know the necessary elementary scalar products between the elements of your vectors a_i, b_j, c_k, then, you do not need to "decode" x and y and can compute the scalar product directly.
Note that this is exactly what happens when you compute an ordinary euclidian distance on a non orthogonal basis.
If you are happy with an approximate result, you could project your high dimension basis vectors using a random projection into a small dimensional space. Johnson-Lindenstrauss lemma says that you can reduce your dimension to O(log N), so that distances remain approximately the same with high probability.