This must be surely well known, being a particular linear programming problem. What I want is a specific easy to implement efficient algorithm adapted to this very case, for relatively small sizes (about, say, ten vectors of dimension less than twenty).
I have vectors v(1), ..., v(m) of the same dimension. Want an
algorithm that produces strictly positive numbers c(1), ..., c(m)
such that c(1)v(1) + ... + c(m)v(m) is the zero vector, or tells for
sure that no such numbers exist.
What I found (in some clever code by a colleague) gives an approximate algorithm like this:
start with, say, c(1) = ... = c(m) = 1/m;
at each stage, given current approximation v = c(1)v(1) + ... + c(m)v(m), seek for j such that v - v(j) is longer than v(j).
If no such j exists then output "no solution" (or c(1), ..., c(m) if v is zero).
If such j exists, change v to the new approximation (1 - c)v + cv(j) with some small positive c.
This changes c(j) to (1 - c)c(j) + c and each other c(i) to (1 - c)c(i), so that the new coefficients will remain positive and strictly less than 1 (in fact they will sum to 1 all the time, i. e. we will remain in the convex hull of the v(i)).
Moreover the new v will have strictly smaller length, so eventually the algorithm will either discover that there is no solution or will produce arbitrarily small v.
Clearly this is incomplete and not satisfactory from several points of view. Can one do better?
Update
There are by now two useful answers; however one final step is missing.
They both boil down to the following (unless I miss some essential point).
Take a basis of the nullspace of v(1), ..., v(m).
One obtains a collection of not necessarily strictly positive solutions c(1), ..., c(m), c'(1), ..., c'(m), c''(1), ..., c''(m), ... such that any such solution is their linear combination (in a unique way). So we are reduced to the question whether this new collection of m-dimensional vectors admits a linear combination with strictly positive entries.
Example: take four 2d-vectors (2,1), (3,-1), (-1,2), (-3,-3). Their nullspace has a basis consisting of two solutions c = (12,-3,0,5), c' = (-1,1,1,0). None of these are strictly positive but their combination c + 4c' = (8,1,4,5) is. So the latter is the desired solution. But in general it might be not so easy to find out whether a strictly positive solution exists and if yes, how to find it.
As suggested in the answer by btilly one might use Fourier-Motzkin elimination for that, but again, I would be grateful for more details about it.
This is doable as follows.
First write your vectors as columns. Put them into a matrix. Now create a single column with entries c(1), c(2), ..., c(m_)). If you multiply that matrix times that column, you get your linear combination.
Now consider the elementary row operations. Multiply a row by a constant, swap two rows, add a multiple of one row to another. If you do an elementary row operation to the matrix, your linear combination after the row operation will be 0 if and only if it was before the row operation. Therefore doing elementary row operations DOESN'T CHANGE the coefficients that you're looking for.
Therefore you may simplify life by doing elementary row operations to put the matrix into reduced row echelon form. Once it is in reduced row echelon form, life gets easier. Columns which do not contain a pivot correspond to free coefficients. Columns which do contain a pivot correspond to coefficients that must be a specific linear combination of free coefficients. This reduces your problem being to find positive values for the free coefficients that make the others also positive. So you're now just solving a system of inequalities (and generally in far fewer variables).
Whether a system of linear inequalities has a solution can be answered with the FME method.
Denoting by A the matrix where the ith row is v(i) and by x the vector whose ith index is c(i), your problem can be describes as Ax = b where b=0 is the zero vector. The problem of Ax=b when b is not equal to zero is called the least squares problem (or the inhomogeneous least squares) and has a close form solution in the sense of Minimal Mean Square Error (MMSE). In your case however, b = 0 therefore we are in the homogeneous least squares problem. In Linear Algebra this can be looked as an eigenvalue problem, whose solution is the eigenvector x of the matrix A^TA whose eigenvalue is equal to 0. If no such eigenvalue exists, the MMSE solution will the the eigenvalue x whose matching eigenvalue is the smallest (closest to 0). A nice discussion on this topic is given here.
The solution is, as stated above, will be the eigenvector of A^TA with the lowest matching eigenvalue. This can be done using Singular Value Decomposition (SVD), which will decompose the matrix A into
The column of V matching with the lowest eigenvalue in the diagonal matrix Sigma will be your solution.
Explanation
When we want to minimize the Ax = 0 in the MSE sense, we can compute the vector derivative w.r.t x as follows:
Therefore, the eigenvector of A^TA matching the smallest eigenvalue will solve your problem.
Practical solution example
In python, you can use numpy.linalg.svd to perform the SVD decomposition. numpy orders the matrices U and V^T such that the leftmost column matches the largest eigenvalue and the rightmost column matches the lowest eigenvalue. Thus, you need to compute the SVD and take the rightmost column of the resulting V:
from numpy.linalg import svd
[_, _, vt] = svd(A)
x = vt[-1] # we take the last row since this is a transposed matrix, so the last column of V is the last row of V^T
One zero eigenvalue
In this case there is only one non trivial vector who solves the problem and the only way to satisfy the strictly positive condition will be if the values in the vector are all positive or all negative (multiplying the vector by -1 will not change the result)
Multiple zero eigenvalues
In the case where we have multiple zero eigenvalues, any of their matching eigenvectors is a possible solution and any linear combination of them. In this case one would have to check if there is a linear combination of these eigenvectors which creates a vector where all the values are strictly positive in order to satisfy the strictly positive condition.
How do we find the solution if one exists? once we are left with the basis of eigenvectors matching zero eigenvalue (also known as null-space) what we need to do is to solve a system of linear inequalities. I'll explain by example, since it will be clearer this way. Suppose we have the following matrix:
import numpy as np
A = np.array([[ 2, 3, -1, -3],
[ 1, -1, 2, -3]])
[_, Sigma, Vt] = np.linalg.svd(A) # Sigma has only 2 non-zero values, meaning that the null-space have a dimension of 2
We can extract the eigenvectors as explained above:
C = Vt[len(Sigma):]
# array([[-0.10292809, 0.59058542, 0.75313786, 0.27092073],
# [ 0.89356997, -0.15289589, 0.09399548, 0.4114856 ]])
What we want to find are two real coefficients, noted as x and y such that:
-0.10292809*x + 0.89356997*y > 0
0.59058542*x - 0.15289589*y > 0
0.75313786*x + 0.09399548*y > 0
0.27092073*x + 0.4114856*y > 0
We have a system of 4 inequalities with 2 variables, therefore in this case a solution is not promised. A solution can be found in many ways but I will propose the following. We can start with an initial guess and go over each hyperplane to check if the initial guess satisfies the inequality. if not we can reflect the guess to the other side of the hyperplane. After passing all the hyperplanes we check for a solution. (explanation of hot to reflect a point w.r.t a line can be found here). An example for python implementation will be:
import numpy as np
def get_strictly_positive(A):
[_, Sigma, Vt] = np.linalg.svd(A)
if len(Sigma[np.abs(Sigma) > 1e-5]) == Vt.shape[0]: # No zero eigenvalues, taking MMSE solution if exists
c = Vt[-1]
if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
return c if np.sum(c) == np.sum(abs(c)) else -1 * c
else:
return -1
# This means we have a zero solution
# Building matrix C of all the null-space basis vectors
C = Vt[len(Sigma[np.abs(Sigma) > 1e-5]):]
# 1. What we have here is a set of linear system of inequalities. Each equation inequality is a hyperplane and for
# each equation there is a valid half-space. We want to find the intersection of all the half-spaces, if it exists.
# 2. A vey important observations is that the basis of the null-space that we found using SVD is ORTHOGONAL!
coeffs = np.ones(C.shape[0]) # initial guess
for hyperplane in C.T:
if coeffs.dot(hyperplane) <= 0: # the guess is on the wrong side of the hyperplane
orthogonal_part = coeffs - (coeffs.dot(hyperplane) / hyperplane.dot(hyperplane)) * hyperplane
# reflecting the coefficients to the other side of the hyperplane
coeffs = 2 * orthogonal_part - coeffs
# If this yielded a solution, we return it
c = C.T.dot(coeffs)
if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
return c if np.sum(c) == np.sum(abs(c)) else -1 * c
else:
return -1
The equations are taken from one of my summaries and therefore I do not have a link to the source
I am trying to use Cholesky decomposition to generate a multivariate matrix with this: Y = U + X*L
U is the mean vector: n x m
L from cholesky: m x m
X is a matrix with univariate normal vectors: n x m
After calculating the mean of the simulated matrix, I realized it was off. The reason is that the mean vector is very close to zero, so when adding it to L*X, L*X dominated the U. Anyone know how to work around this issue?
I am working on a genetic algorithm. Here is how it works :
Input : a list of 2D points
Input : the degree of the curve
Output : the equation of the curve that passes through points the best way (try to minimize the sum of vertical distances from point's Ys to the curve)
The algorithm finds good equations for simple straight lines and for 2-degree equations.
But for 4 points and 3 degree equations and more, it gets more complicated. I cannot find the right combination of parameters : sometimes I have to wait 5 minutes and the curve found is still very bad. I tried modifying many parameters, from population size to number of parents selected...
Do famous combinations/theorems in GA programming can help me ?
Thank you ! :)
Based on what is given, you would need a polynomial interpolation in which, the degree of the equation is number of points minus 1.
n = (Number of points) - 1
Now having said that, let's assume you have 5 points that need to be fitted and I am going to define them in a variable:
var points = [[0,0], [2,3], [4,-1], [5,7], [6,9]]
Please be noted the array of the points have been ordered by the x values which you need to do.
Then the equation would be:
f(x) = a1*x^4 + a2*x^3 + a3*x^2 + a4*x + a5
Now based on definition (https://en.wikipedia.org/wiki/Polynomial_interpolation#Constructing_the_interpolation_polynomial), the coefficients are computed like this:
Now you need to used the referenced page to come up with the coefficient.
It is not that complicated, for the polynomial interpolation of degree n you get the following equation:
p(x) = c0 + c1 * x + c2 * x^2 + ... + cn * x^n = y
This means we need n + 1 genes for the coefficients c0 to cn.
The fitness function is the sum of all squared distances from the points to the curve, below is the formula for the squared distance. Like this a smaller value is obviously better, if you don't want that you can take the inverse (1 / sum of squared distances):
d_squared(xi, yi) = (yi - p(xi))^2
I think for faster conversion you could limit the mutation, e.g. when mutating choose a new value with 20% probability between min and max (e.g. -1000 and 1000) and with 80% probabilty a random factor between 0.8 and 1.2 with which you multiply the old value.
I am working on a little puzzle-game-project. The basic idea is built around projecting multi-dimensonal data down to 2D. My only problem is how to generate the randomized scenario data. Here is the problem:
I got muliple randomized vectors v_i and a target vector t, all 2D. Now I want to randomize scalar values c_i that:
t = sum c_i v_i
Because there are more than two v_i this is a overdetermined system. I also took care that the linear combination of v_i is actual able to reach t.
How can I create (randomized) values for my c_i?
Edit: After finding this Question I can additionally state, that it is possible for me also (slightly) change the v_i.
All values are based on double
Let's say your v_i form a matrix V with 2 rows and n columns, each vector is a column. The coefficients c_i form a column vector c. Then the equation can be written in matrix form as
V×c = t
Now apply a Singular Value Decomposition to matrix V:
V = A×D×B
with A being an orthogonal 2×2 matrix, D is a 2×n matrix and B an orthogonal n×n matrix. The original equation now becomes
A×D×B×c = t
multiply this equation with the inverse of A, the inverse is the same as the transposed matrix AT:
D×B×c = AT×t
Let's introduce new symbols c'=B×c and t'=AT×t:
D×c' = t'
The solution of this equation is simple, because Matrix D looks like this:
u 0 0 0 ... // n columns
0 v 0 0 ...
The solution is
c1' = t1' / u
c2' = t2' / v
And because all the other columns of D are zero, the remaining components c3'...cn' can be chosen freely. This is the place where you can create random numbers for c3'...cn. Having vector c' you can calculate c as
c = BT×c'
with BT being the inverse/transposed of B.
Since the v_i are linearly dependent there are non trivial solutions to 0 = sum l_i v_i.
If you have n vectors you can find n-2 independent such solutions.
If you have now one solution to t = sum c_i v_i you can add any multiple of l_i to c_i and you will still have a solution: c_i' = p l_i + c_i.
For each independent solution of the homogenous problem determine a random p_j and calculate
c_i'' = c_i + sum p_j l_i_j.
and thank you for the attention you're paying to my question :)
My question is about finding an (efficient enough) algorithm for finding orthogonal polynomials of a given weight function f.
I've tried to simply apply the Gram-Schmidt algorithm but this one is not efficient enough. Indeed, it requires O(n^2) integrals. But my goal is to use this algorithm in order to find Hankel determinants of a function f. So a "direct" computation wich consists in simply compute the matrix and take its determinants requires only 2*n - 1 integrals.
But I want to use the theorem stating that the Hankel determinant of order n of f is a product of the n first leading coefficients of the orthogonal polynomials of f. The reason is that when n gets larger (say about 20), Hankel determinant gets really big and my goal is to divided it by an other big constant (for n = 20, the constant is of order 10^103). My idea is then to "dilute" the computation of the constant in the product of the leading coefficients.
I hope there is a O(n) algorithm to compute the n first orthogonal polynomials :) I've done some digging and found nothing in that direction for general function f (f can be any smooth function, actually).
EDIT: I'll precise here what the objects I'm talking about are.
1) A Hankel determinant of order n is the determinant of a square matrix which is constant on the skew diagonals. Thus for example
a b c
b c d
c d e
is a Hankel matrix of size 3 by 3.
2) If you have a function f : R -> R, you can associate to f its "kth moment" which is defined as (I'll write it in tex) f_k := \int_{\mathbb{R}} f(x) x^k dx
With this, you can create a Hankel matrix A_n(f) whose entries are (A_n(f)){ij} = f{i+j-2}, that is something of the like
f_0 f_1 f_2
f_1 f_2 f_3
f_2 f_3 f_4
With this in mind, it is easy to define the Hankel determinant of f which is simply
H_n(f) := det(A_n(f)). (Of course, it is understood that f has sufficient decay at infinity, this means that all the moments are well defined. A typical choice for f could be the gaussian f(x) = exp(-x^2), or any continuous function on a compact set of R...)
3) What I call orthogonal polynomials of f is a set of polynomials (p_n) such that
\int_{\mathbb{R}} f(x) p_j(x) p_k(x) is 1 if j = k and 0 otherwize.
(They are called like that since they form an orthonormal basis of the vector space of polynomials with respect to the scalar product
(p|q) = \int_{\mathbb{R}} f(x) p(x) q(x) dx
4) Now, it is basic linear algebra that from any basis of a vector space equipped with a scalar product, you can built a orthonormal basis thanks to the Gram-Schmidt algorithm. This is where the n^2 integrations comes from. You start from the basis 1, x, x^2, ..., x^n. Then you need n(n-1) integrals for the family to be orthogonal, and you need n more in order to normalize them.
5) There is a theorem saying that if f : R -> R is a function having sufficient decay at infinity, then we have that its Hankel determinant H_n(f) is equal to
H_n(f) = \prod_{j = 0}^{n-1} \kappa_j^{-2}
where \kappa_j is the leading coefficient of the j+1th orthogonal polynomial of f.
Thank you for your answer!
(PS: I tagged octave because I work in octave so, with a bit of luck (but I doubt it), there is a built-in function or a package already done managing this kind of think)
Orthogonal polynomials obey a recurrence relation, which we can write as
P[n+1] = (X-a[n])*P[n] - b[n-1]*P[n-1]
P[0] = 1
P[1] = X-a[0]
and we can compute the a, b coefficients by
a[n] = <X*P[n]|P[n]> / c[n]
b[n-1] = c[n-1]/c[n]
where
c[n] = <P[n]|P[n]>
(Here < | > is your inner product).
However I cannot vouch for the stability of this process at large n.