Related
This must be surely well known, being a particular linear programming problem. What I want is a specific easy to implement efficient algorithm adapted to this very case, for relatively small sizes (about, say, ten vectors of dimension less than twenty).
I have vectors v(1), ..., v(m) of the same dimension. Want an
algorithm that produces strictly positive numbers c(1), ..., c(m)
such that c(1)v(1) + ... + c(m)v(m) is the zero vector, or tells for
sure that no such numbers exist.
What I found (in some clever code by a colleague) gives an approximate algorithm like this:
start with, say, c(1) = ... = c(m) = 1/m;
at each stage, given current approximation v = c(1)v(1) + ... + c(m)v(m), seek for j such that v - v(j) is longer than v(j).
If no such j exists then output "no solution" (or c(1), ..., c(m) if v is zero).
If such j exists, change v to the new approximation (1 - c)v + cv(j) with some small positive c.
This changes c(j) to (1 - c)c(j) + c and each other c(i) to (1 - c)c(i), so that the new coefficients will remain positive and strictly less than 1 (in fact they will sum to 1 all the time, i. e. we will remain in the convex hull of the v(i)).
Moreover the new v will have strictly smaller length, so eventually the algorithm will either discover that there is no solution or will produce arbitrarily small v.
Clearly this is incomplete and not satisfactory from several points of view. Can one do better?
Update
There are by now two useful answers; however one final step is missing.
They both boil down to the following (unless I miss some essential point).
Take a basis of the nullspace of v(1), ..., v(m).
One obtains a collection of not necessarily strictly positive solutions c(1), ..., c(m), c'(1), ..., c'(m), c''(1), ..., c''(m), ... such that any such solution is their linear combination (in a unique way). So we are reduced to the question whether this new collection of m-dimensional vectors admits a linear combination with strictly positive entries.
Example: take four 2d-vectors (2,1), (3,-1), (-1,2), (-3,-3). Their nullspace has a basis consisting of two solutions c = (12,-3,0,5), c' = (-1,1,1,0). None of these are strictly positive but their combination c + 4c' = (8,1,4,5) is. So the latter is the desired solution. But in general it might be not so easy to find out whether a strictly positive solution exists and if yes, how to find it.
As suggested in the answer by btilly one might use Fourier-Motzkin elimination for that, but again, I would be grateful for more details about it.
This is doable as follows.
First write your vectors as columns. Put them into a matrix. Now create a single column with entries c(1), c(2), ..., c(m_)). If you multiply that matrix times that column, you get your linear combination.
Now consider the elementary row operations. Multiply a row by a constant, swap two rows, add a multiple of one row to another. If you do an elementary row operation to the matrix, your linear combination after the row operation will be 0 if and only if it was before the row operation. Therefore doing elementary row operations DOESN'T CHANGE the coefficients that you're looking for.
Therefore you may simplify life by doing elementary row operations to put the matrix into reduced row echelon form. Once it is in reduced row echelon form, life gets easier. Columns which do not contain a pivot correspond to free coefficients. Columns which do contain a pivot correspond to coefficients that must be a specific linear combination of free coefficients. This reduces your problem being to find positive values for the free coefficients that make the others also positive. So you're now just solving a system of inequalities (and generally in far fewer variables).
Whether a system of linear inequalities has a solution can be answered with the FME method.
Denoting by A the matrix where the ith row is v(i) and by x the vector whose ith index is c(i), your problem can be describes as Ax = b where b=0 is the zero vector. The problem of Ax=b when b is not equal to zero is called the least squares problem (or the inhomogeneous least squares) and has a close form solution in the sense of Minimal Mean Square Error (MMSE). In your case however, b = 0 therefore we are in the homogeneous least squares problem. In Linear Algebra this can be looked as an eigenvalue problem, whose solution is the eigenvector x of the matrix A^TA whose eigenvalue is equal to 0. If no such eigenvalue exists, the MMSE solution will the the eigenvalue x whose matching eigenvalue is the smallest (closest to 0). A nice discussion on this topic is given here.
The solution is, as stated above, will be the eigenvector of A^TA with the lowest matching eigenvalue. This can be done using Singular Value Decomposition (SVD), which will decompose the matrix A into
The column of V matching with the lowest eigenvalue in the diagonal matrix Sigma will be your solution.
Explanation
When we want to minimize the Ax = 0 in the MSE sense, we can compute the vector derivative w.r.t x as follows:
Therefore, the eigenvector of A^TA matching the smallest eigenvalue will solve your problem.
Practical solution example
In python, you can use numpy.linalg.svd to perform the SVD decomposition. numpy orders the matrices U and V^T such that the leftmost column matches the largest eigenvalue and the rightmost column matches the lowest eigenvalue. Thus, you need to compute the SVD and take the rightmost column of the resulting V:
from numpy.linalg import svd
[_, _, vt] = svd(A)
x = vt[-1] # we take the last row since this is a transposed matrix, so the last column of V is the last row of V^T
One zero eigenvalue
In this case there is only one non trivial vector who solves the problem and the only way to satisfy the strictly positive condition will be if the values in the vector are all positive or all negative (multiplying the vector by -1 will not change the result)
Multiple zero eigenvalues
In the case where we have multiple zero eigenvalues, any of their matching eigenvectors is a possible solution and any linear combination of them. In this case one would have to check if there is a linear combination of these eigenvectors which creates a vector where all the values are strictly positive in order to satisfy the strictly positive condition.
How do we find the solution if one exists? once we are left with the basis of eigenvectors matching zero eigenvalue (also known as null-space) what we need to do is to solve a system of linear inequalities. I'll explain by example, since it will be clearer this way. Suppose we have the following matrix:
import numpy as np
A = np.array([[ 2, 3, -1, -3],
[ 1, -1, 2, -3]])
[_, Sigma, Vt] = np.linalg.svd(A) # Sigma has only 2 non-zero values, meaning that the null-space have a dimension of 2
We can extract the eigenvectors as explained above:
C = Vt[len(Sigma):]
# array([[-0.10292809, 0.59058542, 0.75313786, 0.27092073],
# [ 0.89356997, -0.15289589, 0.09399548, 0.4114856 ]])
What we want to find are two real coefficients, noted as x and y such that:
-0.10292809*x + 0.89356997*y > 0
0.59058542*x - 0.15289589*y > 0
0.75313786*x + 0.09399548*y > 0
0.27092073*x + 0.4114856*y > 0
We have a system of 4 inequalities with 2 variables, therefore in this case a solution is not promised. A solution can be found in many ways but I will propose the following. We can start with an initial guess and go over each hyperplane to check if the initial guess satisfies the inequality. if not we can reflect the guess to the other side of the hyperplane. After passing all the hyperplanes we check for a solution. (explanation of hot to reflect a point w.r.t a line can be found here). An example for python implementation will be:
import numpy as np
def get_strictly_positive(A):
[_, Sigma, Vt] = np.linalg.svd(A)
if len(Sigma[np.abs(Sigma) > 1e-5]) == Vt.shape[0]: # No zero eigenvalues, taking MMSE solution if exists
c = Vt[-1]
if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
return c if np.sum(c) == np.sum(abs(c)) else -1 * c
else:
return -1
# This means we have a zero solution
# Building matrix C of all the null-space basis vectors
C = Vt[len(Sigma[np.abs(Sigma) > 1e-5]):]
# 1. What we have here is a set of linear system of inequalities. Each equation inequality is a hyperplane and for
# each equation there is a valid half-space. We want to find the intersection of all the half-spaces, if it exists.
# 2. A vey important observations is that the basis of the null-space that we found using SVD is ORTHOGONAL!
coeffs = np.ones(C.shape[0]) # initial guess
for hyperplane in C.T:
if coeffs.dot(hyperplane) <= 0: # the guess is on the wrong side of the hyperplane
orthogonal_part = coeffs - (coeffs.dot(hyperplane) / hyperplane.dot(hyperplane)) * hyperplane
# reflecting the coefficients to the other side of the hyperplane
coeffs = 2 * orthogonal_part - coeffs
# If this yielded a solution, we return it
c = C.T.dot(coeffs)
if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
return c if np.sum(c) == np.sum(abs(c)) else -1 * c
else:
return -1
The equations are taken from one of my summaries and therefore I do not have a link to the source
This is the sample code I've found for calculating the determinant of an (nxn) matrix and it works fine and all but I'm having trouble understanding what's happening in the conversion to triangular form section. Can someone explain what's happening in the "Conversion to upper triangular part"?
I don't have trouble calculating the determinant on my own or doing any upper triangular form conversions but I just don't get how this all translates here in this program.
ii) What is happening with integers (i,j,k,l)? Specifically, what is k and l doing? What is happening inside the IF construct? For a matrix A, I know that something like A(i,j) indicates its position in the matrix and that's all I've ever needed for any matrix programs I've worked with in the past.
========================================================================
!Function to find the determinant of a square matrix
!Description: The subroutine is based on two key points:
!1] A determinant is unaltered when row operations are performed: Hence,
using this principle,
!row operations (column operations would work as well) are used
!to convert the matrix into upper traingular form
!2]The determinant of a triangular matrix is obtained by finding the
product of the diagonal elements
REAL FUNCTION FindDet(matrix, n)
IMPLICIT NONE
REAL, DIMENSION(n,n) :: matrix
INTEGER, INTENT(IN) :: n
REAL :: m, temp
INTEGER :: i, j, k, l
LOGICAL :: DetExists = .TRUE.
l = 1
!Convert to upper triangular form
DO k = 1, n-1
IF (matrix(k,k) == 0) THEN
DetExists = .FALSE.
DO i = k+1, n
IF (matrix(i,k) /= 0) THEN
DO j = 1, n
temp = matrix(i,j)
matrix(i,j)= matrix(k,j)
matrix(k,j) = temp
END DO
DetExists = .TRUE.
l=-l
EXIT
ENDIF
END DO
IF (DetExists .EQV. .FALSE.) THEN
FindDet = 0
return
END IF
ENDIF
DO j = k+1, n
m = matrix(j,k)/matrix(k,k)
DO i = k+1, n
matrix(j,i) = matrix(j,i) - m*matrix(k,i)
END DO
END DO
END DO
!Calculate determinant by finding product of diagonal elements
FindDet = l
DO i = 1, n
FindDet = FindDet * matrix(i,i)
END DO
END FUNCTION FindDet
In this implementation, you can interpret the indices i, j, k, l as the following:
i and j: will be used interchangeably (there is no consistence in the code in this regard) to represent the row and column coordinates of an element of the matrix.
k: will iterate over the "dimension" of the matrix, without meaning a coordinated position. Or, in a different abstraction, iterates over the diagonal.
l: will be +1 or -1, alternating its value anytime the algorithm performs a line-switch. It takes into account that switching any two rows of a matrix reverses the sign of its determinant.
So, the interpretation of the code is:
At each iteration over the dimension of the matrix:
First, check if this diagonal element is zero. If it is zero:
ALARM: maybe the matrix is degenerate.
Let's find it out. Iterate downwards over the rest of this row, trying to find a non-zero element.
If you find a row with a non-zero element in this column, if was a false alarm. Perform row-switch and invert the sign of the determinant. Go on.
If there were all zeroes, then the matrix is degenerate. There is nothing else to do, determinant is zero.
Going on. For each row below this diagonal:
Perform sums and subtractions on the rest of the rows, constructing the diagonal matrix.
Finally, calculate the determinant by multiplying all the elements on the diagonal, taking the sign changes into acount.
How can i write a pseudocode to calculate elements of matrix ?
The following algorithm finds the sum S of two n-by-n matrices A[0..n-1, 0..n-1] and B[0..n-1, 0..n-1].
To add two matrices, we add their corresponding elements. The resulting matrix S[0..n-1, 0..n-1] is an
n-by-n matrix with elements computed by the formula:
S[i, j] = A[i, j] + B[i, j]
Write the pseudocode for calculating the elements of matrix S.
ALGORITHM addMatrices(A[0..n-1,0..n-1], B[0..n-1, 0..n-1])
// Input: Two n-by-n matrices A and B
// Output: Matrix S = A + B
(i) What is the algorithm’s basic operation?
(ii) How many times is the basic operation executed?
(iii)What is the class O(…) the algorithm belongs to?
Algorithms basic operation is to access each element using two loops and add each element of 'A' matrix to each element of 'B' matrix designated by 'i' and 'j', and construct another matrix 'S' using it.
'S' denotes the sum matrix of 'A' and 'B' matrices.
The basic operation is executed n2 times, where 'n' is the order of the matrices.
For each i ∈ {0 , ... , n1-1} and each j ∈ {0 , ... , n2-1} where 'n1' and 'n2' are matrices order.
From 2nd part we denote the Big-O Notation complexity to be n2 or n1·n2 depending upon the order of the matrices.
I have a large matrix, n! x n!, for which I need to take the determinant. For each permutation of n, I associate
a vector of length 2n (this is easy computationally)
a polynomial of in 2n variables (a product of linear factors computed recursively on n)
The matrix is the evaluation matrix for the polynomials at the vectors (thought of as points). So the sigma,tau entry of the matrix (indexed by permutations) is the polynomial for sigma evaluated at the vector for tau.
Example: For n=3, if the ith polynomial is (x1 - 4)(x3 - 5)(x4 - 4)(x6 - 1) and the jth point is (2,2,1,3,5,2), then the (i,j)th entry of the matrix will be (2 - 4)(1 - 5)(3 - 4)(2 - 1) = -8. Here n=3, so the points are in R^(3!) = R^6 and the polynomials have 3!=6 variables.
My goal is to determine whether or not the matrix is nonsingular.
My approach right now is this:
the function point takes a permutation and outputs a vector
the function poly takes a permutation and outputs a polynomial
the function nextPerm gives the next permutation in lexicographic order
The abridged pseudocode version of my code is this:
B := [];
P := [];
w := [1,2,...,n];
while w <> NULL do
B := B append poly(w);
P := P append point(w);
w := nextPerm(w);
od;
// BUILD A MATRIX IN MAPLE
M := Matrix(n!, (i,j) -> eval(B[i],P[j]));
// COMPUTE DETERMINANT IN MAPLE
det := LinearAlgebra[Determinant]( M );
// TELL ME IF IT'S NONSINGULAR
if det = 0 then return false;
else return true; fi;
I'm working in Maple using the built in function LinearAlgebra[Determinant], but everything else is a custom built function that uses low level Maple functions (e.g. seq, convert and cat).
My problem is that this takes too long, meaning I can go up to n=7 with patience, but getting n=8 takes days. Ideally, I want to be able to get to n=10.
Does anyone have an idea for how I could improve the time? I'm open to working in a different language, e.g. Matlab or C, but would prefer to find a way to speed this up within Maple.
I realize this might be hard to answer without all the gory details, but the code for each function, e.g. point and poly, is already optimized, so the real question here is if there is a faster way to take a determinant by building the matrix on the fly, or something like that.
UPDATE: Here are two ideas that I've toyed with that don't work:
I can store the polynomials (since they take a while to compute, I don't want to redo that if I can help it) into a vector of length n!, and compute the points on the fly, and plug these values into the permutation formula for the determinant:
The problem here is that this is O(N!) in the size of the matrix, so for my case this will be O((n!)!). When n=10, (n!)! = 3,628,800! which is way to big to even consider doing.
Compute the determinant using the LU decomposition. Luckily, the main diagonal of my matrix is nonzero, so this is feasible. Since this is O(N^3) in the size of the matrix, that becomes O((n!)^3) which is much closer to doable. The problem, though, is that it requires me to store the whole matrix, which puts serious strain on memory, nevermind the run time. So this doesn't work either, at least not without a bit more cleverness. Any ideas?
It isn't clear to me if your problem is space or time. Obviously the two trade back and forth. If you only wish to know if the determinant is positive or not, then you definitely should go with LU decomposition. The reason is that if A = LU with L lower triangular and U upper triangular, then
det(A) = det(L) det(U) = l_11 * ... * l_nn * u_11 * ... * u_nn
so you only need to determine if any of the main diagonal entries of L or U is 0.
To simplify further, use Doolittle's algorithm, where l_ii = 1. If at any point the algorithm breaks down, the matrix is singular so you can stop. Here's the gist:
for k := 1, 2, ..., n do {
for j := k, k+1, ..., n do {
u_kj := a_kj - sum_{s=1...k-1} l_ks u_sj;
}
for i = k+1, k+2, ..., n do {
l_ik := (a_ik - sum_{s=1...k-1} l_is u_sk)/u_kk;
}
}
The key is that you can compute the ith row of U and the ith column of L at the same time, and you only need to know the previous row/column to move forward. This way you parallel process as much as you can and store as little as you need. Since you can compute the entries a_ij as needed, this requires you to store two vectors of length n while generating two more vectors of length n (rows of U, columns of L). The algorithm takes n^2 time. You might be able to find a few more tricks, but that depends on your space/time trade off.
Not sure if I've followed your problem; is it (or does it reduce to) the following?
You have two vectors of n numbers, call them x and c, then the matrix element is product over k of (x_k+c_k), with each row/column corresponding to distinct orderings of x and c?
If so, then I believe the matrix will be singular whenever there are repeated values in either x or c, since the matrix will then have repeated rows/columns. Try a bunch of Monte Carlo's on a smaller n with distinct values of x and c to see if that case is in general non-singular - it's quite likely if that's true for 6, it'll be true for 10.
As far as brute-force goes, your method:
Is a non-starter
Will work much more quickly (should be a few seconds for n=7), though instead of LU you might want to try SVD, which will do a much better job of letting you know how well behaved your matrix is.
I have to representing a matrix as a list of the matrix rows with the term
like this [[a,b],[c,d]] with representing numbers in Peano notation.
I have to obtain a row of matrix
ow(X,N,C): C is the N-th row of matrix X. and the column of matrix
column(X,N,C): C es the N-th column of matrix X.
alse this One to decompose a matrix in its first column and the rest of the matrix
(which is exactly the same matrix but without the first column):
first_column(X,C,R): matrix X is formed by a first column C in
front of matrix R.
Could somebody help me?
Peano' notation simplifies expressing recursive algorithms.
I assume matrix indexing is 0 based, and will show you how to get the simplest task of your assignment.
row([Row|_], 0, Row).
row([_|Rows], succ(N), Row) :- row(Rows, N, Row).
test (get the second row, with index 1):
?- row([[a,b], [c,d]], succ(0), R).
R = [c, d] ;
false.
This shows the essential ingredients you'll use to answer the other two tasks.