Given a matrix A and a vector d that represents the diagonal of a diagonal matrix D, what would be the best (i.e. simplest without compromising performance) Eigen expression for D*A in terms of only d and A?
Constructing D (as a dense matrix) and doing D*A seems inefficient as it would involve unnecessary multiplications by zero. The rows of A simply need to be scaled by the corresponding elements of d.
Should I convert to array and scale the rows or does Eigen provide for diagonal matrices to be constructed and multiplied in a way that avoids unnecessary overhead?
You can use an Eigen::DiagonalMatrix, as Damien suggested. Alternatively, if you're already given an Eigen::Vector (or Map), you can use d.asDiagonal() as follows:
Eigen::VectorXf d;
Eigen::MatrixXf a, b;
b = d.asDiagonal() * a;
Related
This must be surely well known, being a particular linear programming problem. What I want is a specific easy to implement efficient algorithm adapted to this very case, for relatively small sizes (about, say, ten vectors of dimension less than twenty).
I have vectors v(1), ..., v(m) of the same dimension. Want an
algorithm that produces strictly positive numbers c(1), ..., c(m)
such that c(1)v(1) + ... + c(m)v(m) is the zero vector, or tells for
sure that no such numbers exist.
What I found (in some clever code by a colleague) gives an approximate algorithm like this:
start with, say, c(1) = ... = c(m) = 1/m;
at each stage, given current approximation v = c(1)v(1) + ... + c(m)v(m), seek for j such that v - v(j) is longer than v(j).
If no such j exists then output "no solution" (or c(1), ..., c(m) if v is zero).
If such j exists, change v to the new approximation (1 - c)v + cv(j) with some small positive c.
This changes c(j) to (1 - c)c(j) + c and each other c(i) to (1 - c)c(i), so that the new coefficients will remain positive and strictly less than 1 (in fact they will sum to 1 all the time, i. e. we will remain in the convex hull of the v(i)).
Moreover the new v will have strictly smaller length, so eventually the algorithm will either discover that there is no solution or will produce arbitrarily small v.
Clearly this is incomplete and not satisfactory from several points of view. Can one do better?
Update
There are by now two useful answers; however one final step is missing.
They both boil down to the following (unless I miss some essential point).
Take a basis of the nullspace of v(1), ..., v(m).
One obtains a collection of not necessarily strictly positive solutions c(1), ..., c(m), c'(1), ..., c'(m), c''(1), ..., c''(m), ... such that any such solution is their linear combination (in a unique way). So we are reduced to the question whether this new collection of m-dimensional vectors admits a linear combination with strictly positive entries.
Example: take four 2d-vectors (2,1), (3,-1), (-1,2), (-3,-3). Their nullspace has a basis consisting of two solutions c = (12,-3,0,5), c' = (-1,1,1,0). None of these are strictly positive but their combination c + 4c' = (8,1,4,5) is. So the latter is the desired solution. But in general it might be not so easy to find out whether a strictly positive solution exists and if yes, how to find it.
As suggested in the answer by btilly one might use Fourier-Motzkin elimination for that, but again, I would be grateful for more details about it.
This is doable as follows.
First write your vectors as columns. Put them into a matrix. Now create a single column with entries c(1), c(2), ..., c(m_)). If you multiply that matrix times that column, you get your linear combination.
Now consider the elementary row operations. Multiply a row by a constant, swap two rows, add a multiple of one row to another. If you do an elementary row operation to the matrix, your linear combination after the row operation will be 0 if and only if it was before the row operation. Therefore doing elementary row operations DOESN'T CHANGE the coefficients that you're looking for.
Therefore you may simplify life by doing elementary row operations to put the matrix into reduced row echelon form. Once it is in reduced row echelon form, life gets easier. Columns which do not contain a pivot correspond to free coefficients. Columns which do contain a pivot correspond to coefficients that must be a specific linear combination of free coefficients. This reduces your problem being to find positive values for the free coefficients that make the others also positive. So you're now just solving a system of inequalities (and generally in far fewer variables).
Whether a system of linear inequalities has a solution can be answered with the FME method.
Denoting by A the matrix where the ith row is v(i) and by x the vector whose ith index is c(i), your problem can be describes as Ax = b where b=0 is the zero vector. The problem of Ax=b when b is not equal to zero is called the least squares problem (or the inhomogeneous least squares) and has a close form solution in the sense of Minimal Mean Square Error (MMSE). In your case however, b = 0 therefore we are in the homogeneous least squares problem. In Linear Algebra this can be looked as an eigenvalue problem, whose solution is the eigenvector x of the matrix A^TA whose eigenvalue is equal to 0. If no such eigenvalue exists, the MMSE solution will the the eigenvalue x whose matching eigenvalue is the smallest (closest to 0). A nice discussion on this topic is given here.
The solution is, as stated above, will be the eigenvector of A^TA with the lowest matching eigenvalue. This can be done using Singular Value Decomposition (SVD), which will decompose the matrix A into
The column of V matching with the lowest eigenvalue in the diagonal matrix Sigma will be your solution.
Explanation
When we want to minimize the Ax = 0 in the MSE sense, we can compute the vector derivative w.r.t x as follows:
Therefore, the eigenvector of A^TA matching the smallest eigenvalue will solve your problem.
Practical solution example
In python, you can use numpy.linalg.svd to perform the SVD decomposition. numpy orders the matrices U and V^T such that the leftmost column matches the largest eigenvalue and the rightmost column matches the lowest eigenvalue. Thus, you need to compute the SVD and take the rightmost column of the resulting V:
from numpy.linalg import svd
[_, _, vt] = svd(A)
x = vt[-1] # we take the last row since this is a transposed matrix, so the last column of V is the last row of V^T
One zero eigenvalue
In this case there is only one non trivial vector who solves the problem and the only way to satisfy the strictly positive condition will be if the values in the vector are all positive or all negative (multiplying the vector by -1 will not change the result)
Multiple zero eigenvalues
In the case where we have multiple zero eigenvalues, any of their matching eigenvectors is a possible solution and any linear combination of them. In this case one would have to check if there is a linear combination of these eigenvectors which creates a vector where all the values are strictly positive in order to satisfy the strictly positive condition.
How do we find the solution if one exists? once we are left with the basis of eigenvectors matching zero eigenvalue (also known as null-space) what we need to do is to solve a system of linear inequalities. I'll explain by example, since it will be clearer this way. Suppose we have the following matrix:
import numpy as np
A = np.array([[ 2, 3, -1, -3],
[ 1, -1, 2, -3]])
[_, Sigma, Vt] = np.linalg.svd(A) # Sigma has only 2 non-zero values, meaning that the null-space have a dimension of 2
We can extract the eigenvectors as explained above:
C = Vt[len(Sigma):]
# array([[-0.10292809, 0.59058542, 0.75313786, 0.27092073],
# [ 0.89356997, -0.15289589, 0.09399548, 0.4114856 ]])
What we want to find are two real coefficients, noted as x and y such that:
-0.10292809*x + 0.89356997*y > 0
0.59058542*x - 0.15289589*y > 0
0.75313786*x + 0.09399548*y > 0
0.27092073*x + 0.4114856*y > 0
We have a system of 4 inequalities with 2 variables, therefore in this case a solution is not promised. A solution can be found in many ways but I will propose the following. We can start with an initial guess and go over each hyperplane to check if the initial guess satisfies the inequality. if not we can reflect the guess to the other side of the hyperplane. After passing all the hyperplanes we check for a solution. (explanation of hot to reflect a point w.r.t a line can be found here). An example for python implementation will be:
import numpy as np
def get_strictly_positive(A):
[_, Sigma, Vt] = np.linalg.svd(A)
if len(Sigma[np.abs(Sigma) > 1e-5]) == Vt.shape[0]: # No zero eigenvalues, taking MMSE solution if exists
c = Vt[-1]
if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
return c if np.sum(c) == np.sum(abs(c)) else -1 * c
else:
return -1
# This means we have a zero solution
# Building matrix C of all the null-space basis vectors
C = Vt[len(Sigma[np.abs(Sigma) > 1e-5]):]
# 1. What we have here is a set of linear system of inequalities. Each equation inequality is a hyperplane and for
# each equation there is a valid half-space. We want to find the intersection of all the half-spaces, if it exists.
# 2. A vey important observations is that the basis of the null-space that we found using SVD is ORTHOGONAL!
coeffs = np.ones(C.shape[0]) # initial guess
for hyperplane in C.T:
if coeffs.dot(hyperplane) <= 0: # the guess is on the wrong side of the hyperplane
orthogonal_part = coeffs - (coeffs.dot(hyperplane) / hyperplane.dot(hyperplane)) * hyperplane
# reflecting the coefficients to the other side of the hyperplane
coeffs = 2 * orthogonal_part - coeffs
# If this yielded a solution, we return it
c = C.T.dot(coeffs)
if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
return c if np.sum(c) == np.sum(abs(c)) else -1 * c
else:
return -1
The equations are taken from one of my summaries and therefore I do not have a link to the source
Assuming there exist two matrixes A and B that are both m * n, is there a method or algorithm that can be used to obtain a n * n matrix C which satisfies the equation A * C = B' (B' can be obtained by performing several steps of row swap on B), where C satisfies the minimum sum of squared error.
Or A * C = D * B, where D(m*m) is a row swap transform matrix.
Thanks.
If I read your question correctly, you have two matrixes A and B, and you’re looking for C such that A * C = B + epsilon where you want to minimize epsilon’s sum of squares.
Your question seems to suggest you have some constraint on C but it’s not obvious what that is. But as you indicate in your answer, a linear solver will find a C that minimizes epsilon’s sum of squares. The solver doesn’t care what the ordering of the rows of B are: it will combine row-swapping operators (like D that you mention) into the C that it finds.
There are many different linear solvers and a simple function like solve has to choose which to use—you can always explicitly choose a specific solver if you know you want it. An expensive but very useful solver is the Moore–Penrose pseudoinverse: with C = pinv(A) * B, C is guaranteed to minimize the sum of squares of epsilon but also minimize C’s sum of squares. Wikipedia explains when solve might return something different than this min-norm solution via pseudoinverse.
I am working on algorithm to perform linear regression for one or more independent variables.
that is: (if I have m real world values and in the case of two independent variables a and b)
C + D*a1 + E* b1 = y1
C + D*a2 + E* b2 = y2
...
C + D*am + E* bm = ym
I would like to use the least squares solution to find best fitting straight line.
I will be using the matrix notation
so
where Beta is the vector [C, D, E] where these values will be the best fit line.
Question
What is the best way to solve this formula? Should I compute the inverse of
or should I use the LU factorization/decmposition of the matrix. What is the performance of each on large amount of data (i.e a big value of m , could be in order of 10^8 ...)
EDIT
If the answer was to use Cholesky decomposition or QR decomposition, are there any implementation hints/ simple libraries to use.
I am coding in C/ C++.
Two straightforward approaches spring to mind for solving a dense overdetermined system Ax=b:
Form A^T A x = A b, then Cholesky-factorise A^T A = L L^T, then do two back-solves. This usually gets you an answer precise to about sqrt(machine epsilon).
Compute the QR factorisation A = Q*R, where Q's columns are orthogonal and R is square and upper-triangular, using something like Householder elimination. Then solve Rx = Q^T b for x by back-substitution. This usually gets you an answer precise to about machine epsilon --- twice the precision as the Cholesky method, but it takes about twice as long.
For sparse systems, I'd usually prefer the Cholesky method because it takes better advantage of sparsity.
Your X^TX matrix should have a Cholesky decomposition. I'd look into this decomposition before LU. It is faster: http://en.wikipedia.org/wiki/Cholesky_decomposition
I have a large matrix, n! x n!, for which I need to take the determinant. For each permutation of n, I associate
a vector of length 2n (this is easy computationally)
a polynomial of in 2n variables (a product of linear factors computed recursively on n)
The matrix is the evaluation matrix for the polynomials at the vectors (thought of as points). So the sigma,tau entry of the matrix (indexed by permutations) is the polynomial for sigma evaluated at the vector for tau.
Example: For n=3, if the ith polynomial is (x1 - 4)(x3 - 5)(x4 - 4)(x6 - 1) and the jth point is (2,2,1,3,5,2), then the (i,j)th entry of the matrix will be (2 - 4)(1 - 5)(3 - 4)(2 - 1) = -8. Here n=3, so the points are in R^(3!) = R^6 and the polynomials have 3!=6 variables.
My goal is to determine whether or not the matrix is nonsingular.
My approach right now is this:
the function point takes a permutation and outputs a vector
the function poly takes a permutation and outputs a polynomial
the function nextPerm gives the next permutation in lexicographic order
The abridged pseudocode version of my code is this:
B := [];
P := [];
w := [1,2,...,n];
while w <> NULL do
B := B append poly(w);
P := P append point(w);
w := nextPerm(w);
od;
// BUILD A MATRIX IN MAPLE
M := Matrix(n!, (i,j) -> eval(B[i],P[j]));
// COMPUTE DETERMINANT IN MAPLE
det := LinearAlgebra[Determinant]( M );
// TELL ME IF IT'S NONSINGULAR
if det = 0 then return false;
else return true; fi;
I'm working in Maple using the built in function LinearAlgebra[Determinant], but everything else is a custom built function that uses low level Maple functions (e.g. seq, convert and cat).
My problem is that this takes too long, meaning I can go up to n=7 with patience, but getting n=8 takes days. Ideally, I want to be able to get to n=10.
Does anyone have an idea for how I could improve the time? I'm open to working in a different language, e.g. Matlab or C, but would prefer to find a way to speed this up within Maple.
I realize this might be hard to answer without all the gory details, but the code for each function, e.g. point and poly, is already optimized, so the real question here is if there is a faster way to take a determinant by building the matrix on the fly, or something like that.
UPDATE: Here are two ideas that I've toyed with that don't work:
I can store the polynomials (since they take a while to compute, I don't want to redo that if I can help it) into a vector of length n!, and compute the points on the fly, and plug these values into the permutation formula for the determinant:
The problem here is that this is O(N!) in the size of the matrix, so for my case this will be O((n!)!). When n=10, (n!)! = 3,628,800! which is way to big to even consider doing.
Compute the determinant using the LU decomposition. Luckily, the main diagonal of my matrix is nonzero, so this is feasible. Since this is O(N^3) in the size of the matrix, that becomes O((n!)^3) which is much closer to doable. The problem, though, is that it requires me to store the whole matrix, which puts serious strain on memory, nevermind the run time. So this doesn't work either, at least not without a bit more cleverness. Any ideas?
It isn't clear to me if your problem is space or time. Obviously the two trade back and forth. If you only wish to know if the determinant is positive or not, then you definitely should go with LU decomposition. The reason is that if A = LU with L lower triangular and U upper triangular, then
det(A) = det(L) det(U) = l_11 * ... * l_nn * u_11 * ... * u_nn
so you only need to determine if any of the main diagonal entries of L or U is 0.
To simplify further, use Doolittle's algorithm, where l_ii = 1. If at any point the algorithm breaks down, the matrix is singular so you can stop. Here's the gist:
for k := 1, 2, ..., n do {
for j := k, k+1, ..., n do {
u_kj := a_kj - sum_{s=1...k-1} l_ks u_sj;
}
for i = k+1, k+2, ..., n do {
l_ik := (a_ik - sum_{s=1...k-1} l_is u_sk)/u_kk;
}
}
The key is that you can compute the ith row of U and the ith column of L at the same time, and you only need to know the previous row/column to move forward. This way you parallel process as much as you can and store as little as you need. Since you can compute the entries a_ij as needed, this requires you to store two vectors of length n while generating two more vectors of length n (rows of U, columns of L). The algorithm takes n^2 time. You might be able to find a few more tricks, but that depends on your space/time trade off.
Not sure if I've followed your problem; is it (or does it reduce to) the following?
You have two vectors of n numbers, call them x and c, then the matrix element is product over k of (x_k+c_k), with each row/column corresponding to distinct orderings of x and c?
If so, then I believe the matrix will be singular whenever there are repeated values in either x or c, since the matrix will then have repeated rows/columns. Try a bunch of Monte Carlo's on a smaller n with distinct values of x and c to see if that case is in general non-singular - it's quite likely if that's true for 6, it'll be true for 10.
As far as brute-force goes, your method:
Is a non-starter
Will work much more quickly (should be a few seconds for n=7), though instead of LU you might want to try SVD, which will do a much better job of letting you know how well behaved your matrix is.
I'm having a hard time following the ray-plane intersection described in the following page.
SIGGRAPH Ray-Plane Intersection
Here is my understanding.
The plane is described as Ax + By + Cz + D = 0
or
The Vector ( A, B, C, D ), Where A, B, C define a normal plan. If A, B, and C define a unit normal, then the distance from the origin [0, 0, 0] to the plan is D.
My question is shouldn't D be a vector? Since it represents the distants between two points. I guess I just don't understand how you can represent the distance between to points as a non vector.
Any help is much appreciated.
Distance between two points is ALWAYS a scalar, a single number. Think of the vectors as points in space, right? So, when you say distance between two vectors, you are finding the distance between those two points which is a number. Distance between two vectors is the magnitude of the difference vector of the two vectors. So, you subtract the 2 vectors, get the difference vector and find its magnitude. That is your distance which is a SCALAR and NOT a vector.
Distance is a scalar value, not a vector. It is, in fact, the length of a vector.
You can think of a vector as a set of values describing a point in space in relation to the origin. In R3, you need a minimum of 3 pieces of information to describe the location of that point. These pieces of information give you a direction and a distance.
If you were to tell me that a city is 50 miles away, that would be you describing a distance. Of course, you will not have told me which direction that city was. When you give me 2 pieces of information, you have given me a vector, as opposed to scalar value.
Also recall the formula for distance:
D = sqrt(x^2 + y^2 + z^2)
Scalar value ;)