Eigenvalues calculation in Maxima - matrix

Let's say I have matrix A in following form
then I have matrix C in following form
and finally matrix L in following form
My goal is to find the formulas for the elements of the matrix L so that the eigenvalues of the matrix A-LC will be "K" times greater than the eigenvalues of the matrix A. The "K" is a
parameter.
I have started with the definitions of the matrices:
A: matrix(
[-a,0,b,c],
[0,a,-c,b],
[d,0,-e,-1],
[0,d,1,-e]
);
C: matrix(
[1,0,0,0],
[0,1,0,0]
);
L: matrix(
[l1,-l2],
[l2,l1],
[l3,-l4],
[l4,l3]
);
Then I have found the formula for the characteristic polynomial of the matrix A (its roots are the eigenvalues of the matrix A)
char_pol_system : ratsimp(expand(charpoly(A, x)));
x^4+2*e*x^3+(e^2-2*b*d-a^2+1)*x^2+((-2*b*d-2*a^2)*e-2*c*d)*x-a^2*e^2+(c^2+b^2)*d^2-a^2
and I have also found the formula for the characteristic polynomial of the matrix (A-LC) (its roots are the eigenvalues of the matrix A-LC). The requirement that the eigenvalues of the matrix (A-LC) has to be "K" times greater than the eigenvalues of the matrix A is reflected
by following substitution y = Kx
char_pol_observer : subst((K*x), y, ratsimp(expand(charpoly(A-L.C,y))));
K^4*x^4+K^3*(2*l1+2*e)*x^3+K^2*(2*c*l4+2*b*l3+l2^2+l1^2+4*e*l1+e^2-2*b*d-a^2+1)*x^2+K*((2*b*l2+2*c*l1+2*c*e-2*b)*l4+(-2*c*l2+2*b*l1+2*b*e+2*c)*l3+2*e*l2^2+2*c*d*l2+2*e*l1^2+(2*e^2-2*b*d+2)*l1+(-2*b*d-2*a^2)*e-2*c*d)*x+(c^2+b^2)*l4^2+((2*b*e+2*c)*l2+(2*c*e-2*b)*l1)*l4+(c^2+b^2)*l3^2+((2*b-2*c*e)*l2+(2*b*e+2*c)*l1+(-2*c^2-2*b^2)*d)*l3+(e^2+1)*l2^2+(2*c*d*e-2*b*d)*l2+(e^2+1)*l1^2+(-2*b*d*e-2*c*d)*l1-a^2*e^2+(c^2+b^2)*d^2-a^2
So I have two polynomials in x. My idea how to find the formulas for the unknowns l1 - l4 was to write down the equations based on comparison of the coefficients at the same powers of x.
My question is:
how can I eliminate the K^4 coefficient at the highest power of x in the second polynomial?
how can I write the equations based on comparison of the coefficients at the same powers of x in both polynomials?

To eliminate K^4, divide by K^4:
normalized: expand(char_pol_observer / K^4);
To equate the coefficients, first find the the difference of two polynomials:
difference: char_pol_system - normalized;
Then equate the coefficient of each power of x to 0. You can get the coefficients of x^n using ratcoef.
system_of_eqns: makelist(ratcoef(difference, x, n) = 0, n, 3, 0, -1);
You can find l1 from the first equation (coefficient of x^3), but other equations are not linear in l2, l3, l4. algsys fails to find a solution. Solving them one by one for each variable and substituting could maybe give a closed formula for the other variables.

Related

Algorithm for finding a linear dependence with strictly positive coefficients

This must be surely well known, being a particular linear programming problem. What I want is a specific easy to implement efficient algorithm adapted to this very case, for relatively small sizes (about, say, ten vectors of dimension less than twenty).
I have vectors v(1), ..., v(m) of the same dimension. Want an
algorithm that produces strictly positive numbers c(1), ..., c(m)
such that c(1)v(1) + ... + c(m)v(m) is the zero vector, or tells for
sure that no such numbers exist.
What I found (in some clever code by a colleague) gives an approximate algorithm like this:
start with, say, c(1) = ... = c(m) = 1/m;
at each stage, given current approximation v = c(1)v(1) + ... + c(m)v(m), seek for j such that v - v(j) is longer than v(j).
If no such j exists then output "no solution" (or c(1), ..., c(m) if v is zero).
If such j exists, change v to the new approximation (1 - c)v + cv(j) with some small positive c.
This changes c(j) to (1 - c)c(j) + c and each other c(i) to (1 - c)c(i), so that the new coefficients will remain positive and strictly less than 1 (in fact they will sum to 1 all the time, i. e. we will remain in the convex hull of the v(i)).
Moreover the new v will have strictly smaller length, so eventually the algorithm will either discover that there is no solution or will produce arbitrarily small v.
Clearly this is incomplete and not satisfactory from several points of view. Can one do better?
Update
There are by now two useful answers; however one final step is missing.
They both boil down to the following (unless I miss some essential point).
Take a basis of the nullspace of v(1), ..., v(m).
One obtains a collection of not necessarily strictly positive solutions c(1), ..., c(m), c'(1), ..., c'(m), c''(1), ..., c''(m), ... such that any such solution is their linear combination (in a unique way). So we are reduced to the question whether this new collection of m-dimensional vectors admits a linear combination with strictly positive entries.
Example: take four 2d-vectors (2,1), (3,-1), (-1,2), (-3,-3). Their nullspace has a basis consisting of two solutions c = (12,-3,0,5), c' = (-1,1,1,0). None of these are strictly positive but their combination c + 4c' = (8,1,4,5) is. So the latter is the desired solution. But in general it might be not so easy to find out whether a strictly positive solution exists and if yes, how to find it.
As suggested in the answer by btilly one might use Fourier-Motzkin elimination for that, but again, I would be grateful for more details about it.
This is doable as follows.
First write your vectors as columns. Put them into a matrix. Now create a single column with entries c(1), c(2), ..., c(m_)). If you multiply that matrix times that column, you get your linear combination.
Now consider the elementary row operations. Multiply a row by a constant, swap two rows, add a multiple of one row to another. If you do an elementary row operation to the matrix, your linear combination after the row operation will be 0 if and only if it was before the row operation. Therefore doing elementary row operations DOESN'T CHANGE the coefficients that you're looking for.
Therefore you may simplify life by doing elementary row operations to put the matrix into reduced row echelon form. Once it is in reduced row echelon form, life gets easier. Columns which do not contain a pivot correspond to free coefficients. Columns which do contain a pivot correspond to coefficients that must be a specific linear combination of free coefficients. This reduces your problem being to find positive values for the free coefficients that make the others also positive. So you're now just solving a system of inequalities (and generally in far fewer variables).
Whether a system of linear inequalities has a solution can be answered with the FME method.
Denoting by A the matrix where the ith row is v(i) and by x the vector whose ith index is c(i), your problem can be describes as Ax = b where b=0 is the zero vector. The problem of Ax=b when b is not equal to zero is called the least squares problem (or the inhomogeneous least squares) and has a close form solution in the sense of Minimal Mean Square Error (MMSE). In your case however, b = 0 therefore we are in the homogeneous least squares problem. In Linear Algebra this can be looked as an eigenvalue problem, whose solution is the eigenvector x of the matrix A^TA whose eigenvalue is equal to 0. If no such eigenvalue exists, the MMSE solution will the the eigenvalue x whose matching eigenvalue is the smallest (closest to 0). A nice discussion on this topic is given here.
The solution is, as stated above, will be the eigenvector of A^TA with the lowest matching eigenvalue. This can be done using Singular Value Decomposition (SVD), which will decompose the matrix A into
The column of V matching with the lowest eigenvalue in the diagonal matrix Sigma will be your solution.
Explanation
When we want to minimize the Ax = 0 in the MSE sense, we can compute the vector derivative w.r.t x as follows:
Therefore, the eigenvector of A^TA matching the smallest eigenvalue will solve your problem.
Practical solution example
In python, you can use numpy.linalg.svd to perform the SVD decomposition. numpy orders the matrices U and V^T such that the leftmost column matches the largest eigenvalue and the rightmost column matches the lowest eigenvalue. Thus, you need to compute the SVD and take the rightmost column of the resulting V:
from numpy.linalg import svd
[_, _, vt] = svd(A)
x = vt[-1] # we take the last row since this is a transposed matrix, so the last column of V is the last row of V^T
One zero eigenvalue
In this case there is only one non trivial vector who solves the problem and the only way to satisfy the strictly positive condition will be if the values in the vector are all positive or all negative (multiplying the vector by -1 will not change the result)
Multiple zero eigenvalues
In the case where we have multiple zero eigenvalues, any of their matching eigenvectors is a possible solution and any linear combination of them. In this case one would have to check if there is a linear combination of these eigenvectors which creates a vector where all the values are strictly positive in order to satisfy the strictly positive condition.
How do we find the solution if one exists? once we are left with the basis of eigenvectors matching zero eigenvalue (also known as null-space) what we need to do is to solve a system of linear inequalities. I'll explain by example, since it will be clearer this way. Suppose we have the following matrix:
import numpy as np
A = np.array([[ 2, 3, -1, -3],
[ 1, -1, 2, -3]])
[_, Sigma, Vt] = np.linalg.svd(A) # Sigma has only 2 non-zero values, meaning that the null-space have a dimension of 2
We can extract the eigenvectors as explained above:
C = Vt[len(Sigma):]
# array([[-0.10292809, 0.59058542, 0.75313786, 0.27092073],
# [ 0.89356997, -0.15289589, 0.09399548, 0.4114856 ]])
What we want to find are two real coefficients, noted as x and y such that:
-0.10292809*x + 0.89356997*y > 0
0.59058542*x - 0.15289589*y > 0
0.75313786*x + 0.09399548*y > 0
0.27092073*x + 0.4114856*y > 0
We have a system of 4 inequalities with 2 variables, therefore in this case a solution is not promised. A solution can be found in many ways but I will propose the following. We can start with an initial guess and go over each hyperplane to check if the initial guess satisfies the inequality. if not we can reflect the guess to the other side of the hyperplane. After passing all the hyperplanes we check for a solution. (explanation of hot to reflect a point w.r.t a line can be found here). An example for python implementation will be:
import numpy as np
def get_strictly_positive(A):
[_, Sigma, Vt] = np.linalg.svd(A)
if len(Sigma[np.abs(Sigma) > 1e-5]) == Vt.shape[0]: # No zero eigenvalues, taking MMSE solution if exists
c = Vt[-1]
if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
return c if np.sum(c) == np.sum(abs(c)) else -1 * c
else:
return -1
# This means we have a zero solution
# Building matrix C of all the null-space basis vectors
C = Vt[len(Sigma[np.abs(Sigma) > 1e-5]):]
# 1. What we have here is a set of linear system of inequalities. Each equation inequality is a hyperplane and for
# each equation there is a valid half-space. We want to find the intersection of all the half-spaces, if it exists.
# 2. A vey important observations is that the basis of the null-space that we found using SVD is ORTHOGONAL!
coeffs = np.ones(C.shape[0]) # initial guess
for hyperplane in C.T:
if coeffs.dot(hyperplane) <= 0: # the guess is on the wrong side of the hyperplane
orthogonal_part = coeffs - (coeffs.dot(hyperplane) / hyperplane.dot(hyperplane)) * hyperplane
# reflecting the coefficients to the other side of the hyperplane
coeffs = 2 * orthogonal_part - coeffs
# If this yielded a solution, we return it
c = C.T.dot(coeffs)
if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
return c if np.sum(c) == np.sum(abs(c)) else -1 * c
else:
return -1
The equations are taken from one of my summaries and therefore I do not have a link to the source

Finding integral solution of an equation

This is part of a bigger question. Its actually a mathematical problem. So it would be really great if someone can direct me to any algorithm to obtain the solution of this problem or a pseudo code will be of help.
The question. Given an equation check if it has an integral solution.
For example:
(26a+5)/32=b
Here a is an integer. Is there an algorithm to predict or find if b can be an integer. I need a general solution not specific to this question. The equation can vary. Thanks
Your problem is an example of a linear Diophantine equation. About that, Wikipedia says:
This Diophantine equation [i.e., a x + b y = c] has a solution (where x and y are integers) if and only if c is a multiple of the greatest common divisor of a and b. Moreover, if (x, y) is a solution, then the other solutions have the form (x + k v, y - k u), where k is an arbitrary integer, and u and v are the quotients of a and b (respectively) by the greatest common divisor of a and b.
In this case, (26 a + 5)/32 = b is equivalent to 26 a - 32 b = -5. The gcd of the coefficients of the unknowns is gcd(26, -32) = 2. Since -5 is not a multiple of 2, there is no solution.
A general Diophantine equation is a polynomial in the unknowns, and can only be solved (if at all) by more complex methods. A web search might turn up specialized software for that problem.
Linear Diophantine equations take the form ax + by = c. If c is the greatest common divisor of a and b this means a=z'c and b=z''c then this is Bézout's identity of the form
with a=z' and b=z'' and the equation has an infinite number of solutions. So instead of trial searching method you can check if c is the greatest common divisor (GCD) of a and b
If indeed a and b are multiples of c then x and y can be computed using extended Euclidean algorithm which finds integers x and y (one of which is typically negative) that satisfy Bézout's identity
(as a side note: this holds also for any other Euclidean domain, i.e. polynomial ring & every Euclidean domain is unique factorization domain). You can use Iterative Method to find these solutions:
Integral solution to equation `a + bx = c + dy`

Algorithm for orthogonal polynomials

and thank you for the attention you're paying to my question :)
My question is about finding an (efficient enough) algorithm for finding orthogonal polynomials of a given weight function f.
I've tried to simply apply the Gram-Schmidt algorithm but this one is not efficient enough. Indeed, it requires O(n^2) integrals. But my goal is to use this algorithm in order to find Hankel determinants of a function f. So a "direct" computation wich consists in simply compute the matrix and take its determinants requires only 2*n - 1 integrals.
But I want to use the theorem stating that the Hankel determinant of order n of f is a product of the n first leading coefficients of the orthogonal polynomials of f. The reason is that when n gets larger (say about 20), Hankel determinant gets really big and my goal is to divided it by an other big constant (for n = 20, the constant is of order 10^103). My idea is then to "dilute" the computation of the constant in the product of the leading coefficients.
I hope there is a O(n) algorithm to compute the n first orthogonal polynomials :) I've done some digging and found nothing in that direction for general function f (f can be any smooth function, actually).
EDIT: I'll precise here what the objects I'm talking about are.
1) A Hankel determinant of order n is the determinant of a square matrix which is constant on the skew diagonals. Thus for example
a b c
b c d
c d e
is a Hankel matrix of size 3 by 3.
2) If you have a function f : R -> R, you can associate to f its "kth moment" which is defined as (I'll write it in tex) f_k := \int_{\mathbb{R}} f(x) x^k dx
With this, you can create a Hankel matrix A_n(f) whose entries are (A_n(f)){ij} = f{i+j-2}, that is something of the like
f_0 f_1 f_2
f_1 f_2 f_3
f_2 f_3 f_4
With this in mind, it is easy to define the Hankel determinant of f which is simply
H_n(f) := det(A_n(f)). (Of course, it is understood that f has sufficient decay at infinity, this means that all the moments are well defined. A typical choice for f could be the gaussian f(x) = exp(-x^2), or any continuous function on a compact set of R...)
3) What I call orthogonal polynomials of f is a set of polynomials (p_n) such that
\int_{\mathbb{R}} f(x) p_j(x) p_k(x) is 1 if j = k and 0 otherwize.
(They are called like that since they form an orthonormal basis of the vector space of polynomials with respect to the scalar product
(p|q) = \int_{\mathbb{R}} f(x) p(x) q(x) dx
4) Now, it is basic linear algebra that from any basis of a vector space equipped with a scalar product, you can built a orthonormal basis thanks to the Gram-Schmidt algorithm. This is where the n^2 integrations comes from. You start from the basis 1, x, x^2, ..., x^n. Then you need n(n-1) integrals for the family to be orthogonal, and you need n more in order to normalize them.
5) There is a theorem saying that if f : R -> R is a function having sufficient decay at infinity, then we have that its Hankel determinant H_n(f) is equal to
H_n(f) = \prod_{j = 0}^{n-1} \kappa_j^{-2}
where \kappa_j is the leading coefficient of the j+1th orthogonal polynomial of f.
Thank you for your answer!
(PS: I tagged octave because I work in octave so, with a bit of luck (but I doubt it), there is a built-in function or a package already done managing this kind of think)
Orthogonal polynomials obey a recurrence relation, which we can write as
P[n+1] = (X-a[n])*P[n] - b[n-1]*P[n-1]
P[0] = 1
P[1] = X-a[0]
and we can compute the a, b coefficients by
a[n] = <X*P[n]|P[n]> / c[n]
b[n-1] = c[n-1]/c[n]
where
c[n] = <P[n]|P[n]>
(Here < | > is your inner product).
However I cannot vouch for the stability of this process at large n.

How is the complexity of PCA O(min(p^3,n^3))?

I've been reading a paper on Sparse PCA, which is:
http://stats.stanford.edu/~imj/WEBLIST/AsYetUnpub/sparse.pdf
And it states that, if you have n data points, each represented with p features, then, the complexity of PCA is O(min(p^3,n^3)).
Can someone please explain how/why?
Covariance matrix computation is O(p2n); its eigen-value decomposition is O(p3). So, the complexity of PCA is O(p2n+p3).
O(min(p3,n3)) would imply that you could analyze a two-dimensional dataset of any size in fixed time, which is patently false.
Assuming your dataset is $X \in \R^{nxp}$ where n: number of samples, d: dimensions of a sample, you are interested in the eigenanalysis of $X^TX$ which is the main computational cost of PCA. Now matrices $X^TX \in \R^{pxp}$ and $XX^T \in \R^{nxn}$ have the same min(n, p) non negative eigenvalues and eigenvectors. Assuming p less than n you can solve the eigenanalysis in $O(p^3)$. If p greater than n (for example in computer vision in many cases the dimensionality of sample -number of pixels- is greater than the number of samples available) you can perform eigenanalysis in $O(n^3)$ time. In any case you can get the eigenvectors of one matrix from the eigenvalues and eigenvectors of the other matrix and do that in $O(min(p, n)^3)$ time.
$$X^TX = V \Lambda V^T$$
$$XX^T = U \Lambda U^T$$
$$U = XV\Lambda^{-1/2}$$
Below is michaelt's answer provided in both the original LaTeX and rendered as a PNG.
LaTeX code:
Assuming your dataset is $X \in R^{n\times p}$ where n: number of samples, p: dimensions of a sample, you are interested in the eigenanalysis of $X^TX$ which is the main computational cost of PCA. Now matrices $X^TX \in \R^{p \times p}$ and $XX^T \in \R^{n\times
n}$ have the same min(n, p) non negative eigenvalues and eigenvectors. Assuming p less than n you can solve the eigenanalysis in $O(p^3)$. If p greater than n (for example in computer vision in many cases the dimensionality of sample -number of pixels- is greater than the number of samples available) you can perform eigenanalysis in $O(n^3)$ time. In any case you can get the eigenvectors of one matrix from the eigenvalues and eigenvectors of the other matrix and do that in $O(min(p, n)^3)$ time.

Finding the number of possible arithmetic series of 3 among a given set of numbers

Given a set integers, the problem consists of finding the number of possible arithmetic series of length 3. The set of integers may or may not be sorted.
I could implement a simple bruteforce algorithm taking time O(n^3) but time efficiency is important and the set of integers can be as large as 10^5. This means bruteforce obviously won't work. Can anyone suggest some algorithm/pseudocode/code in c++?
An example: there are 4 numbers 5,2,7,8 . Clearly there is only one such possibility - (2,5,8) in which the common difference is 3, so our answer is 1.
EDIT:I forgot to mention one important property - each number of set given is between 1 to 30000 (inclusive).
You can do it in O(N^2) as follows: create a hash set of your integers so that you could check a presence or absence of an element in O(1). After that, make two nested loops over all pairs of set elements {X, Y}. This is done in O(N^2).
For each pair {X, Y}, assume that X < Y, and calculate two numbers:
Z1 = X - (Y-X)
Z2 = Y + (Y-X)
A triple {X, Y, Zi} form an arithmetic sequence if Zi != X && Zi != Y && set.contains(Zi)
Check both triples {X, Y, Z1} and {X, Y, Z2}. You can do it in O(1) using a hash set, for a total running time of the algorithm of O(N^2).
An alternative solution that is O(N+BlogB) (where B is the maximum size of the integers - in your case 30,000) is to consider the histogram H, where H[x] is the number of times x is present in the sequence.
This histogram can be computed in time N.
You are seeking elements a,b,c such that b-a=c-b. This is equivalent to 2b=a+c.
So the idea is to compute a second histogram G[x] for a+c and then loop through all elements b and add H[b]*G[2b] to the total. This takes time O(B).
(G[x] is the number of times in the sequence there are a pair of values a,b such that x=a+b.)
The only difficulty is computing G[x], but this can be done using the Fast Fourier Transform to convolve H[x] with itself in time O(BlogB).

Resources