The column sum of symmetric matrix inverse? - matrix-inverse

Given a symmetric matrix L, and the inverse of L is difficult to solve. Is there any other way to calculate the sum( inverse(L)(:,i) ) ?

It can be shown that
sum ( inverse(L)(:,i) ) = x(i)
where the vector x is the solution to the simultaneous equations
L x = (1,1,...,1)'
(' denotes transposition). Since solving a system of linear equations is much faster than inverting a matrix (O(n²) vs O(n log(n))), this should improve the speed of the computation.

Related

Eigenvalues calculation in Maxima

Let's say I have matrix A in following form
then I have matrix C in following form
and finally matrix L in following form
My goal is to find the formulas for the elements of the matrix L so that the eigenvalues of the matrix A-LC will be "K" times greater than the eigenvalues of the matrix A. The "K" is a
parameter.
I have started with the definitions of the matrices:
A: matrix(
[-a,0,b,c],
[0,a,-c,b],
[d,0,-e,-1],
[0,d,1,-e]
);
C: matrix(
[1,0,0,0],
[0,1,0,0]
);
L: matrix(
[l1,-l2],
[l2,l1],
[l3,-l4],
[l4,l3]
);
Then I have found the formula for the characteristic polynomial of the matrix A (its roots are the eigenvalues of the matrix A)
char_pol_system : ratsimp(expand(charpoly(A, x)));
x^4+2*e*x^3+(e^2-2*b*d-a^2+1)*x^2+((-2*b*d-2*a^2)*e-2*c*d)*x-a^2*e^2+(c^2+b^2)*d^2-a^2
and I have also found the formula for the characteristic polynomial of the matrix (A-LC) (its roots are the eigenvalues of the matrix A-LC). The requirement that the eigenvalues of the matrix (A-LC) has to be "K" times greater than the eigenvalues of the matrix A is reflected
by following substitution y = Kx
char_pol_observer : subst((K*x), y, ratsimp(expand(charpoly(A-L.C,y))));
K^4*x^4+K^3*(2*l1+2*e)*x^3+K^2*(2*c*l4+2*b*l3+l2^2+l1^2+4*e*l1+e^2-2*b*d-a^2+1)*x^2+K*((2*b*l2+2*c*l1+2*c*e-2*b)*l4+(-2*c*l2+2*b*l1+2*b*e+2*c)*l3+2*e*l2^2+2*c*d*l2+2*e*l1^2+(2*e^2-2*b*d+2)*l1+(-2*b*d-2*a^2)*e-2*c*d)*x+(c^2+b^2)*l4^2+((2*b*e+2*c)*l2+(2*c*e-2*b)*l1)*l4+(c^2+b^2)*l3^2+((2*b-2*c*e)*l2+(2*b*e+2*c)*l1+(-2*c^2-2*b^2)*d)*l3+(e^2+1)*l2^2+(2*c*d*e-2*b*d)*l2+(e^2+1)*l1^2+(-2*b*d*e-2*c*d)*l1-a^2*e^2+(c^2+b^2)*d^2-a^2
So I have two polynomials in x. My idea how to find the formulas for the unknowns l1 - l4 was to write down the equations based on comparison of the coefficients at the same powers of x.
My question is:
how can I eliminate the K^4 coefficient at the highest power of x in the second polynomial?
how can I write the equations based on comparison of the coefficients at the same powers of x in both polynomials?
To eliminate K^4, divide by K^4:
normalized: expand(char_pol_observer / K^4);
To equate the coefficients, first find the the difference of two polynomials:
difference: char_pol_system - normalized;
Then equate the coefficient of each power of x to 0. You can get the coefficients of x^n using ratcoef.
system_of_eqns: makelist(ratcoef(difference, x, n) = 0, n, 3, 0, -1);
You can find l1 from the first equation (coefficient of x^3), but other equations are not linear in l2, l3, l4. algsys fails to find a solution. Solving them one by one for each variable and substituting could maybe give a closed formula for the other variables.

How to calculate time complexity of given algorithm ( ridge regression)?

i have following expression and i need to calculate time complexity of this algorithm. Could anybody help to get correct time complexity of this algorithm.
% save a matrix-vector multiply
Atb = A'*b;
% cache the factorization (using cholesky factorization)
[L U] = factor(A, a);
for( k = 0; k < maxiter; k++)
{
x^k+1 = (A^TA + a* I)^-1 (A^Tb + a (z^k - u^k))^T
}
Where A = mxn matrix and n>>>m, b,u,z = nx1 vectors, I = identity matrix and a=0.001
The most computationally intensive operation here is matrix inversion, so it depends on how you implement this operation. If we assume that you implemented with a Gauss–Jordan algorithm which takes O(n^3) then overall complexity is O(maxiter * n^3). Here i take into account that n is bigger than m (A^T*A takes O(m*n^2)).
If you calculate (A^T*A + a*I)^-1 and A^Tb outside then you are left with
Inv * (Atb + a(z^k - u^k))^T
which is O(n^2) because you need to multiply nxn matrix by nx1 vector while addition and subtraction take O(n).
Still, you have some inconsistencies in sizes which i described in comments for the question.

Generate multivariate normal matrix issue with accuracy

I am trying to use Cholesky decomposition to generate a multivariate matrix with this: Y = U + X*L
U is the mean vector: n x m
L from cholesky: m x m
X is a matrix with univariate normal vectors: n x m
After calculating the mean of the simulated matrix, I realized it was off. The reason is that the mean vector is very close to zero, so when adding it to L*X, L*X dominated the U. Anyone know how to work around this issue?

Algorithm for orthogonal polynomials

and thank you for the attention you're paying to my question :)
My question is about finding an (efficient enough) algorithm for finding orthogonal polynomials of a given weight function f.
I've tried to simply apply the Gram-Schmidt algorithm but this one is not efficient enough. Indeed, it requires O(n^2) integrals. But my goal is to use this algorithm in order to find Hankel determinants of a function f. So a "direct" computation wich consists in simply compute the matrix and take its determinants requires only 2*n - 1 integrals.
But I want to use the theorem stating that the Hankel determinant of order n of f is a product of the n first leading coefficients of the orthogonal polynomials of f. The reason is that when n gets larger (say about 20), Hankel determinant gets really big and my goal is to divided it by an other big constant (for n = 20, the constant is of order 10^103). My idea is then to "dilute" the computation of the constant in the product of the leading coefficients.
I hope there is a O(n) algorithm to compute the n first orthogonal polynomials :) I've done some digging and found nothing in that direction for general function f (f can be any smooth function, actually).
EDIT: I'll precise here what the objects I'm talking about are.
1) A Hankel determinant of order n is the determinant of a square matrix which is constant on the skew diagonals. Thus for example
a b c
b c d
c d e
is a Hankel matrix of size 3 by 3.
2) If you have a function f : R -> R, you can associate to f its "kth moment" which is defined as (I'll write it in tex) f_k := \int_{\mathbb{R}} f(x) x^k dx
With this, you can create a Hankel matrix A_n(f) whose entries are (A_n(f)){ij} = f{i+j-2}, that is something of the like
f_0 f_1 f_2
f_1 f_2 f_3
f_2 f_3 f_4
With this in mind, it is easy to define the Hankel determinant of f which is simply
H_n(f) := det(A_n(f)). (Of course, it is understood that f has sufficient decay at infinity, this means that all the moments are well defined. A typical choice for f could be the gaussian f(x) = exp(-x^2), or any continuous function on a compact set of R...)
3) What I call orthogonal polynomials of f is a set of polynomials (p_n) such that
\int_{\mathbb{R}} f(x) p_j(x) p_k(x) is 1 if j = k and 0 otherwize.
(They are called like that since they form an orthonormal basis of the vector space of polynomials with respect to the scalar product
(p|q) = \int_{\mathbb{R}} f(x) p(x) q(x) dx
4) Now, it is basic linear algebra that from any basis of a vector space equipped with a scalar product, you can built a orthonormal basis thanks to the Gram-Schmidt algorithm. This is where the n^2 integrations comes from. You start from the basis 1, x, x^2, ..., x^n. Then you need n(n-1) integrals for the family to be orthogonal, and you need n more in order to normalize them.
5) There is a theorem saying that if f : R -> R is a function having sufficient decay at infinity, then we have that its Hankel determinant H_n(f) is equal to
H_n(f) = \prod_{j = 0}^{n-1} \kappa_j^{-2}
where \kappa_j is the leading coefficient of the j+1th orthogonal polynomial of f.
Thank you for your answer!
(PS: I tagged octave because I work in octave so, with a bit of luck (but I doubt it), there is a built-in function or a package already done managing this kind of think)
Orthogonal polynomials obey a recurrence relation, which we can write as
P[n+1] = (X-a[n])*P[n] - b[n-1]*P[n-1]
P[0] = 1
P[1] = X-a[0]
and we can compute the a, b coefficients by
a[n] = <X*P[n]|P[n]> / c[n]
b[n-1] = c[n-1]/c[n]
where
c[n] = <P[n]|P[n]>
(Here < | > is your inner product).
However I cannot vouch for the stability of this process at large n.

How is the complexity of PCA O(min(p^3,n^3))?

I've been reading a paper on Sparse PCA, which is:
http://stats.stanford.edu/~imj/WEBLIST/AsYetUnpub/sparse.pdf
And it states that, if you have n data points, each represented with p features, then, the complexity of PCA is O(min(p^3,n^3)).
Can someone please explain how/why?
Covariance matrix computation is O(p2n); its eigen-value decomposition is O(p3). So, the complexity of PCA is O(p2n+p3).
O(min(p3,n3)) would imply that you could analyze a two-dimensional dataset of any size in fixed time, which is patently false.
Assuming your dataset is $X \in \R^{nxp}$ where n: number of samples, d: dimensions of a sample, you are interested in the eigenanalysis of $X^TX$ which is the main computational cost of PCA. Now matrices $X^TX \in \R^{pxp}$ and $XX^T \in \R^{nxn}$ have the same min(n, p) non negative eigenvalues and eigenvectors. Assuming p less than n you can solve the eigenanalysis in $O(p^3)$. If p greater than n (for example in computer vision in many cases the dimensionality of sample -number of pixels- is greater than the number of samples available) you can perform eigenanalysis in $O(n^3)$ time. In any case you can get the eigenvectors of one matrix from the eigenvalues and eigenvectors of the other matrix and do that in $O(min(p, n)^3)$ time.
$$X^TX = V \Lambda V^T$$
$$XX^T = U \Lambda U^T$$
$$U = XV\Lambda^{-1/2}$$
Below is michaelt's answer provided in both the original LaTeX and rendered as a PNG.
LaTeX code:
Assuming your dataset is $X \in R^{n\times p}$ where n: number of samples, p: dimensions of a sample, you are interested in the eigenanalysis of $X^TX$ which is the main computational cost of PCA. Now matrices $X^TX \in \R^{p \times p}$ and $XX^T \in \R^{n\times
n}$ have the same min(n, p) non negative eigenvalues and eigenvectors. Assuming p less than n you can solve the eigenanalysis in $O(p^3)$. If p greater than n (for example in computer vision in many cases the dimensionality of sample -number of pixels- is greater than the number of samples available) you can perform eigenanalysis in $O(n^3)$ time. In any case you can get the eigenvectors of one matrix from the eigenvalues and eigenvectors of the other matrix and do that in $O(min(p, n)^3)$ time.

Resources