How we can use CUR decomposition in place of SVD decomposition? - matrix

I have understood how CUR and SVD works, but not able to understand,
How we can use CUR in place of SVD decomposition?
Does C and R matrices in CUR follow the same properties as that of U and V matrices in SVD decomposition?
If we want to reduce the dimension of original matrix say from n to k, which matrix of CUR we can use to project original matrix, so that we will get k-dimensional data points.

There is a paper called Finding Structure in Randomness that address some points about all of these decompositions as well as the SVD which would be covered in Trefethan and Bau .
The interpolative decomposition is used in different places. A paper that explores it is here.
The U,V are unitary matrices. C is a matrix containing a subset of the columns of A, R a subset of the rows.

Related

How to get thin QR decomposition in Julia?

When I perform QR decomposition on a 3x2 matrix A in Julia, it gives a 3x3 matrix Q. Is there any way I can get a "thin" version of this QR, where it returns a Q that is 3x2 (same dimensions as matrix A)? My goal is just to get an orthonormal basis for the column space of A, so I don't need a 3x3 matrix Q.
This can be achieved with Matrix(qr(A)). qr doesn't return matrices, but rather returns an object that can multiply by other matrices or easily extract the thin or full Q matrix.

Efficient Product of 3 Sparse Matrices that creates a dense intermediate

I have 3 matrices that are all sparse, A, B, and C.
I need to take the matrix product of AB, which results in a dense matrix.
After that, I need the element wise product of AB (element wise *) C.
C is sparse, and therefore the element wise multiplication will zero out most of the dense product AB, resulting in a sparse matrix again.
Knowing that, I am trying to figure out a strategy for not materializing all of the dense components of AB.
If C_{i,J} is 0, then I should not materialize AB_{i, j}. This means I can skip the dot product of A_{row i} and B_{col j}. But it seems very inefficient to write a for loop over rows of A to pick out the rows I want to materialize.
Could there be another way to intelligently do this multiplication?
Here is an example data generator in R, although the real product AB that I have is more dense than this generator. FWIW help from any programming language would be useful, not necessarily R. (Eigen would be great though!)
require(Matrix)
n = 10000
p = 100
A = rsparsematrix(n, p, .1)
B = rsparsematrix(p, p, .1)
C = rsparsematrix(n, p, .1)
This is pretty closely related to triangle counting. If A, B, and C were all binary matrices, then you could interpret them as the adjacency matrices for a tripartite graph and count for each edge in C how many triangles it belongs to.
Perhaps there's a triangle-counting community detection in R that could be adapted to your use case.
Underneath such a library is likely the following trick (that I should have a cite for, but don't offhand). It involves sorting the nodes of the graph by degree and directing all of the outgoing edges from low-degree to high. Then for each node, you test each pair of outgoing edges (wedge) for the edge that would complete it.

How to solve the SVD problem with the constraint that the subspaces spanned the first K left and right vectors are the same?

I'm trying to solve the problem of finding the K leading 'singular vectors' of a matrix. Different from the standard SVD problem, I need the subspace spanned by the left and right 'singular vectors' to be the same. The objective function I consider is
my objective function
which is equivalent to
equivalent objective function
Here M is a matrix with real valued entries that is not necessarily symmetric. K is smaller than n and $\| \|_*$ is representing the nuclear norm.

solving a singular matrix

I am trying to write a little unwrapper for meshes. This uses a finite-element-method to solve for minimal linear stress between flattened and the raw surface. At the moment there are some vertices pinned to get a result. Without this the triangles are rotated and translated randomly...
But as this pinning isn't necessary for the problem, the better solution would be to directly solve the singular matrix. Petsc does provide some methodes to solve a singular system by providing some information on the nullspace. http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf#section.4.6 I wonder if there is any alternative for this in Eigen. If not, are there any other possibilities to solve this problem without fixing/pinning vertices.
thanks,
nice regards
see also this link for further informaton:
dev history
Eigen provides an algorithm for SVD decomposition: Jacobi SVD.
The SVD decomposition gives the null-space. Following the notations of the wikipedia article, let M = U D V be the SVD decomposition of M, where D is a diagonal matrix of the singular values. Then, from the Range, null space and rank:
The right-singular vectors [V] corresponding to vanishing singular values of M span the null space of M

How to accelerate the computation of leverage (diagonals of hat matrix) in least square regression?

For robust fitting problem, I want to find outliers by leverage value, which is the diagonal elements of the 'Hat' matrix. Let the data matrix be X (n * p), Hat matrix is:
Hat = X(X'X)^{-1}X'
where X' is the transpose of X.
When n is large, Hat matrix is a huge (n * n). So computing it is time consuming. I'm wondering is there a faster way to just compute the leverage values?
You did not specify a programming language, so I will only focus on the algorithm part.
If you have fitted your least square problem orthogonal methods like QR factorization and SVD, then hat matrix is in simple form. You may check out my answer Compute projection / hat matrix via QR factorization, SVD (and Cholesky factorization?) for explicit form of the hat matrix (written in LaTeX). Note, OP there wants complete hat matrix, so I did not demonstrate how to efficiently compute only the diagonal elements. But it is really straightforward. Note that for orthogonal methods the hat matrix ends up with a form QQ'. The diagonals are row-wise inner product. Cross product between different rows give off-diagonals. In R, such row-wise inner product can be computed as rowSums(Q ^ 2).
My answer How to compute diag(X %% solve(A) %% t(X)) efficiently without taking matrix inverse? is in a more general setting. Hat matrix is a special case with A = X'X. This answer focus on the use of triangular factorization like Cholesky factorization and LU factorization, and shows how to compute only diagonal elements. You will see colSums rather than rowSums here, because the hat matrix ends up with a form Q'Q.
Finally I would like to point out something statistical. High-leverage alone does not signal outliers. The combination of high leverage and high residual (i.e., high Cook's distance) signals outliers.

Resources