SVD with missing terms - algorithm

Given a real matrix A such that:
A is symmetric
All the off-diagonal terms are known and positive
All the diagonal terms are missing
Has rank k
I would like to find the best possible completion of A, called Ac, such that (approximately) rank(Ac)=k.
The matrix A can be huge (say n>100000), so I need a method working at most is O(n^3).
To do this, I am thinking at a SVD decomposition with missing terms:
I decompose A, then I recover it by selecting the first k singular vectors.
My question is: there exist any reliable result about SVD when the matrix to be decomposed has missing terms?

There's a close connection between this and the maximum cut problem. The usual semidefinite relaxation of the maximum cut problem involves minimising the trace of such an Ac subject to the constraint that Ac is positive semidefinite. I have found Christoph Helmberg's ConicBundle code particularly well-suited for computing numerical solutions to these problems.

Related

Combinatoric Vector Addition Minimization Problem

I'm working on a problem, and it feels like it might be analogous to an existing problem in mathematical programming, but I'm having trouble finding any such problem.
The problem goes like this: We have n sets of d dimensional vectors, such that each set contains exactly d+1 vectors. Within each set, all vectors have the same length (furthermore, the angle between any two vectors in a set is the same for any set, but I'm not sure whether this relevant). We then need to choose exactly one vector out of every set, and compute the sum of these vectors. Our objective is to make our choices so that the sum of the vectors is minimized.
It feels like the problem is sort of related to the Shortest Vector Problem, or a variant of job scheduling, where scheduling a job on a machine affects all machines, or a partition problem.
Does this problem ring a bell? Specifically, I'm looking for research into solving this problem, as currently my best bet is using an ILP, but I feel there must be something more clever that can be done.
I think this is an MIQP (Mixed Integer Quadratic Programming) or MISOCP (mixed integer second-order cone) problem:
Let
v(i,j) be i vectors in group j (data)
x(i,j) in {0,1}: binary decision variables
w: sum of selected vectors (decision variable)
Then the problem can be stated as:
min ||w||
sum(i, x(i,j)) = 1 for all j
w = sum((i,j), x(i,j)*v(i,j))
If you want you can substitute out w. Indeed I don't use your angle restriction (this is a restriction on the data and not on the decision variables of the model). The x variables are chosen such that we select exactly one vector from each group.
Minimizing the 2-norm can be replaced by minimizing the sum of the squares of the elements (i.e. minimizing the square of the norm).
Assuming the 2-norm, this is a MISOCP problem or convex MIQP problem for which quite a few solvers are available. For 1-norm and infinity-norms we can formulate a linear MIP problem. MIP solvers are widely available.

Computing the inverse of a polynomial matrix

I am working with a system of the following structure:
L (k,m) = A2 k2 + A1 k + A0 - m B
I have the matrices (A2, A1, A0, and B) numerically and would like to compute coefficient matrices for L-1 such that I can evaluate L-1 for a given combination (k,m) without computing a matrix inverse each time. Could someone point me on the right direction for this type of algorithm / manipulation? I'm not even sure I know the correct search terms to search the linear algebra literature on the subject. I'm using MATLAB.
You can see from http://en.wikipedia.org/wiki/Invertible_matrix#Analytic_solution that the inverse of a matrix can be written as a matrix of sub-determinants divided by the determinant, so its entries are rational functions - one polynomial divided by another. Given that you know this, and that you can work out the order of the polynomials involved, it should in theory be possible to recover them, for example by fitting a rational function of the correct order to inverses computed at a finite number of points. You could then work out more inverses by evaluating the rational functions you found, instead of doing an explicit inverse.
However, note that the determinant for the three by three matrix example worked out below this is a sum of triples, so in your case it will be a polynomial of degree six in k, and with cross-product terms like k^4m. I suspect that this will save little or no time over computing the inverse as usual, and be numerically unstable to boot. However it does point out that any formula doing this will also be quite complex, as it amounts to working out a rational function of quite high degree.
There are some matrix identities used to avoid recalculation of matrix inverses, such as http://en.wikipedia.org/wiki/Binomial_inverse_theorem. I don't think this is directly applicable to your case, but there might be something there, especially if your A and B matrices are not of full rank.

Efficient Computation of The Least Fixed Point of A Polynomial

Let P(x) denote the polynomial in question. The least fixed point (LFP) of P is the lowest value of x such that x=P(x). The polynomial has real coefficients. There is no guarantee in general that an LFP will exist, although one is guaranteed to exist if the degree is odd and ≥ 3. I know of an efficient solution if the degree is 3. x=P(x) thus 0=P(x)-x. There is a closed-form cubic formula, solving for x is somewhat trivial and can be hardcoded. Degrees 2 and 1 are similarly easy. It's the more complicated cases that I'm having trouble with, since I can't seem to come up with a good algorithm for arbitrary degree.
EDIT:
I'm only considering real fixed points and taking the least among them, not necessarily the fixed point with the least absolute value.
Just solve f(x) = P(x) - x using your favorite numerical method. For example, you could iterate
x_{n + 1} = x_n - P(x_n) / (P'(x_n) - 1).
You won't find closed-form formula in general because there aren't any closed-form formula for quintic and higher polynomials. Thus, for quintic and higher degree you have to use a numerical method of some sort.
Since you want the least fixed point, you can't get away without finding all real roots of P(x) - x and selecting the smallest.
Finding all the roots of a polynomial is a tricky subject. If you have a black box routine, then by all means use it. Otherwise, consider the following trick:
Form M the companion matrix of P(x) - x
Find all eigenvalues of M
but this requires you have access to a routine for finding eigenvalues (which is another tricky problem, but there are plenty of good libraries).
Otherwise, you can implement the Jenkins-Traub algorithm, which is a highly non trivial piece of code.
I don't really recommend finding a zero (with eg. Newton's method) and deflating until you reach degree one: it is very unstable if not done properly, and you'll lose a lot of accuracy (and it is very difficult to tackle multiple roots with it). The proper way do do it is in fact the above-mentioned Jenkins-Traub algorithm.
This problem is trying to find the "least" (here I'm not sure if you mean in magnitude or actually the smallest, which could be the most negative) root of a polynomial. There is no closed form solution for polynomials of large degree, but there are myriad numerical approaches to finding roots.
As is often the case, Wikipedia is a good place to begin your search.
If you want to find the smallest root, then you can use the rule of signs to pin down the interval where it exists and then use some numerical method to find roots in that interval.

Optimization problem - vector mapping

A and B are sets of N dimensional vectors (N=10), |B|>=|A| (|A|=10^2, |B|=10^5). Similarity measure sim(a,b) is dot product (required). The task is following: for each vector a in A find vector b in B, such that sum of similarities ss of all pairs is maximal.
My first attempt was greedy algorithm:
find the pair with the highest similarity and remove that pair from A,B
repeat (1) until A is empty
But such greedy algorithm is suboptimal in this case:
a_1=[1, 0]
a_2=[.5, .4]
b_1=[1, 1]
b_2=[.9, 0]
sim(a_1,b_1)=1
sim(a_1,b_2)=.9
sim(a_2,b_1)=.9
sim(a_2, b_2)=.45
Algorithm returns [a_1,b_1] and [a_2, b_2], ss=1.45, but optimal solution yields ss=1.8.
Is there efficient algo to solve this problem? Thanks
This is essentially a matching problem in weighted bipartite graph. Just assume that weight function f is a dot product (|ab|).
I don't think the special structure of your weight function will simplify problem a lot, so you're pretty much down to finding a maximum matching.
You can find some basic algorithms for this problem in this wikipedia article. Although at first glance they don't seem viable for your data (V = 10^5, E = 10^7), I would still research them: some of them might allow you to take advantage of your 'lame' set of vertixes, with one part orders of magnitude smaller than the other.
This article also seems relevant, although doesn't list any algorithms.
Not exactly a solution, but hope it helps.
I second Nikita here, it is an assignment (or matching) problem. I'm not sure this is computationally feasible for your problem, but you could use the Hungarian algorithm, also known as Munkres' assignment algorithm, where the cost of assignment (i,j) is the negative of the dot product of ai and bj. Unless you happen to know how the elements of A and B are formed, I think this is the most efficient known algorithm for your problem.

Algorithm that takes 2 'similar' matrices and 'aligns' one to another

First of all, the title is very bad, due to my lack of a concise vocabulary. I'll try to describe what I'm doing and then ask my question again.
Background Info
Let's say I have 2 matrices of size n x m, where n is the number of experimental observation vectors, each of length m (the time series over which the observations were collected). One of these matrices is the original matrix, called S, the other which is a reconstructed version of S, called Y.
Let's assume that Y properly reconstructs S. However due to the limitations of the reconstruction algorithm, Y can't determine the true amplitude of the vectors in S, nor is it guaranteed to provide the proper sign for those vectors (the vectors might be flipped). Also, the order of the observation vectors in Y might not match the original ordering of the corresponding vectors in S.
My Question
Is there an algorithm or technique to generate a new matrix which is a 'realignment' of Y to S, so that when Y and S are normalized, the algorithm can (1) find the vectors in Y that match the vectors in S and restore the original ordering of the vectors and (2) likewise match the signs of the vectors?
As always, I really appreciate all help given. Thanks!
How about simply calculating the normalized form for each vector in both matrices and comparing? That should give you an exacty one-to-one match for each vector in each matrix.
The normal form of a vector is one that conforms to:
v_norm = v / ||v||
where ||v|| is the euclidean norm for the vector. For v=(v1, v2, ..., vn) we have ||v|| = sqrt(v1^2 + ... + vn^2).
From there you can reconstruct their order, and return each vector its original length and direction (the vector or its opposite).
The algorithm should be fairly simple from here on, just decide on your implementation. This method should be of quadratic complexity. Per the comment, you can indeed achieve O(nlogn) complexity on this algorithm. If you need something better than that, linear complexity - specifically, you're going to need a much more complicated algorithm which I can't think of right now.

Resources