I'm working on a problem, and it feels like it might be analogous to an existing problem in mathematical programming, but I'm having trouble finding any such problem.
The problem goes like this: We have n sets of d dimensional vectors, such that each set contains exactly d+1 vectors. Within each set, all vectors have the same length (furthermore, the angle between any two vectors in a set is the same for any set, but I'm not sure whether this relevant). We then need to choose exactly one vector out of every set, and compute the sum of these vectors. Our objective is to make our choices so that the sum of the vectors is minimized.
It feels like the problem is sort of related to the Shortest Vector Problem, or a variant of job scheduling, where scheduling a job on a machine affects all machines, or a partition problem.
Does this problem ring a bell? Specifically, I'm looking for research into solving this problem, as currently my best bet is using an ILP, but I feel there must be something more clever that can be done.
I think this is an MIQP (Mixed Integer Quadratic Programming) or MISOCP (mixed integer second-order cone) problem:
Let
v(i,j) be i vectors in group j (data)
x(i,j) in {0,1}: binary decision variables
w: sum of selected vectors (decision variable)
Then the problem can be stated as:
min ||w||
sum(i, x(i,j)) = 1 for all j
w = sum((i,j), x(i,j)*v(i,j))
If you want you can substitute out w. Indeed I don't use your angle restriction (this is a restriction on the data and not on the decision variables of the model). The x variables are chosen such that we select exactly one vector from each group.
Minimizing the 2-norm can be replaced by minimizing the sum of the squares of the elements (i.e. minimizing the square of the norm).
Assuming the 2-norm, this is a MISOCP problem or convex MIQP problem for which quite a few solvers are available. For 1-norm and infinity-norms we can formulate a linear MIP problem. MIP solvers are widely available.
Related
I have a set of N (N is very large) linear equations with W variables.
For efficiency sake, I need to find the smallest number of linear equations that are solvable (have a unique solution). It can be assumed that a set of X equations containing Y variables has a unique solution when X == Y.
For example, if I have the following as input:
2a = b - c
a = 0.5b
b = 2 + a
I want to return the equation set:
a = 0.5b
b = 2 + a
Currently, I have an implementation that uses some heuristics. I create a matrix, columns are variables and rows are equations. I search the matrix to find a set of fully connected equations, and then one-by-one try removing equations to see if the remaining set of equations is still solvable, if it is continue, if not, return the set of equations.
Is there a known algorithm for this, and am I trying got reinvent the wheel?
Does anyone have input on how to better approach this?
Thanks.
Short answer is "yes", there are known algorithms. For example, you could add a single equation and then compute the rank of the matrix. Then add the next equation and compute the rank. If it hasn't gone up that new equation isn't helping any and you can get rid of it. Once the rank == the number of variables you have a unique solution and you're done. There are libraries (e.g. Colt, JAMA, la4j, etc.) that will do this for you.
Longer answer is that this is surprisingly difficult to do correctly, especially if your matrix gets big. You end up with lots of numerical stability issues and so on. I'm not a numerical linear algebra expert but I know enough to know there are dragons here if you're not careful. Having said that, if your matrices are small and "well conditioned" (the rows/columns aren't almost parallel) then you should be in good shape. It depends on your application.
Premise
I've a system of linear equations
dot(A,x) = y
whose solutions have many degrees of freedom: indeed the Number of linearly independent Equations (E) is less than the dimension of x, A.K.A. the Number of Variables (N).
The number of degrees of freedom left constrains the solutions to be a hyperplane N-E of the overall space R^N. Given the (unimportant) characteristics of A, I am always able to write the solutions x (a vector N x 1) as
x=dot(B,t)+q
where B is a N x (N-E) matrix, t a (N-E) x 1 vector and q a N x 1 vector. This define the hyperplane of the solutions of my original problem, A x = y in parametric form.
I need to extract a random solution, with uniform probability over any possible point of the hyperplane, such that all x are positive (we will refer to it as a positive solution). Note that, for the specific problem I am dealing with, the space of positive solutions of x exists and it is bounded (that's how the notion of uniform probability is reasonable for the specific case, to clarify as suggested by #Petr comment). In the beginning, once I was able to write x=Bt+q, I thought it extremely simple. Now I am starting to doubt it.
Proposed Solution
By now I do something like this:
For each dimension i in range(N-E) I compute the maximum and minimum value of t[i]: t_min[i] and t_max[i]. Intervals big enough to not exclude any possible positive solution. Those are algebraically computed, always existing and defining a limited space.
I extract N-E uniform random values t[i], each comprised between t_min [i] and t_max[i].
I compute x = dot(B,t)+q
If all x[j] are positives, accept the solution. If some x[j] is negative, go back to point 2.
An example is visible for a two dimensional space N-E in the next figure.
Caption: A problem in N dimension reduced to a N-E=2 space. The yellow diamond is the space of positive solutions of the N-dimensional problem. I randomly sample points in the orange box between (t1(min),t2(min)) and (t1(max),t2(max)) until I find a point in the yellow box.
I think it is a good enough solution, but...
Problem
When N-E is big, the space of the hyperparallelogram bounded inside the hypercube can be small. In general it will be small^(N-E), that can be very small. How small?
While for sure an infinite number of positive solutions to the original problem exist, the space of the solutions can have measure zero in the N-E dimensional space. This can happen if all the positive solutions of the original problem have one dimension of x = 0. The borders of a diamond will make contact, transforming the diamond of solutions to a line. Of course you will never randomly pick EXACTLY a line in 2D, let alone in 5D.
A obvious idea would be to further reduce the dimensionality from N-E to a smaller number, i.e. to extract directly points from the aforementioned line instead of the square. Algebra is not easy, but I'm working on it. I'm not positive I will be able to solve it.
Note that choosing first one dimension (for example t1), computing the new limits of t2 conditional to the value of t1 extracted and then extract a possible value of t2 in this boundary, while much faster, does not give a uniform probability among all the possible solutions.
I know that the problem is very specific, but even some general ideas or thoughts would be gladly received. I am doubtful if there is some computing technique to extract directly the solution in the diamond...
I am working with a system of the following structure:
L (k,m) = A2 k2 + A1 k + A0 - m B
I have the matrices (A2, A1, A0, and B) numerically and would like to compute coefficient matrices for L-1 such that I can evaluate L-1 for a given combination (k,m) without computing a matrix inverse each time. Could someone point me on the right direction for this type of algorithm / manipulation? I'm not even sure I know the correct search terms to search the linear algebra literature on the subject. I'm using MATLAB.
You can see from http://en.wikipedia.org/wiki/Invertible_matrix#Analytic_solution that the inverse of a matrix can be written as a matrix of sub-determinants divided by the determinant, so its entries are rational functions - one polynomial divided by another. Given that you know this, and that you can work out the order of the polynomials involved, it should in theory be possible to recover them, for example by fitting a rational function of the correct order to inverses computed at a finite number of points. You could then work out more inverses by evaluating the rational functions you found, instead of doing an explicit inverse.
However, note that the determinant for the three by three matrix example worked out below this is a sum of triples, so in your case it will be a polynomial of degree six in k, and with cross-product terms like k^4m. I suspect that this will save little or no time over computing the inverse as usual, and be numerically unstable to boot. However it does point out that any formula doing this will also be quite complex, as it amounts to working out a rational function of quite high degree.
There are some matrix identities used to avoid recalculation of matrix inverses, such as http://en.wikipedia.org/wiki/Binomial_inverse_theorem. I don't think this is directly applicable to your case, but there might be something there, especially if your A and B matrices are not of full rank.
I have a problem involving 3d positioning - sort of like GPS. Given a set of known 3d coordinates (x,y,z) and their distances d from an unknown point, I want to find the unknown point. There can be any number of reference points, however there will be at least four.
So, for example, points are in the format (x,y,z,d). I might have:
(1,0,0,1)
(0,2,0,2)
(0,0,3,3)
(0,3,4,5)
And here the unknown point would be (0,0,0,0).
What would be the best way to go about this? Is there an existing library that supports 3d multilateration? (I have been unable to find one). Since it's unlikely that my data will have an exact solution (all of the 4+ spheres probably won't have a single perfect intersect point), the algorithm would need to be capable of approximating it.
So far, I was thinking of taking each subset of three points, triangulating the unknown based on those three, and then averaging all of the results. Is there a better way to do this?
You could take a non-linear optimisation approach, by defining a "cost" function that incorporates the distance error from each of your observation points.
Setting the unknown point at (x,y,z), and considering a set of N observation points (xi,yi,zi,di) the following function could be used to characterise the total distance error:
C(x,y,z) = sum( ((x-xi)^2 + (y-yi)^2 + (z-zi)^2 - di^2)^2 )
^^^
^^^ for all observation points i = 1 to N
This is the sum of the squared distance errors for all points in the set. (It's actually based on the error in the squared distance, so that there are no square roots!)
When this function is at a minimum the target point (x,y,z) would be at an optimal position. If the solution gives C(x,y,z) = 0 all observations would be exactly satisfied.
One apporach to minimise this type of equation would be Newton's method. You'd have to provide an initial starting point for the iteration - possibly a mean value of the observation points (if they en-circle (x,y,z)) or possibly an initial triangulated value from any three observations.
Edit: Newton's method is an iterative algorithm that can be used for optimisation. A simple version would work along these lines:
H(X(k)) * dX = G(X(k)); // solve a system of linear equations for the
// increment dX in the solution vector X
X(k+1) = X(k) - dX; // update the solution vector by dX
The G(X(k)) denotes the gradient vector evaluated at X(k), in this case:
G(X(k)) = [dC/dx
dC/dy
dC/dz]
The H(X(k)) denotes the Hessian matrix evaluated at X(k), in this case the symmetric 3x3 matrix:
H(X(k)) = [d^2C/dx^2 d^2C/dxdy d^2C/dxdz
d^2C/dydx d^2C/dy^2 d^2C/dydz
d^2C/dzdx d^2C/dzdy d^2C/dz^2]
You should be able to differentiate the cost function analytically, and therefore end up with analytical expressions for G,H.
Another approach - if you don't like derivatives - is to approximate G,H numerically using finite differences.
Hope this helps.
Non-linear solution procedures are not required. You can easily linearise the system. If you take pair-wise differences
$(x-x_i)^2-(x-x_j)^2+(y-y_i)^2-(y-y_j)^2+(z-z_i)^2-(z-z_j)^2=d_i^2-d_j^2$
then a bit of algebra yields the linear equations
$(x_i-x_j) x +(y_i-y_j) y +(zi-zj) z=-1/2(d_i^2-d_j^2+ds_i^2-ds_j^2)$,
where $ds_i$ is the distance from the $i^{th}$ sensor to the origin. These are the equations of the planes defined by intersecting the $i^{th}$ and the $j^{th}$ spheres.
For four sensors you obtain an over-determined linear system with $4 choose 2 = 6$ equations. If $A$ is the resulting matrix and $b$ the corresponding vector of RHS, then you can solve the normal equations
$A^T A r = A^T b$
for the position vector $r$. This will work as long as your sensors are not coplanar.
If you can spend the time, an iterative solution should approach the correct solution pretty quickly. So pick any point the correct distance from site A, then go round the set working out the distance to the point then adjusting the point so that it's in the same direction from the site but the correct distance. Continue until your required precision is met (or until the point is no longer moving far enough in each iteration that it can meet your precision, as per the possible effects of approximate input data).
For an analytic approach, I can't think of anything better than what you already propose.
Let P(x) denote the polynomial in question. The least fixed point (LFP) of P is the lowest value of x such that x=P(x). The polynomial has real coefficients. There is no guarantee in general that an LFP will exist, although one is guaranteed to exist if the degree is odd and ≥ 3. I know of an efficient solution if the degree is 3. x=P(x) thus 0=P(x)-x. There is a closed-form cubic formula, solving for x is somewhat trivial and can be hardcoded. Degrees 2 and 1 are similarly easy. It's the more complicated cases that I'm having trouble with, since I can't seem to come up with a good algorithm for arbitrary degree.
EDIT:
I'm only considering real fixed points and taking the least among them, not necessarily the fixed point with the least absolute value.
Just solve f(x) = P(x) - x using your favorite numerical method. For example, you could iterate
x_{n + 1} = x_n - P(x_n) / (P'(x_n) - 1).
You won't find closed-form formula in general because there aren't any closed-form formula for quintic and higher polynomials. Thus, for quintic and higher degree you have to use a numerical method of some sort.
Since you want the least fixed point, you can't get away without finding all real roots of P(x) - x and selecting the smallest.
Finding all the roots of a polynomial is a tricky subject. If you have a black box routine, then by all means use it. Otherwise, consider the following trick:
Form M the companion matrix of P(x) - x
Find all eigenvalues of M
but this requires you have access to a routine for finding eigenvalues (which is another tricky problem, but there are plenty of good libraries).
Otherwise, you can implement the Jenkins-Traub algorithm, which is a highly non trivial piece of code.
I don't really recommend finding a zero (with eg. Newton's method) and deflating until you reach degree one: it is very unstable if not done properly, and you'll lose a lot of accuracy (and it is very difficult to tackle multiple roots with it). The proper way do do it is in fact the above-mentioned Jenkins-Traub algorithm.
This problem is trying to find the "least" (here I'm not sure if you mean in magnitude or actually the smallest, which could be the most negative) root of a polynomial. There is no closed form solution for polynomials of large degree, but there are myriad numerical approaches to finding roots.
As is often the case, Wikipedia is a good place to begin your search.
If you want to find the smallest root, then you can use the rule of signs to pin down the interval where it exists and then use some numerical method to find roots in that interval.