Inverse of a matrix in 2n^3 operetions in Matlab? - algorithm

I'm trying to find an algorithm in Matlab which can return the inverse of a matrix with 2n^3 arithmetic operations at most (thus not counting assignments or comparisons). I've tried with the usual Gauss elimination algorithm: since finding the inverse is not different from solving a linear system of the kind A[X1 | X2 | ... | Xn] = [e1 | ... | en] with ei being the ith identity matrix column. But this yields (2/3)n^3 + 2n^3 operations, I need 2n^3. I think I should exploit the fact that the columns have all zeros except for the ith element of the column. This way I should be able to reduce the current (8/3)n^3 cost to only 2n^3.
Any idea on how I can manage that?

Related

Algorithm for finding a linear dependence with strictly positive coefficients

This must be surely well known, being a particular linear programming problem. What I want is a specific easy to implement efficient algorithm adapted to this very case, for relatively small sizes (about, say, ten vectors of dimension less than twenty).
I have vectors v(1), ..., v(m) of the same dimension. Want an
algorithm that produces strictly positive numbers c(1), ..., c(m)
such that c(1)v(1) + ... + c(m)v(m) is the zero vector, or tells for
sure that no such numbers exist.
What I found (in some clever code by a colleague) gives an approximate algorithm like this:
start with, say, c(1) = ... = c(m) = 1/m;
at each stage, given current approximation v = c(1)v(1) + ... + c(m)v(m), seek for j such that v - v(j) is longer than v(j).
If no such j exists then output "no solution" (or c(1), ..., c(m) if v is zero).
If such j exists, change v to the new approximation (1 - c)v + cv(j) with some small positive c.
This changes c(j) to (1 - c)c(j) + c and each other c(i) to (1 - c)c(i), so that the new coefficients will remain positive and strictly less than 1 (in fact they will sum to 1 all the time, i. e. we will remain in the convex hull of the v(i)).
Moreover the new v will have strictly smaller length, so eventually the algorithm will either discover that there is no solution or will produce arbitrarily small v.
Clearly this is incomplete and not satisfactory from several points of view. Can one do better?
Update
There are by now two useful answers; however one final step is missing.
They both boil down to the following (unless I miss some essential point).
Take a basis of the nullspace of v(1), ..., v(m).
One obtains a collection of not necessarily strictly positive solutions c(1), ..., c(m), c'(1), ..., c'(m), c''(1), ..., c''(m), ... such that any such solution is their linear combination (in a unique way). So we are reduced to the question whether this new collection of m-dimensional vectors admits a linear combination with strictly positive entries.
Example: take four 2d-vectors (2,1), (3,-1), (-1,2), (-3,-3). Their nullspace has a basis consisting of two solutions c = (12,-3,0,5), c' = (-1,1,1,0). None of these are strictly positive but their combination c + 4c' = (8,1,4,5) is. So the latter is the desired solution. But in general it might be not so easy to find out whether a strictly positive solution exists and if yes, how to find it.
As suggested in the answer by btilly one might use Fourier-Motzkin elimination for that, but again, I would be grateful for more details about it.
This is doable as follows.
First write your vectors as columns. Put them into a matrix. Now create a single column with entries c(1), c(2), ..., c(m_)). If you multiply that matrix times that column, you get your linear combination.
Now consider the elementary row operations. Multiply a row by a constant, swap two rows, add a multiple of one row to another. If you do an elementary row operation to the matrix, your linear combination after the row operation will be 0 if and only if it was before the row operation. Therefore doing elementary row operations DOESN'T CHANGE the coefficients that you're looking for.
Therefore you may simplify life by doing elementary row operations to put the matrix into reduced row echelon form. Once it is in reduced row echelon form, life gets easier. Columns which do not contain a pivot correspond to free coefficients. Columns which do contain a pivot correspond to coefficients that must be a specific linear combination of free coefficients. This reduces your problem being to find positive values for the free coefficients that make the others also positive. So you're now just solving a system of inequalities (and generally in far fewer variables).
Whether a system of linear inequalities has a solution can be answered with the FME method.
Denoting by A the matrix where the ith row is v(i) and by x the vector whose ith index is c(i), your problem can be describes as Ax = b where b=0 is the zero vector. The problem of Ax=b when b is not equal to zero is called the least squares problem (or the inhomogeneous least squares) and has a close form solution in the sense of Minimal Mean Square Error (MMSE). In your case however, b = 0 therefore we are in the homogeneous least squares problem. In Linear Algebra this can be looked as an eigenvalue problem, whose solution is the eigenvector x of the matrix A^TA whose eigenvalue is equal to 0. If no such eigenvalue exists, the MMSE solution will the the eigenvalue x whose matching eigenvalue is the smallest (closest to 0). A nice discussion on this topic is given here.
The solution is, as stated above, will be the eigenvector of A^TA with the lowest matching eigenvalue. This can be done using Singular Value Decomposition (SVD), which will decompose the matrix A into
The column of V matching with the lowest eigenvalue in the diagonal matrix Sigma will be your solution.
Explanation
When we want to minimize the Ax = 0 in the MSE sense, we can compute the vector derivative w.r.t x as follows:
Therefore, the eigenvector of A^TA matching the smallest eigenvalue will solve your problem.
Practical solution example
In python, you can use numpy.linalg.svd to perform the SVD decomposition. numpy orders the matrices U and V^T such that the leftmost column matches the largest eigenvalue and the rightmost column matches the lowest eigenvalue. Thus, you need to compute the SVD and take the rightmost column of the resulting V:
from numpy.linalg import svd
[_, _, vt] = svd(A)
x = vt[-1] # we take the last row since this is a transposed matrix, so the last column of V is the last row of V^T
One zero eigenvalue
In this case there is only one non trivial vector who solves the problem and the only way to satisfy the strictly positive condition will be if the values in the vector are all positive or all negative (multiplying the vector by -1 will not change the result)
Multiple zero eigenvalues
In the case where we have multiple zero eigenvalues, any of their matching eigenvectors is a possible solution and any linear combination of them. In this case one would have to check if there is a linear combination of these eigenvectors which creates a vector where all the values are strictly positive in order to satisfy the strictly positive condition.
How do we find the solution if one exists? once we are left with the basis of eigenvectors matching zero eigenvalue (also known as null-space) what we need to do is to solve a system of linear inequalities. I'll explain by example, since it will be clearer this way. Suppose we have the following matrix:
import numpy as np
A = np.array([[ 2, 3, -1, -3],
[ 1, -1, 2, -3]])
[_, Sigma, Vt] = np.linalg.svd(A) # Sigma has only 2 non-zero values, meaning that the null-space have a dimension of 2
We can extract the eigenvectors as explained above:
C = Vt[len(Sigma):]
# array([[-0.10292809, 0.59058542, 0.75313786, 0.27092073],
# [ 0.89356997, -0.15289589, 0.09399548, 0.4114856 ]])
What we want to find are two real coefficients, noted as x and y such that:
-0.10292809*x + 0.89356997*y > 0
0.59058542*x - 0.15289589*y > 0
0.75313786*x + 0.09399548*y > 0
0.27092073*x + 0.4114856*y > 0
We have a system of 4 inequalities with 2 variables, therefore in this case a solution is not promised. A solution can be found in many ways but I will propose the following. We can start with an initial guess and go over each hyperplane to check if the initial guess satisfies the inequality. if not we can reflect the guess to the other side of the hyperplane. After passing all the hyperplanes we check for a solution. (explanation of hot to reflect a point w.r.t a line can be found here). An example for python implementation will be:
import numpy as np
def get_strictly_positive(A):
[_, Sigma, Vt] = np.linalg.svd(A)
if len(Sigma[np.abs(Sigma) > 1e-5]) == Vt.shape[0]: # No zero eigenvalues, taking MMSE solution if exists
c = Vt[-1]
if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
return c if np.sum(c) == np.sum(abs(c)) else -1 * c
else:
return -1
# This means we have a zero solution
# Building matrix C of all the null-space basis vectors
C = Vt[len(Sigma[np.abs(Sigma) > 1e-5]):]
# 1. What we have here is a set of linear system of inequalities. Each equation inequality is a hyperplane and for
# each equation there is a valid half-space. We want to find the intersection of all the half-spaces, if it exists.
# 2. A vey important observations is that the basis of the null-space that we found using SVD is ORTHOGONAL!
coeffs = np.ones(C.shape[0]) # initial guess
for hyperplane in C.T:
if coeffs.dot(hyperplane) <= 0: # the guess is on the wrong side of the hyperplane
orthogonal_part = coeffs - (coeffs.dot(hyperplane) / hyperplane.dot(hyperplane)) * hyperplane
# reflecting the coefficients to the other side of the hyperplane
coeffs = 2 * orthogonal_part - coeffs
# If this yielded a solution, we return it
c = C.T.dot(coeffs)
if np.sum(c > 0) == len(c) or np.sum(c < 0) == len(c):
return c if np.sum(c) == np.sum(abs(c)) else -1 * c
else:
return -1
The equations are taken from one of my summaries and therefore I do not have a link to the source

dynamic programming and the use of matrices

I'm always confused about how dynamic programming uses the matrix to solve a problem. I understand roughly that the matrix is used to store the results from previous subproblems, so that it can be used in later computation of a bigger problem.
But, how does one determine the dimension of the matrix, and how do we know what value each row/column of the matrix should represent? ie, is there like a generic procedure of constructing the matrix?
For example, if we're interested in making changes for S amount of money using coins of value c1,c2,....cn, what should be the dimension of the matrix, and what should each column/row represent?
Any directional guidance will help. Thank you!
A problem becomes eligible for dynamic programming when it exhibits both Overlapping Sub-problems as well as Optimal Substructure.
Secondly, dynamic programming comes in two variations:
Tabulation or the Bottom-up approach
Memoization or the Top-down approach (not MemoRization!)
Dynamic Programming stems from the ideology that a large problem can be further broken down into sub-problems. The bottom-up version simply starts with solving these sub-problems first and gradually building up the target solution. The top-down approach relies on using auxiliary storage doing away with re-computation.
is there like a generic procedure of constructing the matrix?
It really depends on what problem you're solving and how you're solving it! Matrices are typically used in tabulation, but it always need not be a matrix. The main goal here is to have the solutions to the sub-problems readily available on demand, it could be stored in an array, a matrix or even a hash-table.
The classic book Introduction to Algorithms demonstrates the solution to the rod-cutting problem in both ways where a 1D array is used as auxiliary storage.
For example, if we're interested in making changes for S amount of money using coins of value c1,c2,....cn, what should be the dimension of the matrix and what should each column/row represent?
If I'm not wrong, you're referring to the "total unique ways to make change" variant of the coin-change problem. You need to find the total ways a given amount can be constructed using given set of coins.
There is a great video on this that breaks it down pretty well. It uses a bottom-up approach: https://www.youtube.com/watch?v=DJ4a7cmjZY0
Assume you need to construct amount n = 10 from the given subset of coins c = {1, 2, 10}
Take an empty set and keep adding the coins one per row from c. For every next row, one coin from the set is added. The columns represent the sub-problems. For i in n = 1 : 10, the ith column represents the the total number of ways i can be constructed using the coins in that row:
---------------------------------------------------------
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---------------------------------------------------------
|{} | | | | | | | | | | | |
---------------------------------------------------------
|{1} | | X | | | | | | | | | |
---------------------------------------------------------
|{1, 2} | | | | | | | | | | | |
---------------------------------------------------------
|{1, 2, 10}| | | | Y | | | | | | | Z |
---------------------------------------------------------
In this table, X represents the number of ways amount 1 can be constructed using the coin {1}, Y represents the number of ways amount 3 can be represented using the coins {1, 2, 10} and Z represents the number of ways amount 10 can be represented using the coins {1, 2, 10}.
How are the cells populated?
Initially, the entire first column headed by 0 is filled with 1s because no matter how many coins you have, for the amount 0 you have exactly one way to make change that is to make no change.
The rest of the first row with the empty subset {} is filled with 0s because you can't make a change for any positive amount with no coins.
Now the matrix looks like this:
---------------------------------------------------------
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---------------------------------------------------------
|{} | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
---------------------------------------------------------
|{1} | 1 | X | | | | | | | | | |
---------------------------------------------------------
|{1, 2} | 1 | | | | | | | | | | |
---------------------------------------------------------
|{1, 2, 10}| 1 | | | Y | | | | | | | Z |
---------------------------------------------------------
Now, how do we fill X? You have two alternatives, either to use the 1 coin in this new super set or to not use it. If you did not use the coin, the ways are same as the above row that is 0. But since 1 can be used to make a change of amount 1, we use that coin, and subtract 1 from the amount 1 to be left with 0. Now lookup, 0's ways in the same row, that is the column previous to that of X which is 1. So add it to the amount from the top row to have a total of 1. So you fill this cell as 1.
But, how does one determine the dimension of the matrix, and how do we know what value each row/column of the matrix should represent? ie, is there like a generic procedure of constructing the matrix?
You need to find the recurrence relation and the state(number of parameters) required to represent a subproblem. The whole idea of DP is to avoid re-computation of a subproblem. You compute a subproblem only once the first time you require it, store it in memory and refer to the stored value when required. So if you want to refer to the stored result of a subproblem later, you need to have a key that uniquely identifies the subproblem. The state of the subproblem is usually good choice for this key. If a subproblem has 3 parameters x, y, z, then a tuple (value of x, value of y, value of z) is a good key to store result of the subproblem in a hash table for example. If these values are positive integers, you can use a matrix i.e., multi dimensional array instead of a hash table. Let's develop the ideas of finding the recurrence relation and identifying the state required to uniquely represent a subproblem so that your confusion about the matrix dimensions is cleared.
The most important step in being able to solve a DP problem(any recursive problem in general) is identifying and being able to write down the recurrence relationship. Once the recurrence relation is identified, I'd say 90% of the work is done. Let's first see how to write down the recurrence relation.
Three important ideas in any recursive problem is
identifying the trivial cases (the base cases whose answers are known),
identifying how to divide the problem into subproblems
knowing how to combine the results of the subproblems.
Let's take merge sort as example. It is not a DP problem as there are no overlapping subproblems but for the purpose of introducing recurrence relation, it is a good choice as it is famous and easy to understand. As you might already know, the trivial case in merge sort is array of size 0 or 1. Recursion step is to divide the problems into two subproblems of half the size of the current problem and combination step is the merging algorithm. Finally we can write the recurrence relation for merge sort as follows:
sort(0, n) = merge(sort(0, n/2), sort(n/2, n))
In the above recurrence relation for sort algorithm, the problem of range (0, n) is divided into two subproblems (0, n/2) and (n/2, 0). The combination step is the merge algorithm.
Now let's try to deduce the recurrence relation for some DP problems. You should be able to derive the dimensions of the state(and hence your confusion about dimensions of matrix) from the recurrence relation.
Remember that to find the recurrence relation, we need to identify the subproblems. Identifying subproblems is not always straightforward. Only practice of more DP problems to gain better intuition at these problems and identifying the patterns, trial and error etc are required.
Let's identify the recurrence relations for two problems that look almost similar but require different approach. I chose this problems only because the question was about confusion regarding the dimensions of the matrix.
Given coins of different denominations and an amount, find the minimum number of coins required to make the amount.
Let's represent the problem/algorithm of finding the minimum number of coins required for a given amount n as F(n). If the denominations are p, q, r.
If we know the answer for F(n-p), F(n-q) and F(n-r) i.e., the minimum number of coins required to make amounts n-p, n-q and n-r respectively, we can take the minimum of these and 1 to get the number of coins required to make the amount n.
The subproblems here are F(n-p), F(n-q) and F(n-r) and the combination step is to take the minimum of these values and adding one.
So the recurrence relation is:
F(n) = min(F(n-p), F(n-q), F(n-r)) + 1
# Base conditions
F(0) = 0
F(n) = infinity if n < 0
There is optimal substructure and there are repeated problems(if it is not obvious, take a sample problem and draw the recursion tree) and so we can use some storage to avoid repeated computation. Each of the subproblem is a node in the recursion tree.
From the recurrence relation you can see that the function F takes only one parameter i.e., one parameter is enough to represent the subproblem/node in the recursion tree and hence a 1D array or a hash table keyed by single value can be used to store the result of the subproblems.
Given coins of different denominations and an amount, find total number of combination of coins required to make the amount.
This problem is more subtle. Pause and think for moment and try to identify the recurrence relation.
Let's use the same terminology as above problem i.e., let's say the amount is n and p, q, r are the denominations.
Does the same recurrence as the above problem work? If F(n) represents the total number of combinations of counts to make n out of given denominations, can we combine F(n-p), F(n-q) and F(n-r) is some way to get F(n)? How about just adding them? Does F(n) = F(n-p) + F(n-q) + F(n-r) hold?
Take n = 3 and two denominations p, q = 1, 2
With above recurrence relation we get the answer as 3 corresponding to the splits [1, 1, 1], [1, 2], [2, 1] which is incorrect as [1, 2] and [2, 1] is the same combination of denominations. The above recurrence is calculating the number of permutations instead of combinations. To avoid the repeated results, we need to bring in order about the coins. We can choose it ourself by mandating that p comes before q and q comes before r. Focus on the number of combination with each denomination. Since we are enforcing the order ourself among the available denominations [p, q, r].
Let's start with p and solve the following recurrence.
F(n, only p allowed) = F(n-p, only p allowed)
## Base condition
F(0) = 1 # There is only one way to select 0 coins which is not selecting any coinss
Now let's allow the next denomination q and then solve the following recurrence.
F(n, p and q allowed) = F(n-q, p and q allowed) + F(n, only p allowed)
Finally,
F(n, p q and r allowed) = F(n-r, p q and r allowed) + F(n, p and q allowed)
The above three recurrence relations in general can be written as follows where i is the index in the denominations.
# F(n, i) = with denominations[i] + without denominations[i]
F(n, i) = F(n - denominations[i], i) + F(n, i-1)
## Base conditions
F(n, i) = 1 if n == 0
F(n, i) = 0 if n < 0 or i < 0
From the recurrence relation, we can see that you need two state variables to represent a subproblem and hence a 2D array or a hash table keyed by combination of these two values(a tuple for example) is needed to cache the results of subproblems.
Also see Thought process for arriving at dynamic programming solution of Coins change problem
This chapter explains it very well: http://www.cs.berkeley.edu/~vazirani/algorithms/chap6.pdf
At page 178 it gives some approaches to identify the sub problems that allow you to apply dynamic programming.
An array used by a DP solution is almost always based on the dimensions of the state space of the problem - that is, the valid values for each of its parameters
For example
fib[i+2] = fib[i+1] + fib[i]
Is the same as
def fib(i):
return fib(i-1)+fib(i-2]
You can make this more apparent by implementing memoization in your recursive functions
def fib(i):
if( memo[i] == null )
memo[i] = fib(i-1)+fib(i-2)
return memo[i]
If your recursive function has K parameters, you'll likely need a K-dimensional matrix.

Maximise sum of "non-overlapping" numbers from matrix

Just looking for a bit of direction, I realise that the example given is possible to solve using brute force iteration, but I am looking for a more elegant (ie. mathematical?) solution which could potentially solve significantly larger examples (say 20x20 or 30x30). It is entirely possible that this cannot be done, and I have had very little success in coming up with an approach which does not rely on brute force...
I have a matrix (call it A) which is nxn. I wish to select a subset (call it B) of points from matrix A. The subset will consist of n elements, where one and only one element is taken from each row and from each column of A. The output should provide a solution (B) such that the sum of the elements that make up B is the maximum possible value, given these constraints (eg. 25 in the example below). If multiple instances of B are found (ie. different solutions which give the same maximum sum) the solution for B which has the largest minimum element should be selected.
B could also be a selection matrix which is nxn, but where only the n desired elements are non-zero.
For example:
if A =
|5 4 3 2 1|
|4 3 2 1 5|
|3 2 1 5 4|
|2 1 5 4 3|
|1 5 4 3 2|
=> B would be
|5 5 5 5 5|
However, if A =
|5 4 3|
|4 3 2|
|3 2 1|
B =
|3 3 3|
As the minimum element of B is 3 which is larger than for
|5 3 1|
or
|4 4 1|
which also both sum to 9
Your problem is almost identical to the Assignment problem, which can e.g. be solved by the Hungarian algorithm in polynomial time.
Note that the assignment problem is usually a minimization problem, but multiplying your matrix with -1 and adding some constant should make the method applicable. Further, there is no formal tie-braking condition, for case of multiple optimal solutions. However, the method yields you a solution having the optimal sum. Let m be the minimum summand. Modify your matrix by setting all entries less or equal to m to zero and solve again. Either you get a solution with the same sum that is better than the last one. If not, the previous solution was already optimal.
As Matthias indicated you should use backtracking.
Find a reasonable solution. Select max values from each row and see if they are non-overlapping. If not, then perturb part of the solution so that the result becomes non-overlapping.
Define fitness of a partial solution. Let us say you are picking up value for each row iteratively and you have already picked values from first k rows. The fitness of this solution equals sum of the already picked values + max values from remaining rows and unselected columns
Now recursively start searching for solution. Select the values from first row, calculate their fitness and insert them into a priority queue. Remove all the solutions whose fitness is lower than the current optimal solution (initialized in step 1). Pick the solution at the head of the queue, calculate the next level of solutions and insert them back to the priority queue. Once you have selected values from all columns and rows, calculate the sum, and if it is higher than current optimal, replace it.
Ouch. This algorithm is wrong; there is no proof because it's wrong and therefore it's impossible to prove that it's correct. ;) I'm leaving it here because I'm too attached to delete it entirely, and it's a good demonstration of why you should formally prove algorithms instead of saying "this looks right! There's no possible way this could fail to work!"
I'm giving this solution without proof, for the time being. I have a proof sketch but I'm having trouble proving optimal substructure for this problem. Anyway...
Problem
Given a square array of numbers, select as many "non-overlapping" numbers as possible so that the sum of the selected numbers is maximised. "Non-overlapping" means that no two numbers can be from the same row or the same column.
Algorithm
Let A be a square array of n by n numbers.
Let Aij denote the element of A in the ith row and jth column.
Let S( i1:i2, j1:j2 ) denote the optimal sum of non-overlapping numbers for a square subarray of A containing the intersection of rows i1 to i2 and columns j1 to j2.
Then the optimal sum of non-overlapping numbers is denoted S( 1:n , 1:n ) and is given as follows:
S( 1:n , 1:n ) = max { [ S( 2:n , 2:n ) + A11 ]
[ S( 2:n , 1:n-1 ) + A1n ]
[ S( 1:n-1 , 2:n ) + An1 ]
[ S( 1:n-1 , 1:n-1 ) + Ann ] }
(recursively)
Note that S( i:i, j:j ) is simply Aij.
That is, the optimal sum for a square array of size n can be determined by separately computing the optimal sum for each of the four sub-arrays of size n-1, and then maximising the sum of the sub-array and the element that was "left out".
S for |# # # #|
|# # # #|
|# # # #|
|# # # #|
Is the best of the sums S for:
|# | | #| |# # # | | # # #|
| # # #| |# # # | |# # # | | # # #|
| # # #| |# # # | |# # # | | # # #|
| # # #| |# # # | | #| |# |
Implementation
The recursive algorithm above suggests a recursive solution:
def S(A,i1,i2,j1,j2):
if (i1 == i2) and (j1==j2):
return A[i1][j1]
else:
return max ( S( A, i1+1, i2, j1+1, j2) + A[i1][j1] ],
S( A, i1+1, i2, j1, j2-1) + A[i1][j2] ],
S( A, i1, i2-1, j1+1, j2) + A[i2][j1] ],
S( A, i1, i2-1, j1, j2-1) + A[i2][j2] ], )
Note that this will make O(4^n) calls to S()!! This is much better than the factorial O(n!) time complexity of the "brute force" solution, but still awful performance.
The important thing to note here is that many of the calls are repeated with the same parameters. For example, in solving a 3*3 array, each 2*2 array is solved many times.
This suggests two possible solutions for a speedup:
Make the recursive function S() cache results so that it only needs to S(A,i1,i2,j1,j2) once for each i1,i2,j1,j2. This means that S() only needs to calculate O(n^3) results - all other requests will be fufilled from cache. (This is called memoising.)
Instead of starting at the top, with the large n*n array, and working down through successively smaller subproblems, start at the bottom with the smallest possible subproblems and build up to the n*n case. This is called dynamic programming. This is also O(n^3), but it's a much faster O(n^3) because you don't have to hit a cache all the time.
The dynamic programming solution proceeds somewhat like:
Find optimal solutions to all 1x1 sub-arrays. (Trivial.)
Find optimal solutions for all 2x2 sub-arrays.
Find optimal solutions for all 3x3 sub-arrays.
...
Find optimal solutions for all n-1 * n-1 sub-arrays.
Find optimal solutions for the complete n*n sub-array.
Notes on this solution:
No proof yet. I'm working on it.
You'll note the algorithm above only gives you S(), the optimal sum. It doesn't tell you which numbers actually make up that sum. You get to add in your own method of backtracing your path to the solution.
The algorithm above doesn't guarantee the property that ties like 2,2 vs. 1,3 will be broken in favour of having all the individual numbers be as large as possible (so that 2,2 wins.) I believe you can define max() to break ties in favour of the largest numbers possible, and that will do what you want, but I can't prove it.
General notes:
Dynamic programming is a powerful technique for devising fast algorithms for any problem which exhibits two properties:
Optimal substructure: A problem can be broken down into slightly smaller parts, each of which can be used as part of the solution to the original problem.
Overlapping subproblems means that there are few actual subproblems to solve, and the solutions to the subproblems are re-used many times.
If the problem has optimal substructure, and the problem breaks down into slightly smaller problems - say a problem of size n breaks down into subproblems of size n-1 - then the problem can be solved by dynamic programming.
If you can split the problem into much smaller chunks - say chopping a problem of size n into halves, each of size n/2 - that's divide and conquer, not dynamic programming. Divide and conquer solutions are generally very fast - for example binary search will find an element in a sorted array in O(log n) time.
This is related to the n Queens problem, except that you do not care about the diagonal and you have weighted solutions. As the Queens problem, you can solve it by (multiple) backtracking.
I.e., once you find a solution you remember its weight, mark the soulution as invalid, and start over. The (a) solution with the highest weight wins.

Algorithm to find points that are furthest apart -- better than O(n^2)?

In my program, I have a set of points. For purposes of rescaling, I am searching for the two nodes that are furthest apart, and then computing a factor by which to multiply all coordinates so that the maximum distance is equal to some predefined one I define.
The algorithm I am using to find the two points furthest apart is, however, problematic for large sets of points, as it is O(n^2); pseudocode (distances that were already calculated are skipped):
for each point in points:
for each other point in points:
if distance between point and other point > max
max = distance between point and other point
Is there something faster?
If you just need the scale and not the exact points you can do this in O(n) time with some error margin. Think about the simple case of making a bounding box. Calculate the minimum x value from all the points, the maximum x, the minimum y and the maximum y. These four numbers give you a maximum bounding box around your points with max error of 1 - (1/sqrt(2)) about 30%. You can reduce this by adding more sides to your square. Think about the case of an octagon. To calculate the min and max values for the other sides you have to rotate your coordinate system.
Error vs run time breaks down like this.
shape - run time - max error
square - 2N - 30%
octagon - 4N - 16%
16 sides - 8N - 4%
32 sides - 16N - 1%
Here's the equation for the max error I came up with.
angle = 180 / sides
max_error = (1 / cos angle) - cos angle
Let me know if I should add a diagram explaining this.
The following may help to put the average case linear-time algorithms (for the diameter of a finite set) in a clearer light, as well as contrast multi-dimensional and plane geometry problems.
In 1983 Megiddo gave a deterministic linear-time algorithm for the smallest enclosing circle (or sphere in higher dimensions).
In general position the enclosing circle will have two or three points on its boundary, and thus finding the two farthest apart can be done "on average" in constant time once the bounding circle is known. In higher dimensions the number of points in general position needed on the boundary of the sphere increases (D+1 points for dimension D), and indeed the cost of computing distance between a single pair of points rises linearly with dimension.
The subset of points lying on the bounding circle or sphere is also found in linear time. In the theoretical worst-case all points would lie on the bounding circle or sphere, but this at least is stricter than just having all points on the convex hull. If the points on the sphere are independently perturbed, say along radial lines, then general position is assured with probability 1, and an approximate diameter can be found from just D+1 points on the revised enclosing sphere. This randomized approximation has quadratic dependence on dimension but only linear complexity in number of points.
If the points lying on a bounding circle are "sorted" (cyclically, of course), finding the pair farthest apart can be done in linear time, relying upon the "unimodality" of the circle (meaning that distances from a fixed point rise monotonically until the antipode and then fall) to amortize the cost of computing distances. Unfortunately that sorting would introduce a step with O(n log n) time complexity, and this proves to be worst-case optimal for exact deterministic methods in the planar case.
In 2001 Ramos succeeded in showing an O(n log n) deterministic algorithm for three-dimensional sets, but the technique is so involved that an implementation may be impractical or slower than brute force all-pairs search up to very large datasets.
For higher dimensions many authors have considered randomized or approximate algorithms. See Piotr Indyk's thesis (2000) for approximate methods with only polynomial dependence on dimension for various proximity problems.
As mentioned in this answer, you are seeking the "diameter" of the set of N points, a well known problem in computational geometry. There are basically two steps:
Find the convex hull of the points. Algorithms exist that are O(N ln N), worst case. In practice, QuickHull is usually a fast choice, although potentially O(N^2) worst case. The QHull implementation is convenient to call from the command line. The CGAL library provides a C++ implementation
Antipodal pairs on the convex hull are candidates for furthest points. One can search over the antipodal points using an algorithm like Rotating calipers in O(N) time.
The problem can be generalized to an "all-farthest pairs" problem: for each point i, find the most distant point j---we're now seeking N pairs of points. The solution again uses the convex hull, but now the second part can be done with a matrix searching algorithm.
Not really - a common approach is to group points into clusters and then store the distances between clusters.
That way you don't need to check if a certain house in New York is furthest from Paris if you have already determiend that Australia is further away
The distance from A to B is the same as the distance from B to A. You can easily modify the algorithm to eliminate half of the computations this way. It'll still be O(n^2) but will be twice as fast.
That is, instead of computing all off-diagonal elements of the distance matrix P x P:
P = {A, B, C, D, ...}
+ A + B + C + D + ...
A | | * | * | * | ...
B | * | | * | * | ...
C | * | * | | * | ...
D | * | * | * | | ...
| | | | |
you can compute either the upper triangle:
+ A + B + C + D + ...
A | | * | * | * | ...
B | | | * | * | ...
C | | | | * | ...
D | | | | | ...
| | | | |
or the lower triangle:
+ A + B + C + D + ...
A | | | | | ...
B | * | | | | ...
C | * | * | | | ...
D | * | * | * | | ...
| | | | |
If you perform this query often but the points do not change much, you can perform precalculations that can speed things up.
Each point can store the farthest point from it and recheck on every point addition if the new point is farther.
When you query you just go thru all the points and look at their cached points.
You end up with O(n) for new point entry and O(n) for farthest apart query.
I'm not sure if putting the points into a spatial index and querying it leads to an O(n log n) algorithm.

How to calculate the inverse key matrix in Hill Cipher algorithm?

I am finding it very hard to understand the way the inverse of the matrix is calculated in the Hill Cipher algorithm. I get the idea of it all being done in modulo arithmetic, but somehow things are not adding up. I would really appreciate a simple explanation!
Consider the following Hill Cipher key matrix:
5 8
17 3
Please use the above matrix for illustration.
You must study the Linear congruence theorem and the extended GCD algorithm, which belong to Number Theory, in order to understand the maths behind modulo arithmetic.
The inverse of matrix K for example is (1/det(K)) * adjoint(K), where det(K) <> 0.
I assume that you don't understand how to calculate the 1/det(K) in modulo arithmetic and here is where linear congruences and GCD come to play.
Your K has det(K) = -121. Lets say that the modulo m is 26. We want x*(-121) = 1 (mod 26).[ a = b (mod m) means that a-b = N*m]
We can easily find that for x=3 the above congruence is true because 26 divides (3*(-121) -1) exactly. Of course, the correct way is to use GCD in reverse to calculate the x, but I don't have time for explaining how do it. Check the extented GCD algorithm :)
Now, inv(K) = 3*([3 -8], [-17 5]) (mod 26) = ([9 -24], [-51 15]) (mod 26) = ([9 2], [1 15]).
Update: check out Basics of Computational Number Theory to see how to calculate modular inverses with the Extended Euclidean algorithm. Note that -121 mod 26 = 9, so for gcd(9, 26) = 1 we get (-1, 3).
In my very humble opinion it is much easier to calculate the inverse matrix (modular or otherwise) by using the Gauss-Jordan method. That way you don't have to calculate the determinant, and the method scales very simply to arbitrarily large systems.
Just look up 'Gauss Jordan Matrix Inverse' - but to summarise, you simply adjoin a copy of the identity matrix to the right of the matrix to be inverted, then use row operations to reduce your matrix to be solved until it itself is an identity matrix. At this point, the adjoined identity matrix has become the inverse of the original matrix. Voila!

Resources