Given 2d matrix find minimum sum of elements such that element is chosen one from each row and column? - algorithm

Find minimum sum of elements of n*n 2D matrix such that I have to choose one and only one element from each row and column ?
Eg
4 12
6 6
If I choose 4 from row 1 I cannot choose 12 from row 1 also from column 1 ,
I can only choose 6 from row 2 column 2.
So likewise minimum sum would be 4 + 6 = 10 where 6 is from second row second column
and not 6 + 12 = 18 where 6 is from second row first column
also 4 + 12 is not allowed since both are from same row
I thought of brute force where once i pick element from row and column I cannot pick another but this approach is O(n!)
.

Theorem: If a number is added to or subtracted from all
of the entries of any one row or column of the matrix,
then elements selected to get required minimum sum of the resulting matrix
are the same elements selected to get required minimum sum of the original matrix.
The Hungarian Algorithm (a special case of min-cost flow problem) uses this theorem to select those elements satisfying the constraints given in your problem :
Subtract the smallest entry in each row from all the entries of its
row.
Subtract the smallest entry in each column from all the entries
of its column.
Draw lines through appropriate rows and columns so that all the
zero entries of the cost matrix are covered and the minimum number of
such lines is used.
Test for Optimality
i. If the minimum number of covering lines is n, an optimal assignment of zeros is possible and we are finished.
ii. If the minimum number of covering lines is less than n, an optimal
assignment of zeros is not yet possible. In that case, proceed to Step 5.
Determine the smallest entry not covered by any line. Subtract
this entry from each uncovered row, and then add it to each covered
column. Return to Step 3.
See this example for better understanding.
Both O(n4) (easy and fast to implement) and O(n3) (harder to implement) implementations are explained in detail here.

Related

What is the correct approach to solve this matrix

We are given following matrix
5 7 9
7 8 4
4 2 9
We need to find maximum sum row or column and then we need to subtract 1 from each element of that row or column and then we need to repeat this operation for 3 times.
I will try to explain.
The matrix is n*n and the increment process is repeated for k times.
An o(n^2+k×log(n)) algorithm is possible.
If the sum of the biggest row/columns is a row so:
The row sum is increased by n.
All columns sum is increased by 1.
The two rules apply for columns as well.
For rule one store all rows/columns sum in 2 AVL Trees
(or every other data structure that support o(log(n)) insert and remove)
For rule two store rows/columns number of operations. (just two integers)
Now take the max of both trees where the two integers play a role for difference of the data staractures. Change it and change the other it and insert back.

Algorithm X to Solve the Exact Cover: Fat Matrices

As I was reading about Knuth's Algorithm X to solve the exact cover problem, I thought of an edge case that I wanted some clarification on.
Here are my assumptions:
Given a matrix A, Algorithm X's "goal is to select a subset of the rows so that the digit 1 appears in each column exactly once."
If the matrix is empty, the algorithm terminates successfully and the solution is then the subset of rows logged in the partial solution up to that point.
If there is a column of 0's, the algorithm terminates unsuccessfully.
For reference: http://en.wikipedia.org/wiki/Algorithm_X
Consider the matrix A:
[[1 1 0]
[0 1 1]]
Steps I took:
Given Matrix A:
1. Choose a column, c, with the least number of 1's. I choose: column 1
2. Choose a row, r, that contains to a 1 in column c. I choose: row 1
3. Add r to the partial solution.
4. For each column j such that A(r, j) = 1:
For each row i such that A(i, j) = 1:
delete row i
delete column j
5. Matrix A is empty. Algorithm terminates successfully and solution is allegedly: {row 1}.
However, this is clearly not the case as row 1 only consists of [1 1 0] and clearly does not cover the 3rd column.
I would assume that the algorithm should at some point reduce the matrix to the point where there is only a single 0 and terminate unsuccessfully.
Could someone please explain this?
I think the confusion here is simply in the use of the term empty matrix. If you read Knuth's original paper (linked on the Wikipedia article you cited), you can see that he was treating the rows and columns as doubly-linked lists. When he says that the matrix is empty, he doesn't mean that it has no entries, he means that all the row and column objects have been deleted.
To clarify, I'll label the rows with lower case letters and the columns with upper case letters, as follows:
| A | B | C
---------------
a | 1 | 1 | 0
---------------
b | 0 | 1 | 1
The algorithm states that you choose a column deterministically (using any rule you wish), and he suggests choosing a column with the fewest number of 1's. So, we'll proceed as you suggest and choose column A. The only row with a 1 in column A is row a, so we choose row a and add it to the possible solution { a }. Now, row a has 1s in columns A and B, so we must delete those columns, and any rows containing 1s in those columns, that is, rows a and b, just as you did. The resulting matrix has a single column C and no rows:
| C
-------
This is not an empty matrix (it has a column remaining). However, column C has no 1s in it, so we terminate unsuccessfully, as the algorithm indicates.
This may seem odd, but it is a very important case if we intend to use an incidence matrix for the Exact Cover Problem, because columns represent elements of the set X that we wish to cover and rows represents subsets of X. So a matrix with some columns and no rows represents the exact covering problem where the collection of subsets to choose from is empty (but there are still points to cover).
If this description causes problems for your implementation, there is a simple workaround: just include the empty set in every problem. The empty set (containing no points of X) is represented by a row of all zeros. It is never selected by your algorithm as part of a solution, never collides with any other selected rows, but always ensures that the matrix is nonempty (there is at least one row) until all the columns have been deleted, which is really all you care about since you need to make sure that each column is covered by some row.

Hungarian algorithm matching one set to itself

I'm looking for a variation on the Hungarian algorithm (I think) that will pair N people to themselves, excluding self-pairs and reverse-pairs, where N is even.
E.g. given N0 - N6 and a matrix C of costs for each pair, how can I obtain the set of 3 lowest-cost pairs?
C = [ [ - 5 6 1 9 4 ]
[ 5 - 4 8 6 2 ]
[ 6 4 - 3 7 6 ]
[ 1 8 3 - 8 9 ]
[ 9 6 7 8 - 5 ]
[ 4 2 6 9 5 - ] ]
In this example, the resulting pairs would be:
N0, N3
N1, N4
N2, N5
Having typed this out I'm now wondering if I can just increase the cost values in the "bottom half" of the matrix... or even better, remove them.
Is there a variation of Hungarian that works on a non-square matrix?
Or, is there another algorithm that solves this variation of the problem?
Increasing the values of the bottom half can result in an incorrect solution. You can see this as the corner coordinates (in your example coordinates 0,1 and 5,6) of the upper half will always be considered to be in the minimum X pairs, where X is the size of the matrix.
My Solution for finding the minimum X pairs
Take the standard Hungarian algorithm
You can set the diagonal to a value greater than the sum of the elements in the unaltered matrix — this step may allow you to speed up your implementation, depending on how your implementation handles nulls.
1) The first step of the standard algorithm is to go through each row, and then each column, reducing each row and column individually such that the minimum of each row and column is zero. This is unchanged.
The general principle of this solution, is to mirror every subsequent step of the original algorithm around the diagonal.
2) The next step of the algorithm is to select rows and columns so that every zero is included within the selection, using the minimum number of rows and columns.
My alteration to the algorithm means that when selecting a row/column, also select the column/row mirrored around that diagonal, but count it as one row or column selection for all purposes, including counting the diagonal (which will be the intersection of these mirrored row/column selection pairs) as only being selected once.
3) The next step is to check if you have the right solution — which in the standard algorithm means checking if the number of rows and columns selected is equal to the size of the matrix — in your example if six rows and columns have been selected.
For this variation however, when calculating when to end the algorithm treat each row/column mirrored pair of selections as a single row or column selection. If you have the right solution then end the algorithm here.
4) If the number of rows and columns is less than the size of the matrix, then find the smallest unselected element, and call it k. Subtract k from all uncovered elements, and add it to all elements that are covered twice (again, counting the mirrored row/column selection as a single selection).
My alteration of the algorithm means that when altering values, you will alter their mirrored values identically (this should happen naturally as a result of the mirrored selection process).
Then go back to step 2 and repeat steps 2-4 until step 3 indicates the algorithm is finished.
This will result in pairs of mirrored answers (which are the coordinates — to get the value of these coordinates refer back to the original matrix) — you can safely delete half of each pair arbitrarily.
To alter this algorithm to find the minimum R pairs, where R is less than the size of the matrix, reduce the stopping point in step 3 to R. This alteration is essential to answering your question.
As #Niklas B, stated you are solving Weighted perfect matching problem
take a look at this
here is part of document describing Primal-dual algorithm for weighted perfect matching
Please read all and let me know if is useful to you

Partitioning a Matrix

Given an N x N matrix, where N <= 25, and each cell has a positive integer value, how can you partition it with at most K lines (with straight up/down lines or straight left/right lines [note: they have to extend from one side to the other]) so that the maximum value group (as determined by the partitions) is minimum?
For example, given the following matrix
1 1 2
1 1 2
2 2 4
and we are allowed to use 2 lines to partition it, we should draw a line between column 2 and 3, as well as a line between rows 2 and 3, which gives the minimized maximum value, 4.
My first thought would be a bitmask representing the state of each lines, with 2 integers to represent it. However, this is too slow. I think the complexity is O(2^(2N))Maybe you could solve it for the rows, then solve it for the columns?
Anyone have any ideas?
Edit: Here is the problem after I googled it: http://www.sciencedirect.com/science/article/pii/0166218X94001546
another paper: http://cis.poly.edu/suel/papers/pxp.pdf
I'm trying to read that^
You can try all subsets for vertical lines, and then do dynamic programming for horizontal lines.
Let's say you have fixed the set of vertical lines as S. Denote the answer for the subproblem consisting of first K lines of matrix with fixed set of vertical lines S as D(K, S). It is then trivial to find a recurrence to solve D(K, S) with subproblems of smaller size.
Overall complexity should be O(2^N * N^2) if you precompute the sizes of each submatrix in the beginning.

How to find minimal subrectangle with at least K 1's in a 0-1 matrix

I have encountered an inoridinary problem that given a NxM 0-1 matrix and a number K(<=NxM) and I have to find a minimal subrectangle area of that 0-1 matrix with at least K 1's in inside that subrectangle. Furthermore it's area(the product of both dimensions) should be minimized.
For example:
00000
01010
00100
01010
00000
K = 3
So I can find a subrectangle with minimal area 6 that contains 3 1's inside.
10
01
10
NOTE that the target subrectangle that I mean should contains consecutive numbers of rows and columns from the original 0-1 matrix.
Compute cumulative sum of rows R[i,j] and columns C[i,j].
For top-left corner (i,j) of each possible sub-rectangle:
Starting from a single-row sub-rectangle (n=i),
Search the last possible column for this sub-rectangle (m).
While m>=j:
While there are more than 'k' "ones" in this sub-rectangle:
If this is the smallest sub-rectangle so far, remember it.
Remove column (--m).
This decreases the number of "ones" by C[m+1,n]-C[m+1,j-1].
Add next row (++n).
This increases the number of "ones" by R[m,n]-R[i-1,n].
Time complexity is O(NM(N+M)).
Two nested loops may be optimized by changing linear search to binary search (to process skinny sub-rectangles faster).
Also it is possible (after adding a row/a column to the sub-rectangle) to decrease in O(1) time the number of columns/rows in such a way that the area of this sub-rectangle is not larger than the area of the best-so-far sub-rectangle.
Both these optimizations require calculation of any sub-rectangle weight in O(1). To make it possible, pre-calculate cumulative sum of all elements for sub-rectangles [1..i,1..j] (X[i,j]). Then the weight of any sub-rectangle [i..m,j..n] is computed as X[m,n]-X[i-1,n]-X[m,j-1]+X[i-1,j-1].
Compute cumulative sum of columns C[i,j].
For any starting row (k) of possible sub-rectangle:
For any ending row (l) of possible sub-rectangle:
Starting column (m = 1).
Ending column (n = 1).
While n is not out-of-bounds
While there are less than 'k' "ones" in sub-rectangle [k..l,m..n]:
Add column (++n).
This increases the number of "ones" by C[l,n]-C[k-1,n].
If this is the smallest sub-rectangle so far, remember it.
Remove column (++m).
This decreases the number of "ones" by C[l,m-1]-C[k-1,m-1].
Time complexity is O(N2M).
Loop by 'l' may be terminated when all sub-rectangles, processed inside it, are single-column sub-rectangles (too many rows) or when no sub-rectangles, processed inside it, contain enough "ones" (not enough rows).
The problem is NP-hard because the clique decision problem can be reduced to it. So there is no algorithm that is more efficient than the brute-force approach of trying all the possible submatrices (unless P=NP).
The clique decision problem can be reduced to your problem in the following way:
Let the matrix be the adjacency matrix of the graph.
Set K=L^2, where L is the size of the clique we are looking for.
Solve your problem on this input. The graph contains an L-clique iff the solution to your problem is an LxL submatrix containing only ones (which can be checked in polynomial time).
Off the top of my head, you can make a list of the coordinate pairs(?) of all ones in the matrix, find the (smallest) containing sub-rectangles for each K-combination among them*, then pick the smallest of those.
* which is defined by the smallest and largest row and column indices in the K-combination.

Resources