What is the correct approach to solve this matrix - algorithm

We are given following matrix
5 7 9
7 8 4
4 2 9
We need to find maximum sum row or column and then we need to subtract 1 from each element of that row or column and then we need to repeat this operation for 3 times.

I will try to explain.
The matrix is n*n and the increment process is repeated for k times.
An o(n^2+k×log(n)) algorithm is possible.
If the sum of the biggest row/columns is a row so:
The row sum is increased by n.
All columns sum is increased by 1.
The two rules apply for columns as well.
For rule one store all rows/columns sum in 2 AVL Trees
(or every other data structure that support o(log(n)) insert and remove)
For rule two store rows/columns number of operations. (just two integers)
Now take the max of both trees where the two integers play a role for difference of the data staractures. Change it and change the other it and insert back.

Related

Given 2d matrix find minimum sum of elements such that element is chosen one from each row and column?

Find minimum sum of elements of n*n 2D matrix such that I have to choose one and only one element from each row and column ?
Eg
4 12
6 6
If I choose 4 from row 1 I cannot choose 12 from row 1 also from column 1 ,
I can only choose 6 from row 2 column 2.
So likewise minimum sum would be 4 + 6 = 10 where 6 is from second row second column
and not 6 + 12 = 18 where 6 is from second row first column
also 4 + 12 is not allowed since both are from same row
I thought of brute force where once i pick element from row and column I cannot pick another but this approach is O(n!)
.
Theorem: If a number is added to or subtracted from all
of the entries of any one row or column of the matrix,
then elements selected to get required minimum sum of the resulting matrix
are the same elements selected to get required minimum sum of the original matrix.
The Hungarian Algorithm (a special case of min-cost flow problem) uses this theorem to select those elements satisfying the constraints given in your problem :
Subtract the smallest entry in each row from all the entries of its
row.
Subtract the smallest entry in each column from all the entries
of its column.
Draw lines through appropriate rows and columns so that all the
zero entries of the cost matrix are covered and the minimum number of
such lines is used.
Test for Optimality
i. If the minimum number of covering lines is n, an optimal assignment of zeros is possible and we are finished.
ii. If the minimum number of covering lines is less than n, an optimal
assignment of zeros is not yet possible. In that case, proceed to Step 5.
Determine the smallest entry not covered by any line. Subtract
this entry from each uncovered row, and then add it to each covered
column. Return to Step 3.
See this example for better understanding.
Both O(n4) (easy and fast to implement) and O(n3) (harder to implement) implementations are explained in detail here.

Hungarian algorithm matching one set to itself

I'm looking for a variation on the Hungarian algorithm (I think) that will pair N people to themselves, excluding self-pairs and reverse-pairs, where N is even.
E.g. given N0 - N6 and a matrix C of costs for each pair, how can I obtain the set of 3 lowest-cost pairs?
C = [ [ - 5 6 1 9 4 ]
[ 5 - 4 8 6 2 ]
[ 6 4 - 3 7 6 ]
[ 1 8 3 - 8 9 ]
[ 9 6 7 8 - 5 ]
[ 4 2 6 9 5 - ] ]
In this example, the resulting pairs would be:
N0, N3
N1, N4
N2, N5
Having typed this out I'm now wondering if I can just increase the cost values in the "bottom half" of the matrix... or even better, remove them.
Is there a variation of Hungarian that works on a non-square matrix?
Or, is there another algorithm that solves this variation of the problem?
Increasing the values of the bottom half can result in an incorrect solution. You can see this as the corner coordinates (in your example coordinates 0,1 and 5,6) of the upper half will always be considered to be in the minimum X pairs, where X is the size of the matrix.
My Solution for finding the minimum X pairs
Take the standard Hungarian algorithm
You can set the diagonal to a value greater than the sum of the elements in the unaltered matrix — this step may allow you to speed up your implementation, depending on how your implementation handles nulls.
1) The first step of the standard algorithm is to go through each row, and then each column, reducing each row and column individually such that the minimum of each row and column is zero. This is unchanged.
The general principle of this solution, is to mirror every subsequent step of the original algorithm around the diagonal.
2) The next step of the algorithm is to select rows and columns so that every zero is included within the selection, using the minimum number of rows and columns.
My alteration to the algorithm means that when selecting a row/column, also select the column/row mirrored around that diagonal, but count it as one row or column selection for all purposes, including counting the diagonal (which will be the intersection of these mirrored row/column selection pairs) as only being selected once.
3) The next step is to check if you have the right solution — which in the standard algorithm means checking if the number of rows and columns selected is equal to the size of the matrix — in your example if six rows and columns have been selected.
For this variation however, when calculating when to end the algorithm treat each row/column mirrored pair of selections as a single row or column selection. If you have the right solution then end the algorithm here.
4) If the number of rows and columns is less than the size of the matrix, then find the smallest unselected element, and call it k. Subtract k from all uncovered elements, and add it to all elements that are covered twice (again, counting the mirrored row/column selection as a single selection).
My alteration of the algorithm means that when altering values, you will alter their mirrored values identically (this should happen naturally as a result of the mirrored selection process).
Then go back to step 2 and repeat steps 2-4 until step 3 indicates the algorithm is finished.
This will result in pairs of mirrored answers (which are the coordinates — to get the value of these coordinates refer back to the original matrix) — you can safely delete half of each pair arbitrarily.
To alter this algorithm to find the minimum R pairs, where R is less than the size of the matrix, reduce the stopping point in step 3 to R. This alteration is essential to answering your question.
As #Niklas B, stated you are solving Weighted perfect matching problem
take a look at this
here is part of document describing Primal-dual algorithm for weighted perfect matching
Please read all and let me know if is useful to you

Randomly choosing from a list with weighted probabilities

I have an array of N elements (representing the N letters of a given alphabet), and each cell of the array holds an integer value, that integer value meaning the number of occurrences in a given text of that letter. Now I want to randomly choose a letter from all of the letters in the alphabet, based on his number of appearances with the given constraints:
If the letter has a positive (nonzero) value, then it can be always chosen by the algorithm (with a bigger or smaller probability, of course).
If a letter A has a higher value than a letter B, then it has to be more likely to be chosen by the algorithm.
Now, taking that into account, I've come up with a simple algorithm that might do the job, but I was just wondering if there was a better thing to do. This seems to be quite fundamental, and I think there might be more clever things to do in order to accomplish this more efficiently. This is the algorithm i thought:
Add up all the frequencies in the array. Store it in SUM
Choosing up a random value from 0 to SUM. Store it in RAN
[While] RAN > 0, Starting from the first, visit each cell in the array (in order), and subtract the value of that cell from RAN
The last visited cell is the chosen one
So, is there a better thing to do than this? Am I missing something?
I'm aware most modern computers can compute this so fast I won't even notice if my algorithm is inefficient, so this is more of a theoretical question rather than a practical one.
I prefer an explained algorithm rather than just code for an answer, but If you're more comfortable providing your answer in code, I have no problem with that.
The idea:
Iterate through all the elements and set the value of each element as the cumulative frequency thus far.
Generate a random number between 1 and the sum of all frequencies
Do a binary search on the values for this number (finding the first value greater than or equal to the number).
Example:
Element A B C D
Frequency 1 4 3 2
Cumulative 1 5 8 10
Generate a random number in the range 1-10 (1+4+3+2 = 10, the same as the last value in the cumulative list), do a binary search, which will return values as follows:
Number Element returned
1 A
2 B
3 B
4 B
5 B
6 C
7 C
8 C
9 D
10 D
The Alias Method has amortized O(1) time per value generated, but requires two uniforms per lookup. Basically, you create a table where each column contains one of the values to be generated, a second value called an alias, and a conditional probability of choosing between the value and its alias. Use your first uniform to pick any of the columns with equal likelihood. Then choose between the primary value and the alias based on your second uniform. It takes a O(n log n) work to initially set up a valid table for n values, but after the table's built generating values is constant time. You can download this Ruby gem to see an actual implementation.
Two other very fast methods by Marsaglia et al. are described here. They have provided C implementations.

Generating Random Matrix With Pairwise Distinct Rows and Columns

I need to randomly generate an NxN matrix of integers in the range 1 to K inclusive such that all rows and columns individually have the property that their elements are pairwise distinct.
For example for N=2 and K=3
This is ok:
1 2
2 1
This is not:
1 3
1 2
(Notice that if K < N this is impossible)
When K is sufficiently larger than N an efficient enough algorithm is just to generate a random matrix of 1..K integers, check that each row and each column is pairwise distinct, and if it isn't try again.
But what about the case where K is not much larger than N?
This is not a full answer, but a warning about an intuitive solution that does not work.
I am assuming that by "randomly generate" you mean with uniform probability on all existing such matrices.
For N=2 and K=3, here are the possible matrices, up to permutations of the set [1..K]:
1 2 1 2 1 2
2 1 2 3 3 1
(since we are ignoring permutations of the set [1..K], we can assume wlog that the first line is 1 2).
Now, an intuitive (but incorrect) strategy would be to draw the matrix entries one by one, ensuring for each entry that it is distinct from the other entries on the same line or column.
To see why it's incorrect, consider that we have drawn this:
1 2
x .
and we are now drawing x. x can be 2 or 3, but if we gave each possibility the probability 1/2, then the matrix
1 2
3 1
would get probability 1/2 of being drawn at the end, while it should have only probability 1/3.
Here is a (textual) solution. I don't think it provides good randomness, but nevertherless it could be ok for your application.
Let's generate a matrix in the range [0;K-1] (you will do +1 for all elements if you want to) with the following algorithm:
Generate the first line with any random method you want.
Each number will be the first element of a random sequence calculated in such a manner that you are guarranteed to have no duplicate in subsequent rows, that is for any distinct column x and y, you will have x[i]!=y[i] for all i in [0;N-1].
Compute each row for the previous one.
All the algorithm is based on the random generator with the property I mentioned. With a quick search, I found that the Inversive congruential generator meets this requirement. It seems to be easy to implement. It works if K is prime; if K is not prime, see on the same page 'Compound Inversive Generators'. Maybe it will be a little tricky to handle with perfect squares or cubic numbers (your problem sound like sudoku :-) ), but I think it is possible by creating compound generators with prime factors of K and different parametrization. For all generators, the first element of each column is the seed.
Whatever the value of K, the complexity is only depending on N and is O(N^2).
Deterministically generate a matrix having the desired property for rows and columns. Provided K > N, this can easily be done by starting the ith row with i, and filling in the rest of the row with i+1, i+2, etc., wrapping back to 1 after K. Other algorithms are possible.
Randomly permute columns, then randomly permute rows.
Let's show that permuting rows (i.e. picking up entire rows and assembling a new matrix from them in some order, with each row possibly in a different vertical position) leaves the desired properties intact for both rows and columns, assuming they were true before. The same reasoning then holds for column permutations, and for any sequence of permutations of either kind.
Trivially, permuting rows cannot change the property that, within each row, no element appears more than once.
The effect of permuting rows on a particular column is to reorder the elements within that column. This holds for any column, and since reordering elements cannot produce duplicate elements where there were none before, permuting rows cannot change the property that, within each column, no element appears more than once.
I'm not certain whether this algorithm is capable of generating all possible satisfying matrices, or if it does, whether it will generate all possible satisfying matrices with equal probability. Another interesting question that I don't have an answer for is: How many rounds of row-permutation-then-column-permutation are needed? More precisely, is any finite sequence of row-perm-then-column-perm rounds equivalent to a bounded number of (or in particular, one) row-perm-then-column-perm round? If so then nothing is gained by further permutations after the first row and column permutations. Perhaps someone with a stronger mathematics background can comment. But it may be good enough in any case.

Arranging groups of people optimally

I have this homework assignment that I think I managed to solve, but am not entirely sure as I cannot prove my solution. I would like comments on what I did, its correctness and whether or not there's a better solution.
The problem is as follows: we have N groups of people, where group ihas g[i]people in it. We want to put these people on two rows of S seats each, such that: each group can only be put on a single row, in a contiguous sequence, OR if the group has an even number of members, we can split them in two and put them on two rows, but with the condition that they must form a rectangle (so they must have the same indices on both rows). What is the minimum number of seats S needed so that nobody is standing up?
Example: groups are 4 11. Minimum S is 11. We put all 4 in one row, and the 11 on the second row. Another: groups are 6 2. We split the 6 on two rows, and also the two. Minimum is therefore 4 seats.
This is what I'm thinking:
Calculate T = (sum of all groups + 1) / 2. Store the group numbers in an array, but split all the even values x in two values of x / 2 each. So 4 5 becomes 2 2 5. Now run subset sum on this vector, and find the minimum value higher than or equal to T that can be formed. That value is the minimum number of seats per row needed.
Example: 4 11 => 2 2 11 => T = (15 + 1) / 2 = 8. Minimum we can form from 2 2 11 that's >= 8 is 11, so that's the answer.
This seems to work, at least I can't find any counter example. I don't have a proof though. Intuitively, it seems to always be possible to arrange the people under the required conditions with the number of seats supplied by this algorithm.
Any hints are appreciated.
I think your solution is correct. The minimum number of seats per row in an optimal distribution would be your T (which is mathematically obvious).
Splitting even numbers is also correct, since they have two possible arrangements; by logically putting all the "rectangular" groups of people on one end of the seat rows you can also guarantee that they will always form a proper rectangle, so that this condition is met as well.
So the question boils down to finding a sum equal or as close as possible to T (e.g. partition problem).
Minor nit: I'm not sure if the proposed solution above works in the edge case where each group has 0 members, because your numerator in T = SUM ALL + 1 / 2 is always positive, so there will never be a subset sum that is greater than or equal to T.
To get around this, maybe a modulus operation might work here. We know that we need at least n seats in a row if n is the maximal odd term, so maybe the equation should have a max(n * (n % 2)) term in it. It will come out to max(odd) or 0. Since the maximal odd term is always added to S, I think this is safe (stated boldly without proof...).
Then we want to know if we should split the even terms or not. Here's where the subset sum approach might work, but with T simply equal to SUM ALL / 2.

Resources