Median of areas given in a matrix - algorithm

Given a matrix (n x n) of 1 and 0, where 1 represent land and 0 represent water.
How can I find the median of the area of the lands in the most efficient way?
For Example:
1 1 0 0 0
1 0 0 1 1
1 0 1 0 0
There are three islands, the area of them [1,2,4] and the median is 2
An island can be consist of continuous non-diagonal cells which contain 1:
For example:
1 0 1
0 1 0
this matrix contains three islands of areas [1,1,1]
My solution is finding recursively the areas and then sort them to find the median which takes O(n^2log(n^2)), is there a more efficient way to do that?

First step, run DFS recursively on the grid and discover all the islands & calculate areas in O(n^2) time.
Second step, You can use Median of Medians algorithm to calculate the median of unsorted island's areas array in expected O(m) time where m is the number of islands.
Overall time complexity O(n^2).
If you need further help, I can provide my implementation.

Using a disjoint set gives you O(A(N)), where A is inverse Ackermann function to find the Islands, then using an nth_element (aka IntroSelect) to find the N/2 in O(N) to find the median.
sets = DisjointSet(matrix)
median = nth_element(sets, N/2)
For a total of O(A(N)) far less than O(N^2)

Related

Sort matrix elements around the diagonal

I am looking for an algorithm that can sort the rows of a matrix so that the elements will cumulate around the diagonal.
I will have a square matrix (around 80 rows/ columns) containing only the values 0 and 1. There are algorithms that sort the rows in a way that most of the elements with the value 1 are below the diagonal.
I need an algorithm that sort to minimize the mean distance of the elements to the diagonal.
Like so:
from:
0 1 0
1 0 1
1 1 0
to:
1 1 0
0 1 0
1 0 1
Since I am not familiar with this topic I hope that someone can help me. I am not looking for a complete solution. The name of such algorithm if it exists or a pseudo code would be sufficient.
Thanks a lot!
There is probably a more efficient way, but you could treat this problem as an assignment problem (trying to assign each row to a diagonal element).
This can be done in three steps:
1) Create a new matrix M where each entry M(i,j) contains the cost of assigning row i of your input matrix to the diagonal element j. For your example this matrix will be the following (average distance to the diagonal element):
1 0 1
1 1 1
1 0.5 1.5
Example: M(0,0) = 1 is the average distance when assigning row 0 of the input matrix (0 1 0) to the diagonal element positioned at 0.
2) Run an algorithm to find the best assignment (e.g., hungarian algorithm). This will give you an optimal 1:1 matching between rows and columns minimizing the sum of cost in the matrix.
The result will be the elements (0,1), (1,2) and (2,0)
3) Rearrange your input matrix using this knowledge. So
row 0 -> row 1
row 1 -> row 2
row 2 -> row 0

algorithm to find consecutive wins (adjusted binomial distribution)

Imagine I am having a roulette wheel and I want to feed my algorithm three integers
p := probability to win one game
m := number of times the wheel is spun
k := number of consecutive wins I am interested in
and I am interested in the probability P that after spinning the wheel m times I win at least k consecutive games.
Let's go through an example where m = 5 and k = 3 and let's say 1 is a win and 0 a loss.
1 1 1 1 1
0 1 1 1 1
1 1 1 1 0
1 1 1 0 0
0 1 1 1 0
0 0 1 1 1
So in my intention, this would be all solution to win at least 3 consecutive games. For every k, I have (m-k+1) possible winning outcomes.
First question, is this true? Or would also 1 1 1 0 1 and 1 0 1 1 1 be possible solutions?
Next, how would a handy computation for this problem look like? First, I thought about the binomial distribution to solve this problem, where I just iterate over all k:
\textstyle {n \choose k}\,p^{k}(1-p)^{n-k}
But this somehow doesn't guarantee to have consecutive wins. Is the binomial distribution somehow adjustable to produce the output P I am looking for?
the following is an option you might want to consider:
you generate an array of length m, the entries 0 ...m representing the probability that at the ith time you have k-consecutive 1s.
all slots until k have probability 0, no chance for k consecutive wins.
slot k has the probability p^k.
all positions afterwards are computed based on dynamic programming approach: each position i as of position k+1 is calculated: sum consisting of position i-1 plus (p^k * (1-p) * (1 - the probability at position (i-1-k).
This way you iterate through the array and you will have in the last position the probability for at least k-consecutive 1s.
Or would also 1 1 1 0 1 and 1 0 1 1 1 be possible solutions?
yes, they would according to win at least k consecutive games.
First, I thought about the binomial distribution to solve this problem, where I just iterate over all k: \textstyle {n \choose k}\,p^{k}(1-p)^{n-k}
that might work if you combine if with acceptance-rejection method. Generate outcome from binomial, check for at least k winnings, accept if yes, drop otherwise.
Is the binomial distribution somehow adjustable to produce the output P I am looking for?
Frankly, I would take a look at Geometric distribution, which descirbes number of successful wins before loss (or vice versa).

Count all possible contiguous sub matices with exactly K 1's - Binary Matrix

The problem:
Given a NxN 0-1 matrix and a number K, count all sub matrices (any size) with exactly K 1's.
Constrains: 2 <= N <= 600, 1<= K <= 6
Example
1 0 1
0 0 0
1 0 1
Count: 8
With memoization of the sums of all possible sub matrices my algorithm has complexity O(n^4). I tried to combine various solutions for different but similar problems with no luck. I can't think a better way to reduce the time complexity. Could be done in O(n^3)?
I have read the following:
Kadane's algorithm
Submatrix with maximum sum
Minimum area submatrix with sum K
Minimal subrectangle with at least K 1's
This is a homework.

The Puzzle on the graph

Given an undirected graph G=(V,E), each node i is associated with 'Ci' number of objects. At each step, for every node i, the Ci objects are divided up equally among i's neighbors. After K steps, output the number of objects of the top five nodes which has the most objects.
Here is one example of what happens in one step:
Objects of A is divided equally by B and C.
Objects of B is divided equally by A and C.
Objects of C is divided equally by A and B.
Some Constrains:
|V|<10^5, |E|<2*10^5, K<10^7, Ci<1000
My current idea is: represent the transformation in each step with a matrix.
This problem is converted to the calculation of the power of matrix. But this solution is much too slow considering |V| can be 10^5.
Is there any faster way to do it?
The matrix equation for a single step is like M x = x', where x is a vector of current node contents, and x' is the contents after one step. That is, x' = M x. The contents at the step after that is x" = M x' = M(M x). An example of M follows, where the graph's adjacency matrix is shown at left. The column headed #nbr is the number of neighbors of nodes a, b ... e. Matrix M is formed from the adjacency matrix by replacing each 1 with a fraction equal to the number of ones in the same column.
a b c d e #nbr matrix M
a 0 0 1 1 0 2 0 0 1/3 1/4 0
b 0 0 0 1 0 1 0 0 0 1/4 0
c 1 0 0 1 1 3 1/2 0 0 1/4 1/2
d 1 1 1 0 1 4 1/2 1 1/3 0 1/2
e 0 0 1 1 0 2 0 0 1/3 1/4 0
To do K steps starting with initial contents x, just compute (M^K) x. Use an exponentiation method that requires lg K matrix multiplications, lg representing logarithms to base 2. As matrix multiplication typically is of O(n^3) complexity, this method is O(lg K * n^3) if straightforwardly implemented, or O(lg K * n^2.376) if using Coppersmith/Winograd algorithm. The complexity can be reduced to O(n^2.376) – that is, we can drop the lg K multiplier – by diagonalizing M into form (P^-1)AP, from which M^K = (P^-1)(A^K)P, and A^K is an O(n lg K) operation, giving O(n^2.376) overall. Diagonalization typically costs O(n^3), but is O(n^2.376) using Coppersmith/Winograd algorithm.

algorithm - How to sort a 0/1 array with 2n/3 comparisons?

In Algorithm Design Manual, there is such an excise
4-26 Consider the problem of sorting a sequence of n 0’s and 1’s using
comparisons. For each comparison of two values x and y, the algorithm
learns which of x < y, x = y, or x > y holds.
(a) Give an algorithm to sort in n − 1 comparisons in the worst case.
Show that your algorithm is optimal.
(b) Give an algorithm to sort in 2n/3 comparisons in the average case
(assuming each of the n inputs is 0 or 1 with equal probability). Show
that your algorithm is optimal.
For (a), I think it is fairly easy. I can choose a[n-1] as pivot, then do something like in quicksort partition, scan 0 to n - 2, find the middle point where left side is all 0 and right side is all 1, this take n - 1 comparisons.
But for (b), I can't get a clue. It says "each of the n inputs is 0 or 1 with equal probability", so I guess I can assume the numbers of 0 and 1 equal? But how can I get a result which is related to 1/3? divide the whole array into 3 groups?
Thanks
"0 or 1 with equal probability" is the condition for "average" case. Other cases may have worse timing.
Hint 1: 2/3 = 1/2 + 1/8 + 1/32 + 1/128 + ...
Hint 2: Consider the sequence as a sequence of pairs and compare the items in each pair. Half will return equal; half will not. Of the half that are unequal you know which item in the pair is 0 and which is 1, so those need no more comparisons.
No it means that at any position, you have the same chance (probability) of the input value being 0 or 1. this give you a first clue : your algorithm will be randomized.
The runtime will depend on some random variable, and you need to take the expected value to obtain the average complexity case. Note that in this case, you have to detail during complexity analysis, as they require a precise constant (2/3n rather than simply O(n))
Edit:
Hint. In the sorted array (the one you get at the end), what is only thing which varies, knowing you have only 2 possible elements.

Resources