Binary matrix initialization algorithm - algorithm

I have to initialize a matrix (of size Nx(N-1)) with 0's and 1's, where every time I put a 0 (or 1) in the [i, j] position, a 1 (or 0) is placed in the matrix too, in the [k, j] position, where k is a random row different to i (notice that the column is the same).
The condition to satisfy is that there cannot be more than 2 consecutive 0's or 1's in the same row.
So, the next matrix would be invalid:
0 1 1
1 1 1
1 0 1
0 1 1
And this one would be valid:
0 1 1
1 0 0
1 1 0
0 0 1
Any ideas on how I can implement this initialization? Suppose the matrix starts with -1's in it.

First of all, for such a matrix to exist (every (i,j) tuple having a unique opposite (k,j) tuple), N has to be a multiple of 2.
Step 1 create a fully random (N/2)*(N-1) matrix.
Step 2 correct mistakes in this matrix, so that every row has no mistakes. With the following algorithm:
for row in matrix:
for i in range(0, len(row) - 2):
if row[i] == row[i + 1] == row[i + 2]:
row[i + 2] = 1 if row[i + 2] == 0 else 0
Step 3 For every row, append the inverse row, in order (aka N/2) rows.
for row_index in range(N // 2):
row = matrix[row_index]
new_row = [0 if x == 1 else 1 for x in row]
matrix.append(new_row)
Step 4 For every (i, j) generate a p in [0, N), swap (i,j) with (p,j) and see if it contradicts the 3 in a row pattern, if it does, swap them back.

Related

Most efficent way of finding submatrices of a matrix [matlab]

Say we have a matrix of zeros and ones
0 1 1 1 0 0 0
1 1 1 1 0 1 1
0 0 1 0 0 1 0
0 1 1 0 1 1 1
0 0 0 0 0 0 1
0 0 0 0 0 0 1
and we want to find all the submatrices (we just need the row indices and column indices of the corners) with these properties:
contain at least L ones and L zeros
contain max H elements
i.e. take the previous matrix with L=1 and H=5, the submatrix 1 2 1 4 (row indices 1 2 and column indices 1 4)
0 1 1 1
1 1 1 1
satisfies the property 1 but has 8 elements (bigger than 5) so it is not good;
the matrix 4 5 1 2
0 1
0 0
is good because satisfies both the properties.
The objective is then to find all the submatrices with min area 2*L, max area H and containg at least L ones and L zeros.
If we consider a matrix as a rectangle it is easy to find all the possibile subrectangles with max area H and min area 2*L by looking at the divisors of all the numbers from H to 2*L.
For example, with H=5 and L=1 all the possibile subrectangles/submatrices are given by the divisors of
H=5 -> divisors [1 5] -> possibile rectangles of area 5 are 1x5 and 5x1
4 -> divisors [1 2 4] -> possibile rectangles of area 4 are 1x4 4x1 and 2x2
3 -> divisors [1 3] -> possibile rectangles of area 3 are 3x1 and 1x3
2*L=2 -> divisors [1 2] -> possibile rectangles of area 2 are 2x1 and 1x2
I wrote this code, which, for each number finds its divisors and cycles over them to find the submatrices. To find the submatrices it does this: take for example a 1x5 submatrix, what the code does is to fix the first line of the matrix and move step by step (along all the columns of the matrix) the submatrix from the left edge of the matrix to the right edge of the matrix, then the code fixes the second row of the matrix and moves the submatrix along all the columns from left to right, and so on until it arrives at the last row.
It does this for all the 1x5 submatrices, then it considers the 5x1 submatrices, then the 1x4, then the 4x1, then the 2x2, etc.
The code do the job in 2 seconds (it finds all the submatrices) but for big matrices, i.e. 200x200, a lot of minutes are needed to find all the submatrices. So I wonder if there are more efficient ways to do the job, and eventually which is the most efficient.
This is my code:
clc;clear all;close all
%% INPUT
P= [0 1 1 1 0 0 0 ;
1 1 1 1 0 1 1 ;
0 0 1 0 0 1 0 ;
0 1 1 0 1 1 1 ;
0 0 0 0 0 0 1 ;
0 0 0 0 0 0 1];
L=1; % a submatrix has to containg at least L ones and L zeros
H=5; % max area of a submatrix
[R,C]=size(P); % rows and columns of P
sub=zeros(1,6); % initializing the matrix containing the indexes of each submatrix (columns 1-4), their area (5) and the counter (6)
counter=1; % no. of submatrices found
%% FIND ALL RECTANGLES OF AREA >= 2*L & <= H
%
% idea: all rectangles of a certain area can be found using the area's divisors
% e.g. divisors(6)=[1 2 3 6] -> rectangles: 1x6 6x1 2x3 and 3x2
tic
for sH = H:-1:2*L % find rectangles of area H, H-1, ..., 2*L
div_sH=divisors(sH); % find all divisors of sH
disp(['_______AREA ', num2str(sH), '_______'])
for i = 1:round(length(div_sH)/2) % cycle over all couples of divisors
div_small=div_sH(i);
div_big=div_sH(end-i+1);
if div_small <= R && div_big <= C % rectangle with long side <= C and short side <= R
for j = 1:R-div_small+1 % cycle over all possible rows
for k = 1:C-div_big+1 % cycle over all possible columns
no_of_ones=length(find(P(j:j-1+div_small,k:k-1+div_big))); % no. of ones in the current submatrix
if no_of_ones >= L && no_of_ones <= sH-L % if the submatrix contains at least L ones AND L zeros
% row indexes columns indexes area position
sub(counter,:)=[j,j-1+div_small , k,k-1+div_big , div_small*div_big , counter]; % save the submatrix
counter=counter+1;
end
end
end
disp([' [', num2str(div_small), 'x', num2str(div_big), '] submatrices: ', num2str(size(sub,1))])
end
if div_small~=div_big % if the submatrix is a square, skip this part (otherwise there will be duplicates in sub)
if div_small <= C && div_big <= R % rectangle with long side <= R and short side <= C
for j = 1:C-div_small+1 % cycle over all possible columns
for k = 1:R-div_big+1 % cycle over all possible rows
no_of_ones=length(find(P(k:k-1+div_big,j:j-1+div_small)));
if no_of_ones >= L && no_of_ones <= sH-L
sub(counter,:)=[k,k-1+div_big,j,j-1+div_small , div_big*div_small, counter];
counter=counter+1;
end
end
end
disp([' [', num2str(div_big), 'x', num2str(div_small), '] submatrices: ', num2str(size(sub,1))])
end
end
end
end
fprintf('\ntime: %2.2fs\n\n',toc)
Here is a solution centered around 2D matrix convolution. The rough idea is to convolve P for each submatrix shape with a second matrix such that each element of the resulting matrix indicates how many ones are in the submatrix having its top left corner at said element. Like this you get all solutions for a single shape in one go, without having to loop over rows/columns, greatly speeding things up (it takes less than a second for a 200x200 matrix on my 8 years old laptop)
P= [0 1 1 1 0 0 0
1 1 1 1 0 1 1
0 0 1 0 0 1 0
0 1 1 0 1 1 1
0 0 0 0 0 0 1
0 0 0 0 0 0 1];
L=1; % a submatrix has to containg at least L ones and L zeros
H=5; % max area of a submatrix
submats = [];
for sH = H:-1:2*L
div_sH=divisors(sH); % find all divisors of sH
for i = 1:length(div_sH) % cycle over all couples of divisors
%number of rows of the current submatrix
nrows=div_sH(i);
% number of columns of the current submatrix
ncols=div_sH(end-i+1);
% perpare matrix to convolve P with
m = zeros(nrows*2-1,ncols*2-1);
m(1:nrows,1:ncols) = 1;
% get the number of ones in the top left corner each submatrix
submatsums = conv2(P,m,'same');
% set values where the submatrices go outside P invalid
validsums = zeros(size(P))-1;
validsums(1:(end-nrows+1),1:(end-ncols+1)) = submatsums(1:(end-nrows+1),1:(end-ncols+1));
% get the indexes where the number of ones and zeros is >= L
topLeftIdx = find(validsums >= L & validsums<=sH-L);
% save submatrixes in following format: [index, nrows, ncols]
% You can ofc use something different, but it seemed the simplest way to me
submats = [submats ; [topLeftIdx bsxfun(#times,[nrows ncols],ones(length(topLeftIdx),1))]];
end
end
First, I suggest that you combine finding the allowable sub-matrix sizes.
for smaller = 1:sqrt(H)
for larger = 2*L:H/smaller
# add smaller X larger and larger x smaller to your shapes list
Next, start with the smallest rectangles in the shapes. Note that any solution to a small rectangle can be extended in any direction, to the area limit of H, and the added elements will not invalidate the solution you found. This will identify many solutions without bothering to check the populations within.
Keep track of the solutions you've found. As you work your way toward larger rectangles, you can avoid checking anything already in your solutions set. If you keep that in a hash table, checking membership is O(1). All you'll need to check thereafter will be larger blocks of mostly-1 adjacent to mostly-0. This should speed up the processing somewhat.
Is that enough of a nudge to help?

Counting subrows in each row of a matrix in Matlab?

I need an algorithm in Matlab which counts how many adjacent and non-overlapping (1,1) I have in each row of a matrix A mx(n*2) without using loops. E.g.
A=[1 1 1 0 1 1 0 0 0 1; 1 0 1 1 1 1 0 0 1 1] %m=2, n=5
Then I want
B=[2;3] %mx1
Specific case
Assuming A to have ones and zeros only, this could be one way -
B = sum(reshape(sum(reshape(A',2,[]))==2,size(A,2)/2,[]))
General case
If you are looking for a general approach that must work for all integers and a case where you can specify the pattern of numbers, you may use this -
patt = [0 1] %%// pattern to be found out
B = sum(reshape(ismember(reshape(A',2,[])',patt,'rows'),[],2))
Output
With patt = [1 1], B = [2 3]
With patt = [0 1], B = [1 0]
you can use transpose then reshape so each consecutive values will now be in a row, then compare the top and bottom row (boolean compare or compare the sum of each row to 2), then sum the result of the comparison and reshape the result to your liking.
in code, it would look like:
A=[1 1 1 0 1 1 0 0 0 1; 1 0 1 1 1 1 0 0 1 1] ;
m = size(A,1) ;
n = size(A,2)/2 ;
Atemp = reshape(A.' , 2 , [] , m ) ;
B = squeeze(sum(sum(Atemp)==2))
You could pack everything in one line of code if you want, but several lines is usually easier for comprehension. For clarity, the Atemp matrix looks like that:
Atemp(:,:,1) =
1 1 1 0 0
1 0 1 0 1
Atemp(:,:,2) =
1 1 1 0 1
0 1 1 0 1
You'll notice that each row of the original A matrix has been broken down in 2 rows element-wise. The second line will simply compare the sum of each row with 2, then sum the valid result of the comparisons.
The squeeze command is only to remove the singleton dimensions not necessary anymore.
you can use imresize , for example
imresize(A,[size(A,1),size(A,2)/2])>0.8
ans =
1 0 1 0 0
0 1 1 0 1
this places 1 where you have [1 1] pairs... then you can just use sum
For any pair type [x y] you can :
x=0; y=1;
R(size(A,1),size(A,2)/2)=0; % prealocarting memory
for n=1:size(A,1)
b=[A(n,1:2:end)' A(n,2:2:end)']
try
R(n,find(b(:,1)==x & b(:,2)==y))=1;
end
end
R =
0 0 0 0 1
0 0 0 0 0
With diff (to detect start and end of each run of ones) and accumarray (to group runs of the same row; each run contributes half its length rounded down):
B = diff([zeros(1,size(A,1)); A.'; zeros(1,size(A,1))]); %'// columnwise is easier
[is js] = find(B==1); %// rows and columns of starts of runs of ones
[ie je] = find(B==-1); %// rows and columns of ends of runs of ones
result = accumarray(js, floor((ie-is)/2)); %// sum values for each row of A

permutation matrix

Is it possible to decompose a matrix A having n rows and n columns to sum of m [n x n] permutation matrices. where m is the number of 1's in each row and each column in matrix A?
UPDATE:
yes, this is possible. I came across such an exmaple which is shown below - but How can we generalize the answer?
What you want is called a 1-factorization. One algorithm is repeatedly to find a perfect matching and remove it; probably there are others.
For the first permutation matrix, take the first 1 in the first row. For the second row, take the first 1 that is in a column you don't already have. For the third row, take the first 1 that is in a column you don't already have. And so on. Do this for all rows.
You now have one permutation matrix.
Next subtract your first permutation matrix from the original. This new matrix now has m-1 ones in each row and column. So repeat the process m-1 more times, and you'll have your m permutation matrices.
You can skip the last step, because a matrix with one 1 in each row and column already is a permutation matrix. There's no need to do any calculations.
This is a greedy algorithm that doesn't always work. We can make it work by changing the selection rule slightly. See below:
For your example:
1 0 1 1
A = 1 1 0 1
1 1 1 0
0 1 1 1
In the first step, we pick (1,1) for the first row, (2,2) for the second row, (3,3) for the thrid row and (4,4) for the 4th row. We then have:
1 0 0 0 0 0 1 1
A = 0 1 0 0 + 1 0 0 1
0 0 1 0 1 1 0 0
0 0 0 1 0 1 1 0
The first matrix is a permutation matrix. The second matrix has exactly two 1's in each row and column. So we pick, in order: (1,3), (2,1), (3,2) and... we're in trouble: the rows that contain a 1 in column 4 have already been used.
So how do we fix this? Well, we can keep track of the number of 1's remaining in each column. Instead of picking the first column that is unused, we pick the column with the lowest number of 1's remaining. For the second matrix above:
0 0 1 1 0 0 X 0 0 0 X 0 0 0 X 0
B = 1 0 0 1 --> 1 0 0 1 --> 0 0 0 X --> 0 0 0 X
1 1 0 0 1 1 0 0 1 1 0 0 X 0 0 0
0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
------- ------- ------- -------
2 2 2 2 2 2 X 1 1 2 X X X 1 X X
So we would pick column 4 in the second step, column 1 in the 3rd step, and column 2 in the 4th step.
There can always be only one column with one remaining 1. The other 1's must have been taken away in m-1 previous rows. If you had two such columns, one of them would have had to have been picked as the minimum column before.
This can be done easily using a recursive (backtracking OR depth-first traversal) algorithm. Here is the pseudo-code for its solution:
void printPermutationMatrices(const int OrigMat[][], int permutMat[], int curRow, const int n){
//curPermutMatrix is 1-D array where value of ith element contains the value of column where 1 is placed in ith row
if(curRow == n){//Base case
//do stuff with permutMat[]
printPermutMat(permutMat);
return;
}
for(int col=0; col<n; col++){//try to place 1 in cur_row in each col if possible and go further to next row in recursion
if(origM[cur_row][col] == 1){
permutMat[cur_row] = col;//choose this col for cur_row
if there is no conflict to place a 1 in [cur_row, col] in permutMat[]
perform(origM, curPermutMat, curRow+1, n);
}
}
}
Here is how to call from your main function:
int[] permutMat = new int[n];
printPermutationMatrices(originalMatrix, permutMat, 0, n);

Turning an array of integers into an array of nonnegative integers

Start with an array of integers so that the sum of the values is some positive integer S. The following routine always terminates in the same number of steps with the same results. Why is this?
Start with an array x = [x_0, x_1, ..., x_N-1] such that all x_i's are integers. While there is a negative entry, do the following:
Choose any index i such that x_i < 0.
Add x_i (a negative number) to x_(i-1 % N).
Add x_i (a negative number) to x_(i+1 % N).
Replace x_i with -x_i (a positive number).
This process maintains the property that x_0 + x_1 + ... + x_N-1 = S. For any given starting array x, no matter which index is chosen at any step, the number of times one goes through these steps is the same as is the resulting vector. It is not even obvious (to me, at least) that this process terminates in finite time, let alone has this nice invariant property.
EXAMPLE:
Take x = [4 , -1, -2] and flipping x_1 to start, the result is
[4, -1, -2]
[3, 1, -3]
[0, -2, 3]
[-2, 2, 1]
[2, 0, -1]
[1, -1, 1]
[0, 1, 0]
On the other hand, flipping x_2 to start gives
[4, -1, -2]
[2, -3, 2]
[-1, 3, -1]
[1, 2, -2]
[-1, 0, 2]
[1, -1, 1]
[0, 1, 0]
and the final way give this solution with arrays reversed from the third on down if you choose x_2 instead of x_0 to flip at the third array. In all cases, 6 steps lead to [0,1,0].
I have an argument for why this is true, but it seems to me to be overly complicated (it has to do with Coxeter groups). Does anyone have a more direct way to think about why this happens? Even finding a reason why this should terminate would be great.
Bonus points to anyone who finds a way to determine the number of steps for a given array (without going through the process).
I think the easiest way to see why the output vector and the number of steps are the same no matter what index you choose at each step is to look at the problem as a bunch of matrix and vector multiplications.
For the case where x has 3 components, think of x as a 3x1 vector: x = [x_0 x_1 x_2]' (where ' is the transpose operation). Each iteration of the loop will choose to flip one of x_0,x_1,x_2, and the operation it performs on x is identical to multiplication by one of the following matrices:
-1 0 0 1 1 0 1 0 1
s_0 = 1 1 0 s_1 = 0 -1 0 s_2 = 0 1 1
1 0 1 0 1 1 0 0 -1
where multiplication by s_0 is the operation performed if the index i=0, s_1 corresponds to i=1, and s_2 corresponds to i=2. With this view, you can interpret the algorithm as multiplying the corresponding s_i matrix by x at each iteration. So in the first example where x_1 is flipped at the start, the algorithm computes: s_1*s_2*s_0*s_1*s_2*s_1[4 -1 -2]' = [0 1 0]'
The fact that the index you choose doesn't affect the final output vector arises from two interesting properties of the s matrices. First, s_i*s_(i-1)*s_i = s_(i-1)*s_i*s(i-1), where i-1 is computed modulo n, the number of matrices. This property is the only one needed to see why you get the same result in the examples with 3 elements:
s_1*s_2*s_0*s_1*s_2*s_1 = s_1*s_2*s_0*(s_1*s_2*s_1) = s_1*s_2*s_0*(s_2*s_1*s_2), which corresponds to choosing x_2 at the start, and lastly:
s_1*s_2*s_0*s_2*s_1*s_2 = s_1*(s_2*s_0*s_2)*s_1*s_2 = s_1*(s_0*s_2*s_0)*s1*s2, which corresponds to choosing to flip x_2 at the start, but then choosing to flip x_0 in the third iteration.
The second property only applies when x has 4 or more elements. It is s_i*s_k = s_k*s_i whenever k <= i-2 where i-2 is again computed modulo n. This property is apparent when you consider the form of matrices when x has 4 elements:
-1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1
s_0 = 1 1 0 0 s_1 = 0 -1 0 0 s_2 = 0 1 1 0 s_3 = 0 1 0 0
0 0 1 0 0 1 1 0 0 0 -1 0 0 0 1 1
1 0 0 1 0 0 0 1 0 0 1 1 0 0 0 -1
The second property essentially says that you can exchange the order in which non-conflicting flips occur. For example, in a 4 element vector, if you first flipped x_1 and then flipped x_3, this has the same effect as first flipping x_3 and then flipping x_1.
I picture pushing the negative value(s) out in two directions until they dampen. Since addition is commutative, it doesn't matter what order you process the elements.
Here is an observation for when N is divisible by 3... Probably not useful, but I feel like writing it down.
Let w (complex) be a primitive cube root of 1; that is, w^3 = 1 and 1 + w + w^2 = 0. For example, w = cos(2pi/3) + i*sin(2pi/3).
Consider the sum x_0 + x_1*w + x_2*w^2 + x_3 + x_4*w + x_5*w^2 + .... That is, multiply each element of the sequence by consecutive powers of w and add them all up.
Something moderately interesting happens to this sum on each step.
Consider three consecutive numbers [a, -b, c] from the sequence, with b positive. Suppose these elements line up with the powers of w such that these three numbers contribute a - b*w + c*w^2 to the sum.
Now perform the step on the middle element.
After the step, these numbers contribute (a-b) + b*w + (c-b)*w^2 to the sum.
But since 1 + w + w^2 = 0, b + b*w + b*w^2 = 0 too. So we can add this to the previous expression to get a + 2*b*w + c. Which is very similar to what we had before the step.
In other words, the step merely added 3*b*w to the sum.
If the three consecutive numbers had lined up with powers of w to contribute (say) a*w - b*w^2 + c, it turns out that the step will add 3*b*w^2.
In other words, no matter how the powers of w line up with the three numbers, the step increases the sum by 3*b, 3*b*w, or 3*b*w^2.
Unfortunately, since w^2 = -(w+1), this does not actually yield a steadily increasing function. So, as I said, probably not useful. But it still seems like a reasonable strategy is to seek a "signature" for each position that changes monotonically with each step...

Finding maximum size sub-matrix of all 1's in a matrix having 1's and 0's

Suppose you are given an mXn bitmap, represented by an array M[1..m,1.. n] whose entries are all 0 or 1. A all-one block is a subarray of the form M[i .. i0, j .. j0] in which every bit is equal to 1. Describe and analyze an efficient algorithm to find an all-one block in M with maximum area
I am trying to make a dynamic programming solution. But my recursive algorithm runs in O(n^n) time, and even after memoization I cannot think of bringing it down below O(n^4). Can someone help me find a more efficient solution?
An O(N) (number of elements) solution:
A
1 1 0 0 1 0
0 1 1 1 1 1
1 1 1 1 1 0
0 0 1 1 0 0
Generate an array C where each element represents the number of 1s above and including it, up until the first 0.
C
1 1 0 0 1 0
0 2 1 1 2 1
1 3 2 2 3 0
0 0 3 3 0 0
We want to find the row R, and left, right indices l , r that maximizes (r-l+1)*min(C[R][l..r]). Here is an algorithm to inspect each row in O(cols) time:
Maintain a stack of pairs (h, i), where C[R][i-1] < h ≤ C[R][i]. At any position cur, we should have h=min(C[R][i..cur]) for all pairs (h, i) on the stack.
For each element:
If h_cur>h_top
Push (h, i).
Else:
While h_cur<h_top:
Pop the top of the stack.
Check whether it would make a new best, i.e. (i_cur-i_pop)*h_pop > best.
If h_cur>h_top
Push (h, i_lastpopped).
An example of this in execution for the third row in our example:
i =0 1 2 3 4 5
C[i]=1 3 2 2 3 0
(3, 4)
S= (3, 1) (2, 1) (2, 1) (2, 1)
(1, 0) (1, 0) (1, 0) (1, 0) (1, 0)
(0,-1) (0,-1) (0,-1) (0,-1) (0,-1) (0,-1)
i=0, C[i]=1) Push (1, 0).
i=1, C[i]=3) Push (3, 1).
i=2, C[i]=2) Pop (3, 1). Check whether (2-1)*3=3 is a new best.
        The last i popped was 1, so push (2, 1).
i=3, C[i]=2) h_cur=h_top so do nothing.
i=4, C[i]=3) Push (3, 4).
i=5, C[i]=0) Pop (3, 4). Check whether (5-4)*3=3 is a new best.
        Pop (2, 1). Check whether (5-1)*2=8 is a new best.
        Pop (1, 0). Check whether (5-0)*1=5 is a new best.
        End. (Okay, we should probably add an extra term C[cols]=0 on the end for good measure).
Here's an O(numCols*numLines^2) algorithm. Let S[i][j] = sum of the first i elements of column j.
I will work the algorithm on this example:
M
1 1 0 0 1 0
0 1 1 1 0 1
1 1 1 1 0 0
0 0 1 1 0 0
We have:
S
1 1 0 0 1 0
1 2 1 1 1 1
2 3 2 2 1 1
2 3 3 3 1 1
Now consider the problem of finding the maximum subarray of all ones in a one-dimensional array. This can be solved using this simple algorithm:
append 0 to the end of your array
max = 0, temp = 0
for i = 1 to array.size do
if array[i] = 1 then
++temp
else
if temp > max then
max = temp
temp = 0
For example, if you have this 1d array:
1 2 3 4 5 6
1 1 0 1 1 1
you'd do this:
First append a 0:
1 2 3 4 5 6 7
1 1 0 1 1 1 0
Now, notice that whenever you hit a 0, you know where a sequence of contiguous ones ends. Therefore, if you keep a running total (temp variable) of the current number of ones, you can compare that total with the maximum so far (max variable) when you hit a zero, and then reset the running total. This will give you the maximum length of a contiguous sequence of ones in the variable max.
Now you can use this subalgorithm to find the solution for your problem. First of all append a 0 column to your matrix. Then compute S.
Then:
max = 0
for i = 1 to M.numLines do
for j = i to M.numLines do
temp = 0
for k = 1 to M.numCols do
if S[j][k] - S[i-1][k] = j - i + 1 then
temp += j - i + 1
else
if temp > max then
max = temp
temp = 0
Basically, for each possible height of a subarray (there are O(numLines^2) possible heights), you find the one with maximum area having that height by applying the algorithm for the one-dimensional array (in O(numCols)).
Consider the following "picture":
M
1 1 0 0 1 0 0
i 0 1 1 1 0 1 0
j 1 1 1 1 0 0 0
0 0 1 1 0 0 0
This means that we have the height j - i + 1 fixed. Now, take all the elements of the matrix that are between i and j inclusively:
0 1 1 1 0 1 0
1 1 1 1 0 0 0
Notice that this resembles the one-dimensional problem. Let's sum the columns and see what we get:
1 2 2 2 0 1 0
Now, the problem is reduced to the one-dimensional case, with the exception that we must find a subsequence of contiguous j - i + 1 (which is 2 in this case) values. This means that each column in our j - i + 1 "window" must be full of ones. We can check for this efficiently by using the S matrix.
To understand how S works, consider a one-dimensional case again: let s[i] = sum of the first i elements of the vector a. Then what is the sum of the subsequence a[i..j]? It's the sum of all the elements up to and including a[j], minus the sum of all those up to and including a[i-1], meaning s[j] - s[i-1]. The 2d case works the same, except we have an s for each column.
I hope this is clear, if you have any more questions please ask.
I don't know if this fits your needs, but I think there's also an O(numLines*numCols) algorithm, based on dynamic programming. I can't figure it out yet, except for the case where the subarray you're after is square. Someone might have better insight however, so wait a bit more.
Define a new matrix A wich will store in A[i,j] two values: the width and the height of the largest submatrix with the left upper corner at i,j, fill this matrix starting from the bottom right corner, by rows bottom to top. You'll find four cases:
Perform these cases when given matrix at [i,j]=1
case 1: none of the right or bottom neighbour elements in the original matrix are equal to the current one, i.e: M[i,j] != M[i+1,j] and M[i,j] != M[i,j+1] being M the original matrix, in this case, the value of A[i,j] is 1x1
case 2: the neighbour element to the right is equal to the current one but the bottom one is different, the value of A[i,j].width is A[i+1,j].width+1 and A[i,j].height=1
case 3: the neighbour element to the bottom is equal but the right one is different, A[i,j].width=1, A[i,j].height=A[i,j+1].height+1
case 4: both neighbours are equal:
Three rectangles are considered:
A[i,j].width=A[i,j+1].width+1; A[i,j].height=1;
A[i,j].height=A[i+1,j].height+1; a[i,j].width=1;
A[i,j].width = min(A[i+1,j].width+1,A[i,j+1].width) and A[i,j].height = min(A[i,j+1]+1,A[i+1,j])
The one with the max area in the above three cases will be considered to represent the rectangle at this position.
The size of the largest matrix that has the upper left corner at i,j is A[i,j].width*A[i,j].height so you can update the max value found while calculating the A[i,j]
the bottom row and the rightmost column elements are treated as if their neighbours to the bottom and to the right respectively are different.
Here is a O(N) implementation in C#.
The idea is to use a dynamic programming to build an accumulated Matrix that has the size of the biggest submatrix including the current cell itself.
public static int LargestSquareMatrixOfOne(int[,] original_mat)
{
int[,] AccumulatedMatrix = new int[original_mat.GetLength(0), original_mat.GetLength(1)];
AccumulatedMatrix[0, 0] = original_mat[0, 0];
int biggestSize = 1;
for (int i = 0; i < original_mat.GetLength(0); i++)
{
for (int j = 0; j < original_mat.GetLength(1); j++)
{
if (i > 0 && j > 0)
{
if (original_mat[i, j] == 1)
{
AccumulatedMatrix[i, j] = Math.Min(AccumulatedMatrix[i - 1, j - 1], (Math.Min(AccumulatedMatrix[i - 1, j], AccumulatedMatrix[i, j - 1]))) + 1;
if (AccumulatedMatrix[i, j] > biggestSize)
{
biggestSize = AccumulatedMatrix[i, j];
}
}
else
{
AccumulatedMatrix[i, j] = 0;
}
}
else if ( (i > 0 && j == 0) || (j > 0 && i == 0))
{
if (original_mat[i, j] == 1) { AccumulatedMatrix[i, j] = 1; }
else { AccumulatedMatrix[i, j] = 0; }
}
}
}
return biggestSize;
}

Resources