Distinct values of bitwise and of subarrays - algorithm

How to find number of distinct values of bitwise and of all subarrays of an array.(Array size <=1e5 and array elements<=1e6).
for eg.
A[]={1,2,3}
distinct values are 4(1,2,3,0).

Let's fix the right boundary r of the subarray. Let's image the left boundary l moves to the left starting from r. How many times can the value of the and change? At most O(log(MAX_VALUE)). Why? When we add one more element to the left, we've got two options:
The and value of the subarray doesn't change.
It changes. In that case, the number of bits in it gets strictly less (as it's a submask of the previous and value).
Thus, we can consider only those values of l where something changes. Now we just need to find them quickly.
Let's iterate over the array from left to right and store the position of the last element that doesn't have the i-th bit for all valid i (we can update it by iterating over all bits of the current element). This way, we'll be able to find the next position where the value changes quickly (namely, it's the largest value in this array over all bits that are set). If we sort the positions, we can find the next largest one in O(1).
The total time complexity of this solution is O(N * log(MAX_VALUE) * log(log(MAX_VALUE))) (we iterate over all bits of each element of the array, we sort the array of positions for each them and iterate over it). The space complexity is O(N + MAX_VALUE). It should be good enough for the given contraints.

Imagine the numbers as columns representing their bits. We will have sequences of 1's extending horizontally. For example:
Array index: 0 1 2 3 4 5 6 7
Bit columns: 0 1 1 1 1 1 0 0
0 0 0 1 1 1 1 1
0 0 1 1 1 1 1 0
1 0 0 0 1 1 0 1
0 1 1 1 1 1 1 0
Looking to the left, the bit-row for any subarray anded after a zero will continue being zero, which means no change after that in that row.
Let's take index 5 for example. Now sorting the horizontal sequences of 1's from index 5 to the left will provide us a simple way to detect a change in the bit configuration (the sorting would have to be done on each iteration):
Index 5 ->
Sorted bit rows: 1 0 0 0 1 1
0 0 0 1 1 1
0 0 1 1 1 1
0 1 1 1 1 1
0 1 1 1 1 1
Index 5 to 4, no change
Index 4 to 3, change
Index 2 to 1, change
Index 1 to 0, change
To easily examine these changes, kraskevich proposes recording only the last unset bit for each row as we go along, which would indicate the length of the horizontal sequence of 1's, and a boolean array (of 1e6 numbers max) to store the unique bit configurations encountered.
Numbers: 1, 2, 3
Bits: 1 0 1
0 1 1
As we move from left to right, keep a record of the index of the last unset bit in each row, and also keep a record of any new bit configuration (at most 1e6 of them):
Indexes of last unset bit for each row on each iteration
Numbers: 1, 2, 3
A[0]: -1 arrayHash = [false,true,false,false], count = 1
0
A[1]: -1 1 Now sort the column descending, representing (current - index)
0 0 the lengths of sequences of 1's extending to the left.
As we move from top to bottom on this column, each value change represents a bit
configuration and a possibly distinct count:
Record present bit configuration b10
=> arrayHash = [false,true,true,false]
1 => 1 - 1 => sequence length 0, ignore sequence length 0
0 => 1 - 0 => sequence length 1,
unset second bit: b10 => b00
=> new bit configuration b00
=> arrayHash = [true,true,true,false]
Third iteration:
Numbers: 1, 2, 3
A[2]: -1 1 1
0 0 0
Record present bit configuration b11
=> arrayHash = [true,true,true,true]
(We continue since we don't necessarily know the arrayHash has filled.)
1 => 2 - 1 => sequence length 1
unset first bit: b11 => b10
=> seen bit configuration b10
0 => 2 - 0 => sequence length 2,
unset second bit: b10 => b00
=> seen bit configuration b00

Related

How to find all sub rectangles using fastest algorithm?

An example , suppose we have a 2D array such as:
A= [
[1,0,0],
[1,0,0],
[0,1,1]
]
The task is to find all sub rectangles concluding only zeros. So the output of this algorithm should be:
[[0,1,0,2] , [0,1,1,1] , [0,2,1,2] , [0,1,1,2] ,[1,1,1,2], [2,0,2,0] ,
[0,1,0,1] , [0,2,0,2] , [1,1,1,1] , [1,2,1,2]]
Where i,j in [ i , j , a , b ] are coordinates of rectangle's starting point and a,b are coordinates of rectangle's ending point.
I found some algorithms for example Link1 and Link2 but I think first one is simplest algorithm and we want fastest.For the second one we see that the algorithm only calculates rectangles and not all sub rectangles.
Question:
Does anyone know better or fastest algorithm for this problem? My idea is to use dynamic programming but how to use isn't easy for me.
Assume an initial array of size c columns x r rows.
Every 0 is a rectangle of size 1x1.
Now perform an "horizontal dilation", i.e. replace every element by the maximum of itself and the one to its right, and drop the last element in the row. E.g.
1 0 0 1 0
1 0 0 -> 1 0
0 1 1 1 1
Every zero now corresponds to a 1x2 rectangle in the original array. You can repeat this c-1 times, until there is a single column left.
1 0 0 1 0 1
1 0 0 -> 1 0 -> 1
0 1 1 1 1 1
The zeroes correspond to a 1xc rectangles in the original array (initially c columns).
For every dilated array, perform a similar "vertical dilation".
1 0 0 1 0 1
1 0 0 -> 1 0 -> 1
0 1 1 1 1 1
| | |
V V V
1 0 0 1 0 1
1 1 1 -> 1 1 -> 1
| | |
V V V
1 1 1 -> 1 1 -> 1
In these rxc arrays, the zeroes correspond to the subrectangles of all possible sizes. (Here, 5 of size 1x1, 2 of size 2x1, 2 of size 1x2 and one of size 2x2.)
The total workload to detect the zeroes and compute the dilations is of order O(c²r²). I guess that this is worst-case optimal. (In case an array contains no zeroes, there is no need to continue any dilation.)

How to optimize search of rows x columns combination in a matrix?

Given a matrix of 1's and 0's, I want to find a combination of rows and columns with least or none 0's, maximizing the n_of_rows * n_of_columns picked.
For example, rows (0,1,2) and columns (0,1,3) have only one zero in col #0 row #1, and the rest 8 values are 1's.
1 1 0 1 0
0 1 1 1 0
1 1 0 1 1
0 0 1 0 0
Pracical task is to search over 1000's to 1000000's of rows and columns, finding the maximal biclique in a bipartite graph – rows and cols can be viewed as verticles, and values as connections.
The problem in NP-complete, as far as I learned.
Please advice an approach / algorithm that would speed up the task and reduce requirements to CPU and memory.
Not sure you could minimise thism
However, easy way to work this out would be...
Multiple your matrix by a 1 column and n rows full of 1's. This will give you number of ones in each row. Next do a 1 row by n columns multiplcation (at frot of) your matrix full of 1's. This will give you totals of 1's for each column, From there it's a pretty easy compairson........
ie original matrix...
1 0 1
0 1 1
0 0 0
do
1 0 1 x 1 = 2 (row totals)
o 1 1 1 2
0 0 0 1 0
do
1 1 1 x 1 0 1 = 1 (Column totals)
0 1 1 2
0 0 0 0
nb max sum is 2 (which you would keep track of as you work it out.
Actually given the following assumptions:
1. You don't care how many 0's are in each row or column
2. You don't need to keep track of their order....
Then you only really need to store values to count the total in each row/column as you read the values in and don't actually store the matrix itself.
If you are given the number of rows and columns prior to reading in the matrix you can do the following heuristics to reduce computational time...
Keep track of the current max. If the current row cannot reach this potential max stop counting for the row (but continue in the columns). Vice versa is true for the columns
But you still have a worst case scenario in which all rows and columns have sme number of 1's and 0's.... :)

Fill matrix randomly without row-repetitions

Please help. I'm new to matlab scripting and need a bit of help. I have a series of numbers:
A=[1 1 1 2 2 2 3 3 3 4 4 4 5]
which I want to fill randomly into an 8x12 matrix without having the same number in the same row. At the end I want all the "empty" cells of the 8x12 matrix being filled with 0's or nan.
an example could be:
result=
3 1 5 2 4 5 0 0 0 0 0 0
4 1 3 2 0 0 0 0 0 0 0 0
1 3 4 2 0 0 0 0 0 0 0 0
make sure A is sorted. A = sort(A)more info
make an empty matrix.
For each number in A: more info
find out how many repetitions of the number there are -> for loop in A, start is the first occurance of the number, end is the last, n = last-first+1
find all rows that have space for an extra number, just do a double for loop and keep track of elements that are zero
randomly pick n rows -> more info. To do this, make an array R of all available row indixes. Then take a random sample between 1..size(R,2) with the provided function and get all the values, you now have your row indixes.
randomly pick one of the empty spots in each of the selected rows and assign the number

How I can get the 'n' possible matrices from two vectors?

I've been searching for an algorithm for the solution of all possible matrices of dimension 'n' that can be obtained with two arrays, one of the sum of the rows, and another, of the sum of the columns of a matrix. For example, if I have the following matrix of dimension 7:
matriz= [ 1 0 0 1 1 1 0
1 0 1 0 1 0 0
0 0 1 0 1 0 0
1 0 0 1 1 0 1
0 1 1 0 1 0 1
1 1 1 0 0 0 1
0 0 1 0 1 0 1 ]
The sum of the columns are:
col= [4 2 5 2 6 1 4]
The sum of the rows are:
row = [4 3 2 4 4 4 3]
Now, I want to obtain all possible matrices of "ones and zeros" where the sum of the columns and the rows fulfil the condition of "col" and "row" respectively.
I would appreciate ideas that can help solve this problem.
One obvious way is to brute-force a solution: for the first row, generate all the possibilities that have the right sum, then for each of these, generate all the possibilities for the 2nd row, and so on. Once you have generated all the rows, you check if the sum of the columns is right. But this will take a lot of time. My math might be rusty at this time of the day, but I believe the number of distinct possibilities for a row of length n of which k bits are 1 is given by the binomial coefficient or nchoosek(n,k) in Matlab. To determine the total number of possibilities, you have to multiply this number for every row:
>> n = 7;
>> row= [4 3 2 4 4 4 3];
>> prod(arrayfun(#(k) nchoosek(n, k), row))
ans =
3.8604e+10
This is a lot of possibilities to check! Doing the same for the columns gives
>> col= [4 2 5 2 6 1 4];
>> prod(arrayfun(#(k) nchoosek(n, k), col))
ans =
555891525
Still a large number, but 'only' a factor 70 smaller.
It might be possible to improve this brute-force method a little bit by seeing if the later rows are already constrained by the previous rows. If in your example, for a particular combination of the first two rows, both rows have a 1 in the second column, the rest of this column should all be 0, since the sum must be 2. This reduces the number of possibilities for the remaining rows a bit. Implementing such checks might complicate things a bit, but they might make the difference between a calculation that takes 2 days or one that takes just 1 hour.
An optimized version of this might alternatively generate rows and columns, and start with those for which the number of possibilities is the lowest. I don't know if there is a more elegant solution than this brute-force method, I would be interested to hear one.

permutation matrix

Is it possible to decompose a matrix A having n rows and n columns to sum of m [n x n] permutation matrices. where m is the number of 1's in each row and each column in matrix A?
UPDATE:
yes, this is possible. I came across such an exmaple which is shown below - but How can we generalize the answer?
What you want is called a 1-factorization. One algorithm is repeatedly to find a perfect matching and remove it; probably there are others.
For the first permutation matrix, take the first 1 in the first row. For the second row, take the first 1 that is in a column you don't already have. For the third row, take the first 1 that is in a column you don't already have. And so on. Do this for all rows.
You now have one permutation matrix.
Next subtract your first permutation matrix from the original. This new matrix now has m-1 ones in each row and column. So repeat the process m-1 more times, and you'll have your m permutation matrices.
You can skip the last step, because a matrix with one 1 in each row and column already is a permutation matrix. There's no need to do any calculations.
This is a greedy algorithm that doesn't always work. We can make it work by changing the selection rule slightly. See below:
For your example:
1 0 1 1
A = 1 1 0 1
1 1 1 0
0 1 1 1
In the first step, we pick (1,1) for the first row, (2,2) for the second row, (3,3) for the thrid row and (4,4) for the 4th row. We then have:
1 0 0 0 0 0 1 1
A = 0 1 0 0 + 1 0 0 1
0 0 1 0 1 1 0 0
0 0 0 1 0 1 1 0
The first matrix is a permutation matrix. The second matrix has exactly two 1's in each row and column. So we pick, in order: (1,3), (2,1), (3,2) and... we're in trouble: the rows that contain a 1 in column 4 have already been used.
So how do we fix this? Well, we can keep track of the number of 1's remaining in each column. Instead of picking the first column that is unused, we pick the column with the lowest number of 1's remaining. For the second matrix above:
0 0 1 1 0 0 X 0 0 0 X 0 0 0 X 0
B = 1 0 0 1 --> 1 0 0 1 --> 0 0 0 X --> 0 0 0 X
1 1 0 0 1 1 0 0 1 1 0 0 X 0 0 0
0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
------- ------- ------- -------
2 2 2 2 2 2 X 1 1 2 X X X 1 X X
So we would pick column 4 in the second step, column 1 in the 3rd step, and column 2 in the 4th step.
There can always be only one column with one remaining 1. The other 1's must have been taken away in m-1 previous rows. If you had two such columns, one of them would have had to have been picked as the minimum column before.
This can be done easily using a recursive (backtracking OR depth-first traversal) algorithm. Here is the pseudo-code for its solution:
void printPermutationMatrices(const int OrigMat[][], int permutMat[], int curRow, const int n){
//curPermutMatrix is 1-D array where value of ith element contains the value of column where 1 is placed in ith row
if(curRow == n){//Base case
//do stuff with permutMat[]
printPermutMat(permutMat);
return;
}
for(int col=0; col<n; col++){//try to place 1 in cur_row in each col if possible and go further to next row in recursion
if(origM[cur_row][col] == 1){
permutMat[cur_row] = col;//choose this col for cur_row
if there is no conflict to place a 1 in [cur_row, col] in permutMat[]
perform(origM, curPermutMat, curRow+1, n);
}
}
}
Here is how to call from your main function:
int[] permutMat = new int[n];
printPermutationMatrices(originalMatrix, permutMat, 0, n);

Resources