Understanding a five-dimensional DP with bitshifts and XORs?

Understanding a five-dimensional DP with bitshifts and XORs? - algorithm

I was looking over the solution to this problem here, and I didn't quite understand how the dynamic programming (DP) worked.
A summary of the problem is as follows: You are given a 9x9 grid of either ones or zeroes, arranged in nine 3x3 subgrids as follows:
000 000 000
001 000 100
000 000 000
000 110 000
000 111 000
000 000 000
000 000 000
000 000 000
000 000 000
You need to find the minimum number of changes needed so that each of the nine rows, columns, and 3x3 subgrids contain an even number of 1's. Here, a change is defined as toggling a given element from 1 to 0 or vice-versa.
The solution involves dynamic programming, and each state consists of the minimum number of moves such that all rows up to the current row being look at have even parity (even number of ones).
However, I do not understand the details of their implementation. First off, in their memoization array
int memo[9][9][1<<9][1<<3][2];
what do each of the indexes represent? I gathered that the first two are for current row and column, the third is for column parity, the fourth is for subgrid parity, and the fifth is for row parity. However, why does the column parity need 2^9 elements whereas row parity needs only 2?
Next, how are the transitions between the states handled? I would assume that you go across the row trying each element and moving to the next row when done, but after seeing their code I am quite confused
int& ref = memo[r][c][mc][mb][p];
/* Try setting the cell to 1. */
ref = !A[r][c] + solve(r, c + 1, mc ^ 1 << c, mb ^ 1 << c / 3, !p);
/* Try setting the cell to 0. */
ref = min(ref, A[r][c] + solve(r, c + 1, mc, mb, p));
How do they try setting the cell to one by flipping the current bit in the grid? And I understand how when you make it a one the row parity changes, as indicated by !p but I don't understand how column parity would be affected, or what mc ^ 1 << c does -- why do you need xor and bitshifts? Same goes for the subgrid parity -- mb ^ 1 << c / 3. What is it doing?
Could someone please explain how these work?

I think I've figured this out. The idea is to sweep from top-to-bottom, left-to-right. At each step, we try moving to the next position by setting the current box either to 0 or to 1.
At the end of each row, if the parity is even, we move on to the next row; otherwise we backtrack. At the end of every third row, if the parity of all three boxes is even, we move on to the next row; otherwise we backtrack. Finally, at the end of the board, if all columns have even parity, we're done; otherwise we backtrack.
The state of the recursion at any point can be described in terms of the following five pieces of information:
The current row and column.
The parities of all the columns.
The parities of the three boxes we're currently in (each row intersects three).
The current parity of the column.
This is what the memoization table looks like:
int memo[9][9][1<<9][1<<3][2];
^ ^ ^ ^ ^
| | | | |
row --+ | | | |
col -----+ | | |
column parity ---+ | |
box parity ----------+ |
current row parity---------+
To see why there are bitshifts, let's look at the column parity. There are 9 columns, so we can write out their parities as a bitvector with 9 bits. Equivalently, we could use a nine-bit integer. 1 << 9 gives the number of possible nine-bit integers, so we can use a single integer to encode all column parities at the same time.
Why use XOR and bitshifts? Well, XORing a bitvector A with a second bitvector B inverts all the bits in A that are set in B and leaves all the other bits unchanged. If you're tracking parity, you can use XOR to toggle individual bits to represent a flip in parity; the shifting happens because we're packing multiple parity bits into a single machine word. The division you referred to is to map from a column index to the horizontal index of the box it passes through.
Hope this helps!

The algorithm in the solution is an exhaustive depth-first search with a couple optimizations. Unfortunately, the description doesn't exactly explain it.
Exhaustive search means that we try to enumerate every possible combination of bits. Depth-first means we first try to set all bits to one, then set the last one to zero, then the second-to-last, then both the last and the second-to-last, etc.
The first optimization is to backtrack as soon as we detect that parity isn't even. So, for example, as we start our search and reach the first row, we check if that row has zero parity. If it doesn't, we don't continue. We stop, backtrack, and try setting the last bit in the row to zero.
The second optimization is DP-like, in that we cache partial results and re-use them. This takes advantage of the fact that, in terms of the problem, different paths in the search can converge to the same logical state. What is a logical search state? The description in the solution begins to explain it ("begins" being the key word). In essence, the trick is that, at any given point in the search, the minimum number of additional bit flips does not depend on the exact state of the whole sudoku board, but only on the state of the various parities that we need to track. (See further explanation below.) There are 27 parities that we are tracking (accounting for 9 columns, 9 rows, and 9 3x3 boxes). Moreover, we can optimize some of them away. The parity for all higher rows, given how we perform the search, will always be even, while the parity of all lower rows, not yet touched by the search, doesn't change. We only track the parity of 1 row. By the same logic, the parity of the boxes above and below are disregarded, and we only need to track the "active" 3 boxes.
Therefore, instead of 2^9 * 2^9 * 2^9 = 134,217,728 states, we only have 2^9 * 2^1 * 2^3 = 8,192 states. Unfortunately, we need a separate cache for each depth level in the search. So, we multiply by the 81 possible depths to the search, to discover that we need an array of size 663,552. To borrow from templatetypedef:
int memo[9][9][1<<9][1<<3][2];
^ ^ ^ ^ ^
| | | | |
row --+ | | | |
col -----+ | | |
column parity ---+ | |
box parity ----------+ |
current row parity---------+
1<<9 simply means 2^9, given how integers and bit shifts work.
Further explanation: Due to how parity works, a bit flip will always flip its 3 corresponding parities. Therefore, all the permutations of sudoku boards that have the same parities can be solved with the same winning pattern of bit flips. The function 'solve' gives the answer to the problem: "Assuming you can only perform bit flips starting with the cell at position (x,y), what is the minimum number of bit flips to get a solved board." All sudoku boards with the same parities will yield the same answer. The search algorithm considers many permutations of boards. It starts modifying them from the top, counts how many bit flips it's already done, then asks the function 'solve' to see how many more it would need. If 'solve' has already been called with the same values of (x,y) and the same parities, we can just return the cached result.
The confusing part is the code that actually does the search and updates state:
/* Try setting the cell to 1. */
ref = !A[r][c] + solve(r, c + 1, mc ^ 1 << c, mb ^ 1 << c / 3, !p);
/* Try setting the cell to 0. */
ref = min(ref, A[r][c] + solve(r, c + 1, mc, mb, p));
It could be more clearly rendered as:
/* Try having this cell equal 0 */
bool areWeFlipping = A[r][c] == 1;
int nAdditionalFlipsIfCellIs0 = (areWeFlipping ? 1 : 0) + solve(r, c + 1, mc, mb, p); // Continue the search
/* Try having this cell equal 1 */
areWeFlipping = A[r][c] == 0;
// At the start, we assume the sudoku board is all zeroes, and therefore the column parity is all even. With each additional cell, we update the column parity with the value of tha cell. In this case, we assume it to be 1.
int newMc = mc ^ (1 << c); // Update the parity of column c. ^ (1 << c) means "flip the bit denoting the parity of column c"
int newMb = mb ^ (1 << (c / 3)); // Update the parity of 'active' box (c/3) (ie, if we're in column 5, we're in box 1)
int newP = p ^ 1; // Update the current row parity
int nAdditionalFlipsIfCellIs1 = (areWeFlipping ? 1 : 0) + solve(r, c + 1, newMc, newMb, newP); // Continue the search
ref = min( nAdditionalFlipsIfCellIs0, nAdditionalFlipsIfCellIs1 );
Personally, I would've implemented the two sides of the search as "flip" and "don't flip". This makes the algorithm make more sense, conceptually. It would make the second paragraph read: "Depth-first means we first try to not flip any bits, then flip the last one, then the second-to-last, then both the last and the second-to-last, etc." In addition, before we start the search, we would need to pre-calculate the values of 'mc', 'mb', and 'p' for our board, instead of passing 0's.
/* Try not flipping the current cell */
int nAdditionalFlipsIfDontFlip = 0 + solve(r, c + 1, mc, mb, p);
/* Try flipping it */
int newMc = mc ^ (1 << c);
int newMb = mb ^ (1 << (c / 3));
int newP = p ^ 1;
int nAdditionalFlipsIfFlip = 1 + solve(r, c + 1, newMc, newMb, newP);
ref = min( nAdditionalFlipsIfDontFlip, nAdditionalFlipsIfFlip );
However, this change doesn't seem to affect performance.
UPDATE
Most surprisingly, the key to the algorithm's blazing speed seems to be that the memoization array ends up rather sparse. At each depth level, there is typically 512 (sometimes, 256 or 128) states visited (out of 8192). Moreover, it is always one state per column parity. The box and row parities don't seem to matter! Omitting them from the memoization array improves performance another 30-fold. Yet, can we prove that it is always true?

Related

Given a number find the next sparse number

The problem statement is the following:
Given a number x, find the smallest Sparse number which greater than or equal to x
A number is Sparse if there are no two adjacent 1s in its binary representation. For example 5 (binary representation: 101) is sparse, but 6 (binary representation: 110) is not sparse.
I'm taking the problem from this post where the most efficient solution is listed as having a running time of O(logn):
1) Find binary of the given number and store it in a
boolean array.
2) Initialize last_finalized bit position as 0.
2) Start traversing the binary from least significant bit.
a) If we get two adjacent 1's such that next (or third)
bit is not 1, then
(i) Make all bits after this 1 to last finalized
bit (including last finalized) as 0.
(ii) Update last finalized bit as next bit.
What isn't clear in the post is what is meant by "finalized bit." It seems that the algorithm starts out by inserting the binary representation of a number into a std::vector using a while loop in which it ANDS the input (which is a number x) with 1 and then pushes that back into the vector but, at least from the provided description, its not clear why this is done. Is there a clearer explanation (or even approach) to an efficient solution for this problem?
EDIT:
// Start from second bit (next to LSB)
for (int i=1; i<n-1; i++)
{
// If current bit and its previous bit are 1, but next
// bit is not 1.
if (bin[i] == 1 && bin[i-1] == 1 && bin[i+1] != 1)
{
// Make the next bit 1
bin[i+1] = 1;
// Make all bits before current bit as 0 to make
// sure that we get the smallest next number
for (int j=i; j>=last_final; j--)
bin[j] = 0;
// Store position of the bit set so that this bit
// and bits before it are not changed next time.
last_final = i+1;
}
}

If you see any sequence "011" in the binary representation of your number, then change the '0' to a '1' and set every bit after it to '0' (since that gives the minimum).
The algorithm suggests starting from the right (the least significant bit), but if you start from the left, find the leftmost sequence "011" and do as above, you get the solution one half of the time. The other half is when the next bit to the left of this sequence is a '1'. When you change the '0' to a '1', you create a new "011" sequence that needs to be treated the same way.
The "last finalized bit" is the leftmost '0' bit that sees only '0' bits to its right. This is because all of those '0's won't change in the next steps.

So here are some observation to solve this question:-
Convert the number in its binary format now if the last digits is 0 then we can append 1 and 0 both at the end but if the last digit is 1 then we can only append 0 at the end.
So naive approach is to do a simple iteration and check for every number but we can optimize this approach so for that if we look closely to some example
let say n=5 -> 101 next sparse is 5 (101)
let say n=14 -> 1110 next sparse is 16 (10000)
let say n=39 ->100111 next sparse is 40 (101000)
let say n=438 -> 110110110 next sparse is 512 (1000000000)
To optimize naive approach the idea here is to use the BIT-MANIPULATION
the concept that if we AND a bit sequence with a shifted version of itself, we’re effectively removing the trailing 1 from every sequence of consecutive 1s.
for n=5
0101 (5)
& 1010 (5<<1)
---------
0000
so as you get the value of n&(n<<1) to be zero means the number you have does not have any consecutive 1's in it ( because if it is not zero then there must be a sequence of consecutive 1's in our number) so this will be answer
for n=14
01110 (14)
& 11100 (14<<1)
----------------
01100
so the value is not zero then just increment our number by 1 so our new number is 15
now again perform same things
01111 (15)
& 11110 (15<<1)
------------------------------
01110
again our number is not zero then increment number by 1 and perform same for n = 16
010000 (16)
& 100000 (16<<1)
------------------------
000000
so now our number become zero so we have now encounter a number which does not contains any consecutive 1's so our answer is 16.
So in the similar manner you can check for other number too.
Hope you get the idea if so then upvote. Happy Coding!
int nextSparse(int n) {
// code here
while(true)
{
if(n&(n<<1))
n++;
else
return n;
}
}
Time Complexity will be O(logn).

What is the fastest algorithm to computer all permutations of a binary number with same hamming weight?

I want an algorithm to compute all permutations of a fixed size binary number with a given hamming weight. For example, if the hamming weight is 2 and the binary size is 4 then there are these outputs:
0011
0110
0101
1100
1010
1001
The number of such combinations is computed as C(n,r) in this example C(4,2) which is 6.
Note that you can solve this just by increasing a number from 0 to 2^n and see if the count is OK. However, it is not a fast solution.
I am considering solving the problem using bitset class in C++, and I need to increase N.
I want to add that there is an obvious recursive algorithm for this problem. Due to stack overflow, it is not a good answer. I have received a good answer here from Gosper's hack. While, I need to scale up the input and maybe use an MPI implementation for it later, I need a scalable library. Unsigned int is not big enough and I would rather a scalable and fast library like bitset. The solution is not applicable here while there is no addition in bitset library. any other solution?

You can implement the "lexicographically next bit-permutation" using Gosper's Hack:
unsigned int v; // current permutation of bits
unsigned int w; // next permutation of bits
unsigned int t = v | (v - 1); // t gets v's least significant 0 bits set to 1
// Next set to 1 the most significant bit to change,
// set to 0 the least significant ones, and add the necessary 1 bits.
w = (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(v) + 1));
Or if you don't have ctz (_BitScanForward on MSVC),
unsigned int t = (v | (v - 1)) + 1;
w = t | ((((t & -t) / (v & -v)) >> 1) - 1);

You can generate them in the following manner:
Initially, make a vector with n - r zeros in the beginning and r ones in the end(0011 for n = 4 and r = 2).
Then, repeat the following procedure:
Find the rightmost one such that a zero is located to the left from it.
If there is no such one, we are done.
Move it to the left(by one position, that is, swap it with a zero).
Move all the ones that are located to the right from it to the very end of the vector.
For example, if we have 0110, we first move the rightmost one that can be moved to the left and get 1010, then we shift all ones to the right from it to the end of the vector and get 1001.
This solution has O(C(n, r) * n) time complexity. One more feature of this solution: it generates elements in lexicographical order.

Algorithm for finding kth binary number with certain properties

Let's assume we will consider binary numbers which has length 2n and n might be about 1000. We are looking for kth number (k is limited by 10^9) which has following properties:
Amount of 1's is equal to amount of 0's what can be described as following: #(1) = #(0)
Every prefix of this number has to contain atleast as much 0's as 1's. It might be easier to understand it after negating the sentence, which is: There is no prefix which would contain more 1's than 0's.
And basically that's it.
So to make it clear let's do some example:
n=2, k=2
we have to take binary number of length 2n:
0000
0001
0010
0011
0100
0101
0110
0111
1000
and so on...
And now we have to find 2nd number which fulfill those two requirements. So we see 0011 is the first one, and 0101 is second one.
If we change k=3, then answer doesn't exist since there are number which have same amount of opposite bits, but for 0110, there is prefix 011 so number doesn't fulfill second constraint and same would be with all numbers which has 1 as most significant bit.
So what I did so far to find algorithm?
Well my first idea was to generate all possible bits settings, and check whether it has those two properties, but generate them all would take O(2^(2n)) which is not an option for n=1000.
Additionally I realize there is no need to check all numbers which are smaller than 0011 for n=2, 000111 for n=3, and so on... frankly speaking those which half of most significant bits remains "untouched" because those numbers have no possibility to fulfill #(1) = #(0) condition. Using that I can reduce n by half, but it doesn't help much. Instead of 2 * forever I have forever running algorithm. It's still O(2^n) complexity, which is way too big.
Any idea for algorithm?
Conclusion
This text has been created as a result of my thoughts after reading Andy Jones post.
First of all I wouldn't post code I have used since it's point 6 in following document from Andy's post Kasa 2009. All you have to do is consider nr as that what I described as k. Unranking Dyck words algorithm, would help us find out answer much faster. However it has one bottleneck.
while (k >= C(n-i,j))
Considering that n <= 1000, Catalan number can be quite huge, even C(999,999). We can use some big number arithmetic, but on the other hand I came up with little trick to overpass it and use standard integer.
We don't want to know how big actually Catalan number is as long as it's bigger than k. So now we will create Catalan numbers caching partial sums in n x n table.
... ...
5 | 42 ...
4 | 14 42 ...
3 | 5 14 28 ...
2 | 2 5 9 14 ...
1 | 1 2 3 4 5 ...
0 | 1 1 1 1 1 1 ...
---------------------------------- ...
0 1 2 3 4 5 ...
To generate it is quite trivial:
C(x,0) = 1
C(x,y) = C(x,y-1) + C(x-1,y) where y > 0 && y < x
C(x,y) = C(x,y-1) where x == y
So what we can see only this:
C(x,y) = C(x,y-1) + C(x-1,y) where y > 0 && y < x
can cause overflow.
Let's stop at this point and provide definition.
k-flow - it's not real overflow of integer but rather information that value of C(x,y) is bigger than k.
My idea is to check after each running of above formula whether C(x,y) is grater than k or any of sum components is -1. If it is we put -1 instead, which would act as a marker, that k-flow has happened. I guess it quite obvious that if k-flow number is sum up with any positive number it's still be k-flowed in particular sum of 2 k-flowed numbers is k-flowed.
The last what we have to prove is that there is no possibility to create real overflow. Real overflow might only happen if we sum up a + b which non of them is k-flowed but as sum they generated the real overflow.
Of course it's impossible since maximum value can be described as a + b <= 2 * k <= 2*10^9 <= 2,147,483,647 where last value in this inequality is value of int with sign. I assume also that int has 32 bits, as in my case.

The numbers you are describing correspond to Dyck words. Pt 2 of Kasa 2009 gives a simple algorithm for enumerating them in lexicographic order. Its references should be helpful if you want to do any further reading.
As an aside (and be warned I'm half asleep as I write this, so it might be wrong), the wikipedia article notes that the number of Dyck words of length 2n is the n th Catalan number, C(n). You might want to find the smallest n such that C(n) is larger than the k you're looking for, and then enumerate Dyck words starting from X^n Y^n.

I'm sorry for misunderstood this problem last time, so I edit it and now I can promise the correction and you can test the code first, the complexity is O(n^2), the detail answer is follow
First, we can equal the problem to the next one
We are looking for kth largest number (k is limited by 10^9) which has following properties:
Amount of 1's is equal to amount of 0's what can be described as following: #(1) = #(0)
Every prefix of this number has to contain at least as much [[1's as 0's]], which means: There is no prefix which would contain more [[0's than 1's]].
Let's give an example to explain it: let n=3 and k=4, the amount of satisfied numbers is 5, and the picture below has explain what we should determine in previous problem and new problem:
| 000111 ------> 111000 ^
| 001011 ------> 110100 |
| 001101 ------> 110010 |
| previous 4th number 010011 ------> 101100 new 4th largest number |
v 010101 ------> 101010 |
so after we solve the new problem, we just need to bitwise not.
Now the main problem is how to solve the new problem. First, let A be the array, so A[m]{1<=m<=2n} only can be 1 or 0, let DP[v][q] be the amount of numbers which satisfy condition2 and condition #(1)=q in {A[2n-v+1]~A[2n]}, so the DP[2n][n] is the amount of satisfied numbers.
A[1] only can be 1 or 0, if A[1]=1, the amount of numbers is DP[2n-1][n-1], if A[1]=0, the amount of numbers is DP[2n-1][n], now we want to find the kth largest number, if k<=DP[2n-1][n-1], kth largest number's A[1] must be 1, then we can judge A[2] with DP[2n-2][n-2]; if k>DP[2n-1][n-1], kth largest number's A[1] must be 0 and k=k-DP[2n-1][n-1], then we can judge A[2] with DP[2n-2][n-1]. So with the same theory, we can judge A[j] one by one until there is no number to compare. Now we can give a example to understand (n=3, k=4)
(We use dynamic programming to determine DP matrix, the DP equation is DP[v][q]=DP[v-1][q-1]+DP[v-1][q])
Intention: we need the number in leftest row can be compared,
so we add a row on DP's left row, but it's not include by DP matrix
in the row, all the number is 1.
the number include by bracket are initialized by ourselves
the theory of initialize just follow the mean of DP matrix
DP matrix = (1) (0) (0) (0) 4<=DP[5][2]=5 --> A[1]=1
(1) (1) (0) (0) 4>DP[4][1]=3 --> A[2]=0, k=4-3=1
(1) (2) (0) (0) 1<=DP[3][1]=3 --> A[3]=1
(1) (3) 2 (0) 1<=1 --> a[4]=1
(1) (4) 5 (0) no number to compare, A[5]~A[6]=0
(1) (5) 9 5 so the number is 101100
If you have not understand clearly, you can use the code to understand
Intention：DP[2n][n] increase very fast, so the code can only work when n<=19, in the problem n<1000, so you can use big number programming, and the code can be optimize by bit operation, so the code is just a reference
/*--------------------------------------------------
Environment: X86 Ubuntu GCC
Author: Cong Yu
Blog: aimager.com
Mail: funcemail#gmail.com
Build_Date: Mon Dec 16 21:52:49 CST 2013
Function:
--------------------------------------------------*/
#include <stdio.h>
int DP[2000][1000];
// kth is the result
int kth[1000];
void Oper(int n, int k){
int i,j,h;
// temp is the compare number
// jishu is the
int temp,jishu=0;
// initialize
for(i=1;i<=2*n;i++)
DP[i-1][0]=i-1;
for(j=2;j<=n;j++)
for(i=1;i<=2*j-1;i++)
DP[i-1][j-1]=0;
for(i=1;i<=2*n;i++)
kth[i-1]=0;
// operate DP matrix with dynamic programming
for(j=2;j<=n;j++)
for(i=2*j;i<=2*n;i++)
DP[i-1][j-1]=DP[i-2][j-2]+DP[i-2][j-1];
// the main thought
if(k>DP[2*n-1][n-1])
printf("nothing\n");
else{
i=2*n;
j=n;
for(;j>=1;i--,jishu++){
if(j==1)
temp=1;
else
temp=DP[i-2][j-2];
if(k<=temp){
kth[jishu]=1;
j--;
}
else{
kth[jishu]=0;
if(j==1)
k-=1;
else
k-=DP[i-2][j-2];
}
}
for(i=1;i<=2*n;i++){
kth[i-1]=1-kth[i-1];
printf("%d",kth[i-1]);
}
printf("\n");
}
}
int main(){
int n,k;
scanf("%d",&n);
scanf("%d",&k);
Oper(n,k);
return 0;
}

bit vector implementation of set in Programming Pearls, 2nd Edition

On Page 140 of Programming Pearls, 2nd Edition, Jon proposed an implementation of sets with bit vectors.
We'll turn now to two final structures that exploit the fact that our sets represent integers. Bit vectors are an old friend from Column 1. Here are their private data and functions:
enum { BITSPERWORD = 32, SHIFT = 5, MASK = 0x1F };
int n, hi, *x;
void set(int i) { x[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { x[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i) { return x[i>>SHIFT] &= (1<<(i & MASK)); }
As I gathered, the central idea of a bit vector to represent an integer set, as described in Column 1, is that the i-th bit is turned on if and only if the integer i is in the set.
But I am really at a loss at the algorithms involved in the above three functions. And the book doesn't give an explanation.
I can only get that i & MASK is to get the lower 5 bits of i, while i>>SHIFT is to move i 5 bits toward the right.
Anybody would elaborate more on these algorithms? Bit operations always seem a myth to me, :(

Bit Fields and You
I'll use a simple example to explain the basics. Say you have an unsigned integer with four bits:
[0][0][0][0] = 0
You can represent any number here from 0 to 15 by converting it to base 2. Say we have the right end be the smallest:
[0][1][0][1] = 5
So the first bit adds 1 to the total, the second adds 2, the third adds 4, and the fourth adds 8. For example, here's 8:
[1][0][0][0] = 8
So What?
Say you want to represent a binary state in an application-- if some option is enabled, if you should draw some element, and so on. You probably don't want to use an entire integer for each one of these- it'd be using a 32 bit integer to store one bit of information. Or, to continue our example in four bits:
[0][0][0][1] = 1 = ON
[0][0][0][0] = 0 = OFF //what a huge waste of space!
(Of course, the problem is more pronounced in real life since 32-bit integers look like this:
[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] = 0
The answer to this is to use a bit field. We have a collection of properties (usually related ones) which we will flip on and off using bit operations. So, say, you might have 4 different lights on a piece of hardware that you want to be on or off.
3 2 1 0
[0][0][0][0] = 0
(Why do we start with light 0? I'll explain this in a second.)
Note that this is an integer, and is stored as an integer, but is used to represent multiple states for multiple objects. Crazy! Say we turn lights 2 and 1 on:
3 2 1 0
[0][1][1][0] = 6
The important thing you should note here: There's probably no obvious reason why lights 2 and 1 being on should equal six, and it may not be obvious how we would do anything with this scheme of information storage. It doesn't look more obvious if you add more bits:
3 2 1 0
[1][1][1][0] = 0xE \\what?
Why do we care about this? Do we have exactly one state for each number between 0 and 15?How are we going to manage this without some insane series of switch statements? Ugh...
The Light at the End
So if you've worked with binary arithmetic a bit before, you might realize that the relationship between the numbers on the left and the numbers on the right is, of course, base 2. That is:
1*(23) + 1*(22) + 1*(21) +0 *(20) = 0xE
So each light is present in the exponent of each term of the equation. If the light is on, there is a 1 next to its term- if the light is off, there is a zero. Take the time to convince yourself that there is exactly one integer between 0 and 15 that corresponds to each state in this numbering scheme.
Bit operators
Now that we have this done, let's take a second to see what bitshifting does to integers in this setup.
[0][0][0][1] = 1
When you shift bits to the left or the right in an integer, it literally moves the bits left and right. (Note: I 100% disavow this explanation for negative numbers! There be dragons!)
1<<2 = 4
[0][1][0][0] = 4
4>>1 = 2
[0][0][1][0] = 2
You will encounter similar behavior when shifting numbers represented with more than one bit. Also, it shouldn't be hard to convince yourself that x>>0 or x<<0 is just x. Doesn't shift anywhere.
This probably explains the naming scheme of the Shift operators to anyone who wasn't familiar with them.
Bitwise operations
This representation of numbers in binary can also be used to shed some light on the operations of bitwise operators on integers. Each bit in the first number is xor-ed, and-ed, or or-ed with its fellow number. Take a second to venture to wikipedia and familiarize yourself with the function of these Boolean operators - I'll explain how they function on numbers but I don't want to rehash the general idea in great detail.
...
Welcome back! Let's start by examining the effect of the OR (|) operator on two integers, stored in four bit.
OR OPERATOR ON:
[1][0][0][1] = 0x9
[1][1][0][0] = 0xC
________________
[1][1][0][1] = 0xD
Tough! This is a close analogue to the truth table for the boolean OR operator. Notice that each column ignores the adjacent columns and simply fills in the result column with the result of the first bit and the second bit OR'd together. Note also that the value of anything or'd with 1 is 1 in that particular column. Anything or'd with zero remains the same.
The table for AND (&) is interesting, though somewhat inverted:
AND OPERATOR ON:
[1][0][0][1] = 0x9
[1][1][0][0] = 0xC
________________
[1][0][0][0] = 0x8
In this case we do the same thing- we perform the AND operation with each bit in a column and put the result in that bit. No column cares about any other column.
Important lesson about this, which I invite you to verify by using the diagram above: anything AND-ed with zero is zero. Also, equally important- nothing happens to numbers that are AND-ed with one. They stay the same.
The final table, XOR, has behavior which I hope you all find predictable by now.
XOR OPERATOR ON:
[1][0][0][1] = 0x9
[1][1][0][0] = 0xC
________________
[0][1][0][1] = 0x5
Each bit is being XOR'd with its column, yadda yadda, and so on. But look closely at the first row and the second row. Which bits changed? (Half of them.) Which bits stayed the same? (No points for answering this one.)
The bit in the first row is being changed in the result if (and only if) the bit in the second row is 1!
The one lightbulb example!
So now we have an interesting set of tools we can use to flip individual bits. Let's go back to the lightbulb example and focus only on the first lightbulb.
0
[?] \\We don't know if it's one or zero while coding
We know that we have an operation that can always make this bit equal to one- the OR 1 operator.
0|1 = 1
1|1 = 1
So, ignoring the rest of the bulbs, we could do this
4_bit_lightbulb_integer |= 1;
and know for sure that we did nothing but set the first lightbulb to ON.
3 2 1 0
[0][0][0][?] = 0 or 1? \\4_bit_lightbulb_integer
[0][0][0][1] = 1
________________
[0][0][0][1] = 0x1
Similarly, we can AND the number with zero. Well- not quite zero- we don't want to affect the state of the other bits, so we will fill them in with ones.
I'll use the unary (one-argument) operator for bit negation. The ~ (NOT) bitwise operator flips all of the bits in its argument. ~(0X1):
[0][0][0][1] = 0x1
________________
[1][1][1][0] = 0xE
We will use this in conjunction with the AND bit below.
Let's do 4_bit_lightbulb_integer & 0xE
3 2 1 0
[0][1][0][?] = 4 or 5? \\4_bit_lightbulb_integer
[1][1][1][0] = 0xE
________________
[0][1][0][0] = 0x4
We're seeing a lot of integers on the right-hand-side which don't have any immediate relevance. You should get used to this if you deal with bit fields a lot. Look at the left-hand side. The bit on the right is always zero and the other bits are unchanged. We can turn off light 0 and ignore everything else!
Finally, you can use the XOR bit to flip the first bit selectively!
3 2 1 0
[0][1][0][?] = 4 or 5? \\4_bit_lightbulb_integer
[0][0][0][1] = 0x1
________________
[0][1][0][*] = 4 or 5?
We don't actually know what the value of * is now- just that flipped from whatever ? was.
Combining Bit Shifting and Bitwise operations
The interesting fact about these two operations is when taken together they allow you to manipulate selective bits.
[0][0][0][1] = 1 = 1<<0
[0][0][1][0] = 2 = 1<<1
[0][1][0][0] = 4 = 1<<2
[1][0][0][0] = 8 = 1<<3
Hmm. Interesting. I'll mention the negation operator here (~) as it's used in a similar way to produce the needed bit values for ANDing stuff in bit fields.
[1][1][1][0] = 0xE = ~(1<<0)
[1][1][0][1] = 0xD = ~(1<<1)
[1][0][1][1] = 0xB = ~(1<<2)
[0][1][1][1] = 0X7 = ~(1<<3)
Are you seeing an interesting relationship between the shift value and the corresponding lightbulb position of the shifted bit?
The canonical bitshift operators
As alluded to above, we have an interesting, generic method for turning on and off specific lights with the bit-shifters above.
To turn on a bulb, we generate the 1 in the right position using bit shifting, and then OR it with the current lightbulb positions. Say we want to turn on light 3, and ignore everything else. We need to get a bit shifting operation that ORs
3 2 1 0
[?][?][?][?] \\all we know about these values at compile time is where they are!
and 0x8
[1][0][0][0] = 0x8
Which is easy, thanks to bitshifting! We'll pick the number of the light and switch the value over:
1<<3 = 0x8
and then:
4_bit_lightbulb_integer |= 0x8;
3 2 1 0
[1][?][?][?] \\the ? marks have not changed!
And we can guarantee that the bit for the 3rd lightbulb is set to 1 and that nothing else has changed.
Clearing a bit works similarly- we'll use the negated bits table above to, say, clear light 2.
~(1<<2) = 0xB = [1][0][1][1]
4_bit_lightbulb_integer & 0xB:
3 2 1 0
[?][?][?][?]
[1][0][1][1]
____________
[?][0][?][?]
The XOR method of flipping bits is the same idea as the OR one.
So the canonical methods of bit switching are this:
Turn on the light i:
4_bit_lightbulb_integer|=(1<<i)
Turn off light i:
4_bit_lightbulb_integer&=~(1<<i)
Flip light i:
4_bit_lightbulb_integer^=(1<<i)
Wait, how do I read these?
In order to check a bit we can simply zero out all of the bits except for the one we care about. We'll then check to see if the resulting value is greater than zero- since this is the only value that could possibly be nonzero, it will make the entire integer nonzero if and only if it is nonzero. For example, to check bit 2:
1<<2:
[0][1][0][0]
4_bit_lightbulb_integer:
[?][?][?][?]
1<<2 & 4_bit_lightbulb_integer:
[0][?][0][0]
Remember from the previous examples that the value of ? didn't change. Remember also that anything AND 0 is 0. So, we can say for sure that if this value is greater than zero, the switch at position 2 is true and the lightbulb is zero. Similarly, if the value is off, the value of the entire thing will be zero.
(You can alternately shift the entire value of 4_bit_lightbulb_integer over by i bits and AND it with 1. I don't remember off the top of my head if one is faster than the other but I doubt it.)
So the canonical checking function:
Check if bit i is on:
if (4_bit_lightbulb_integer & 1<<i) {
\\do whatever
}
The specifics
Now that we have a complete set of tools for bitwise operations, we can look at the specific example here. This is basically the same idea- except a much more concise and powerful way of executing it. Let's look at this function:
void set(int i) { x[i>>SHIFT] |= (1<<(i & MASK)); }
From the canonical implementation I'm going to make a guess that this is trying to set some bits to 1! Let's take an integer and look at what's going on here if i feed the value 0x32 (50 in decimal) into i:
x[0x32>>5] |= (1<<(0x32 & 0x1f))
Well, that's a mess.. let's dissect this operation on the right. For convenience, pretend there are 24 more irrelevant zeros, since these are both 32 bit integers.
...[0][0][0][1][1][1][1][1] = 0x1F
...[0][0][1][1][0][0][1][0] = 0x32
________________________
...[0][0][0][1][0][0][1][0] = 0x12
It looks like everything is being cut off at the boundary on top where 1s turn into zeros. This technique is called Bit Masking. Interestingly, the boundary here restricts the resulting values to be between 0 and 31... Which is exactly the number of bit positions we have for a 32 bit integer!
x[0x32>>5] |= (1<<(0x12))
Let's look at the other half.
...[0][0][1][1][0][0][1][0] = 0x32
Shift five bits to the right:
...[0][0][0][0][0][0][0][1] = 0x01
Note that this transformation exactly destroyed all information from the first part of the function- we have 32-5 = 27 remaining bits which could be nonzero. This indicates which of 227 integers in the array of integers are selected. So the simplified equation is now:
x[1] |= (1<<0x12)
This just looks like the canonical bit-setting operation! We've just chosen
So the idea is to use the first 27 bits to pick an integer to shift and the last five bits indicate which bit of the 32 in that integer to shift.

The key to understanding what's going on is to recognize that BITSPERWORD = 2SHIFT. Thus, x[i>>SHIFT] finds which 32-bit element of the array x has the bit corresponding to i. (By shifting i 5 bits to the right, you're simply dividing by 32.) Once you have located the correct element of x, the lower 5 bits of i can then be used to find which particular bit of x[i>>SHIFT] corresponds to i. That's what i & MASK does; by shifting 1 by that number of bits, you move the bit corresponding to 1 to the exact position within x[i>>SHIFT] that corresponds to the ith bit in x.
Here's a bit more of an explanation:
Imagine that we want capacity for N bits in our bit vector. Since each int holds 32 bits, we will need (N + 31) / 32 int values for our storage (that is, N/32 rounded up). Within each int value, we will adopt the convention that bits are ordered from least significant to most significant. We will also adopt the convention that the first 32 bits of our vector are in x[0], the next 32 bits are in x[1], and so forth. Here's the memory layout we are using (showing the bit index in our bit vector corresponding to each bit of memory):
+----+----+-------+----+----+----+
x[0]: | 31 | 30 | . . . | 02 | 01 | 00 |
+----+----+-------+----+----+----+
x[1]: | 63 | 62 | . . . | 34 | 33 | 32 |
+----+----+-------+----+----+----+
etc.
Our first step is to allocate the necessary storage capacity:
x = new int[(N + BITSPERWORD - 1) >> SHIFT]
(We could make provision for dynamically expanding this storage, but that would just add complexity to the explanation.)
Now suppose we want to access bit i (either to set it, clear it, or just to know its current value). We need to first figure out which element of x to use. Since there are 32 bits per int value, this is easy:
subscript for x = i / 32
Making use of the enum constants, the x element we want is:
x[i >> SHIFT]
(Think of this as a 32-bit-wide window into our N-bit vector.) Now we have to find the specific bit corresponding to i. Looking at the memory layout, it's not hard to figure out that the first (rightmost) bit in the window corresponds to bit index 32 * (i >> SHIFT). (The window starts afteri >> SHIFT slots in x, and each slot has 32 bits.) Since that's the first bit in the window (position 0), then the bit we're interested in is is at position
i - (32 * (i >> SHIFT))
in the windows. With a little experimenting, you can convince yourself that this expression is always equal to i % 32 (actually, that's one definition of the mod operator) which, in turn, is always equal to i & MASK. Since this last expression is the fastest way to calculate what we want, that's what we'll use.
From here, the rest is pretty simple. We start with a single bit in the least-significant position of the window (that is, the constant 1), and move it to the left by i & MASK bits to get it to the position in the window corresponding to bit i in the bit vector. This is where the expression
1 << (i & MASK)
comes from. With the bit now moved to where we want it, we can use this as a mask to set, clear, or query the value of the bit at that position in x[i>>SHIFT] and we know that we're actually setting, clearing, or querying the value of bit i in our bit vector.

If you store your bits in an array of n words you can imagine them to be layed out as a matrix with n rows and 32 columns (BITSPERWORD):
3 0
1 0
0 xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx
1 xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx
2 xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx
....
n xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx
To get the k-th bit you divide k by 32. The (integer) result will give you the row (word) the bit is in, the reminder will give you which bit is within the word.
Dividing by 2^p can be done simply by shifting p postions to the right. The reminder can be obtained by getting the p rightmost bits (i.e the bitwise AND with (2^p - 1)).
In C terms:
#define div32(k) ((k) >> 5)
#define mod32(k) ((k) & 31)
#define word_the_bit_is_in(k) div32(k)
#define bit_within_word(k) mod32(k)
Hope it helps.

How to do matrix conversions by row and columns toggles?

I have got a square matrix consisting of elements either 1
or 0. An ith row toggle toggles all the ith row elements (1
becomes 0 and vice versa) and jth column toggle toggles all
the jth column elements. I have got another square matrix of
similar size. I want to change the initial matrix to the
final matrix using the minimum number of toggles. For example
|0 0 1|
|1 1 1|
|1 0 1|
to
|1 1 1|
|1 1 0|
|1 0 0|
would require a toggle of the first row and of the last
column.
What will be the correct algorithm for this?

In general, the problem will not have a solution. To see this, note that transforming matrix A to matrix B is equivalent to transforming the matrix A - B (computed using binary arithmetic, so that 0 - 1 = 1) to the zero matrix. Look at the matrix A - B, and apply column toggles (if necessary) so that the first row becomes all 0's or all 1's. At this point, you're done with column toggles -- if you toggle one column, you have to toggle them all to get the first row correct. If even one row is a mixture of 0's and 1's at this point, the problem cannot be solved. If each row is now all 0's or all 1's, the problem is solvable by toggling the appropriate rows to reach the zero matrix.
To get the minimum, compare the number of toggles needed when the first row is turned to 0's vs. 1's. In the OP's example, the candidates would be toggling column 3 and row 1 or toggling columns 1 and 2 and rows 2 and 3. In fact, you can simplify this by looking at the first solution and seeing if the number of toggles is smaller or larger than N -- if larger than N, than toggle the opposite rows and columns.

It's not always possible. If you start with a 2x2 matrix with an even number of 1s you can never arrive at a final matrix with an odd number of 1s.

Algorithm
Simplify the problem from "Try to transform A into B" into "Try to transform M into 0", where M = A xor B. Now all the positions which must be toggled have a 1 in them.
Consider an arbitrary position in M. It is affected by exactly one column toggle and exactly one row toggle. If its initial value is V, the presence of the column toggle is C, and the presence of the row toggle is R, then the final value F is V xor C xor R. That's a very simple relationship, and it makes the problem trivial to solve.
Notice that, for each position, R = F xor V xor C = 0 xor V xor C = V xor C. If we set C then we force the value of R, and vice versa. That's awesome, because it means if I set the value of any row toggle then I will force all of the column toggles. Any one of those column toggles will force all of the row toggles. If the result is the 0 matrix, then we have a solution. We only need to try two cases!
Pseudo-code
function solve(Matrix M) as bool possible, bool[] rowToggles, bool[] colToggles:
For var b in {true, false}
colToggles = array from c in M.colRange select b xor Matrix(0, c)
rowToggles = array from r in M.rowRange select colToggles[0] xor M(r, 0)
if none from c in M.colRange, r in M.rowRange
where colToggle[c] xor rowToggle[r] xor M(r, c) != 0 then
return true, rowToggles, colToggles
end if
next var
return false, null, null
end function
Analysis
The analysis is trivial. We try two cases, within which we run along a row, then a column, then all cells. Therefore if there are r rows and c columns, meaning the matrix has size n = c * r, then the time complexity is O(2 * (c + r + c * r)) = O(c * r) = O(n). The only space we use is what is required for storing the outputs = O(c + r).
Therefore the algorithm takes time linear in the size of the matrix, and uses space linear in the size of the output. It is asymptotically optimal for obvious reasons.

I came up with a brute force algorithm.
The algorithm is based on 2 conjectures:
(so it may not work for all matrices - I'll verify them later)
The minimum (number of toggles) solution will contain a specific row or column only once.
In whatever order we apply the steps to convert the matrix, we get the same result.
The algorithm:
Lets say we have the matrix m = [ [1,0], [0,1] ].
m: 1 0
0 1
We generate a list of all row and column numbers,
like this: ['r0', 'r1', 'c0', 'c1']
Now we brute force, aka examine, every possible step combinations.
For example,we start with 1-step solution,
ksubsets = [['r0'], ['r1'], ['c0'], ['c1']]
if no element is a solution then proceed with 2-step solution,
ksubsets = [['r0', 'r1'], ['r0', 'c0'], ['r0', 'c1'], ['r1', 'c0'], ['r1', 'c1'], ['c0', 'c1']]
etc...
A ksubsets element (combo) is a list of toggle steps to apply in a matrix.
Python implementation (tested on version 2.5)
# Recursive definition (+ is the join of sets)
# S = {a1, a2, a3, ..., aN}
#
# ksubsets(S, k) = {
# {{a1}+ksubsets({a2,...,aN}, k-1)} +
# {{a2}+ksubsets({a3,...,aN}, k-1)} +
# {{a3}+ksubsets({a4,...,aN}, k-1)} +
# ... }
# example: ksubsets([1,2,3], 2) = [[1, 2], [1, 3], [2, 3]]
def ksubsets(s, k):
if k == 1: return [[e] for e in s]
ksubs = []
ss = s[:]
for e in s:
if len(ss) < k: break
ss.remove(e)
for x in ksubsets(ss,k-1):
l = [e]
l.extend(x)
ksubs.append(l)
return ksubs
def toggle_row(m, r):
for i in range(len(m[r])):
m[r][i] = m[r][i] ^ 1
def toggle_col(m, i):
for row in m:
row[i] = row[i] ^ 1
def toggle_matrix(m, combos):
# example of combos, ['r0', 'r1', 'c3', 'c4']
# 'r0' toggle row 0, 'c3' toggle column 3, etc.
import copy
k = copy.deepcopy(m)
for combo in combos:
if combo[0] == 'r':
toggle_row(k, int(combo[1:]))
else:
toggle_col(k, int(combo[1:]))
return k
def conversion_steps(sM, tM):
# Brute force algorithm.
# Returns the minimum list of steps to convert sM into tM.
rows = len(sM)
cols = len(sM[0])
combos = ['r'+str(i) for i in range(rows)] + \
['c'+str(i) for i in range(cols)]
for n in range(0, rows + cols -1):
for combo in ksubsets(combos, n +1):
if toggle_matrix(sM, combo) == tM:
return combo
return []
Example:
m: 0 0 0
0 0 0
0 0 0
k: 1 1 0
1 1 0
0 0 1
>>> m = [[0,0,0],[0,0,0],[0,0,0]]
>>> k = [[1,1,0],[1,1,0],[0,0,1]]
>>> conversion_steps(m, k)
['r0', 'r1', 'c2']
>>>

If you can only toggle the rows, and not the columns, then there will only be a subset of matrices that you can convert into the final result. If this is the case, then it would be very simple:
for every row, i:
if matrix1[i] == matrix2[i]
continue;
else
toggle matrix1[i];
if matrix1[i] == matrix2[i]
continue
else
die("cannot make similar");

This is a state space search problem. You are searching for the optimum path from a starting state to a destination state. In this particular case, "optimum" is defined as "minimum number of operations".
The state space is the set of binary matrices generatable from the starting position by row and column toggle operations.
ASSUMING that the destination is in the state space (NOT a valid assumption in some cases: see Henrik's answer), I'd try throwing a classic heuristic search (probably A*, since it is about the best of the breed) algorithm at the problem and see what happened.
The first, most obvious heuristic is "number of correct elements".
Any decent Artificial Intelligence textbook will discuss search and the A* algorithm.
You can represent your matrix as a nonnegative integer, with each cell in the matrix corresponding to exactly one bit in the integer On a system that supports 64-bit long long unsigned ints, this lets you play with anything up to 8x8. You can then use exclusive-OR operations on the number to implement the row and column toggle operations.
CAUTION: the raw total state space size is 2^(N^2), where N is the number of rows (or columns). For a 4x4 matrix, that's 2^16 = 65536 possible states.

Rather than look at this as a matrix problem, take the 9 bits from each array, load each of them into 2-byte size types (16 bits, which is probably the source of the arrays in the first place), then do a single XOR between the two.
(the bit order would be different depending on your type of CPU)
The first array would become: 0000000001111101
The second array would become: 0000000111110101
A single XOR would produce the output. No loops required. All you'd have to do is 'unpack' the result back into an array, if you still wanted to. You can read the bits without resorting to that, though.i

I think brute force is not necessary.
The problem can be rephrased in terms of a group. The matrices over the field with 2 elements constitute an commutative group with respect to addition.
As pointed out before, the question whether A can be toggled into B is equivalent to see if A-B can be toggled into 0. Note that toggling of row i is done by adding a matrix with only ones in the row i and zeros otherwise, while the toggling of column j is done by adding a matrix with only ones in column j and zeros otherwise.
This means that A-B can be toggled to the zero matrix if and only if A-B is contained in the subgroup generated by the toggling matrices.
Since addition is commutative, the toggling of columns takes place first, and we can apply the approach of Marius first to the columns and then to the rows.
In particular the toggling of the columns must make any row either all ones or all zeros. there are two possibilites:
Toggle columns such that every 1 in the first row becomes zero. If after this there is a row in which both ones and zeros occur, there is no solution. Otherwise apply the same approach for the rows (see below).
Toggle columns such that every 0 in the first row becomes 1. If after this there is a row in which both ones and zeros occur, there is no solution. Otherwise apply the same approach for the rows (see below).
Since the columns have been toggled successfully in the sense that in each row contains only ones or zeros, there are two possibilities:
Toggle rows such that every 1 in the first column becomes zero.
Toggle rows such that every 0 in the first row becomes zero.
Of course in the step for the rows, we take the possibility which results in less toggles, i.e. we count the ones in the first column and then decide how to toggle.
In total, only 2 cases have to be considered, namely how the columns are toggled; for the row step, the toggling can be decided by counting to minimuze the number of toggles in the second step.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio