What does the % operator do in Ruby in N % 2? - ruby

if counter % 2 == 1 I am trying to decode this line - it's a Rails project and I am trying to figure out what the % does in this if statement.

% is the modulo operator. The result of counter % 2 is the remainder of counter / 2.
n % 2 is often a good way of determining if a number n is even or odd. If n % 2 == 0, the number is even (because no remainder means that the number is evenly divisible by 2); if n % 2 == 1, the number is odd.

In answer to the question "What does the % symbol do or mean in Ruby?" It is:
1) The modulo binary operator (as has been mentioned)
17 % 10 #=> 7
2) The alternative string delimiter token
%Q{hello world} #=> "hello world"
%Q(hello world) #=> "hello world"
%Q[hello world] #=> "hello world"
%Q!hello world! #=> "hello world"
# i.e. choose your own bracket pair
%q(hello world) #=> 'hello world'
%x(pwd) #=> `pwd`
%r(.*) #=> /.*/
3) The string format operator (shorthand for Kernel::sprintf)
"05d" % 123 #=> "00123"

That's the modulo operator. It gives the remainder when counter is divided by 2.
For example:
3 % 2 == 1
2 % 2 == 0

Regardless of how it works, the modulo operator is probably not the best code for the purpose (even though we are not given much context). As Jörg mentioned in a comment, the expression if counter.odd? is probably the intent, and is more readable.
If this is view code and used to determine (for example) alternating row colors, then you may be able to do without the counter altogether by using the built-in Rails helper cycle(). For example, you can use cycle('odd','even') as a class name for a table row, eliminating the counter and surrounding if/then logic.
Another thought: if this is within an each block, you may be able to use each_with_index and eliminate the extraneous counter variable.
My refactoring $0.02.

Also keep in mind that, Ruby's definition of the modulo (%) operator differs from that of C and Java. In Ruby, -7%3 is 2. In C and Java, the result is -1 instead. In Ruby, the sign of the result (for % operator) is always the same as the sign of the second operand.

Its the modulo operator.
http://en.wikipedia.org/wiki/Modulo_operation

It is the modulo operator, which is a fancy way of saying it's the remainder operator.
So if you divided a number by two, and the integer remainder of that number is one, then you know the number was odd. Your example checks for odd numbers.
Often this is done to highlight odd number rows with a different background color, making it easier to read large lists of data.

This is a very basic question. % is the modulo opereator if counter % 2 == 1 results to true for every odd number and to false for every even number.
If you're learning ruby you should learn how to use irb, there you can try things out and perhaps answer the question yourself.
try to enter
100.times{|i| puts "#{i} % 2 == 1 #=> #{i % 2 == 1}"}
into your irb irb console and see the output, than it should be clear what % does.
And you really should take a look at the rails api documentation (1.9, 1.8.7, 1.8.7), there you would have found the answer two your question % (Fixnum) with a further link to a detailed description of divmod (Numeric):
Returns an array containing the quotient and modulus obtained by dividing num by aNumeric. > If q, r = x.divmod(y), then
q = floor(float(x)/float(y))
x = q*y + r
The quotient is rounded toward -infinity, as shown in the following table:
a | b | a.divmod(b) | a/b | a.modulo(b) | a.remainder(b)
------+-----+---------------+---------+-------------+---------------
13 | 4 | 3, 1 | 3 | 1 | 1
------+-----+---------------+---------+-------------+---------------
13 | -4 | -4, -3 | -3 | -3 | 1
------+-----+---------------+---------+-------------+---------------
-13 | 4 | -4, 3 | -4 | 3 | -1
------+-----+---------------+---------+-------------+---------------
-13 | -4 | 3, -1 | 3 | -1 | -1
------+-----+---------------+---------+-------------+---------------
11.5 | 4 | 2, 3.5 | 2.875 | 3.5 | 3.5
------+-----+---------------+---------+-------------+---------------
11.5 | -4 | -3, -0.5 | -2.875 | -0.5 | 3.5
------+-----+---------------+---------+-------------+---------------
-11.5 | 4 | -3, 0.5 | -2.875 | 0.5 | -3.5
------+-----+---------------+---------+-------------+---------------
-11.5 | -4 | 2 -3.5 | 2.875 | -3.5 | -3.5
Examples
11.divmod(3) #=> [3, 2]
11.divmod(-3) #=> [-4, -1]
11.divmod(3.5) #=> [3, 0.5]
(-11).divmod(3.5) #=> [-4, 3.0]
(11.5).divmod(3.5) #=> [3, 1.0]

To give a few ways to say it:
Modulo operator
Remainder operator
Modular residue
Strictly speaking, if a % b = c, c is the unique constant such that
a == c (mod b) and 0 <= c < b
Where x == y (mod m) iff x - y = km for some constant k.
This is equivalent to the remainder. By some well known theorem, we have that a = bk + c for some constant k, where c is the remainder, which gives us a - c = bk, which obviously implies a == c (mod b).
(Is there a way to use LaTeX on Stackoverflow?)

Related

Fast calculation of probability distribution in board game Da Vinci Code

I'm interested in efficiently calculating the probability distribution over possible secret numbers given what one can observe of the opponents' hand (and your own hand) in the board game Da Vinci Code. A link to the game here: https://boardgamegeek.com/boardgame/8946/da-vinci-code
I have abstracted the problem into the following:
You are given an array A of length N and a finite set of numbers Si for each index i of the array. Now,
we are to place a number from Si at each index i to fill the entire array A;
while ensuring that the number is unique across the entire array A;
and for 3 disjoint subarrays A1, A2, A3 of A such that concat(A1, A2, A3) = A, the numbers in each subarray must follow a strictly increasing order;
given all the possible numbers to form A that satisfy the above constraints, what is the probability ditribution over each number at each index?
Here I provide an example below:
Assuming we have the following array of length 5 with each column representing Si at the index of the column
| 6 6 | 6 6 | 6 |
| 5 | 5 | |
| 4 4 | | 4 |
| | 3 3 | |
| 2 | 2 2 | |
| 1 1 | | |
| ___ | __ | _ |
| A1 | A2 | A3|
The set of all possible arrays are:
14236
14256
14356
15234
15236
15264
15364
16234
16254
16354
24356
25364
26354
45236
Therefore the probability distribution over each number [1-6] at each index is:
6 0 4/14 0 3/14 6/14
5 0 6/14 0 6/14 0
4 1/14 4/14 0 0 8/14
3 0 0 6/14 5/14 0
2 3/14 0 8/14 0 0
1 10/14 0 0 0 0
___________ __________ ______
A1 A2 A3
Brute forcing this problem is obviously doable but I have a gut feeling that there must be some more efficient algorithms for this.
The reason why I think so is due to the fact that one can derive the probability distribution from the set of all possibilities but not the other way around, so the distribution itself must contain less information than the set of all possibilities have. Therefore, I believe that we do not need to generate all possibilites just to obtain the probability distribution.
Hence, I am wondering if there is any smart matrix operation we could use for this problem or even fixed-point iteration/density evolution to approximate the end probability distribution? Some other potentially more efficient approaches to this problem are also appreciated.
Edit: By brute-force, I mean specifically enumerating all possibilities with constraint propagation like in sudoku. My hope is to obtain an accurate solution, or a approximate solution that approximates well (better than plain monte carlo), that works better than CP in terms of running time.
Edit2: The better solution I desire should have the characteristic that it does not need to generate all possibilities to obtain or approximate the probability distribution.
Did you consider Constraint Propagation?
When you assign a number to a position, that number cannot appear in any other position, so exclude that number from the remaining positions
When you assign a number in the first column of a subarray, the second column must contain a larger value, so exclude all values that are lower or equal
With a BF approach in your example the code would generate and check 4 * 4 * 3 * 4 * 2 = 384 possibilities; with the CP approach we only generate 65 possibilities.
Here is a sample Python implementation:
from dataclasses import dataclass, field
from typing import Dict, List
#dataclass
class DaVinci:
grid : List[List[int]]
top : int
lastcol : int = 0
solved : List = field(default_factory=list)
count : int = 0
distrib : List[Dict[int,int]] = field(init=False)
def __post_init__(self):
self.lastcol = len(self.grid)-1
self.distrib = [{x:0 for x in range(1,self.top+1)} for y in range(len(self.grid))]
self.solve_next(current = 0, even = True, blocked = [], minval = 0, solving = [])
self.count = len(self.solved)
def solve_next(self, current, even, blocked, minval, solving):
found = False
for n in self.grid[current]:
if n not in blocked and n > minval:
if current != self.lastcol:
self.solve_next(current + 1, not even, blocked + [n], n * even, solving + [n])
else:
for col in range(self.lastcol):
self.distrib[col][solving[col]] += 1
self.distrib[self.lastcol][n] += 1
self.solved.append(solving + [n])
def show_solved(self):
for sol in self.solved:
print(''.join(map(str,sol)))
def show_distrib(self):
for i in range(1, self.top+1):
print(i, end = ' ')
for col in range(len(self.grid)):
print(f'{self.distrib[col][i]:2d}/{self.count}', end = ' ')
print()
dv = DaVinci([[1,2,4,6],[1,4,5,6],[2,3,6],[2,3,5,6],[4,6]], 6)
dv.show_solved()
14236
14256
14356
15234
15236
15264
15364
16234
16254
16354
24356
25364
26354
45236
dv.show_distrib()
1 10/14 0/14 0/14 0/14 0/14
2 3/14 0/14 8/14 0/14 0/14
3 0/14 0/14 6/14 5/14 0/14
4 1/14 4/14 0/14 0/14 8/14
5 0/14 6/14 0/14 6/14 0/14
6 0/14 4/14 0/14 3/14 6/14
A simple idea to get an approximation for the distribution is to use a Monte Carlo approach.
Set a variable total: = 0 and a matrix M[N][Q] with all entries initially set to zero (Q is the total of numbers allowed).
Fix a positive integer K. Perform K iterations. At each iteration, for each i in [1..N], take a random element from Si and fill the array A. When the array A is all filled, verify in O(N) if it satisfies your conditions. If so, increment by one the variable total and iterate through the array, incrementing the matrix entries M[i][A[i]] by one, for i in [1..N].
In the end, iterate through all the elements of the matrix M in O(N Q) and divide its elements by total to get an approximation for the distribution.
Total time complexity is O(N (K + Q)).
You can also precalculate stuff to make the approximation more precise. For example, you can precalculate all increasing sequences in the groups A1, A2 and A3. Put them in arrays I1, I2, I3. Then, at each iteration, instead of taking random elements from each Si, you take random sequences from I1, I2 and I3 and verify if the concatenation has no repeated elements (in O(N)). If so, proceed as before. The total time complexity (apart from the expensive precalculation) remains O(N (K + Q)).
Start by converting all legal subarray selections into bitvectors.
E.g., for A2 we have [2,3], [2,5], [2,6], [3,5], [3,6]
[2,3] as a bitvector is 000110
[3,5] is 010100
Next, arrange your three subarrays by the number of bitvectors they have.
Next, put these in a hash for each subarray/member combination except the smallest subarray. Use the smallest set bit as the key.
E.g. For [2,3] in A2, we'd have {2 => 000110}
Note that the values of the map to be in an array since there will be multiple bitvectors for each index/element combo.
Finally,
For every bitvec of subarray_small:
For every non-set bit of that bitvec
Find the list that has that bit as a key in subarray_medium
For every bitvec in this list
Check if the inverse of (bitvec_small | bitvec_medium) is in the hash for subarray_large.
If it is, we have a valid arrangement; update your frequency counts.

Prolog Extended Euclidian Algorithm

I have been struggling with some prolog code for several days and I couldn't find a way out of it. I am trying to write the extended Euclidean algorithm and find values p and s in :
a*p + b*s = gcd(a,b)
equation and here is what I have tried :`
common(X,X,X,_,_,_,_,_,_).
common(0,Y,Y,_,_,_,_,_,_).
common(X,0,X,_,_,_,_,_,_).
common(X,Y,_,1,0,L1,L2,SF,TF):-
append(L1,1,[H]),
append(L2,0,[A]),
SF is H ,
TF is A,
common(X,Y,_,0,1,[H],[A],SF,TF).
common(X,Y,_,0,1,L1,L2,SF,TF):-
append(L1,0,[_,S2]),
append(L2,1,[_,T2]),
Q is truncate(X/Y),
S is 1-Q*0,T is 0-Q*1 ,
common(X,Y,_,S,T,[S2,S],
[T2,T],SF,TF).
common(X,Y,N,S,T,[S1,S2],[T1,T2],SF,TF):-
Q is truncate(X/Y),
K is X-(Y*Q),
si_finder(S1,S2,Q,SF),
ti_finder(T1,T2,Q,TF),
common(Y,K,N,S,T,[S2,S],[T2,T],SF,TF).
si_finder(PP,P,Q,C):- C is PP - Q*P.
ti_finder(P2,P1,QA,C2):- C2 is P2 - QA*P1.
After a little search I found that s and p coefficients start from 1 and 0 and the second values for them are 0 and 1 respectively.Then it continues in a pattern which is what I have done in si_finder and ti_finder predicates.Common predicates are where I tried to control the pattern recursively. However the common predicates keeps on returning false in every call. Can anyone help me implement this algorithm in Prolog.
Thanks in advance.
First let's think about the arity of the predicate. Obviously you want to have the numbers A and B as well as the Bézout coefficients P and S as arguments. Since the algorithm is calculating the GCD anyway, it is opportune to have that as an argument as well. That leaves us with arity 5. As we're talking about the extended Euclidean algorithm, let' call the predicate eeuclid/5. Next, consider an example: Let's use the algorithm to calculate P, S and GCD for A=242 and B=69:
quotient (Q) | remainder (B1) | P | S
-------------+-------------------+-------+-------
| 242 | 1 | 0
| 69 | 0 | 1
242/69 = 3 | 242 − 3*69 = 35 | 1 | -3
69/35 = 1 | 69 − 1*35 = 34 | -1 | 4
35/34 = 1 | 35 − 1*34 = 1 | 2 | -7
34/1 = 34 | 34 − 34*1 = 0 | -69 | 242
We can observe the following:
The algorithm stops if the remainder becomes 0
The line before the last row contains the GCD in the remainder column (in this example 1) and the Bézout coefficients in the P and S columns respectively (in this example 2 and -7)
The quotient is calculated from the previous to remainders. So in the next iteration A becomes B and B becomes B1.
P and S are calculated from their respective predecessors. For example: P3 = P1 - 3*P2 = 1 - 3*0 = 1 and S3 = S1 - 3*S2 = 0 - 3*1 = -3. And since it's sufficient to have the previous two P's and S's, we might as well pass them on as pairs, e.g. P1-P2 and S1-S2.
The algorithm starts with the pairs P: 1-0 and S: 0-1
The algorithm starts with the bigger number
Putting all this together, the calling predicate has to ensure that A is the bigger number and, in addition to it's five arguments, it has to pass along the starting pairs 1-0 and 0-1 to the predicate describing the actual relation, here a_b_p_s_/7:
:- use_module(library(clpfd)).
eeuclid(A,B,P,S,GCD) :-
A #>= B,
GCD #= A*P + B*S, % <- new
a_b_p_s_(A,B,P,S,1-0,0-1,GCD).
eeuclid(A,B,P,S,GCD) :-
A #< B,
GCD #= A*P + B*S, % <- new
a_b_p_s_(B,A,S,P,1-0,0-1,GCD).
The first rule of a_b_p_s_/7 describes the base case, where B=0 and the algorithm stops. Then A is the GCD and P1, S1 are the Bézout coefficients. Otherwise the quotient Q, the remainder B1 and the new values for P and S are calculated and a_b_p_s_/7 is called with those new values:
a_b_p_s_(A,0,P1,S1,P1-_P2,S1-_S2,A).
a_b_p_s_(A,B,P,S,P1-P2,S1-S2,GCD) :-
B #> 0,
A #> B, % <- new
Q #= A/B,
B1 #= A mod B,
P3 #= P1-(Q*P2),
S3 #= S1-(Q*S2),
a_b_p_s_(B,B1,P,S,P2-P3,S2-S3,GCD).
Querying this with the above example yields the desired result:
?- eeuclid(242,69,P,S,GCD).
P = 2,
S = -7,
GCD = 1 ;
false.
And indeed: gcd(242,69) = 1 = 2*242 − 7*69
EDIT: On a second thought I would suggest to add two constraints. Firstly Bézout's identity before calling a_b_p_s_/7 and secondly A #> B after the first goal of a_b_p_s_/7. I edited the predicates above and marked the new goals. These additions make eeuclid/5 more versatile. For example, you could ask what numbers A and B have the Bézout coefficients 2 and -7 and 1 as the gcd. There is no unique answer to this query and Prolog will give you residual goals for every potential solution. However, you can ask for a limited range for A and B, say between 0 and 50 and then use label/1 to get actual numbers:
?- [A,B] ins 0..50, eeuclid(A,B,2,-7,1), label([A,B]).
A = 18,
B = 5 ;
A = 25,
B = 7 ;
A = 32,
B = 9 ;
A = 39,
B = 11 ;
A = 46,
B = 13 ;
false. % <- previously loop here
Without the newly added constraints the query would not terminate after the fifth solution. However, with the new constraints Prolog is able to determine, that there are no more solutions between 0 and 50.

Finding all possible combinations of row in a matrix where sum of columns represents a specific row vector

I need to find out all possible combinations of row in a matrix where sum of columns represents a specific row matrix.
Example:
Consider the following matrix
| 0 0 2 |
| 1 1 0 |
| 0 1 2 |
| 1 1 2 |
| 0 1 0 |
| 2 1 2 |
I need to get the following row matrix from where sum of columns:
| 2 2 2 |
The possible combination were:
1.
| 1 1 0 |
| 1 1 2 |
2.
| 0 1 0 |
| 2 1 2 |
What is the best way to find out that.
ALGORITHM
One option is to turn this into the subset sum problem by choosing a base b and treating each row as a number in base b.
For example, with a base of 10 your initial problem turns into:
Consider the list of numbers
002
110
012
112
010
212
Find all subsets that sum to 222
This problem is well known and is solvable via dynamic programming (see the wikipedia page).
If all your entries are nonnegative, then you can use David Psinger's linear time algorithm which has complexity O(nC) where C is the target number and n is the length of your list.
CHOICE OF BASE
The complexity of the algorithm is determined by the choice of the base b.
For the algorithm to be correct you need to choose the base larger than the sum of all the digits in each column. (This is needed to avoid solving the problem due to an overflow from one digit into the next.)
However, note that if you choose a smaller base you will still get all the correct solutions, plus some incorrect solutions. It may be worth considering using a smaller base (which will make the subset sum algorithm work much faster), followed by a postprocessing stage that checks all the solutions found and discards any incorrect ones.
Too small a base will produce an exponential number of incorrect solutions to discard, so the best size of base will depend on the details of your problem.
EXAMPLE CODE
Python code to implement this algorithm.
from collections import defaultdict
A=[ [0, 0, 2],
[1, 1, 0],
[0, 1, 2],
[1, 1, 2],
[0, 1, 0],
[2, 1, 2] ]
target = [2,2,2]
b=10
def convert2num(a):
t=0
for d in a:
t+=b*t+d
return t
B = [convert2num(a) for a in A]
M=defaultdict(list)
for v,a in zip(B,A):
M[v].append(a) # Store a reverse index to allow us to look up rows
# First build the DP array
# Map from number to set of previous numbers
DP = defaultdict(set)
DP[0] = set()
for v in B:
for old_value in DP.keys():
new_value = old_value+v
if new_value<=target:
DP[new_value].add(v)
# Then search for solutions
def go(goal,sol):
if goal==0:
# Double check
assert map(sum,zip(*sol[:]))==target
print sol
return
for v in DP[goal]:
for a in M[v]:
sol.append(a)
go(goal-v,sol)
sol.pop()
go(convert2num(target),[])
This code assumes that b has been chosen large enough to avoid overflow.

What is the best algorithm to find a determinant of a matrix?

Can anyone tell me which is the best algorithm to find the value of determinant of a matrix of size N x N?
Here is an extensive discussion.
There are a lot of algorithms.
A simple one is to take the LU decomposition. Then, since
det M = det LU = det L * det U
and both L and U are triangular, the determinant is a product of the diagonal elements of L and U. That is O(n^3). There exist more efficient algorithms.
Row Reduction
The simplest way (and not a bad way, really) to find the determinant of an nxn matrix is by row reduction. By keeping in mind a few simple rules about determinants, we can solve in the form:
det(A) = α * det(R), where R is the row echelon form of the original matrix A, and α is some coefficient.
Finding the determinant of a matrix in row echelon form is really easy; you just find the product of the diagonal. Solving the determinant of the original matrix A then just boils down to calculating α as you find the row echelon form R.
What You Need to Know
What is row echelon form?
See this [link](http://stattrek.com/matrix-algebra/echelon-form.aspx) for a simple definition
**Note:** Not all definitions require 1s for the leading entries, and it is unnecessary for this algorithm.
You Can Find R Using Elementary Row Operations
Swapping rows, adding multiples of another row, etc.
You Derive α from Properties of Row Operations for Determinants
If B is a matrix obtained by multiplying a row of A by some non-zero constant ß, then
det(B) = ß * det(A)
In other words, you can essentially 'factor out' a constant from a row by just pulling it out front of the determinant.
If B is a matrix obtained by swapping two rows of A, then
det(B) = -det(A)
If you swap rows, flip the sign.
If B is a matrix obtained by adding a multiple of one row to another row in A, then
det(B) = det(A)
The determinant doesn't change.
Note that you can find the determinant, in most cases, with only Rule 3 (when the diagonal of A has no zeros, I believe), and in all cases with only Rules 2 and 3. Rule 1 is helpful for humans doing math on paper, trying to avoid fractions.
Example
(I do unnecessary steps to demonstrate each rule more clearly)
| 2 3 3 1 |
A=| 0 4 3 -3 |
| 2 -1 -1 -3 |
| 0 -4 -3 2 |
R2 R3, -α -> α (Rule 2)
| 2 3 3 1 |
-| 2 -1 -1 -3 |
| 0 4 3 -3 |
| 0 -4 -3 2 |
R2 - R1 -> R2 (Rule 3)
| 2 3 3 1 |
-| 0 -4 -4 -4 |
| 0 4 3 -3 |
| 0 -4 -3 2 |
R2/(-4) -> R2, -4α -> α (Rule 1)
| 2 3 3 1 |
4| 0 1 1 1 |
| 0 4 3 -3 |
| 0 -4 -3 2 |
R3 - 4R2 -> R3, R4 + 4R2 -> R4 (Rule 3, applied twice)
| 2 3 3 1 |
4| 0 1 1 1 |
| 0 0 -1 -7 |
| 0 0 1 6 |
R4 + R3 -> R3
| 2 3 3 1 |
4| 0 1 1 1 | = 4 ( 2 * 1 * -1 * -1 ) = 8
| 0 0 -1 -7 |
| 0 0 0 -1 |
def echelon_form(A, size):
for i in range(size - 1):
for j in range(size - 1, i, -1):
if A[j][i] == 0:
continue
else:
try:
req_ratio = A[j][i] / A[j - 1][i]
# A[j] = A[j] - req_ratio*A[j-1]
except ZeroDivisionError:
# A[j], A[j-1] = A[j-1], A[j]
for x in range(size):
temp = A[j][x]
A[j][x] = A[j-1][x]
A[j-1][x] = temp
continue
for k in range(size):
A[j][k] = A[j][k] - req_ratio * A[j - 1][k]
return A
If you did an initial research, you've probably found that with N>=4, calculation of a matrix determinant becomes quite complex. Regarding algorithms, I would point you to Wikipedia article on Matrix determinants, specifically the "Algorithmic Implementation" section.
From my own experience, you can easily find a LU or QR decomposition algorithm in existing matrix libraries such as Alglib. The algorithm itself is not quite simple though.
I am not too familiar with LU factorization, but I know that in order to get either L or U, you need to make the initial matrix triangular (either upper triangular for U or lower triangular for L). However, once you get the matrix in triangular form for some nxn matrix A and assuming the only operation your code uses is Rb - k*Ra, you can just solve det(A) = Π T(i,i) from i=0 to n (i.e. det(A) = T(0,0) x T(1,1) x ... x T(n,n)) for the triangular matrix T. Check this link to see what I'm talking about. http://matrix.reshish.com/determinant.php

Why is (a | b ) equivalent to a - (a & b) + b?

I was looking for a way to do a BITOR() with an Oracle database and came across a suggestion to just use BITAND() instead, replacing BITOR(a,b) with a + b - BITAND(a,b).
I tested it by hand a few times and verified it seems to work for all binary numbers I could think of, but I can't think out quick mathematical proof of why this is correct.
Could somebody enlighten me?
A & B is the set of bits that are on in both A and B. A - (A & B) leaves you with all those bits that are only on in A. Add B to that, and you get all the bits that are on in A or those that are on in B.
Simple addition of A and B won't work because of carrying where both have a 1 bit. By removing the bits common to A and B first, we know that (A-(A&B)) will have no bits in common with B, so adding them together is guaranteed not to produce a carry.
Imagine you have two binary numbers: a and b. And let's say that these number never have 1 in the same bit at the same time, i.e. if a has 1 in some bit, the b always has 0 in the corresponding bit. And in other direction, if b has 1 in some bit, then a always has 0 in that bit. For example
a = 00100011
b = 11000100
This would be an example of a and b satisfying the above condition. In this case it is easy to see that a | b would be exactly the same as a + b.
a | b = 11100111
a + b = 11100111
Let's now take two numbers that violate our condition, i.e. two numbers have at least one 1 in some common bit
a = 00100111
b = 11000100
Is a | b the same as a + b in this case? No
a | b = 11100111
a + b = 11101011
Why are they different? They are different because when we + the bit that has 1 in both numbers, we produce so called carry: the resultant bit is 0, and 1 is carried to the next bit to the left: 1 + 1 = 10. Operation | has no carry, so 1 | 1 is again just 1.
This means that the difference between a | b and a + b occurs when and only when the numbers have at least one 1 in common bit. When we sum two numbers with 1 in common bits, these common bits get added "twice" and produce a carry, which ruins the similarity between a | b and a + b.
Now look at a & b. What does a & b calculate? a & b produces the number that has 1 in all bits where both a and b have 1. In our latest example
a = 00100111
b = 11000100
a & b = 00000100
As you saw above, these are exactly the bits that make a + b differ from a | b. The 1 in a & b indicate all positions where carry will occur.
Now, when we do a - (a & b) we effectively remove (subtract) all "offending" bits from a and only such bits
a - (a & b) = 00100011
Numbers a - (a & b) and b have no common 1 bits, which means that if we add a - (a & b) and b we won't run into a carry, and, if you think about it, we should end up with the same result as if we just did a | b
a - (a & b) + b = 11100111
A&B = C where any bits left set in C are those set in both A and in B.
Either A-C = D or B-C = E sets just these common bits to 0. There is no carrying effect because 1-1=0.
D+B or E+A is similar to A+B except that because we subtracted A&B previously there will be no carry due to having cleared all commonly set bits in D or E.
The net result is that A-A&B+B or B-A&B+A is equivalent to A|B.
Here's a truth table if it's still confusing:
A | B | OR A | B | & A | B | - A | B | +
---+---+---- ---+---+--- ---+---+--- ---+---+---
0 | 0 | 0 0 | 0 | 0 0 | 0 | 0 0 | 0 | 0
0 | 1 | 1 0 | 1 | 0 0 | 1 | 0-1 0 | 1 | 1
1 | 0 | 1 1 | 0 | 0 1 | 0 | 1 1 | 0 | 1
1 | 1 | 1 1 | 1 | 1 1 | 1 | 0 1 | 1 | 1+1
Notice the carry rows in the + and - operations, we avoid those because A-(A&B) sets cases were both bits in A and B are 1 to 0 in A, then adding them back from B also brings in the other cases were there was a 1 in either A or B but not where both had 0, so the OR truth table and the A-(A&B)+B truth table are identical.
Another way to eyeball it is to see that A+B is almost like A|B except for the carry in the bottom row. A&B isolates that bottom row for us, A-A&B moves those isolated cased up two rows in the + table, and the (A-A&B)+B becomes equivalent to A|B.
While you could commute this to A+B-(A&B), I was afraid of a possible overflow but that was unjustified it seems:
#include <stdio.h>
int main(){ unsigned int a=0xC0000000, b=0xA0000000;
printf("%x %x %x %x\n",a, b, a|b, a&b);
printf("%x %x %x %x\n",a+b, a-(a&b), a-(a&b)+b, a+b-(a&b)); }
c0000000 a0000000 e0000000 80000000
60000000 40000000 e0000000 e0000000
Edit: So I wrote this before there were answers, then there was some 2 hours of down time on my home connection, and I finally managed to post it, noticing only afterwards that it'd been properly answered twice. Personally I prefer referring to a truth table to work out bitwise operations, so I'll leave it in case it helps someone.

Resources