How does clearing least significant bit repeatedly work in this algorithm? - algorithm

The following function determines the number of bits you would need to flip to convert integer A to integer B. I have solved it by a different method, but when I read this solution I don't understand it. It clearly works, but my question is why?
def bit_swap_required(a: int, b: int) -> int:
count, c = 0, a ^ b
while c:
count, c = count + 1, c & (c - 1)
return count
I understand why we do a^b. It gives us a '1' at every place we need to flip. But, How does doing c & (c-1) repeatedly give you the EXACT number of '1's in the number?

c - 1 unsets the least significant bit in the binary representation of c and sets all the unset bits to the right of that bit.
When you binary and c - 1 with c you effectively unset all the bits to the right of the least significant set bit and also the least significant set bit. In other words the least significant set bit in c and everything to its right become zeros.
You count this as one, and rightly so. Because it was just one set bit fomr a ^ b.
Now, you continue this operation until c becomes zero and the number of operations is the number of set bits in c which is the number of different bits between a and b.
To give you an example for what c - 1 does to the binary representation of c:
c = 6, in binary 110
c-1 = 5, in binary 101
(c-1) & (c) = 110 & 101 = 100
The above is one iteration of the loop
Next iteration
c = 4, in binary 100
c-1 = 3, in binary 011
(c-1) & (c) = 100 & 101 = 0
The above successfully counts the number of set bits in 6.
This optimization helps you to improve the algorithm compared to when you right shift the number at each iteration and check whether the least significant bit is set. In the former case you operate in the number of positions where the most significant set bit is. Say for a power of 2 number, 2^7, you iterate 8 times until the number becomes zero. While with the optimized method you iterate according to the number of set bits. For 2^7 it would be just one iteration.

From the structure of the code, you can guess that c & (c-1) has one 1 less than c, even without investigating the expression
Indeed, subtracting 1 flips all 0 bits from the right (there are borrows) until the rightmost 1 inclusive. So if you bitwise and with c, only that rightmost 1 disappears.
E.g. c = 1001010100 -> c-1 = 1001010011 -> c & (c-1) = 1001010000.
Next, c = 1001010000 -> c-1 = 1001001111 -> c & (c-1) = 1001000000.

Related

Computing the most accurate result for: the sum of a list of fixed point nubers

Lets say you'd have to add some 32-bit fixed point numbers stored in a huge array L and you'd like to get the most accurate result as possible. Furthermore, you're not allowed to use anything else than L and the 32-bit fixed point numbers (i.e. you're not allowed to convert them to 64-bit). What would be your approach to get the most accurate result for the sum of the numbers in L?
This would be my current approach noted in sudo code:
L = sort(L)
result = 0
lastMax = false -- indicates whether we've extracted the maximum from L last time
while (not empty(L)) and (result not equals +INF or -INF) do:
current = 0
if lastMax:
current = extractMin(L) -- gets and removes minimum from L
else:
current = extractMax(L) -- gets and removes maximum from L
result = safeAdd(result, current)
lastMax = not lastMax
safeAdd(a,b):
if a = +INF: return +INF
else if a = -INF: return -INF
else: return a + b
So I'm alternating between adding the minimum/maximum from the remaining list L in order to stay between the ranges of L. The way how safeAdd is implemented shows that once we've crossed the ranges of accuracy (i.e., the result of a+b has yielded +INF or -INF - just as it's done in C) we will not alter the result anymore.
Do you have any suggestions on how to improve the approach?
Sidenote: If we want to be very precise: We further assume that the + operation can yield +INF or -INF which can be represented as fixed point numbers in the programming language. But we assume that the values +INF, -INF do not occur in L. And we ignore the fact the fixed point standard may also have a representation for NaN.
If you know, that the result is in range, you don't need to care about overflow or underflow.
Here is an example with 4 bits only to keep it simple
7 0111
+1 0001
= 1000 <-- -8 overflow
-1 0001
= 0111
If you are not sure, if the result is in range, you need to count overflows and underflows.
For a = b + c
Overflow if b and c are positive and a is negative
Underflow if b and c are negative and a is positive

How can I minimise number of additions?

Multiply two numbers without using * operator, and with minimum number of additions
For eg: If input is, 5*8, one of the following ways, can be add the bigger number smaller number of times, and that will be the answer. But how can I minimise the number of additions?
One strategy to minimize reduce the number of additions is to add things hierarchically. This is the same strategy that is used in the classic power algorithm, which follows the same technique for minimizing the number of multiplications.
Let's say you need
M = a * 8 = a + a + a + a + a + a + a + a
Once you calculate m2 = a + a, you can substitute it into the above addition and get
M = m2 + m2 + m2 + m2
Then you can calculate m4 = m2 + m2 and arrive at
M = m4 + m4
So, the result is calculated in 3 additions instead of the original 8. However, adding a value to itself can be replaced by a left-shift by 1 bit (if this is allowed), this greatly reducing the number of additions.
This technique can be elegantly implemented through analyzing the binary representation of one of the multiplicands (exactly as it is typically implemented in the power algorithm). E.g. if you need to calculate a * b you can do it in this fashion
int M = 0;
for (int m = a; b != 0; b >>= 1, m <<= 1)
if ((b & 1) != 0)
M += m;
The total number of additions such implementation will use is the total number of 1 bits in b. It will multiply 5 by 8 in 1 addition.
Note that in order to achieve the lowest the number of additions provided by this strategy, multiplying larger number by smaller number is not necessarily the best idea. E.g. multiplying by 8 uses less additions than multiplying by 5.
A better example will be 5 * 7. This is essentially the binary multiplication using old methods, but with clever choice of the multiplier.
If we can use left-shift and that doesn't count as an addition: choose the number with the smaller number of bits as the multiplier. This will be 5 in this case.
111
x 101
------
111
000x <== This is not an addition, only a left shift
111xx
-------
100011 <== 2 additions totally.
-------
If we cannot use left-shift: note that left shift is the same as doubling / additions. Then we will have to use a slightly different tactic. Since the multiplicand will be shifted the same number of times as the (position of MSB - 1), the number of additions will be the number with the lesser value of (position of MSB - 1) + (number of bits set). In the case of 5 * 8, the values are (3-1) + 2 = 4 and (4-1) = 3 respectively. The lesser is for 8 and hence use that as the multiplier.
101
x 1000
-------
000
000x <== left shift
000xx <== left shift
101xxx <== left shift
--------
101000 <== no addition needed, so 3 additions totally.
--------
The above has three shifts and zero additions.
I like Codor's suggestion of using shifts and having zero additions!
But if you can truly only use additions and no other operations like shifts, logs, subtractions, etc, I believe the minimal number of additions to compute a * b will be:
min{int[log2(a+1)] + numbits(a), int[log2(b+1)] + numbits(b)} - 2
where
numbits(n) is the number of ones in the binary representation of
integer n
For example, numbits(4)=1, numbits(5)=2, etc.
int[x] is the integer part of float x
For example, int[3.9]=3
Now, how did we get there? First look at your original example. You can at least group additions together. E.g.
8+8=16
16+16=32
32+8=40
To generalize this, if you need to multiply a b times by only using additions that used a or the results of additions already computed, you need:
int[log2(b+1)]-1 additions to compute all the 2^n.a intermediate numbers you need.
In your example, int[log2(5+1)]-1 = 2: you need 2 additions to compute 16 and 32
numbits(b)-1 additions to add all intermediate results together, where numbits(b) is the number of ones in the binary representation of b.
In your example, 5 = 2^2 + 2^0 so numbits(5)-1 = 1: you need 1 addition to do 32 + 8
Interestingly, this means that your statement
add the bigger number smaller number of times
is not always the recipe to minimize the number of additions.
For example, if you need to compute 2^9 * (2^9 - 1), you are better off computing additions based on (2^9-1) than on 2^9 even though 2^9 is larger. The fastest approach is:
x = (2^9-1) + (2^9-1)
And then
x = x+x
8 times for a total of 9 additions.
If instead you added 2^9 to itself, you would need 8 additions to get all the 2^k*2^9 first and then an additional 8 additions to add all these numbers together for a total of 16 additions.
suppose a is to be multiplied with b and we are storing the result in res, we add a to res only if b is odd, else keep dividing b by 2 and multiplying a by 2. this is done in a loop till b becomes 0. multiplication and division can be done using bitwise operator.
Let the two given numbers be 'a' and 'b'
1) Initialize result 'res' as 0.
2) Do following while 'b' is greater than 0
a) If 'b' is odd, add 'a' to 'res'
b) Double 'a' and halve 'b'
3) Return 'res'.

Generating all integers n which n&m==n(m is a given integers) effeciently in C

& is bit AND in C.
Example:
m= 21(10101)
The result:
0 16 4 1 20 5 17 21
My code:
for(int i=0;i<=m;i++)
if ((i&m)==i) printf("%d ",i);
This will be very slow when m is large.
How to find the result quickly (when answer is very few, such as m=2^30)?
i = 0
repeat
print(i)
i = (i + (NOT m) + 1) AND m
until i == 0
UPD :
A bit simpler code:
i = 0
repeat
print(i)
i = (i - 1) AND m
until i == 0
The accepted answer is correct, but a bit sparse on detail, so here's why it works
Observe that the set { n | n & m == n } is the set of all n such that the bits in n are a subset of those in m. If all those bits were in a group starting at the least significant bit, they could be generated by simply counting from 0 to m.
But they're not necessarily grouped thus, there may be empty spaces between the bits. These spaces should be skipped, and how do you make the carry from the incrementing skip those bits?
Well you can make the carry propagate (not exactly skip) through those bits by making them 1. That leaves some garbage in those bits though, which has to be cleared out.
So, first set the bits in the gaps: i | ~m (written equivalently as i ^ ~m or i + ~m because the bits in ~m are guaranteed to not overlap with any bits in i).
Then do the increment (+1) and then throw away any garbage left in the gaps: & m.
In total: ((i | ~m) + 1) & m.
For the new code, observe that making a borrow from a subtraction propagate through the gaps is easier, because in order for it to propagate the gap should be filled with zeroes, which it already is. The only issue, then, is that some garbage may be left behind in the gaps, which should be cleared out with & m, giving, in total, (i - 1) & m.
One line code:
for(i=m;i;i=(i-1)&m) printf("%d ",i);printf("0");

How to implement Random(a,b) with only Random(0,1)? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
how to get uniformed random between a, b by a known uniformed random function RANDOM(0,1)
In the book of Introduction to algorithms, there is an excise:
Describe an implementation of the procedure Random(a, b) that only makes calls to Random(0,1). What is the expected running time of your procedure, as a function of a and b? The probability of the result of Random(a,b) should be pure uniformly distributed, as Random(0,1)
For the Random function, the results are integers between a and b, inclusively. For e.g., Random(0,1) generates either 0 or 1; Random(a, b) generates a, a+1, a+2, ..., b
My solution is like this:
for i = 1 to b-a
r = a + Random(0,1)
return r
the running time is T=b-a
Is this correct? Are the results of my solutions uniformly distributed?
Thanks
What if my new solution is like this:
r = a
for i = 1 to b - a //including b-a
r += Random(0,1)
return r
If it is not correct, why r += Random(0,1) makes r not uniformly distributed?
Others have explained why your solution doesn't work. Here's the correct solution:
1) Find the smallest number, p, such that 2^p > b-a.
2) Perform the following algorithm:
r=0
for i = 1 to p
r = 2*r + Random(0,1)
3) If r is greater than b-a, go to step 2.
4) Your result is r+a
So let's try Random(1,3).
So b-a is 2.
2^1 = 2, so p will have to be 2 so that 2^p is greater than 2.
So we'll loop two times. Let's try all possible outputs:
00 -> r=0, 0 is not > 2, so we output 0+1 or 1.
01 -> r=1, 1 is not > 2, so we output 1+1 or 2.
10 -> r=2, 2 is not > 2, so we output 2+1 or 3.
11 -> r=3, 3 is > 2, so we repeat.
So 1/4 of the time, we output 1. 1/4 of the time we output 2. 1/4 of the time we output 3. And 1/4 of the time we have to repeat the algorithm a second time. Looks good.
Note that if you have to do this a lot, two optimizations are handy:
1) If you use the same range a lot, have a class that computes p once so you don't have to compute it each time.
2) Many CPUs have fast ways to perform step 1 that aren't exposed in high-level languages. For example, x86 CPUs have the BSR instruction.
No, it's not correct, that method will concentrate around (a+b)/2. It's a binomial distribution.
Are you sure that Random(0,1) produces integers? it would make more sense if it produced floating point values between 0 and 1. Then the solution would be an affine transformation, running time independent of a and b.
An idea I just had, in case it's about integer values: use bisection. At each step, you have a range low-high. If Random(0,1) returns 0, the next range is low-(low+high)/2, else (low+high)/2-high.
Details and complexity left to you, since it's homework.
That should create (approximately) a uniform distribution.
Edit: approximately is the important word there. Uniform if b-a+1 is a power of 2, not too far off if it's close, but not good enough generally. Ah, well it was a spontaneous idea, can't get them all right.
No, your solution isn't correct. This sum'll have binomial distribution.
However, you can generate a pure random sequence of 0, 1 and treat it as a binary number.
repeat
result = a
steps = ceiling(log(b - a))
for i = 0 to steps
result += (2 ^ i) * Random(0, 1)
until result <= b
KennyTM: my bad.
I read the other answers. For fun, here is another way to find the random number:
Allocate an array with b-a elements.
Set all the values to 1.
Iterate through the array. For each nonzero element, flip the coin, as it were. If it is came up 0, set the element to 0.
Whenever, after a complete iteration, you only have 1 element remaining, you have your random number: a+i where i is the index of the nonzero element (assuming we start indexing on 0). All numbers are then equally likely. (You would have to deal with the case where it's a tie, but I leave that as an exercise for you.)
This would have O(infinity) ... :)
On average, though, half the numbers would be eliminated, so it would have an average case running time of log_2 (b-a).
First of all I assume you are actually accumulating the result, not adding 0 or 1 to a on each step.
Using some probabilites you can prove that your solution is not uniformly distibuted. The chance that the resulting value r is (a+b)/2 is greatest. For instance if a is 0 and b is 7, the chance that you get a value 4 is (combination 4 of 7) divided by 2 raised to the power 7. The reason for that is that no matter which 4 out of the 7 values are 1 the result will still be 4.
The running time you estimate is correct.
Your solution's pseudocode should look like:
r=a
for i = 0 to b-a
r+=Random(0,1)
return r
As for uniform distribution, assuming that the random implementation this random number generator is based on is perfectly uniform the odds of getting 0 or 1 are 50%. Therefore getting the number you want is the result of that choice made over and over again.
So for a=1, b=5, there are 5 choices made.
The odds of getting 1 involves 5 decisions, all 0, the odds of that are 0.5^5 = 3.125%
The odds of getting 5 involves 5 decisions, all 1, the odds of that are 0.5^5 = 3.125%
As you can see from this, the distribution is not uniform -- the odds of any number should be 20%.
In the algorithm you created, it is really not equally distributed.
The result "r" will always be either "a" or "a+1". It will never go beyond that.
It should look something like this:
r=0;
for i=0 to b-a
r = a + r + Random(0,1)
return r;
By including "r" into your computation, you are including the "randomness" of all the previous "for" loop runs.

Algorithm - Partition two numbers about a power-of-two

Given two floating point numbers, p and q where 0 < p < q I am interested in writing a function partition(p,q) that finds the 'simplest' number r that is between p and q. For example:
partition(3.0, 4.1) = 4.0 (2^2)
partition(4.2, 7.0) = 6.0 (2^2 + 2^1)
partition(2.0, 4.0) = 3.0 (2^1 + 2^0)
partition(0.3, 0.6) = 0.5 (2^-1)
partition(1.0, 10.0) = 8.0 (2^3)
In the last instance I am interested in the largest number (so 8 as opposed to 4 or 2).
Let us assume assume that p and q are both normalized and positive, and p < q.
If p and q have differing exponents, it appears that the number you are looking for is the number obtained by zeroing the mantissa of q after the leading (and often implicit) 1. The corner cases are left as an exercise, especially the case where q's mantissa is already made of zeroes after the leading, possibly implicit, 1.
If p and q have the same exponent, then we have to look at their mantissas. These mantissas have some bits in common (starting from the most significant end). Let us call c1 c2 .. ck pk+1 ... pn the bits of p's mantissa, c1 c2 .. ck qk+1 ... qnthe bits of q's mantissa, where c1 .. ck are common bits and pk+1, qk+1 differ. Then pk+1 is zero and qk+1 is one (because of the hypotheses). The number with the same exponent and mantissa c1 .. ck 1 0 .. 0 is in the interval p .. q and is the number you are looking for (again, corner cases left as an exercise).
Write the numbers in binary (terminating if possible, so 1 is written as 1.0000..., not 0.1111...),
Scan from left to right, "keeping" all digits at which the two numbers are equal
At the first digit where the two numbers differ, p must be 0 and q must be 1 since p < q:
If q has any more 1 digits after this point, then put a 1 at this point and you're done.
If q has no more 1 digits after this point, then doing that would result in r == q, which is forbidden, so instead append a 0 digit. Follow that by a 1 digit unless doing so would result in r == p, in which case append another 0 and then a 1.
Basically, we truncate q down to the first place at which p and q differ, then jigger it a bit if necessary to avoid r == p or r == q. The result is certainly less than q and greater than p. It is "simplest" (has the least possible number of 1 digits) since any number between p and q must share their common initial sequence. We have added only one 1 digit to that sequence, which is necessary since the initial sequence alone is <= p, so no value in range (p,q) has fewer 1 digits. We've chosen the "largest" solution because we always place our extra 1 at the first (biggest) possible place.
It sounds like you just want to convert the binary representation of the largest integer strictly less than your largest argument to the corresponding sum of powers of two.

Resources