Count the number of set bits in an integer [duplicate] - algorithm

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Best algorithm to count the number of set bits in a 32-bit integer?
I came across this question in an interview. I want to find the number of set bits in a given number in an optimized way.
Example:
If the given number is 7 then output should be 3 (since binary of 7 is 111 we have three 1s).
If the given number 8 then output should be 1 (since binary of 8 is 1000 we have one 1s).
We need to find the number of ones in an optimized way. Any suggestions?

Warren has a whole chapter about counting bits, including one about Conting 1-bits.
The problem can be solved in a divide and conquer manner, i.e. summing 32bits is solved as summing up 2 16bit numbers and so on. This means we just add the number of ones in two n bit Fields together into one 2n field.
Example:
10110010
01|10|00|01
0011|0001
00000100
The code for this looks something like this:
x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x & 0x0f0f0f0f) + ((x >> 4) & 0x0f0f0f0f);
x = (x & 0x00ff00ff) + ((x >> 8) & 0x00ff00ff);
x = (x & 0x0000ffff) + ((x >> 16) & 0x0000ffff);
We're using ((x >> 1) & 0x55555555) rather than (x & 0xAAAAAAAA) >> 1 only because we want to avoid generating two large constants in a register. If you look at it, you can see that the last and is quite useless and other ands can be omitted as well if there's no danger that the sum will carry over. So if we simplify the code, we end up with this:
int pop(unsigned x) {
x = x - ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x + (x >> 4)) & 0x0f0f0f0f;
x = x + (x >> 8);
x = x + (x >> 16);
return x & 0x0000003f;
}
That'd be 21 instructions, branch free on a usual RISC machine. Depending on how many bits are set on average it may be faster or slower than the kerrigan loop - though probably also depends on the used CPU.

Conceptually this works:
int numones(int input) {
int num = 0;
do {
num += input % 2;
input = input / 2;
} while (input > 0);
return num;
}
A more optimized way (from commenters link above):
unsigned int v; // count the number of bits set in v
unsigned int c; // c accumulates the total bits set in v
for (c = 0; v; c++)
{
v &= v - 1; // clear the least significant bit set
}

If you are using GCC, use the builtin function int __builtin_popcount (unsigned int x). On some machines, this will reduce to a single instruction.

Related

Finding the Nth number where a given binary digit is set to 0

I am looking for an algorithm which given a target, the Nth number with that target bit at 0 is returned.
For example, for the inputs of n={0,1,2,3} and target=1,
The output would be (in binary)
000,001,100,101
Just write the value N-1 (if enumeration starts from 1) in binary, and then insert a 0 in the required position (target).
For example:
for N=3 and target=1
N-1 = 10bin
inserting 0 in 1-th position gives
R = 100b = 4dec
with bit operations:
NN = N- 1
Mask = (1 << target) - 1 //like 00000111 for target=3
NotMask = ~ Mask //like 11111000 for target=3
R = (NN & Mask) | ((NN & NotMask) << 1)
expression (NN & Mask) selects bits right to target bit (zeroing other bits)
expression (NN & NotMask) << 1 selects left bits, then shifts them to free a place for zero target bit
The target bit oscillates between being unset target**2 times and set target**2 times, as we move up the sequence of natural numbers. So we can directly calculate the nth number where the target bit is unset.
JavaScript code:
function f(n, target){
let block = 1 << target
if (n < block)
return n
let index_in_block = n % block;
let num_set_blocks = (n - index_in_block) / block
return 2 * num_set_blocks * block + index_in_block
}
for (let i=0; i<10; i++)
console.log(i, f(i, 1))

mirror bits in char, limited operators +,<<,& no loops allowed, C language

Preparing for exam and got stuck at this question:
Allowed operators are <<,+,& no loops allowed and minimum temp variables.
Write a function in C, that gets 4-bit number (char) and returns mirrored (relative to center) bits.
Example: given b4,b3,b2,b1 return b1,b2,b3,b4
O_o thanks!
it might be not clear, but general language tools are allowed ('==',if,>,< etc..)
This is not possible given the constraints of only the operators <<, +, & and no other constructs besides return.
To move b3 from the 3rd position to the 2nd position, you will need a way to shift to the right which requires something like >> or /. Of the operators provided, none can be used with b3 to set the 2nd or 1st bit position.
if you can use if statements and the assignment operator =, it is possible. You can then write a messy solution such as
char flip(char c)
{
char f;
f = (c & 1) << 3 + (c & 2) << 1;
if (c & 4)
f = f + 2;
if (c & 8)
f = f + 1;
return f;
}
A more ugly but shorter one liner if you can use the similar to if ? operator.
char flip(char c)
{
return (c & 1) << 3 + (c & 2) << 1 + ((c & 4) ? 2 : 0) + ((c & 8) ? 1 : 0);
}

Reversing the bits in an integer x

Bit Reversal
I found this code for reversing the bits in an integer x (assume a 32bit value):
unsigned int
reverse(register unsigned int x)
{
x = (((x & 0xaaaaaaaa) >> 1) | ((x & 0x55555555) << 1));
x = (((x & 0xcccccccc) >> 2) | ((x & 0x33333333) << 2));
x = (((x & 0xf0f0f0f0) >> 4) | ((x & 0x0f0f0f0f) << 4));
x = (((x & 0xff00ff00) >> 8) | ((x & 0x00ff00ff) << 8));
return((x >> 16) | (x << 16));
}
I am unable to understand the logic/algorithm behind this code. What is the purpose of all the magic numbers?
Let's look at how it's done for an 8 bit value:
The first line in the function takes every second bit and moves it left or right:
12345678 --> 1-3-5-7- --> -1-3-5-7 --> 21436587
-2-4-6-8 2-4-6-8-
The second line takes groups of two bits and moves left or right:
21436587 --> 21--65-- --> --21--65 --> 43218765
--43--87 43--87--
The third line takes groups of four bits and moves left or right:
43218765 --> 4321---- --> ----4321 --> 87654321
----8765 8765----
Now the bits are reversed. For a 32 bit value you need two more steps that moves bits in groups of 8 and 16.
That's bit masks. Write them in binary form and remember how bitwise and works. You can also take a piece of paper and write down every step's masks, input and result to sort it out.

Number of 1s in the two's complement binary representations of integers in a range

This problem is from the 2011 Codesprint (http://csfall11.interviewstreet.com/):
One of the basics of Computer Science is knowing how numbers are represented in 2's complement. Imagine that you write down all numbers between A and B inclusive in 2's complement representation using 32 bits. How many 1's will you write down in all ?
Input:
The first line contains the number of test cases T (<1000). Each of the next T lines contains two integers A and B.
Output:
Output T lines, one corresponding to each test case.
Constraints:
-2^31 <= A <= B <= 2^31 - 1
Sample Input:
3
-2 0
-3 4
-1 4
Sample Output:
63
99
37
Explanation:
For the first case, -2 contains 31 1's followed by a 0, -1 contains 32 1's and 0 contains 0 1's. Thus the total is 63.
For the second case, the answer is 31 + 31 + 32 + 0 + 1 + 1 + 2 + 1 = 99
I realize that you can use the fact that the number of 1s in -X is equal to the number of 0s in the complement of (-X) = X-1 to speed up the search. The solution claims that there is a O(log X) recurrence relation for generating the answer but I do not understand it. The solution code can be viewed here: https://gist.github.com/1285119
I would appreciate it if someone could explain how this relation is derived!
Well, it's not that complicated...
The single-argument solve(int a) function is the key. It is short, so I will cut&paste it here:
long long solve(int a)
{
if(a == 0) return 0 ;
if(a % 2 == 0) return solve(a - 1) + __builtin_popcount(a) ;
return ((long long)a + 1) / 2 + 2 * solve(a / 2) ;
}
It only works for non-negative a, and it counts the number of 1 bits in all integers from 0 to a inclusive.
The function has three cases:
a == 0 -> returns 0. Obviously.
a even -> returns the number of 1 bits in a plus solve(a-1). Also pretty obvious.
The final case is the interesting one. So, how do we count the number of 1 bits from 0 to an odd number a?
Consider all of the integers between 0 and a, and split them into two groups: The evens, and the odds. For example, if a is 5, you have two groups (in binary):
000 (aka. 0)
010 (aka. 2)
100 (aka. 4)
and
001 (aka 1)
011 (aka 3)
101 (aka 5)
Observe that these two groups must have the same size (because a is odd and the range is inclusive). To count how many 1 bits there are in each group, first count all but the last bits, then count the last bits.
All but the last bits looks like this:
00
01
10
...and it looks like this for both groups. The number of 1 bits here is just solve(a/2). (In this example, it is the number of 1 bits from 0 to 2. Also, recall that integer division in C/C++ rounds down.)
The last bit is zero for every number in the first group and one for every number in the second group, so those last bits contribute (a+1)/2 one bits to the total.
So the third case of the recursion is (a+1)/2 + 2*solve(a/2), with appropriate casts to long long to handle the case where a is INT_MAX (and thus a+1 overflows).
This is an O(log N) solution. To generalize it to solve(a,b), you just compute solve(b) - solve(a), plus the appropriate logic for worrying about negative numbers. That is what the two-argument solve(int a, int b) is doing.
Cast the array into a series of integers. Then for each integer do:
int NumberOfSetBits(int i)
{
i = i - ((i >> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
}
Also this is portable, unlike __builtin_popcount
See here: How to count the number of set bits in a 32-bit integer?
when a is positive, the better explanation was already been posted.
If a is negative, then on a 32-bit system each negative number between a and zero will have 32 1's bits less the number of bits in the range from 0 to the binary representation of positive a.
So, in a better way,
long long solve(int a) {
if (a >= 0){
if (a == 0) return 0;
else if ((a %2) == 0) return solve(a - 1) + noOfSetBits(a);
else return (2 * solve( a / 2)) + ((long long)a + 1) / 2;
}else {
a++;
return ((long long)(-a) + 1) * 32 - solve(-a);
}
}
In the following code, the bitsum of x is defined as the count of 1 bits in the two's complement representation of the numbers between 0 and x (inclusive), where Integer.MIN_VALUE <= x <= Integer.MAX_VALUE.
For example:
bitsum(0) is 0
bitsum(1) is 1
bitsum(2) is 1
bitsum(3) is 4
..etc
10987654321098765432109876543210 i % 10 for 0 <= i <= 31
00000000000000000000000000000000 0
00000000000000000000000000000001 1
00000000000000000000000000000010 2
00000000000000000000000000000011 3
00000000000000000000000000000100 4
00000000000000000000000000000101 ...
00000000000000000000000000000110
00000000000000000000000000000111 (2^i)-1
00000000000000000000000000001000 2^i
00000000000000000000000000001001 (2^i)+1
00000000000000000000000000001010 ...
00000000000000000000000000001011 x, 011 = x & (2^i)-1 = 3
00000000000000000000000000001100
00000000000000000000000000001101
00000000000000000000000000001110
00000000000000000000000000001111
00000000000000000000000000010000
00000000000000000000000000010001
00000000000000000000000000010010 18
...
01111111111111111111111111111111 Integer.MAX_VALUE
The formula of the bitsum is:
bitsum(x) = bitsum((2^i)-1) + 1 + x - 2^i + bitsum(x & (2^i)-1 )
Note that x - 2^i = x & (2^i)-1
Negative numbers are handled slightly differently than positive numbers. In this case the number of zeros is subtracted from the total number of bits:
Integer.MIN_VALUE <= x < -1
Total number of bits: 32 * -x.
The number of zeros in a negative number x is equal to the number of ones in -x - 1.
public class TwosComplement {
//t[i] is the bitsum of (2^i)-1 for i in 0 to 31.
private static long[] t = new long[32];
static {
t[0] = 0;
t[1] = 1;
int p = 2;
for (int i = 2; i < 32; i++) {
t[i] = 2*t[i-1] + p;
p = p << 1;
}
}
//count the bits between x and y inclusive
public static long bitsum(int x, int y) {
if (y > x && x > 0) {
return bitsum(y) - bitsum(x-1);
}
else if (y >= 0 && x == 0) {
return bitsum(y);
}
else if (y == x) {
return Integer.bitCount(y);
}
else if (x < 0 && y == 0) {
return bitsum(x);
} else if (x < 0 && x < y && y < 0 ) {
return bitsum(x) - bitsum(y+1);
} else if (x < 0 && x < y && 0 < y) {
return bitsum(x) + bitsum(y);
}
throw new RuntimeException(x + " " + y);
}
//count the bits between 0 and x
public static long bitsum(int x) {
if (x == 0) return 0;
if (x < 0) {
if (x == -1) {
return 32;
} else {
long y = -(long)x;
return 32 * y - bitsum((int)(y - 1));
}
} else {
int n = x;
int sum = 0; //x & (2^i)-1
int j = 0;
int i = 1; //i = 2^j
int lsb = n & 1; //least significant bit
n = n >>> 1;
while (n != 0) {
sum += lsb * i;
lsb = n & 1;
n = n >>> 1;
i = i << 1;
j++;
}
long tot = t[j] + 1 + sum + bitsum(sum);
return tot;
}
}
}

Counting, reversed bit pattern

I am trying to find an algorithm to count from 0 to 2n-1 but their bit pattern reversed. I care about only n LSB of a word. As you may have guessed I failed.
For n=3:
000 -> 0
100 -> 4
010 -> 2
110 -> 6
001 -> 1
101 -> 5
011 -> 3
111 -> 7
You get the idea.
Answers in pseudo-code is great. Code fragments in any language are welcome, answers without bit operations are preferred.
Please don't just post a fragment without even a short explanation or a pointer to a source.
Edit: I forgot to add, I already have a naive implementation which just bit-reverses a count variable. In a sense, this method is not really counting.
This is, I think easiest with bit operations, even though you said this wasn't preferred
Assuming 32 bit ints, here's a nifty chunk of code that can reverse all of the bits without doing it in 32 steps:
unsigned int i;
i = (i & 0x55555555) << 1 | (i & 0xaaaaaaaa) >> 1;
i = (i & 0x33333333) << 2 | (i & 0xcccccccc) >> 2;
i = (i & 0x0f0f0f0f) << 4 | (i & 0xf0f0f0f0) >> 4;
i = (i & 0x00ff00ff) << 8 | (i & 0xff00ff00) >> 8;
i = (i & 0x0000ffff) << 16 | (i & 0xffff0000) >> 16;
i >>= (32 - n);
Essentially this does an interleaved shuffle of all of the bits. Each time around half of the bits in the value are swapped with the other half.
The last line is necessary to realign the bits so that bin "n" is the most significant bit.
Shorter versions of this are possible if "n" is <= 16, or <= 8
At each step, find the leftmost 0 digit of your value. Set it, and clear all digits to the left of it. If you don't find a 0 digit, then you've overflowed: return 0, or stop, or crash, or whatever you want.
This is what happens on a normal binary increment (by which I mean it's the effect, not how it's implemented in hardware), but we're doing it on the left instead of the right.
Whether you do this in bit ops, strings, or whatever, is up to you. If you do it in bitops, then a clz (or call to an equivalent hibit-style function) on ~value might be the most efficient way: __builtin_clz where available. But that's an implementation detail.
This solution was originally in binary and converted to conventional math as the requester specified.
It would make more sense as binary, at least the multiply by 2 and divide by 2 should be << 1 and >> 1 for speed, the additions and subtractions probably don't matter one way or the other.
If you pass in mask instead of nBits, and use bitshifting instead of multiplying or dividing, and change the tail recursion to a loop, this will probably be the most performant solution you'll find since every other call it will be nothing but a single add, it would only be as slow as Alnitak's solution once every 4, maybe even 8 calls.
int incrementBizarre(int initial, int nBits)
// in the 3 bit example, this should create 100
mask=2^(nBits-1)
// This should only return true if the first (least significant) bit is not set
// if initial is 011 and mask is 100
// 3 4, bit is not set
if(initial < mask)
// If it was not, just set it and bail.
return initial+ mask // 011 (3) + 100 (4) = 111 (7)
else
// it was set, are we at the most significant bit yet?
// mask 100 (4) / 2 = 010 (2), 001/2 = 0 indicating overflow
if(mask / 2) > 0
// No, we were't, so unset it (initial-mask) and increment the next bit
return incrementBizarre(initial - mask, mask/2)
else
// Whoops we were at the most significant bit. Error condition
throw new OverflowedMyBitsException()
Wow, that turned out kinda cool. I didn't figure in the recursion until the last second there.
It feels wrong--like there are some operations that should not work, but they do because of the nature of what you are doing (like it feels like you should get into trouble when you are operating on a bit and some bits to the left are non-zero, but it turns out you can't ever be operating on a bit unless all the bits to the left are zero--which is a very strange condition, but true.
Example of flow to get from 110 to 001 (backwards 3 to backwards 4):
mask 100 (4), initial 110 (6); initial < mask=false; initial-mask = 010 (2), now try on the next bit
mask 010 (2), initial 010 (2); initial < mask=false; initial-mask = 000 (0), now inc the next bit
mask 001 (1), initial 000 (0); initial < mask=true; initial + mask = 001--correct answer
Here's a solution from my answer to a different question that computes the next bit-reversed index without looping. It relies heavily on bit operations, though.
The key idea is that incrementing a number simply flips a sequence of least-significant bits, for example from nnnn0111 to nnnn1000. So in order to compute the next bit-reversed index, you have to flip a sequence of most-significant bits. If your target platform has a CTZ ("count trailing zeros") instruction, this can be done efficiently.
Example in C using GCC's __builtin_ctz:
void iter_reversed(unsigned bits) {
unsigned n = 1 << bits;
for (unsigned i = 0, j = 0; i < n; i++) {
printf("%x\n", j);
// Compute a mask of LSBs.
unsigned mask = i ^ (i + 1);
// Length of the mask.
unsigned len = __builtin_ctz(~mask);
// Align the mask to MSB of n.
mask <<= bits - len;
// XOR with mask.
j ^= mask;
}
}
Without a CTZ instruction, you can also use integer division:
void iter_reversed(unsigned bits) {
unsigned n = 1 << bits;
for (unsigned i = 0, j = 0; i < n; i++) {
printf("%x\n", j);
// Find least significant zero bit.
unsigned bit = ~i & (i + 1);
// Using division to bit-reverse a single bit.
unsigned rev = (n / 2) / bit;
// XOR with mask.
j ^= (n - 1) & ~(rev - 1);
}
}
void reverse(int nMaxVal, int nBits)
{
int thisVal, bit, out;
// Calculate for each value from 0 to nMaxVal.
for (thisVal=0; thisVal<=nMaxVal; ++thisVal)
{
out = 0;
// Shift each bit from thisVal into out, in reverse order.
for (bit=0; bit<nBits; ++bit)
out = (out<<1) + ((thisVal>>bit) & 1)
}
printf("%d -> %d\n", thisVal, out);
}
Maybe increment from 0 to N (the "usual" way") and do ReverseBitOrder() for each iteration. You can find several implementations here (I like the LUT one the best).
Should be really quick.
Here's an answer in Perl. You don't say what comes after the all ones pattern, so I just return zero. I took out the bitwise operations so that it should be easy to translate into another language.
sub reverse_increment {
my($n, $bits) = #_;
my $carry = 2**$bits;
while($carry > 1) {
$carry /= 2;
if($carry > $n) {
return $carry + $n;
} else {
$n -= $carry;
}
}
return 0;
}
Here's a solution which doesn't actually try to do any addition, but exploits the on/off pattern of the seqence (most sig bit alternates every time, next most sig bit alternates every other time, etc), adjust n as desired:
#define FLIP(x, i) do { (x) ^= (1 << (i)); } while(0)
int main() {
int n = 3;
int max = (1 << n);
int x = 0;
for(int i = 1; i <= max; ++i) {
std::cout << x << std::endl;
/* if n == 3, this next part is functionally equivalent to this:
*
* if((i % 1) == 0) FLIP(x, n - 1);
* if((i % 2) == 0) FLIP(x, n - 2);
* if((i % 4) == 0) FLIP(x, n - 3);
*/
for(int j = 0; j < n; ++j) {
if((i % (1 << j)) == 0) FLIP(x, n - (j + 1));
}
}
}
How about adding 1 to the most significant bit, then carrying to the next (less significant) bit, if necessary. You could speed this up by operating on bytes:
Precompute a lookup table for counting in bit-reverse from 0 to 256 (00000000 -> 10000000, 10000000 -> 01000000, ..., 11111111 -> 00000000).
Set all bytes in your multi-byte number to zero.
Increment the most significant byte using the lookup table. If the byte is 0, increment the next byte using the lookup table. If the byte is 0, increment the next byte...
Go to step 3.
With n as your power of 2 and x the variable you want to step:
(defun inv-step (x n) ; the following is a function declaration
"returns a bit-inverse step of x, bounded by 2^n" ; documentation
(do ((i (expt 2 (- n 1)) ; loop, init of i
(/ i 2)) ; stepping of i
(s x)) ; init of s as x
((not (integerp i)) ; breaking condition
s) ; returned value if all bits are 1 (is 0 then)
(if (< s i) ; the loop's body: if s < i
(return-from inv-step (+ s i)) ; -> add i to s and return the result
(decf s i)))) ; else: reduce s by i
I commented it thoroughly as you may not be familiar with this syntax.
edit: here is the tail recursive version. It seems to be a little faster, provided that you have a compiler with tail call optimization.
(defun inv-step (x n)
(let ((i (expt 2 (- n 1))))
(cond ((= n 1)
(if (zerop x) 1 0)) ; this is really (logxor x 1)
((< x i)
(+ x i))
(t
(inv-step (- x i) (- n 1))))))
When you reverse 0 to 2^n-1 but their bit pattern reversed, you pretty much cover the entire 0-2^n-1 sequence
Sum = 2^n * (2^n+1)/2
O(1) operation. No need to do bit reversals
Edit: Of course original poster's question was about to do increment by (reversed) one, which makes things more simple than adding two random values. So nwellnhof's answer contains the algorithm already.
Summing two bit-reversal values
Here is one solution in php:
function RevSum ($a,$b) {
// loop until our adder, $b, is zero
while ($b) {
// get carry (aka overflow) bit for every bit-location by AND-operation
// 0 + 0 --> 00 no overflow, carry is "0"
// 0 + 1 --> 01 no overflow, carry is "0"
// 1 + 0 --> 01 no overflow, carry is "0"
// 1 + 1 --> 10 overflow! carry is "1"
$c = $a & $b;
// do 1-bit addition for every bit location at once by XOR-operation
// 0 + 0 --> 00 result = 0
// 0 + 1 --> 01 result = 1
// 1 + 0 --> 01 result = 1
// 1 + 1 --> 10 result = 0 (ignored that "1", already taken care above)
$a ^= $b;
// now: shift carry bits to the next bit-locations to be added to $a in
// next iteration.
// PHP_INT_MAX here is used to ensure that the most-significant bit of the
// $b will be cleared after shifting. see link in the side note below.
$b = ($c >> 1) & PHP_INT_MAX;
}
return $a;
}
Side note: See this question about shifting negative values.
And as for test; start from zero and increment value by 8-bit reversed one (10000000):
$value = 0;
$add = 0x80; // 10000000 <-- "one" as bit reversed
for ($count = 20; $count--;) { // loop 20 times
printf("%08b\n", $value); // show value as 8-bit binary
$value = RevSum($value, $add); // do addition
}
... will output:
00000000
10000000
01000000
11000000
00100000
10100000
01100000
11100000
00010000
10010000
01010000
11010000
00110000
10110000
01110000
11110000
00001000
10001000
01001000
11001000
Let assume number 1110101 and our task is to find next one.
1) Find zero on highest position and mark position as index.
11101010 (4th position, so index = 4)
2) Set to zero all bits on position higher than index.
00001010
3) Change founded zero from step 1) to '1'
00011010
That's it. This is by far the fastest algorithm since most of cpu's has instructions to achieve this very efficiently. Here is a C++ implementation which increment 64bit number in reversed patern.
#include <intrin.h>
unsigned __int64 reversed_increment(unsigned __int64 number)
{
unsigned long index, result;
_BitScanReverse64(&index, ~number); // returns index of the highest '1' on bit-reverse number (trick to find the highest '0')
result = _bzhi_u64(number, index); // set to '0' all bits at number higher than index position
result |= (unsigned __int64) 1 << index; // changes to '1' bit on index position
return result;
}
Its not hit your requirements to have "no bits" operations, however i fear there is now way how to achieve something similar without them.

Resources