This question already has answers here:
bitwise most significant set bit
(10 answers)
Closed 9 years ago.
A friend of mine was asked at an interview the following question: "Given a binary number, find the most significant bit". I immediately thought of the following solution but am not sure if it is correct.
Namely, divide the string into two parts and convert both parts into decimal. If the left-subarray is 0 in decimal then the do binary search in the right subarray, looking for 1.
That is my other question. Is the most significant bit, the left-most 1 in a binary number? Can you show me an example when a 0 is the most significant bit with an example and explanation.
EDIT:
There seems to be a bit of confusion in the answers below so I am updating the question to make it more precise. The interviewer said "you have a website that you receive data from until the most significant bit indicates to stop transmitting data" How would you go about telling the program to stop the data transfer"
You could also use bit shifting. Pseudo-code:
number = gets
bitpos = 0
while number != 0
bitpos++ # increment the bit position
number = number >> 1 # shift the whole thing to the right once
end
puts bitpos
if the number is zero, bitpos is zero.
Finding the most significant bit in a word (i.e. calculating log2 with rounding down) by using only C-like language instructions can be done by using a rather well-known method based on De Bruijn sequences. For example, for a 32-bit value
unsigned ulog2(uint32_t v)
{ /* Evaluates [log2 v] */
static const unsigned MUL_DE_BRUIJN_BIT[] =
{
0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
};
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
return MUL_DE_BRUIJN_BIT[(v * 0x07C4ACDDu) >> 27];
}
However, in practice more simple methods (like unrolled binary search) usually work just as well or better.
The edited question is really quite different, though not very clear. Who are "you"? The website or the programmer of the program that reads data from the website? If you're the website, you make the program stop by sending a value (but what, a byte, probably?) with its most-significant bit set. Just OR or ADD that bit in. If you're the programmer, you test the most-significant bit of the values you receive, and stop reading when it becomes set. For unsigned bytes, you could do the test like
bool stop = received_byte >= 128;
or
bool stop = received_byte & 128;
For signed bytes, you could use
bool stop = received_byte < 0;
or
bool stop = received_byte & 128;
If you're not reading bytes but, say, 32bit words, the 128 changes to (1 << 31).
This is one approach (not necessarily the most efficient, though, especially if your platform has a single-instruction solution to find-first-one or count-leading-zeros or something similar), assuming twos complement signed integers and a 32-bit integer width.
int mask = (int)(1U<<31); // signed integer with only bit 32 set
while (! n & mask) // n is the int we're testing against
mask >>= 1; // take advantage of sign fill on right shift of negative number
mask = mask ^ (mask << 1) // isolate first bit that matched with n
If you want the bit position of that first one, simply add a integer counter that starts at 31 and gets decremented on each loop iteration.
One downside to this is if n == 0, it's an infinite loop, so test for zero beforehand.
If you are interested in a C/C++ solution you can have a look at the book "Matters Computational" by Jörg Arndt where you have these functions defined in section "1.6.1 Isolating the highest one and finding its index":
static inline ulong highest_one_idx(ulong x)
// Return index of highest bit set.
// Return 0 if no bit is set.
{
#if defined BITS_USE_ASM
return asm_bsr(x);
#else // BITS_USE_ASM
#if BITS_PER_LONG == 64
#define MU0 0x5555555555555555UL // MU0 == ((-1UL)/3UL) == ...01010101_2
#define MU1 0x3333333333333333UL // MU1 == ((-1UL)/5UL) == ...00110011_2
#define MU2 0x0f0f0f0f0f0f0f0fUL // MU2 == ((-1UL)/17UL) == ...00001111_2
#define MU3 0x00ff00ff00ff00ffUL // MU3 == ((-1UL)/257UL) == (8 ones)
#define MU4 0x0000ffff0000ffffUL // MU4 == ((-1UL)/65537UL) == (16 ones)
#define MU5 0x00000000ffffffffUL // MU5 == ((-1UL)/4294967297UL) == (32 ones)
#else
#define MU0 0x55555555UL // MU0 == ((-1UL)/3UL) == ...01010101_2
#define MU1 0x33333333UL // MU1 == ((-1UL)/5UL) == ...00110011_2
#define MU2 0x0f0f0f0fUL // MU2 == ((-1UL)/17UL) == ...00001111_2
#define MU3 0x00ff00ffUL // MU3 == ((-1UL)/257UL) == (8 ones)
#define MU4 0x0000ffffUL // MU4 == ((-1UL)/65537UL) == (16 ones)
#endif
ulong r = (ulong)ld_neq(x, x & MU0)
+ ((ulong)ld_neq(x, x & MU1) << 1)
+ ((ulong)ld_neq(x, x & MU2) << 2)
+ ((ulong)ld_neq(x, x & MU3) << 3)
+ ((ulong)ld_neq(x, x & MU4) << 4);
#if BITS_PER_LONG > 32
r += ((ulong)ld_neq(x, x & MU5) << 5);
#endif
return r;
#undef MU0
#undef MU1
#undef MU2
#undef MU3
#undef MU4
#undef MU5
#endif
}
where asm_bsr is implemented depending on your processor architecture
// i386
static inline ulong asm_bsr(ulong x)
// Bit Scan Reverse: return index of highest one.
{
asm ("bsrl %0, %0" : "=r" (x) : "0" (x));
return x;
}
or
// AMD64
static inline ulong asm_bsr(ulong x)
// Bit Scan Reverse
{
asm ("bsrq %0, %0" : "=r" (x) : "0" (x));
return x;
}
Go here for the code: http://jjj.de/bitwizardry/bitwizardrypage.html
EDIT:
This is the definition in the source for function ld_neq:
static inline bool ld_neq(ulong x, ulong y)
// Return whether floor(log2(x))!=floor(log2(y))
{ return ( (x^y) > (x&y) ); }
I don't know it this is too much tricky :)
I would convert the binary number to dec and then I would return the logaritm base 2 of the number directly (converted from float to int).
The solution is the (returned number + 1) bit starting from the right.
To your answer as far as I know its the left-most 1
I tihnk this is kind of a trick question. The most significant bit is always going to be a 1 :-). If interviewers like lateral thinking, that answer should be a winner!
Related
I am building a C library on big integer number. Basically, I'm seeking a fast algorythm to convert any integer in it binary representation to a decimal one
I saw JDK's Biginteger.toString() implementation, but it looks quite heavy to me, as it was made to convert the number to any radix (it uses a division for each digits, which should be pretty slow while dealing with thousands of digits).
So if you have any documentations / knowledge to share about it, I would be glad to read it.
EDIT: more precisions about my question:
Let P a memory address
Let N be the number of bytes allocated (and set) at P
How to convert the integer represented by the N bytes at address P (let's say in little endian to make things simpler), to a C string
Example:
N = 1
P = some random memory address storing '00101010'
out string = "42"
Thank for your answer still
The reason for the BigInteger.toString method looking heavy is doing the conversion in chunks.
A trivial algorithm would take the last digits and then divide the whole big integer by the radix until there is nothing left.
One problem with this is that a big integer division is quite expensive, so the number is subdivided into chunks that can be processed with regular integer division (opposed to BigInt division):
static String toDecimal(BigInteger bigInt) {
BigInteger chunker = new BigInteger(1000000000);
StringBuilder sb = new StringBuilder();
do {
int current = bigInt.mod(chunker).getInt(0);
bigInt = bigInt.div(chunker);
for (int i = 0; i < 9; i ++) {
sb.append((char) ('0' + remainder % 10));
current /= 10;
if (currnet == 0 && bigInt.signum() == 0) {
break;
}
}
} while (bigInt.signum() != 0);
return sb.reverse().toString();
}
That said, for a fixed radix, you are probably even better off with porting the "double dabble" algorithm to your needs, as suggested in the comments: https://en.wikipedia.org/wiki/Double_dabble
I recently got the challenge to print a big mersenne prime: 2**82589933-1. On my CPU that takes ~40 minutes with apcalc and ~120 minutes with python 2.7. It's a number with 24 million digits and a bit.
Here is my own little C code for the conversion:
// print 2**82589933-1
#include <stdio.h>
#include <math.h>
#include <stdint.h>
#include <inttypes.h>
#include <string.h>
const uint32_t exponent = 82589933;
//const uint32_t exponent = 100;
//outputs 1267650600228229401496703205375
const uint32_t blocks = (exponent + 31) / 32;
const uint32_t digits = (int)(exponent * log(2.0) / log(10.0)) + 10;
uint32_t num[2][blocks];
char out[digits + 1];
// blocks : number of uint32_t in num1 and num2
// num1 : number to convert
// num2 : free space
// out : end of output buffer
void conv(uint32_t blocks, uint32_t *num1, uint32_t *num2, char *out) {
if (blocks == 0) return;
const uint32_t div = 1000000000;
uint64_t t = 0;
for (uint32_t i = 0; i < blocks; ++i) {
t = (t << 32) + num1[i];
num2[i] = t / div;
t = t % div;
}
for (int i = 0; i < 9; ++i) {
*out-- = '0' + (t % 10);
t /= 10;
}
if (num2[0] == 0) {
--blocks;
num2++;
}
conv(blocks, num2, num1, out);
}
int main() {
// prepare number
uint32_t t = exponent % 32;
num[0][0] = (1LLU << t) - 1;
memset(&num[0][1], 0xFF, (blocks - 1) * 4);
// prepare output
memset(out, '0', digits);
out[digits] = 0;
// convert to decimal
conv(blocks, num[0], num[1], &out[digits - 1]);
// output number
char *res = out;
while(*res == '0') ++res;
printf("%s\n", res);
return 0;
}
The conversion is destructive and tail recursive. In each step it divides num1 by 1_000_000_000 and stores the result in num2. The remainder is added to out. Then it calls itself with num1 and num2 switched and often shortened by one (blocks is decremented). out is filled from back to front. You have to allocate it large enough and then strip leading zeroes.
Python seems to be using a similar mechanism for converting big integers to decimal.
Want to do better?
For large number like in my case each division by 1_000_000_000 takes rather long. At a certain size a divide&conquer algorithm does better. In my case the first division would be by dividing by 10 ^ 16777216 to split the number into divident and remainder. Then convert each part separately. Now each part is still big so split again at 10 ^ 8388608. Recursively keep splitting till the numbers are small enough. Say maybe 1024 digits each. Those convert with the simple algorithm above. The right definition of "small enough" would have to be tested, 1024 is just a guess.
While the long division of two big integer numbers is expensive, much more so than a division by 1_000_000_000, the time spend there is then saved because each separate chunk requires far fewer divisions by 1_000_000_000 to convert to decimal.
And if you have split the problem into separate and independent chunks it's only a tiny step away from spreading the chunks out among multiple cores. That would really speed up the conversion another step. It looks like apcalc uses divide&conquer but not multi-threading.
It might help to start out with a real world example. Say I'm writing a web app that's backed by MongoDB, so my records have a long hex primary key, making my url to view a record look like /widget/55c460d8e2d6e59da89d08d0. That seems excessively long. Urls can use many more characters than that. While there are just under 8 x 10^28 (16^24) possible values in a 24 digit hex number, just limiting yourself to the characters matched by a [a-zA-Z0-9] regex class (a YouTube video id uses more), 62 characters, you can get past 8 x 10^28 in only 17 characters.
I want an algorithm that will convert any string that is limited to a specific alphabet of characters to any other string with another alphabet of characters, where the value of each character c could be thought of as alphabet.indexOf(c).
Something of the form:
convert(value, sourceAlphabet, destinationAlphabet)
Assumptions
all parameters are strings
every character in value exists in sourceAlphabet
every character in sourceAlphabet and destinationAlphabet is unique
Simplest example
var hex = "0123456789abcdef";
var base10 = "0123456789";
var result = convert("12245589", base10, hex); // result is "bada55";
But I also want it to work to convert War & Peace from the Russian alphabet plus some punctuation to the entire unicode charset and back again losslessly.
Is this possible?
The only way I was ever taught to do base conversions in Comp Sci 101 was to first convert to a base ten integer by summing digit * base^position and then doing the reverse to convert to the target base. Such a method is insufficient for the conversion of very long strings, because the integers get too big.
It certainly feels intuitively that a base conversion could be done in place, as you step through the string (probably backwards to maintain standard significant digit order), keeping track of a remainder somehow, but I'm not smart enough to work out how.
That's where you come in, StackOverflow. Are you smart enough?
Perhaps this is a solved problem, done on paper by some 18th century mathematician, implemented in LISP on punch cards in 1970 and the first homework assignment in Cryptography 101, but my searches have borne no fruit.
I'd prefer a solution in javascript with a functional style, but any language or style will do, as long as you're not cheating with some big integer library. Bonus points for efficiency, of course.
Please refrain from criticizing the original example. The general nerd cred of solving the problem is more important than any application of the solution.
Here is a solution in C that is very fast, using bit shift operations. It assumes that you know what the length of the decoded string should be. The strings are vectors of integers in the range 0..maximum for each alphabet. It is up to the user to convert to and from strings with restricted ranges of characters. As for the "in-place" in the question title, the source and destination vectors can overlap, but only if the source alphabet is not larger than the destination alphabet.
/*
recode version 1.0, 22 August 2015
Copyright (C) 2015 Mark Adler
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
Mark Adler
madler#alumni.caltech.edu
*/
/* Recode a vector from one alphabet to another using intermediate
variable-length bit codes. */
/* The approach is to use a Huffman code over equiprobable alphabets in two
directions. First to encode the source alphabet to a string of bits, and
second to encode the string of bits to the destination alphabet. This will
be reasonably close to the efficiency of base-encoding with arbitrary
precision arithmetic. */
#include <stddef.h> // size_t
#include <limits.h> // UINT_MAX, ULLONG_MAX
#if UINT_MAX == ULLONG_MAX
# error recode() assumes that long long has more bits than int
#endif
/* Take a list of integers source[0..slen-1], all in the range 0..smax, and
code them into dest[0..*dlen-1], where each value is in the range 0..dmax.
*dlen returns the length of the result, which will not exceed the value of
*dlen when called. If the original *dlen is not large enough to hold the
full result, then recode() will return non-zero to indicate failure.
Otherwise recode() will return 0. recode() will also return non-zero if
either of the smax or dmax parameters are less than one. The non-zero
return codes are 1 if *dlen is not long enough, 2 for invalid parameters,
and 3 if any of the elements of source are greater than smax.
Using this same operation on the result with smax and dmax reversed reverses
the operation, restoring the original vector. However there may be more
symbols returned than the original, so the number of symbols expected needs
to be known for decoding. (An end symbol could be appended to the source
alphabet to include the length in the coding, but then encoding and decoding
would no longer be symmetric, and the coding efficiency would be reduced.
This is left as an exercise for the reader if that is desired.) */
int recode(unsigned *dest, size_t *dlen, unsigned dmax,
const unsigned *source, size_t slen, unsigned smax)
{
// compute sbits and scut, with which we will recode the source with
// sbits-1 bits for symbols < scut, otherwise with sbits bits (adding scut)
if (smax < 1)
return 2;
unsigned sbits = 0;
unsigned scut = 1; // 2**sbits
while (scut && scut <= smax) {
scut <<= 1;
sbits++;
}
scut -= smax + 1;
// same thing for dbits and dcut
if (dmax < 1)
return 2;
unsigned dbits = 0;
unsigned dcut = 1; // 2**dbits
while (dcut && dcut <= dmax) {
dcut <<= 1;
dbits++;
}
dcut -= dmax + 1;
// recode a base smax+1 vector to a base dmax+1 vector using an
// intermediate bit vector (a sliding window of that bit vector is kept in
// a bit buffer)
unsigned long long buf = 0; // bit buffer
unsigned have = 0; // number of bits in bit buffer
size_t i = 0, n = 0; // source and dest indices
unsigned sym; // symbol being encoded
for (;;) {
// encode enough of source into bits to encode that to dest
while (have < dbits && i < slen) {
sym = source[i++];
if (sym > smax) {
*dlen = n;
return 3;
}
if (sym < scut) {
buf = (buf << (sbits - 1)) + sym;
have += sbits - 1;
}
else {
buf = (buf << sbits) + sym + scut;
have += sbits;
}
}
// if not enough bits to assure one symbol, then break out to a special
// case for coding the final symbol
if (have < dbits)
break;
// encode one symbol to dest
if (n == *dlen)
return 1;
sym = buf >> (have - dbits + 1);
if (sym < dcut) {
dest[n++] = sym;
have -= dbits - 1;
}
else {
sym = buf >> (have - dbits);
dest[n++] = sym - dcut;
have -= dbits;
}
buf &= ((unsigned long long)1 << have) - 1;
}
// if any bits are left in the bit buffer, encode one last symbol to dest
if (have) {
if (n == *dlen)
return 1;
sym = buf;
sym <<= dbits - 1 - have;
if (sym >= dcut)
sym = (sym << 1) - dcut;
dest[n++] = sym;
}
// return recoded vector
*dlen = n;
return 0;
}
/* Test recode(). */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <assert.h>
// Return a random vector of len unsigned values in the range 0..max.
static void ranvec(unsigned *vec, size_t len, unsigned max) {
unsigned bits = 0;
unsigned long long mask = 1;
while (mask <= max) {
mask <<= 1;
bits++;
}
mask--;
unsigned long long ran = 0;
unsigned have = 0;
size_t n = 0;
while (n < len) {
while (have < bits) {
ran = (ran << 31) + random();
have += 31;
}
if ((ran & mask) <= max)
vec[n++] = ran & mask;
ran >>= bits;
have -= bits;
}
}
// Get a valid number from str and assign it to var
#define NUM(var, str) \
do { \
char *end; \
unsigned long val = strtoul(str, &end, 0); \
var = val; \
if (*end || var != val) { \
fprintf(stderr, \
"invalid or out of range numeric argument: %s\n", str); \
return 1; \
} \
} while (0)
/* "bet n m len count" generates count test vectors of length len, where each
entry is in the range 0..n. Each vector is recoded to another vector using
only symbols in the range 0..m. That vector is recoded back to a vector
using only symbols in 0..n, and that result is compared with the original
random vector. Report on the average ratio of input and output symbols, as
compared to the optimal ratio for arbitrary precision base encoding. */
int main(int argc, char **argv)
{
// get sizes of alphabets and length of test vector, compute maximum sizes
// of recoded vectors
unsigned smax, dmax, runs;
size_t slen, dsize, bsize;
if (argc != 5) { fputs("need four arguments\n", stderr); return 1; }
NUM(smax, argv[1]);
NUM(dmax, argv[2]);
NUM(slen, argv[3]);
NUM(runs, argv[4]);
dsize = ceil(slen * ceil(log2(smax + 1.)) / floor(log2(dmax + 1.)));
bsize = ceil(dsize * ceil(log2(dmax + 1.)) / floor(log2(smax + 1.)));
// generate random test vectors, encode, decode, and compare
srandomdev();
unsigned source[slen], dest[dsize], back[bsize];
unsigned mis = 0, i;
unsigned long long dtot = 0;
int ret;
for (i = 0; i < runs; i++) {
ranvec(source, slen, smax);
size_t dlen = dsize;
ret = recode(dest, &dlen, dmax, source, slen, smax);
if (ret) {
fprintf(stderr, "encode error %d\n", ret);
break;
}
dtot += dlen;
size_t blen = bsize;
ret = recode(back, &blen, smax, dest, dlen, dmax);
if (ret) {
fprintf(stderr, "decode error %d\n", ret);
break;
}
if (blen < slen || memcmp(source, back, slen)) // blen > slen is ok
mis++;
}
if (mis)
fprintf(stderr, "%u/%u mismatches!\n", mis, i);
if (ret == 0)
printf("mean dest/source symbols = %.4f (optimal = %.4f)\n",
dtot / (i * (double)slen), log(smax + 1.) / log(dmax + 1.));
return 0;
}
As has been pointed out in other StackOverflow answers, try not to think of summing digit * base^position as converting it to base ten; rather, think of it as directing the computer to generate a representation of the quantity represented by the number in its own terms (for most computers probably closer to our concept of base 2). Once the computer has its own representation of the quantity, we can direct it to output the number in any way we like.
By rejecting "big integer" implementations and asking for letter-by-letter conversion you are at the same time arguing that the numerical/alphabetical representation of quantity is not actually what it is, namely that each position represents a quantity of digit * base^position. If the nine-millionth character of War and Peace does represent what you are asking to convert it from, then the computer at some point will need to generate a representation for Д * 33^9000000.
I don't think any solution can work generally because if ne != m for some integer e and some MAX_INT because there's no way to calculate the value of the target base in a certain place p if np > MAX_INT.
You can get away with this for the case where ne == m for some e because the problem is recursively doable (the first e digits of n can be summed and converted into the first digit of M, and then chopped off and repeated.
If you don't have this useful property, then eventually you're going to have to try to take some part of the original base and try to perform modulus in np and np is going to be greater than MAX_INT, which means it's impossible.
I found this problem:
Consider sequences of 36 bits. Each such sequence has 32 5 - bit
sequences consisting of adjacent bits. For example, the sequence
1101011… contains the 5 - bit sequences 11010,10101,01011,…. Write a
program that prints all 36 - bit sequences with the two properties:
1.The first 5 bits of the sequence are 00000.
2. No two 5 - bit subsequences are the same.
So I generalized to find all n-bit sequences with k - bit unique subsequences satisfy the above requirements. However, the only approach I can think of is using a brutal force search: generate all permutations of n-bit sequence with the first k bits zero, then for each sequence, check if all k-bit subsequences are unique. This apparently is not a very efficient approach. I am wondering is there a better way to solve the problem?
Thanks.
The simplest approach seems to be a backtracking approach. You can keep track of which 5-bit sequences you've seen with a flat array. At each bit, try adding 0 -- counter = (counter & 0x0f) << 1 and check if you've seen that before, then do a counter = counter | 1 and try that path.
There are probably more efficient algorithms that can prune the search space faster. This seems related to https://en.wikipedia.org/wiki/De_Bruijn_sequence. I am not certain, but I believe that it is actually equivalent; that is, the last five digits of the sequence will have to be 10000, making it cyclic.
EDIT: here's some c code. Less efficient than it could be in terms of space, because of the recursion, but simple. The worst bit is the mask management. It appears I was correct about De Bruijn sequences; this finds all 2048 of them.
#include <stdio.h>
#include <stdlib.h>
char *binprint(int val) {
static char res[33];
int i;
res[32] = 0;
for (i = 0; i < 32; i++) {
res[31 - i] = (val & 1) + '0';
val = val >> 1;
}
return res;
}
void checkPoint(int mask, int counter) {
// Get the appropriate bit in the mask
int idxmask = 1 << (counter & 0x1f);
// Abort if we've seen this suffix before
if (mask & idxmask) {
return;
}
// Update the mask
mask = mask | idxmask;
// We're done if we've hit all 32
if (mask == 0xffffffff) {
printf("%10u 0000%s\n", counter, binprint(counter));
return;
}
checkPoint(mask, counter << 1);
checkPoint(mask, (counter << 1) | 1);
}
void main(int argc, char *argv[]) {
checkPoint(0, 0);
}
I remember seeing this exact problem in a programming interview questions book of mine. Here is their solution:
hope it helps. cheers.
The following is the implementation of BitSet in the solution of question 10-4 in cracking the coding interview book. Why is it allocating an array of size/32 not (size/32 + 1). Am I missing something here or this is a bug?
If I pass 33 to the constructor of BitSet then I will allocate only one int and If I try to set or get the bit 32, I will get an AV!
package Question10_4;
class BitSet {
int[] bitset;
public BitSet(int size) {
bitset = new int[size >> 5]; // divide by 32
}
boolean get(int pos) {
int wordNumber = (pos >> 5); // divide by 32
int bitNumber = (pos & 0x1F); // mod 32
return (bitset[wordNumber] & (1 << bitNumber)) != 0;
}
void set(int pos) {
int wordNumber = (pos >> 5); // divide by 32
int bitNumber = (pos & 0x1F); // mod 32
bitset[wordNumber] |= 1 << bitNumber;
}
}
From what I can gather from reading the solution you mention (on page 205), and the little I understand about computer programming, it seems to me that this is a special implementation of a bitset, meant to take the argument of 32,000 in its construction (see the checkDuplicates function. The question is about examining an array with numbers from 1 to N, where N is at most 32,000, with only 4KB of memory).
This way, an array of 1000 elements is created, each one used for 32 bits in the bit set. You can see in the bitset class that to get a bit's position, we (floor) divide by 32 to get the array index, and then mod 32 to get the specific bit position.
Yes, answer in the book is incorrect.
Correct answer:
bitset = new int[(size + 31) >> 5]; // divide by 32
I forgot a bit hack to generate all integers with a given number of 1s. Does anybody remember it (and probably can explain it also)?
From Bit Twiddling Hacks
Update Test program Live On Coliru
#include <utility>
#include <iostream>
#include <bitset>
using I = uint8_t;
auto dump(I v) { return std::bitset<sizeof(I) * __CHAR_BIT__>(v); }
I bit_twiddle_permute(I v) {
I t = v | (v - 1); // t gets v's least significant 0 bits set to 1
// Next set to 1 the most significant bit to change,
// set to 0 the least significant ones, and add the necessary 1 bits.
I w = (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(v) + 1));
return w;
}
int main() {
I p = 0b001001;
std::cout << dump(p) << "\n";
for (I n = bit_twiddle_permute(p); n>p; p = n, n = bit_twiddle_permute(p)) {
std::cout << dump(n) << "\n";
}
}
Prints
00001001
00001010
00001100
00010001
00010010
00010100
00011000
00100001
00100010
00100100
00101000
00110000
01000001
01000010
01000100
01001000
01010000
01100000
10000001
10000010
10000100
10001000
10010000
10100000
11000000
Compute the lexicographically next bit permutation
Suppose we have a pattern of N bits set to 1 in an integer and we want the next permutation of N 1 bits in a lexicographical sense. For example, if N is 3 and the bit pattern is 00010011, the next patterns would be 00010101, 00010110, 00011001,00011010, 00011100, 00100011, and so forth. The following is a fast way to compute the next permutation.
unsigned int v; // current permutation of bits
unsigned int w; // next permutation of bits
unsigned int t = v | (v - 1); // t gets v's least significant 0 bits set to 1
// Next set to 1 the most significant bit to change,
// set to 0 the least significant ones, and add the necessary 1 bits.
w = (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(v) + 1));
The __builtin_ctz(v) GNU C compiler intrinsic for x86 CPUs returns the number of trailing zeros. If you are using Microsoft compilers for x86, the intrinsic is _BitScanForward. These both emit a bsf instruction, but equivalents may be available for other architectures. If not, then consider using one of the methods for counting the consecutive zero bits mentioned earlier.
Here is another version that tends to be slower because of its division operator, but it
does not require counting the trailing zeros.
unsigned int t = (v | (v - 1)) + 1;
w = t | ((((t & -t) / (v & -v)) >> 1) - 1);
Thanks to Dario Sneidermanis of Argentina, who provided this on November 28, 2009.
For bit hacks I like to refer to this page: Bit Twiddling Hacks.
Regarding your specific question, read the part entitled Compute the lexicographically next bit permutation.
Compute the lexicographically next bit permutation
Suppose we have a pattern of N bits set to 1 in an integer and we want the next permutation of N 1 bits in a lexicographical sense. For example, if N is 3 and the bit pattern is 00010011, the next patterns would be 00010101, 00010110, 00011001,00011010, 00011100, 00100011, and so forth. The following is a fast way to compute the next permutation.
unsigned int v; // current permutation of bits
unsigned int w; // next permutation of bits
unsigned int t = v | (v - 1); // t gets v's least significant 0 bits set to 1
// Next set to 1 the most significant bit to change,
// set to 0 the least significant ones, and add the necessary 1 bits.
w = (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(v) + 1));
The __builtin_ctz(v) GNU C compiler intrinsic for x86 CPUs returns the number of trailing zeros. If you are using Microsoft compilers for x86, the intrinsic is _BitScanForward. These both emit a bsf instruction, but equivalents may be available for other architectures. If not, then consider using one of the methods for counting the consecutive zero bits mentioned earlier.
Here is another version that tends to be slower because of its division operator, but it does not require counting the trailing zeros.
unsigned int t = (v | (v - 1)) + 1;
w = t | ((((t & -t) / (v & -v)) >> 1) - 1);
Thanks to Dario Sneidermanis of Argentina, who provided this on November 28, 2009.
To add onto #sehe's answer included below (originally from Dario Sneidermanis also at http://graphics.stanford.edu/~seander/bithacks.html#NextBitPermutation.)
#include <utility>
#include <iostream>
#include <bitset>
using I = uint8_t;
auto dump(I v) { return std::bitset<sizeof(I) * __CHAR_BIT__>(v); }
I bit_twiddle_permute(I v) {
I t = v | (v - 1); // t gets v's least significant 0 bits set to 1
// Next set to 1 the most significant bit to change,
// set to 0 the least significant ones, and add the necessary 1 bits.
I w = (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(v) + 1));
return w;
}
int main() {
I p = 0b001001;
std::cout << dump(p) << "\n";
for (I n = bit_twiddle_permute(p); n>p; p = n, n = bit_twiddle_permute(p))
{
std::cout << dump(n) << "\n";
}
}
There are boundary issues with bit_twiddle_permute(I v). Whenever v is the last permutation, t is all 1's (e.g. 2^8 - 1), (~t & -~t) = 0, and w is the first permutation of bits with one fewer 1s than v, except when v = 000000000 in which case w = 01111111. In particular if you set p to 0; the loop in main will produce all permutations with seven 1's, and the following slight modification of the for loop, will cycle through all permutations with 0, 7, 6, ..., 1 bits set -
for (I n = bit_twiddle_permute(p); n>p; n = bit_twiddle_permute(n))
If this is the intention, it is perhaps worth a comment. If not it is trivial to fix, e.g.
if (t == (I)(-1)) { return v >> __builtin_ctz(v); }
So with an additional small simplification
I bit_twiddle_permute2(I v) {
I t = (v | (v - 1)) + 1;
if (t == 0) { return v >> __builtin_ctz(v); }
I w = t | ((~t & v) >> (__builtin_ctz(v) + 1));
return w;
}
int main() {
I p = 0b1;
cout << dump(p) << "\n";
for (I n = bit_twiddle_permute2(p); n>p; n = bit_twiddle_permute2(n)) {
cout << dump(n) << "\n";
}
}
The following adaptation of Dario Sneidermanis's idea may be slightly easier to follow
I bit_twiddle_permute3(I v) {
int n = __builtin_ctz(v);
I s = v >> n;
I t = s + 1;
I w = (t << n) | ((~t & s) >> 1);
return w;
}
or with a similar solution to the issue I mentioned at the beginning of this post
I bit_twiddle_permute3(I v) {
int n = __builtin_ctz(v);
I s = v >> n;
I t = s + 1;
if (v == 0 || t << n == 0) { return s; }
I w = (t << n) | ((~t & s) >> 1);
return w;
}