Find the closest integer with same weight O(1) - algorithm

I am solving this problem:
The count of ones in binary representation of integer number is called the weight of that number. The following algorithm finds the closest integer with the same weight. For example, for 123 (0111 1011)₂, the closest integer number is 125 (0111 1101)₂.
The solution for O(n)
where n is the width of the input number is by swapping the positions of the first pair of consecutive bits that differ.
Could someone give me some hints for solving in it in O(1) runtime and space ?
Thanks

As already commented by ajayv this cannot really be done in O(1) as the answer always depends on the number of bits the input has. However, if we interpret the O(1) to mean that we have as an input some primitive integer data and all the logic and arithmetic operations we perform on that integer are O(1) (no loops over the bits), the problem can be solved in constant time. Of course, if we changed from 32bit integer to 64bit integer the running time would increase as the arithmetic operations would take longer on hardware.
One possible solution is to use following functions. The first gives you a number where only the lowest set bit of x is set
int lowestBitSet(int x){
( x & ~(x-1) )
}
and the second the lowest bit not set
int lowestBitNotSet(int x){
return ~x & (x+1);
}
If you work few examples of these on paper you see how they work.
Now you can find the bits you need to change using these two functions and then use the algorithm you already described.
A c++ implementation (not checking for cases where there are no answer)
unsigned int closestInt(unsigned int x){
unsigned int ns=lowestBitNotSet(x);
unsigned int s=lowestBitSet(x);
if (ns>s){
x|=ns;
x^=ns>>1;
}
else{
x^=s;
x|=s>>1;
}
return x;
}

To solve this problem in O(1) time complexity it can be considered that there are two main cases:
1) When LSB is '0':
In this case, the first '1' must be shifted with one position to the right.
Input : "10001000"
Out ::: "10000100"
2) When LSB is '1':
In this case the first '0' must be set to '1', and first '1' must be set to '0'.
Input : "10000111"
Out ::: "10001110"
The next method in Java represents one solution.
private static void findClosestInteger(String word) { // ex: word = "10001000"
System.out.println(word); // Print initial binary format of the number
int x = Integer.parseInt(word, 2); // Convert String to int
if((x & 1) == 0) { // Evaluates LSB value
// Case when LSB = '0':
// Input: x = 10001000
int firstOne = x & ~(x -1); // get first '1' position (from right to left)
// firstOne = 00001000
x = x & (x - 1); // set first '1' to '0'
// x = 10000000
x = x | (firstOne >> 1); // "shift" first '1' with one position to right
// x = 10000100
} else {
// Case when LSB = '1':
// Input: x = 10000111
int firstZero = ~x & ~(~x - 1); // get first '0' position (from right to left)
// firstZero = 00001000
x = x & (~1); // set first '1', which is the LSB, to '0'
// x = 10000110
x = x | firstZero; // set first '0' to '1'
// x = 10001110
}
for(int i = word.length() - 1; i > -1 ; i--) { // print the closest integer with same weight
System.out.print("" + ( ( (x & 1 << i) != 0) ? 1 : 0) );
}
}

The problem can be viewed as "which differing bits to swap in a bit representation of a number, so that the resultant number is closest to the original?"
So, if we we're to swap bits at indices k1 & k2, with k2 > k1, the difference between the numbers would be 2^k2 - 2^k1. Our goal is to minimize this difference. Assuming that the bit representation is not all 0s or all 1s, a simple observation yields that the difference would be least if we kept |k2 - k1| as minimum. The minimum value can be 1. So, if we're able to find two consecutive different bits, starting from the least significant bit (index = 0), our job is done.
The case where bits starting from Least Significant Bit to the right most set bit are all 1s
k2
|
7 6 5 4 3 2 1 0
---------------
n: 1 1 1 0 1 0 1 1
rightmostSetBit: 0 0 0 0 0 0 0 1
rightmostNotSetBit: 0 0 0 0 0 1 0 0 rightmostNotSetBit > rightmostSetBit so,
difference: 0 0 0 0 0 0 1 0 i.e. rightmostNotSetBit - (rightmostNotSetBit >> 1):
---------------
n + difference: 1 1 1 0 1 1 0 1
The case where bits starting from Least Significant Bit to the right most set bit are all 0s
k2
|
7 6 5 4 3 2 1 0
---------------
n: 1 1 1 0 1 1 0 0
rightmostSetBit: 0 0 0 0 0 1 0 0
rightmostNotSetBit: 0 0 0 0 0 0 0 1 rightmostSetBit > rightmostNotSetBit so,
difference: 0 0 0 0 0 0 1 0 i.e. rightmostSetBit -(rightmostSetBit>> 1)
---------------
n - difference: 1 1 1 0 1 0 1 0
The edge case, of course the situation where we have all 0s or all 1s.
public static long closestToWeight(long n){
if(n <= 0 /* If all 0s */ || (n+1) == Integer.MIN_VALUE /* n is MAX_INT */)
return -1;
long neg = ~n;
long rightmostSetBit = n&~(n-1);
long rightmostNotSetBit = neg&~(neg-1);
if(rightmostNotSetBit > rightmostSetBit){
return (n + (rightmostNotSetBit - (rightmostNotSetBit >> 1)));
}
return (n - (rightmostSetBit - (rightmostSetBit >> 1)));
}

Attempted the problem in Python. Can be viewed as a translation of Ari's solution with the edge case handled:
def closest_int_same_bit_count(x):
# if all bits of x are 0 or 1, there can't be an answer
if x & sys.maxsize in {sys.maxsize, 0}:
raise ValueError("All bits are 0 or 1")
rightmost_set_bit = x & ~(x - 1)
next_un_set_bit = ~x & (x + 1)
if next_un_set_bit > rightmost_set_bit:
# 0 shifted to the right e.g 0111 -> 1011
x ^= next_un_set_bit | next_un_set_bit >> 1
else:
# 1 shifted to the right 1000 -> 0100
x ^= rightmost_set_bit | rightmost_set_bit >> 1
return x
Similarly jigsawmnc's solution is provided below:
def closest_int_same_bit_count(x):
# if all bits of x are 0 or 1, there can't be an answer
if x & sys.maxsize in {sys.maxsize, 0}:
raise ValueError("All bits are 0 or 1")
rightmost_set_bit = x & ~(x - 1)
next_un_set_bit = ~x & (x + 1)
if next_un_set_bit > rightmost_set_bit:
# 0 shifted to the right e.g 0111 -> 1011
x += next_un_set_bit - (next_un_set_bit >> 1)
else:
# 1 shifted to the right 1000 -> 0100
x -= rightmost_set_bit - (rightmost_set_bit >> 1)
return x

Java Solution:
//Swap the two rightmost consecutive bits that are different
for (int i = 0; i < 64; i++) {
if ((((x >> i) & 1) ^ ((x >> (i+1)) & 1)) == 1) {
// then swap them or flip their bits
int mask = (1 << i) | (1 << i + 1);
x = x ^ mask;
System.out.println("x = " + x);
return;
}
}

static void findClosestIntWithSameWeight(uint x)
{
uint xWithfirstBitSettoZero = x & (x - 1);
uint xWithOnlyfirstbitSet = x & ~(x - 1);
uint xWithNextTofirstBitSet = xWithOnlyfirstbitSet >> 1;
uint closestWeightNum = xWithfirstBitSettoZero | xWithNextTofirstBitSet;
Console.WriteLine("Closet Weight for {0} is {1}", x, closestWeightNum);
}

Code in python:
def closest_int_same_bit_count(x):
if (x & 1) != ((x >> 1) & 1):
return x ^ 0x3
diff = x ^ (x >> 1)
rbs = diff & ~(diff - 1)
i = int(math.log(rbs, 2))
return x ^ (1 << i | 1 << i + 1)

A great explanation of this problem can be found on question 4.4 in EPI.
(Elements of Programming Interviews)
Another place would be this link on geeksforgeeks.org if you don't own the book.
(Time complexity may be wrong on this link)
Two things you should keep in mind here is (Hint if you're trying to solve this for yourself):
You can use x & (x - 1) to clear the lowest set-bit (not to get confused with LSB - least significant bit)
You can use x & ~(x - 1) to get/extract the lowest set bit
If you know the O(n) solution you know that we need to find the index of the first bit that differs from LSB.
If you don't know what the LBS is:
0000 0000
^ // it's bit all the way to the right of a binary string.
Take the base two number 1011 1000 (184 in decimal)
The first bit that differs from LSB:
1011 1000
^ // this one
We'll record this as K1 = 0000 1000
Then we need to swap it with the very next bit to the right:
0000 1000
^ // this one
We'll record this as K2 = 0000 0100
Bitwise OR K1 and K2 together and you'll get a mask
mask = K1 | k2 // 0000 1000 | 0000 0100 -> 0000 1100
Bitwise XOR the mask with the original number and you'll have the correct output/swap
number ^ mask // 1011 1000 ^ 0000 1100 -> 1011 0100
Now before we pull everything together we have to consider that fact that the LSB could be 0001, and so could a bunch of bits after that 1000 1111. So we have to deal with the two cases of the first bit that differs from the LSB; it may be a 1 or 0.
First we have a conditional that test the LSB to be 1 or 0: x & 1
IF 1 return x XORed with the return of a helper function
This helper function has a second argument which its value depends on whether the condition is true or not. func(x, 0xFFFFFFFF) // if true // 0xFFFFFFFF 64 bit word with all bits set to 1
Otherwise we'll skip the if statement and return a similar expression but with a different value provided to the second argument.
return x XORed with func(x, 0x00000000) // 64 bit word with all bits set to 0. You could alternatively just pass 0 but I did this for consistency
Our helper function returns a mask that we are going to XOR with the original number to get our output.
It takes two arguments, our original number and a mask, used in this expression:
(x ^ mask) & ~((x ^ mask) - 1)
which gives us a new number with the bit at index K1 always set to 1.
It then shifts that bit 1 to the right (i.e index K2) then ORs it with itself to create our final mask
0000 1000 >> 1 -> 0000 0100 | 0001 0000 -> 0000 1100
This all implemented in C++ looks like:
unsigned long long int closestIntSameBitCount(unsigned long long int n)
{
if (n & 1)
return n ^= getSwapMask(n, 0xFFFFFFFF);
return n ^= getSwapMask(n, 0x00000000);
}
// Helper function
unsigned long long int getSwapMask(unsigned long long int n, unsigned long long int mask)
{
unsigned long long int swapBitMask = (n ^ mask) & ~((n ^ mask) - 1);
return swapBitMask | (swapBitMask >> 1);
}
Keep note of the expression (x ^ mask) & ~((x ^ mask) - 1)
I'll now run through this code with my example 1011 1000:
// start of closestIntSameBitCount
if (0) // 1011 1000 & 1 -> 0000 0000
// start of getSwapMask
getSwapMask(1011 1000, 0x00000000)
swapBitMask = (x ^ mask) & ~1011 0111 // ((x ^ mask) - 1) = 1011 1000 ^ .... 0000 0000 -> 1011 1000 - 1 -> 1011 0111
swapBitMask = (x ^ mask) & 0100 1000 // ~1011 0111 -> 0100 1000
swapBitMask = 1011 1000 & 0100 1000 // (x ^ mask) = 1011 1000 ^ .... 0000 0000 -> 1011 1000
swapBitMask = 0000 1000 // 1011 1000 & 0100 1000 -> 0000 1000
return swapBitMask | 0000 0100 // (swapBitMask >> 1) = 0000 1000 >> 1 -> 0000 0100
return 0000 1100 // 0000 1000 | 0000 0100 -> 0000 11000
// end of getSwapMask
return 1011 0100 // 1011 1000 ^ 0000 11000 -> 1011 0100
// end of closestIntSameBitCount
Here is a full running example if you would like compile and run it your self:
#include <iostream>
#include <stdio.h>
#include <bitset>
unsigned long long int closestIntSameBitCount(unsigned long long int n);
unsigned long long int getSwapMask(unsigned long long int n, unsigned long long int mask);
int main()
{
unsigned long long int number;
printf("Pick a number: ");
std::cin >> number;
std::bitset<64> a(number);
std::bitset<64> b(closestIntSameBitCount(number));
std::cout << a
<< "\n"
<< b
<< std::endl;
}
unsigned long long int closestIntSameBitCount(unsigned long long int n)
{
if (n & 1)
return n ^= getSwapMask(n, 0xFFFFFFFF);
return n ^= getSwapMask(n, 0x00000000);
}
// Helper function
unsigned long long int getSwapMask(unsigned long long int n, unsigned long long int mask)
{
unsigned long long int swapBitMask = (n ^ mask) & ~((n ^ mask) - 1);
return swapBitMask | (swapBitMask >> 1);
}

This was my solution to the problem. I guess #jigsawmnc explains pretty well why we need to have |k2 -k1| to a minimum. So in order to find the closest integer, with the same weight, we would want to find the location where consecutive bits are flipped and then flip them again to get the answer. In order to do that we can shift the number 1 unit. Take the XOR with the same number. This will set bits at all locations where there is a flip. Find the least significant bit for the XOR. This will give you the smallest location to flip. Create a mask for the location and next bit. Take an XOR and that should be the answer. This won't work, if the digits are all 0 or all 1
Here is the code for it.
def variant_closest_int(x: int) -> int:
if x == 0 or ~x == 0:
raise ValueError('All bits are 0 or 1')
x_ = x >> 1
lsb = x ^ x_
mask_ = lsb & ~(lsb - 1)
mask = mask_ | (mask_ << 1)
return x ^ mask

My solution, takes advantage of the parity of the integer. I think the way I got the LSB masks can be simplified
def next_weighted_int(x):
if x % 2 == 0:
lsb_mask = ( ((x - 1) ^ x) >> 1 ) + 1 # Gets a mask for the first 1
x ^= lsb_mask
x |= (lsb_mask >> 1)
return x
lsb_mask = ((x ^ (x + 1)) >> 1 ) + 1 # Gets a mask for the first 0
x |= lsb_mask
x ^= (lsb_mask >> 1)
return x

Just sharing my python solution for this problem:
def same closest_int_same_bit_count(a):
x = a + (a & 1) # change last bit to 0
bit = (x & ~(x-1)) # get last set bit
return a ^ (bit | bit >> 1) # swap set bit with unset bit

func findClosestIntegerWithTheSameWeight2(x int) int {
rightMost0 := ^x & (x + 1)
rightMost1 := x & (-x)
if rightMost0 > 1 {
return (x ^ rightMost0) ^ (rightMost0 >> 1)
} else {
return (x ^ rightMost1) ^ (rightMost1 >> 1)
}
}

Related

easy algorithm for using flags

Exists flags defs:
flag1=1
flag2=2
flag3=4
flag4=8
...
flagN=2^(N-1)
flag=flag1+flag2+...+flagN
if flagI not set, it eq 0
i have flag. which method can easily check, is for example flag2 defined?
Answer to your question
What's the range of flag? If it's under 2^64-1, almost every method is okay.
As #taskinoor posted, you should notice that:
flag1 = 000 ... ... 0001
flag2 = 000 ... ... 0010
flag3 = 000 ... ... 0100
In other words,
flag[n] = 1 << (n-1)
So, if you want to check all bits, a for loop and bitwise operation are fast enough to solve you problem. Like This (suppose you could understand C/C++ and flag is less than 2^32, which could be hold by an unsigned int in C/C++):
void check(unsigned int flag)
{
for (int i = 0; i < 32; ++i)
if ((flag & (1 << i)) != 0)
printf("flag%d defined!\n", i+1);
}
It's O(k), which k is the length of the type of flag in binary. For unsigned int, it's O(32) = O(1), almost in constant time.
If you just want to count how many flags defined:
I don't know what's your purpose. If you just want to count how many flags defined and flag is less than 2^64, the following method is awesome(suppose unsigned int as well):
unsigned int count_bit(unsigned int x)
{
x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F);
x = (x & 0x00FF00FF) + ((x >> 8) & 0x00FF00FF);
x = (x & 0x0000FFFF) + ((x >> 16)& 0x0000FFFF);
return x;
}
If you call count_bit(1234567890), it'll return 12.
Let me explain this algorithm.
This algorithm is based on Divide and Conquer Algorithm. Suppose there is a 8bit integer 213(11010101 in binary), the algorithm works like this(each time merge two neighbor blocks):
+-------------------------------+
| 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | <- x
| 1 0 | 0 1 | 0 1 | 0 1 | <- first time merge
| 0 0 1 1 | 0 0 1 0 | <- second time merge
| 0 0 0 0 0 1 0 1 | <- third time ( answer = 00000101 = 5)
+-------------------------------+
Note that in each flag only one bit is set to 1, others are 0.
flag1 = 000 ... ... 0001
flag2 = 000 ... ... 0010
flag3 = 000 ... ... 0100
// and like this
So if you do bitwise AND flag & flag2 then the result will be non-zero only if flag2 is defined.
r = flag & flag2;
if r != 0 then flag2 is defined
You can do this with all flags.
Boolean isSet (flags, flagN){
Return (flags & flagN) != 0;
}
Flags being the flag vector, flagN the flag you want to check
It is worth grokking the concept of bitmasks and flags more deeply. You can then use your imagination to represent state efficiently . (Just an example explained below)
First -Define the bitmask : 0x0000001c
What are the binary strings for which when you do an 'and' operation on the mask, you get a non-zero value?
Those are your valid flag values.
Valid flag values for this bitmask: 0x0000001c,0x00000014,0x00000018,0x00000004,0x00000008,etc ..
So, in your application if you can do the following:
flagvariable |= flagvalue1 ->Enable a particular flag.
if( flagvariable & maskvalue) :Check if a mask is enabled :
Then the different cases you would need to check :
if(flagvariable &maskvalue ==flagvalue1) { do something}
else
if(flagvariable &maskvalue ==flagvalue2) {do something else}
flagvariable &= `flagvalue1 : Clear the flag
To be more clear about flags and bitmasks, just step into gdb and do p /t and evaluate the the operations described above.

Number of 1s in the two's complement binary representations of integers in a range

This problem is from the 2011 Codesprint (http://csfall11.interviewstreet.com/):
One of the basics of Computer Science is knowing how numbers are represented in 2's complement. Imagine that you write down all numbers between A and B inclusive in 2's complement representation using 32 bits. How many 1's will you write down in all ?
Input:
The first line contains the number of test cases T (<1000). Each of the next T lines contains two integers A and B.
Output:
Output T lines, one corresponding to each test case.
Constraints:
-2^31 <= A <= B <= 2^31 - 1
Sample Input:
3
-2 0
-3 4
-1 4
Sample Output:
63
99
37
Explanation:
For the first case, -2 contains 31 1's followed by a 0, -1 contains 32 1's and 0 contains 0 1's. Thus the total is 63.
For the second case, the answer is 31 + 31 + 32 + 0 + 1 + 1 + 2 + 1 = 99
I realize that you can use the fact that the number of 1s in -X is equal to the number of 0s in the complement of (-X) = X-1 to speed up the search. The solution claims that there is a O(log X) recurrence relation for generating the answer but I do not understand it. The solution code can be viewed here: https://gist.github.com/1285119
I would appreciate it if someone could explain how this relation is derived!
Well, it's not that complicated...
The single-argument solve(int a) function is the key. It is short, so I will cut&paste it here:
long long solve(int a)
{
if(a == 0) return 0 ;
if(a % 2 == 0) return solve(a - 1) + __builtin_popcount(a) ;
return ((long long)a + 1) / 2 + 2 * solve(a / 2) ;
}
It only works for non-negative a, and it counts the number of 1 bits in all integers from 0 to a inclusive.
The function has three cases:
a == 0 -> returns 0. Obviously.
a even -> returns the number of 1 bits in a plus solve(a-1). Also pretty obvious.
The final case is the interesting one. So, how do we count the number of 1 bits from 0 to an odd number a?
Consider all of the integers between 0 and a, and split them into two groups: The evens, and the odds. For example, if a is 5, you have two groups (in binary):
000 (aka. 0)
010 (aka. 2)
100 (aka. 4)
and
001 (aka 1)
011 (aka 3)
101 (aka 5)
Observe that these two groups must have the same size (because a is odd and the range is inclusive). To count how many 1 bits there are in each group, first count all but the last bits, then count the last bits.
All but the last bits looks like this:
00
01
10
...and it looks like this for both groups. The number of 1 bits here is just solve(a/2). (In this example, it is the number of 1 bits from 0 to 2. Also, recall that integer division in C/C++ rounds down.)
The last bit is zero for every number in the first group and one for every number in the second group, so those last bits contribute (a+1)/2 one bits to the total.
So the third case of the recursion is (a+1)/2 + 2*solve(a/2), with appropriate casts to long long to handle the case where a is INT_MAX (and thus a+1 overflows).
This is an O(log N) solution. To generalize it to solve(a,b), you just compute solve(b) - solve(a), plus the appropriate logic for worrying about negative numbers. That is what the two-argument solve(int a, int b) is doing.
Cast the array into a series of integers. Then for each integer do:
int NumberOfSetBits(int i)
{
i = i - ((i >> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
}
Also this is portable, unlike __builtin_popcount
See here: How to count the number of set bits in a 32-bit integer?
when a is positive, the better explanation was already been posted.
If a is negative, then on a 32-bit system each negative number between a and zero will have 32 1's bits less the number of bits in the range from 0 to the binary representation of positive a.
So, in a better way,
long long solve(int a) {
if (a >= 0){
if (a == 0) return 0;
else if ((a %2) == 0) return solve(a - 1) + noOfSetBits(a);
else return (2 * solve( a / 2)) + ((long long)a + 1) / 2;
}else {
a++;
return ((long long)(-a) + 1) * 32 - solve(-a);
}
}
In the following code, the bitsum of x is defined as the count of 1 bits in the two's complement representation of the numbers between 0 and x (inclusive), where Integer.MIN_VALUE <= x <= Integer.MAX_VALUE.
For example:
bitsum(0) is 0
bitsum(1) is 1
bitsum(2) is 1
bitsum(3) is 4
..etc
10987654321098765432109876543210 i % 10 for 0 <= i <= 31
00000000000000000000000000000000 0
00000000000000000000000000000001 1
00000000000000000000000000000010 2
00000000000000000000000000000011 3
00000000000000000000000000000100 4
00000000000000000000000000000101 ...
00000000000000000000000000000110
00000000000000000000000000000111 (2^i)-1
00000000000000000000000000001000 2^i
00000000000000000000000000001001 (2^i)+1
00000000000000000000000000001010 ...
00000000000000000000000000001011 x, 011 = x & (2^i)-1 = 3
00000000000000000000000000001100
00000000000000000000000000001101
00000000000000000000000000001110
00000000000000000000000000001111
00000000000000000000000000010000
00000000000000000000000000010001
00000000000000000000000000010010 18
...
01111111111111111111111111111111 Integer.MAX_VALUE
The formula of the bitsum is:
bitsum(x) = bitsum((2^i)-1) + 1 + x - 2^i + bitsum(x & (2^i)-1 )
Note that x - 2^i = x & (2^i)-1
Negative numbers are handled slightly differently than positive numbers. In this case the number of zeros is subtracted from the total number of bits:
Integer.MIN_VALUE <= x < -1
Total number of bits: 32 * -x.
The number of zeros in a negative number x is equal to the number of ones in -x - 1.
public class TwosComplement {
//t[i] is the bitsum of (2^i)-1 for i in 0 to 31.
private static long[] t = new long[32];
static {
t[0] = 0;
t[1] = 1;
int p = 2;
for (int i = 2; i < 32; i++) {
t[i] = 2*t[i-1] + p;
p = p << 1;
}
}
//count the bits between x and y inclusive
public static long bitsum(int x, int y) {
if (y > x && x > 0) {
return bitsum(y) - bitsum(x-1);
}
else if (y >= 0 && x == 0) {
return bitsum(y);
}
else if (y == x) {
return Integer.bitCount(y);
}
else if (x < 0 && y == 0) {
return bitsum(x);
} else if (x < 0 && x < y && y < 0 ) {
return bitsum(x) - bitsum(y+1);
} else if (x < 0 && x < y && 0 < y) {
return bitsum(x) + bitsum(y);
}
throw new RuntimeException(x + " " + y);
}
//count the bits between 0 and x
public static long bitsum(int x) {
if (x == 0) return 0;
if (x < 0) {
if (x == -1) {
return 32;
} else {
long y = -(long)x;
return 32 * y - bitsum((int)(y - 1));
}
} else {
int n = x;
int sum = 0; //x & (2^i)-1
int j = 0;
int i = 1; //i = 2^j
int lsb = n & 1; //least significant bit
n = n >>> 1;
while (n != 0) {
sum += lsb * i;
lsb = n & 1;
n = n >>> 1;
i = i << 1;
j++;
}
long tot = t[j] + 1 + sum + bitsum(sum);
return tot;
}
}
}

How to get lg2 of a number that is 2^k

What is the best solution for getting the base 2 logarithm of a number that I know is a power of two (2^k). (Of course I know only the value 2^k not k itself.)
One way I thought of doing is by subtracting 1 and then doing a bitcount:
lg2(n) = bitcount( n - 1 ) = k, iff k is an integer
0b10000 - 1 = 0b01111, bitcount(0b01111) = 4
But is there a faster way of doing it (without caching)? Also something that doesn't involve bitcount about as fast would be nice to know?
One of the applications this is:
suppose you have bitmask
0b0110111000
and value
0b0101010101
and you are interested of
(value & bitmask) >> number of zeros in front of bitmask
(0b0101010101 & 0b0110111000) >> 3 = 0b100010
this can be done with
using bitcount
value & bitmask >> bitcount((bitmask - 1) xor bitmask) - 1
or using lg2
value & bitmask >> lg2(((bitmask - 1) xor bitmask) + 1 ) - 2
For it to be faster than bitcount without caching it should be faster than O(lg(k)) where k is the count of storage bits.
Yes. Here's a way to do it without the bitcount in lg(n), if you know the integer in question is a power of 2.
unsigned int x = ...;
static const unsigned int arr[] = {
// Each element in this array alternates a number of 1s equal to
// consecutive powers of two with an equal number of 0s.
0xAAAAAAAA, // 0b10101010.. // one 1, then one 0, ...
0xCCCCCCCC, // 0b11001100.. // two 1s, then two 0s, ...
0xF0F0F0F0, // 0b11110000.. // four 1s, then four 0s, ...
0xFF00FF00, // 0b1111111100000000.. // [The sequence continues.]
0xFFFF0000
}
register unsigned int reg = (x & arr[0]) != 0;
reg |= ((x & arr[4]) != 0) << 4;
reg |= ((x & arr[3]) != 0) << 3;
reg |= ((x & arr[2]) != 0) << 2;
reg |= ((x & arr[1]) != 0) << 1;
// reg now has the value of lg(x).
In each of the reg |= steps, we successively test to see if any of the bits of x are shared with alternating bitmasks in arr. If they are, that means that lg(x) has bits which are in that bitmask, and we effectively add 2^k to reg, where k is the log of the length of the alternating bitmask. For example, 0xFF00FF00 is an alternating sequence of 8 ones and zeroes, so k is 3 (or lg(8)) for this bitmask.
Essentially, each reg |= ((x & arr[k]) ... step (and the initial assignment) tests whether lg(x) has bit k set. If so, we add it to reg; the sum of all those bits will be lg(x).
That looks like a lot of magic, so let's try an example. Suppose we want to know what power of 2 the value 2,048 is:
// x = 2048
// = 1000 0000 0000
register unsigned int reg = (x & arr[0]) != 0;
// reg = 1000 0000 0000
& ... 1010 1010 1010
= 1000 0000 0000 != 0
// reg = 0x1 (1) // <-- Matched! Add 2^0 to reg.
reg |= ((x & arr[4]) != 0) << 4;
// reg = 0x .. 0800
& 0x .. 0000
= 0 != 0
// reg = reg | (0 << 4) // <--- No match.
// reg = 0x1 | 0
// reg remains 0x1.
reg |= ((x & arr[3]) != 0) << 3;
// reg = 0x .. 0800
& 0x .. FF00
= 800 != 0
// reg = reg | (1 << 3) // <--- Matched! Add 2^3 to reg.
// reg = 0x1 | 0x8
// reg is now 0x9.
reg |= ((x & arr[2]) != 0) << 2;
// reg = 0x .. 0800
& 0x .. F0F0
= 0 != 0
// reg = reg | (0 << 2) // <--- No match.
// reg = 0x9 | 0
// reg remains 0x9.
reg |= ((x & arr[1]) != 0) << 1;
// reg = 0x .. 0800
& 0x .. CCCC
= 800 != 0
// reg = reg | (1 << 1) // <--- Matched! Add 2^1 to reg.
// reg = 0x9 | 0x2
// reg is now 0xb (11).
We see that the final value of reg is 2^0 + 2^1 + 2^3, which is indeed 11.
If you know the number is a power of 2, you could just shift it right (>>) until it equals 0. The amount of times you shifted right (minus 1) is your k.
Edit: faster than this is the lookup table method (though you sacrifice some space, but not a ton). See http://doctorinterview.com/index.html/algorithmscoding/find-the-integer-log-base-2-of-an-integer/.
Many architectures have a "find first one" instruction (bsr, clz, bfffo, cntlzw, etc.) which will be much faster than bit-counting approaches.
If you don't mind dealing with floats you can use log(x) / log(2).

How to compute a 3D Morton number (interleave the bits of 3 ints)

I'm looking for a fast way to compute a 3D Morton number. This site has a magic-number based trick for doing it for 2D Morton numbers, but it doesn't seem obvious how to extend it to 3D.
So basically I have 3 10-bit numbers that I want to interleave into a single 30 bit number with a minimal number of operations.
You can use the same technique. I'm assuming that variables contain 32-bit integers with the highest 22 bits set to 0 (which is a bit more restrictive than necessary). For each variable x containing one of the three 10-bit integers we do the following:
x = (x | (x << 16)) & 0x030000FF;
x = (x | (x << 8)) & 0x0300F00F;
x = (x | (x << 4)) & 0x030C30C3;
x = (x | (x << 2)) & 0x09249249;
Then, with x,y and z the three manipulated 10-bit integers we get the result by taking:
x | (y << 1) | (z << 2)
The way this technique works is as follows. Each of the x = ... lines above "splits" groups of bits in half such that there is enough space in between for the bits of the other integers. For example, if we consider three 4-bit integers, we split one with bits 1234 into 000012000034 where the zeros are reserved for the other integers. In the next step we split 12 and 34 in the same way to get 001002003004. Even though 10 bits doesn't make for a nice repeated division in two groups, you can just consider it 16 bits where you lose the highest ones in the end.
As you can see from the first line, you actually only need that for each input integer x it holds that x & 0x03000000 == 0.
Here is my solution with a python script:
I took the hint from in his comment: Fabian “ryg” Giesen
Read the long comment below! We need to keep track which bits need to go how far!
Then in each step we select these bits and move them and apply a bitmask (see comment last lines) to mask them!
Bit Distances: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
Bit Distances (binary): ['0', '10', '100', '110', '1000', '1010', '1100', '1110', '10000', '10010']
Shifting bits by 1 for bits idx: []
Shifting bits by 2 for bits idx: [1, 3, 5, 7, 9]
Shifting bits by 4 for bits idx: [2, 3, 6, 7]
Shifting bits by 8 for bits idx: [4, 5, 6, 7]
Shifting bits by 16 for bits idx: [8, 9]
BitPositions: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Shifted bef.: 0000 0000 0000 0000 0000 0011 0000 0000 hex: 0x300
Shifted: 0000 0011 0000 0000 0000 0000 0000 0000 hex: 0x3000000
NonShifted: 0000 0000 0000 0000 0000 0000 1111 1111 hex: 0xff
Bitmask is now: 0000 0011 0000 0000 0000 0000 1111 1111 hex: 0x30000ff
Shifted bef.: 0000 0000 0000 0000 0000 0000 1111 0000 hex: 0xf0
Shifted: 0000 0000 0000 0000 1111 0000 0000 0000 hex: 0xf000
NonShifted: 0000 0011 0000 0000 0000 0000 0000 1111 hex: 0x300000f
Bitmask is now: 0000 0011 0000 0000 1111 0000 0000 1111 hex: 0x300f00f
Shifted bef.: 0000 0000 0000 0000 1100 0000 0000 1100 hex: 0xc00c
Shifted: 0000 0000 0000 1100 0000 0000 1100 0000 hex: 0xc00c0
NonShifted: 0000 0011 0000 0000 0011 0000 0000 0011 hex: 0x3003003
Bitmask is now: 0000 0011 0000 1100 0011 0000 1100 0011 hex: 0x30c30c3
Shifted bef.: 0000 0010 0000 1000 0010 0000 1000 0010 hex: 0x2082082
Shifted: 0000 1000 0010 0000 1000 0010 0000 1000 hex: 0x8208208
NonShifted: 0000 0001 0000 0100 0001 0000 0100 0001 hex: 0x1041041
Bitmask is now: 0000 1001 0010 0100 1001 0010 0100 1001 hex: 0x9249249
x &= 0x3ff
x = (x | x << 16) & 0x30000ff <<< THIS IS THE MASK for shifting 16 (for bit 8 and 9)
x = (x | x << 8) & 0x300f00f
x = (x | x << 4) & 0x30c30c3
x = (x | x << 2) & 0x9249249
So for a 10bit number and 2 interleaving bits (for 32 bit), you need to do the following!:
x &= 0x3ff
x = (x | x << 16) & 0x30000ff #<<< THIS IS THE MASK for shifting 16 (for bit 8 and 9)
x = (x | x << 8) & 0x300f00f
x = (x | x << 4) & 0x30c30c3
x = (x | x << 2) & 0x9249249
And for a 21bit number and 2 interleaving bits (for 64bit), you need to do the following!:
x &= 0x1fffff
x = (x | x << 32) & 0x1f00000000ffff
x = (x | x << 16) & 0x1f0000ff0000ff
x = (x | x << 8) & 0x100f00f00f00f00f
x = (x | x << 4) & 0x10c30c30c30c30c3
x = (x | x << 2) & 0x1249249249249249
And for a 42bit number and 2 interleaving bits (for 128bit), you need to do the following ( in case you need it ;-)) :
x &= 0x3ffffffffff
x = (x | x << 64) & 0x3ff0000000000000000ffffffffL
x = (x | x << 32) & 0x3ff00000000ffff00000000ffffL
x = (x | x << 16) & 0x30000ff0000ff0000ff0000ff0000ffL
x = (x | x << 8) & 0x300f00f00f00f00f00f00f00f00f00fL
x = (x | x << 4) & 0x30c30c30c30c30c30c30c30c30c30c3L
x = (x | x << 2) & 0x9249249249249249249249249249249L
Python Script to produce and check the Interleaving Patterns!!!
def prettyBinString(x,d=32,steps=4,sep=".",emptyChar="0"):
b = bin(x)[2:]
zeros = d - len(b)
if zeros <= 0:
zeros = 0
k = steps - (len(b) % steps)
else:
k = steps - (d % steps)
s = ""
#print("zeros" , zeros)
#print("k" , k)
for i in range(zeros):
#print("k:",k)
if(k%steps==0 and i!= 0):
s+=sep
s += emptyChar
k+=1
for i in range(len(b)):
if( (k%steps==0 and i!=0 and zeros == 0) or (k%steps==0 and zeros != 0) ):
s+=sep
s += b[i]
k+=1
return s
def binStr(x): return prettyBinString(x,32,4," ","0")
def computeBitMaskPatternAndCode(numberOfBits, numberOfEmptyBits):
bitDistances=[ i*numberOfEmptyBits for i in range(numberOfBits) ]
print("Bit Distances: " + str(bitDistances))
bitDistancesB = [bin(dist)[2:] for dist in bitDistances]
print("Bit Distances (binary): " + str(bitDistancesB))
moveBits=[] #Liste mit allen Bits welche aufsteigend um 2, 4,8,16,32,64,128 stellen geschoben werden müssen
maxLength = len(max(bitDistancesB, key=len))
abort = False
for i in range(maxLength):
moveBits.append([])
for idx,bits in enumerate(bitDistancesB):
if not len(bits) - 1 < i:
if(bits[len(bits)-i-1] == "1"):
moveBits[i].append(idx)
for i in range(len(moveBits)):
print("Shifting bits by " + str(2**i) + "\t for bits idx: " + str(moveBits[i]))
bitPositions = range(numberOfBits);
print("BitPositions: " + str(bitPositions))
maskOld = (1 << numberOfBits) -1
codeString = "x &= " + hex(maskOld) + "\n"
for idx in xrange(len(moveBits)-1, -1, -1):
if len(moveBits[idx]):
shifted = 0
for bitIdxToMove in moveBits[idx]:
shifted |= 1<<bitPositions[bitIdxToMove];
bitPositions[bitIdxToMove] += 2**idx; # keep track where the actual bit stands! might get moved several times
# Get the non shifted part!
nonshifted = ~shifted & maskOld
print("Shifted bef.:\t" + binStr(shifted) + " hex: " + hex(shifted))
shifted = shifted << 2**idx
print("Shifted:\t" + binStr(shifted)+ " hex: " + hex(shifted))
print("NonShifted:\t" + binStr(nonshifted) + " hex: " + hex(nonshifted))
maskNew = shifted | nonshifted
print("Bitmask is now:\t" + binStr(maskNew) + " hex: " + hex(maskNew) +"\n")
#print("Code: " + "x = x | x << " +str(2**idx)+ " & " +hex(maskNew))
codeString += "x = (x | x << " +str(2**idx)+ ") & " +hex(maskNew) + "\n"
maskOld = maskNew
return codeString
numberOfBits = 10;
numberOfEmptyBits = 2;
codeString = computeBitMaskPatternAndCode(numberOfBits,numberOfEmptyBits);
print(codeString)
def partitionBy2(x):
exec(codeString)
return x
def checkPartition(x):
print("Check partition for: \t" + binStr(x))
part = partitionBy2(x);
print("Partition is : \t\t" + binStr(part))
#make the pattern manualy
partC = long(0);
for bitIdx in range(numberOfBits):
partC = partC | (x & (1<<bitIdx)) << numberOfEmptyBits*bitIdx
print("Partition check is :\t" + binStr(partC))
if(partC == part):
return True
else:
return False
checkError = False
for i in range(20):
x = random.getrandbits(numberOfBits);
if(checkPartition(x) == False):
checkError = True
break
if not checkError:
print("CHECK PARTITION SUCCESSFUL!!!!!!!!!!!!!!!!...")
else:
print("checkPartition has ERROR!!!!")
The simplest is probably a lookup table, if you've 4K free space:
static uint32_t t [ 1024 ] = { 0, 0x1, 0x8, ... };
uint32_t m ( int a, int b, int c )
{
return t[a] | ( t[b] << 1 ) | ( t[c] << 2 );
}
The bit hack uses shifts and masks to spread the bits out, so each time it shifts the value and ors it, copying some of the bits into empty spaces, then masking out combinations so only the original bits remain.
for example:
x = 0xabcd;
= 0000_0000_0000_0000_1010_1011_1100_1101
x = (x | (x << S[3])) & B[3];
= ( 0x00abcd00 | 0x0000abcd ) & 0xff00ff
= 0x00ab__cd & 0xff00ff
= 0x00ab00cd
= 0000_0000_1010_1011_0000_0000_1100_1101
x = (x | (x << S[2])) & B[2];
= ( 0x0ab00cd0 | 0x00ab00cd) & 0x0f0f0f0f
= 0x0a_b_c_d & 0x0f0f0f0f
= 0x0a0b0c0d
= 0000_1010_0000_1011_0000_1100_0000_1101
x = (x | (x << S[1])) & B[1];
= ( 0000_1010_0000_1011_0000_1100_0000_1101 |
0010_1000_0010_1100_0011_0000_0011_0100 ) &
0011_0011_0011_0011_0011_0011_0011_0011
= 0010_0010_0010_0011_0011_0000_0011_0001
x = (x | (x << S[0])) & B[0];
= ( 0010_0010_0010_0011_0011_0000_0011_0001 |
0100_0100_0100_0110_0110_0000_0110_0010 ) &
0101_0101_0101_0101_0101_0101_0101_0101
= 0100_0010_0100_0101_0101_0000_0101_0001
In each iteration, each block is split in two, the rightmost bit of the leftmost half of the block moved to its final position, and a mask applied so only the required bits remain.
Once you have spaced the inputs out, shifting them so the values of one fall into the zeros of the other is easy.
To extend that technique for more than two bits between values in the final result, you have to increase the shifts between where the bits end up. It gets a bit trickier, as the starting block size isn't a power of 2, so you could either split it down the middle, or on a power of 2 boundary.
So an evolution like this might work:
0000_0000_0000_0000_0000_0011_1111_1111
0000_0011_0000_0000_0000_0000_1111_1111
0000_0011_0000_0000_1111_0000_0000_1111
0000_0011_0000_1100_0011_0000_1100_0011
0000_1001_0010_0100_1001_0010_0100_1001
// 0000_0000_0000_0000_0000_0011_1111_1111
x = ( x | ( x << 16 ) ) & 0x030000ff;
// 0000_0011_0000_0000_0000_0000_1111_1111
x = ( x | ( x << 8 ) ) & 0x0300f00f;
// 0000_0011_0000_0000_1111_0000_0000_1111
x = ( x | ( x << 4 ) ) & 0x030c30c3;
// 0000_0011_0000_1100_0011_0000_1100_0011
x = ( x | ( x << 2 ) ) & 0x09249249;
// 0000_1001_0010_0100_1001_0010_0100_1001
Perform the same transformation on the inputs, shift one by one and another by two, or them together and you're done.
Good timing, I just did this last month!
The key was to make two functions. One spreads bits out to every-third bit.
Then we can combine three of them together (with a shift for the last two) to get the final Morton interleaved value.
This code interleaves starting at the HIGH bits (which is more logical for fixed point values.) If your application is only 10 bits per component, just shift each value left by 22 in order to make it start at the high bits.
/* Takes a value and "spreads" the HIGH bits to lower slots to seperate them.
ie, bit 31 stays at bit 31, bit 30 goes to bit 28, bit 29 goes to bit 25, etc.
Anything below bit 21 just disappears. Useful for interleaving values
for Morton codes. */
inline unsigned long spread3(unsigned long x)
{
x=(0xF0000000&x) | ((0x0F000000&x)>>8) | (x>>16); // spread top 3 nibbles
x=(0xC00C00C0&x) | ((0x30030030&x)>>4);
x=(0x82082082&x) | ((0x41041041&x)>>2);
return x;
}
inline unsigned long morton(unsigned long x, unsigned long y, unsigned long z)
{
return spread3(x) | (spread3(y)>>1) | (spread3(z)>>2);
}
I took the above and modified it to combine 3 16-bit numbers into a 48- (really 64-) bit number. Perhaps it will save someone the small bit of thinking to get there.
#include <inttypes.h>
#include <assert.h>
uint64_t zorder3d(uint64_t x, uint64_t y, uint64_t z){
static const uint64_t B[] = {0x00000000FF0000FF, 0x000000F00F00F00F,
0x00000C30C30C30C3, 0X0000249249249249};
static const int S[] = {16, 8, 4, 2};
static const uint64_t MAXINPUT = 65536;
assert( ( (x < MAXINPUT) ) &&
( (y < MAXINPUT) ) &&
( (z < MAXINPUT) )
);
x = (x | (x << S[0])) & B[0];
x = (x | (x << S[1])) & B[1];
x = (x | (x << S[2])) & B[2];
x = (x | (x << S[3])) & B[3];
y = (y | (y << S[0])) & B[0];
y = (y | (y << S[1])) & B[1];
y = (y | (y << S[2])) & B[2];
y = (y | (y << S[3])) & B[3];
z = (z | (z << S[0])) & B[0];
z = (z | (z << S[1])) & B[1];
z = (z | (z << S[2])) & B[2];
z = (z | (z << S[3])) & B[3];
return ( x | (y << 1) | (z << 2) );
}
The following code finds the Morton number of the three 10 bit input numbers. It uses the idee from your link and performs the bit spreading in the steps 5-5, 3-2-3-2, 2-1-1-1-2-1-1-1, and 1-1-1-1-1-1-1-1-1-1 because 10 is not a power of two.
......................9876543210
............98765..........43210
........987....56......432....10
......98..7..5..6....43..2..1..0
....9..8..7..5..6..4..3..2..1..0
Above you can see the location of every bit before the first and after every of the four steps.
public static Int32 GetMortonNumber(Int32 x, Int32 y, Int32 z)
{
return SpreadBits(x, 0) | SpreadBits(y, 1) | SpreadBits(z, 2);
}
public static Int32 SpreadBits(Int32 x, Int32 offset)
{
if ((x < 0) || (x > 1023))
{
throw new ArgumentOutOfRangeException();
}
if ((offset < 0) || (offset > 2))
{
throw new ArgumentOutOfRangeException();
}
x = (x | (x << 10)) & 0x000F801F;
x = (x | (x << 4)) & 0x00E181C3;
x = (x | (x << 2)) & 0x03248649;
x = (x | (x << 2)) & 0x09249249;
return x << offset;
}
Following is the code snippet to generate Morton key of size 64 bits for 3-D point.
using namespace std;
unsigned long long spreadBits(unsigned long long x)
{
x=(x|(x<<20))&0x000001FFC00003FF;
x=(x|(x<<10))&0x0007E007C00F801F;
x=(x|(x<<4))&0x00786070C0E181C3;
x=(x|(x<<2))&0x0199219243248649;
x=(x|(x<<2))&0x0649249249249249;
x=(x|(x<<2))&0x1249249249249249;
return x;
}
int main()
{
unsigned long long x,y,z,con=1;
con=con<<63;
printf("%#llx\n",(spreadBits(x)|(spreadBits(y)<<1)|(spreadBits(z)<<2))|con);
}
I had a similar problem today, but instead of 3 numbers, I have to combine an arbitrary number of numbers of any bit length. I employed my own sort of bit spreading and masking algorithm and applied it to C# BigIntegers. Here is the code I wrote. As a compilation step, it figures out the magic numbers and mask for the given number of dimensions and bit depth. Then you can reuse the object for multiple conversions.
/// <summary>
/// Convert an array of integers into a Morton code by interleaving the bits.
/// Create one Morton object for a given pair of Dimension and BitDepth and reuse if when encoding multiple
/// Morton numbers.
/// </summary>
public class Morton
{
/// <summary>
/// Number of bits to use to represent each number being interleaved.
/// </summary>
public int BitDepth { get; private set; }
/// <summary>
/// Count of separate numbers to interleave into a Morton number.
/// </summary>
public int Dimensions { get; private set; }
/// <summary>
/// The MagicNumbers spread the bits out to the right position.
/// Each must must be applied and masked, because the bits would overlap if we only used one magic number.
/// </summary>
public BigInteger LargeMagicNumber { get; private set; }
public BigInteger SmallMagicNumber { get; private set; }
/// <summary>
/// The mask removes extraneous bits that were spread into positions needed by the other dimensions.
/// </summary>
public BigInteger Mask { get; private set; }
public Morton(int dimensions, int bitDepth)
{
BitDepth = bitDepth;
Dimensions = dimensions;
BigInteger magicNumberUnit = new BigInteger(1UL << (int)(Dimensions - 1));
LargeMagicNumber = magicNumberUnit;
BigInteger maskUnit = new BigInteger(1UL << (int)(Dimensions - 1));
Mask = maskUnit;
for (var i = 0; i < bitDepth - 1; i++)
{
LargeMagicNumber = (LargeMagicNumber << (Dimensions - 1)) | (i % 2 == 1 ? magicNumberUnit : BigInteger.Zero);
Mask = (Mask << Dimensions) | maskUnit;
}
SmallMagicNumber = (LargeMagicNumber >> BitDepth) << 1; // Need to trim off pesky ones place bit.
}
/// <summary>
/// Interleave the bits from several integers into a single BigInteger.
/// The high-order bit from the first number becomes the high-order bit of the Morton number.
/// The high-order bit of the second number becomes the second highest-ordered bit in the Morton number.
///
/// How it works.
///
/// When you multupliy by the magic numbers you make multiple copies of the the number they are multplying,
/// each shifted by a different amount.
/// As it turns out, the high order bit of the highest order copy of a number is N bits to the left of the
/// second bit of the second copy, and so forth.
/// This is because each copy is shifted one bit less than N times the copy number.
/// After that, you apply the AND-mask to unset all bits that are not in position.
///
/// Two magic numbers are needed because since each copy is shifted one less than the bitDepth, consecutive
/// copies would overlap and ruin the algorithm. Thus one magic number (LargeMagicNumber) handles copies 1, 3, 5, etc, while the
/// second (SmallMagicNumber) handles copies 2, 4, 6, etc.
/// </summary>
/// <param name="vector">Integers to combine.</param>
/// <returns>A Morton number composed of Dimensions * BitDepth bits.</returns>
public BigInteger Interleave(int[] vector)
{
if (vector == null || vector.Length != Dimensions)
throw new ArgumentException("Interleave expects an array of length " + Dimensions, "vector");
var morton = BigInteger.Zero;
for (var i = 0; i < Dimensions; i++)
{
morton |= (((LargeMagicNumber * vector[i]) & Mask) | ((SmallMagicNumber * vector[i]) & Mask)) >> i;
}
return morton;
}
public override string ToString()
{
return "Morton(Dimension: " + Dimensions + ", BitDepth: " + BitDepth
+ ", MagicNumbers: " + Convert.ToString((long)LargeMagicNumber, 2) + ", " + Convert.ToString((long)SmallMagicNumber, 2)
+ ", Mask: " + Convert.ToString((long)Mask, 2) + ")";
}
}
Here's a generator I've made in Ruby for producing encoding methods of arbitrary length:
def morton_code_for(bits)
method = ''
limit_mask = (1 << (bits * 3)) - 1
split = (2 ** ((Math.log(bits) / Math.log(2)).to_i + 1)).to_i
level = 1
puts "// Coding for 3 #{bits}-bit values"
loop do
shift = split
split /= 2
level *= 2
mask = ([ '1' * split ] * level).join('0' * split * 2).to_i(2) & limit_mask
expression = "v = (v | (v << %2d)) & 0x%016x;" % [ shift, mask ]
method << expression
puts "%s // 0b%064b" % [ expression, mask ]
break if (split <= 1)
end
puts
print "// Test of method results: "
v = (1 << bits) - 1
puts eval(method).to_s(2)
end
morton_code_for(21)
The output is suitably generic and can be adapted as required. Sample output:
// Coding for 3 21-bit values
v = (v | (v << 32)) & 0x7fff00000000ffff; // 0b0111111111111111000000000000000000000000000000001111111111111111
v = (v | (v << 16)) & 0x00ff0000ff0000ff; // 0b0000000011111111000000000000000011111111000000000000000011111111
v = (v | (v << 8)) & 0x700f00f00f00f00f; // 0b0111000000001111000000001111000000001111000000001111000000001111
v = (v | (v << 4)) & 0x30c30c30c30c30c3; // 0b0011000011000011000011000011000011000011000011000011000011000011
v = (v | (v << 2)) & 0x1249249249249249; // 0b0001001001001001001001001001001001001001001001001001001001001001
// Test of method results: 1001001001001001001001001001001001001001001001001001001001001

Counting, reversed bit pattern

I am trying to find an algorithm to count from 0 to 2n-1 but their bit pattern reversed. I care about only n LSB of a word. As you may have guessed I failed.
For n=3:
000 -> 0
100 -> 4
010 -> 2
110 -> 6
001 -> 1
101 -> 5
011 -> 3
111 -> 7
You get the idea.
Answers in pseudo-code is great. Code fragments in any language are welcome, answers without bit operations are preferred.
Please don't just post a fragment without even a short explanation or a pointer to a source.
Edit: I forgot to add, I already have a naive implementation which just bit-reverses a count variable. In a sense, this method is not really counting.
This is, I think easiest with bit operations, even though you said this wasn't preferred
Assuming 32 bit ints, here's a nifty chunk of code that can reverse all of the bits without doing it in 32 steps:
unsigned int i;
i = (i & 0x55555555) << 1 | (i & 0xaaaaaaaa) >> 1;
i = (i & 0x33333333) << 2 | (i & 0xcccccccc) >> 2;
i = (i & 0x0f0f0f0f) << 4 | (i & 0xf0f0f0f0) >> 4;
i = (i & 0x00ff00ff) << 8 | (i & 0xff00ff00) >> 8;
i = (i & 0x0000ffff) << 16 | (i & 0xffff0000) >> 16;
i >>= (32 - n);
Essentially this does an interleaved shuffle of all of the bits. Each time around half of the bits in the value are swapped with the other half.
The last line is necessary to realign the bits so that bin "n" is the most significant bit.
Shorter versions of this are possible if "n" is <= 16, or <= 8
At each step, find the leftmost 0 digit of your value. Set it, and clear all digits to the left of it. If you don't find a 0 digit, then you've overflowed: return 0, or stop, or crash, or whatever you want.
This is what happens on a normal binary increment (by which I mean it's the effect, not how it's implemented in hardware), but we're doing it on the left instead of the right.
Whether you do this in bit ops, strings, or whatever, is up to you. If you do it in bitops, then a clz (or call to an equivalent hibit-style function) on ~value might be the most efficient way: __builtin_clz where available. But that's an implementation detail.
This solution was originally in binary and converted to conventional math as the requester specified.
It would make more sense as binary, at least the multiply by 2 and divide by 2 should be << 1 and >> 1 for speed, the additions and subtractions probably don't matter one way or the other.
If you pass in mask instead of nBits, and use bitshifting instead of multiplying or dividing, and change the tail recursion to a loop, this will probably be the most performant solution you'll find since every other call it will be nothing but a single add, it would only be as slow as Alnitak's solution once every 4, maybe even 8 calls.
int incrementBizarre(int initial, int nBits)
// in the 3 bit example, this should create 100
mask=2^(nBits-1)
// This should only return true if the first (least significant) bit is not set
// if initial is 011 and mask is 100
// 3 4, bit is not set
if(initial < mask)
// If it was not, just set it and bail.
return initial+ mask // 011 (3) + 100 (4) = 111 (7)
else
// it was set, are we at the most significant bit yet?
// mask 100 (4) / 2 = 010 (2), 001/2 = 0 indicating overflow
if(mask / 2) > 0
// No, we were't, so unset it (initial-mask) and increment the next bit
return incrementBizarre(initial - mask, mask/2)
else
// Whoops we were at the most significant bit. Error condition
throw new OverflowedMyBitsException()
Wow, that turned out kinda cool. I didn't figure in the recursion until the last second there.
It feels wrong--like there are some operations that should not work, but they do because of the nature of what you are doing (like it feels like you should get into trouble when you are operating on a bit and some bits to the left are non-zero, but it turns out you can't ever be operating on a bit unless all the bits to the left are zero--which is a very strange condition, but true.
Example of flow to get from 110 to 001 (backwards 3 to backwards 4):
mask 100 (4), initial 110 (6); initial < mask=false; initial-mask = 010 (2), now try on the next bit
mask 010 (2), initial 010 (2); initial < mask=false; initial-mask = 000 (0), now inc the next bit
mask 001 (1), initial 000 (0); initial < mask=true; initial + mask = 001--correct answer
Here's a solution from my answer to a different question that computes the next bit-reversed index without looping. It relies heavily on bit operations, though.
The key idea is that incrementing a number simply flips a sequence of least-significant bits, for example from nnnn0111 to nnnn1000. So in order to compute the next bit-reversed index, you have to flip a sequence of most-significant bits. If your target platform has a CTZ ("count trailing zeros") instruction, this can be done efficiently.
Example in C using GCC's __builtin_ctz:
void iter_reversed(unsigned bits) {
unsigned n = 1 << bits;
for (unsigned i = 0, j = 0; i < n; i++) {
printf("%x\n", j);
// Compute a mask of LSBs.
unsigned mask = i ^ (i + 1);
// Length of the mask.
unsigned len = __builtin_ctz(~mask);
// Align the mask to MSB of n.
mask <<= bits - len;
// XOR with mask.
j ^= mask;
}
}
Without a CTZ instruction, you can also use integer division:
void iter_reversed(unsigned bits) {
unsigned n = 1 << bits;
for (unsigned i = 0, j = 0; i < n; i++) {
printf("%x\n", j);
// Find least significant zero bit.
unsigned bit = ~i & (i + 1);
// Using division to bit-reverse a single bit.
unsigned rev = (n / 2) / bit;
// XOR with mask.
j ^= (n - 1) & ~(rev - 1);
}
}
void reverse(int nMaxVal, int nBits)
{
int thisVal, bit, out;
// Calculate for each value from 0 to nMaxVal.
for (thisVal=0; thisVal<=nMaxVal; ++thisVal)
{
out = 0;
// Shift each bit from thisVal into out, in reverse order.
for (bit=0; bit<nBits; ++bit)
out = (out<<1) + ((thisVal>>bit) & 1)
}
printf("%d -> %d\n", thisVal, out);
}
Maybe increment from 0 to N (the "usual" way") and do ReverseBitOrder() for each iteration. You can find several implementations here (I like the LUT one the best).
Should be really quick.
Here's an answer in Perl. You don't say what comes after the all ones pattern, so I just return zero. I took out the bitwise operations so that it should be easy to translate into another language.
sub reverse_increment {
my($n, $bits) = #_;
my $carry = 2**$bits;
while($carry > 1) {
$carry /= 2;
if($carry > $n) {
return $carry + $n;
} else {
$n -= $carry;
}
}
return 0;
}
Here's a solution which doesn't actually try to do any addition, but exploits the on/off pattern of the seqence (most sig bit alternates every time, next most sig bit alternates every other time, etc), adjust n as desired:
#define FLIP(x, i) do { (x) ^= (1 << (i)); } while(0)
int main() {
int n = 3;
int max = (1 << n);
int x = 0;
for(int i = 1; i <= max; ++i) {
std::cout << x << std::endl;
/* if n == 3, this next part is functionally equivalent to this:
*
* if((i % 1) == 0) FLIP(x, n - 1);
* if((i % 2) == 0) FLIP(x, n - 2);
* if((i % 4) == 0) FLIP(x, n - 3);
*/
for(int j = 0; j < n; ++j) {
if((i % (1 << j)) == 0) FLIP(x, n - (j + 1));
}
}
}
How about adding 1 to the most significant bit, then carrying to the next (less significant) bit, if necessary. You could speed this up by operating on bytes:
Precompute a lookup table for counting in bit-reverse from 0 to 256 (00000000 -> 10000000, 10000000 -> 01000000, ..., 11111111 -> 00000000).
Set all bytes in your multi-byte number to zero.
Increment the most significant byte using the lookup table. If the byte is 0, increment the next byte using the lookup table. If the byte is 0, increment the next byte...
Go to step 3.
With n as your power of 2 and x the variable you want to step:
(defun inv-step (x n) ; the following is a function declaration
"returns a bit-inverse step of x, bounded by 2^n" ; documentation
(do ((i (expt 2 (- n 1)) ; loop, init of i
(/ i 2)) ; stepping of i
(s x)) ; init of s as x
((not (integerp i)) ; breaking condition
s) ; returned value if all bits are 1 (is 0 then)
(if (< s i) ; the loop's body: if s < i
(return-from inv-step (+ s i)) ; -> add i to s and return the result
(decf s i)))) ; else: reduce s by i
I commented it thoroughly as you may not be familiar with this syntax.
edit: here is the tail recursive version. It seems to be a little faster, provided that you have a compiler with tail call optimization.
(defun inv-step (x n)
(let ((i (expt 2 (- n 1))))
(cond ((= n 1)
(if (zerop x) 1 0)) ; this is really (logxor x 1)
((< x i)
(+ x i))
(t
(inv-step (- x i) (- n 1))))))
When you reverse 0 to 2^n-1 but their bit pattern reversed, you pretty much cover the entire 0-2^n-1 sequence
Sum = 2^n * (2^n+1)/2
O(1) operation. No need to do bit reversals
Edit: Of course original poster's question was about to do increment by (reversed) one, which makes things more simple than adding two random values. So nwellnhof's answer contains the algorithm already.
Summing two bit-reversal values
Here is one solution in php:
function RevSum ($a,$b) {
// loop until our adder, $b, is zero
while ($b) {
// get carry (aka overflow) bit for every bit-location by AND-operation
// 0 + 0 --> 00 no overflow, carry is "0"
// 0 + 1 --> 01 no overflow, carry is "0"
// 1 + 0 --> 01 no overflow, carry is "0"
// 1 + 1 --> 10 overflow! carry is "1"
$c = $a & $b;
// do 1-bit addition for every bit location at once by XOR-operation
// 0 + 0 --> 00 result = 0
// 0 + 1 --> 01 result = 1
// 1 + 0 --> 01 result = 1
// 1 + 1 --> 10 result = 0 (ignored that "1", already taken care above)
$a ^= $b;
// now: shift carry bits to the next bit-locations to be added to $a in
// next iteration.
// PHP_INT_MAX here is used to ensure that the most-significant bit of the
// $b will be cleared after shifting. see link in the side note below.
$b = ($c >> 1) & PHP_INT_MAX;
}
return $a;
}
Side note: See this question about shifting negative values.
And as for test; start from zero and increment value by 8-bit reversed one (10000000):
$value = 0;
$add = 0x80; // 10000000 <-- "one" as bit reversed
for ($count = 20; $count--;) { // loop 20 times
printf("%08b\n", $value); // show value as 8-bit binary
$value = RevSum($value, $add); // do addition
}
... will output:
00000000
10000000
01000000
11000000
00100000
10100000
01100000
11100000
00010000
10010000
01010000
11010000
00110000
10110000
01110000
11110000
00001000
10001000
01001000
11001000
Let assume number 1110101 and our task is to find next one.
1) Find zero on highest position and mark position as index.
11101010 (4th position, so index = 4)
2) Set to zero all bits on position higher than index.
00001010
3) Change founded zero from step 1) to '1'
00011010
That's it. This is by far the fastest algorithm since most of cpu's has instructions to achieve this very efficiently. Here is a C++ implementation which increment 64bit number in reversed patern.
#include <intrin.h>
unsigned __int64 reversed_increment(unsigned __int64 number)
{
unsigned long index, result;
_BitScanReverse64(&index, ~number); // returns index of the highest '1' on bit-reverse number (trick to find the highest '0')
result = _bzhi_u64(number, index); // set to '0' all bits at number higher than index position
result |= (unsigned __int64) 1 << index; // changes to '1' bit on index position
return result;
}
Its not hit your requirements to have "no bits" operations, however i fear there is now way how to achieve something similar without them.

Resources