Bitwise comparison of binary representation of two numbers - ruby

I was looking for a way to flip all of the bits in an arbitrary-sized number (ie: arbitrary number of bits), and thought of just negating it. When I printed out
p ~0b1010 == 0b0101
It said false. I'm probably using the wrong comparison operator though. What's the proper way to check that two binary numbers are equal in representation?

One complement is not flipping all bits.
To flip bits, you need to use a xor operation with an argument with the number of bits 1 you want the significance.
Also you can't negate an binary from arbitrary number. You need to define the number of bits you are flipping. This example will show you why:
> 0b000001 ^ 0b1
=> 0
> 0b000001 ^ 0b11
=> 2
> 0b000001 ^ 0b111
=> 6
> 0b000001 ^ 0b1111
=> 14
What you can do is define that an arbitrary number of bits is the minimum number of bits you need to represent your number. This is most likely not what you want, however, the following code can do this for you:
def negate_arbitrary_number(x)
# size is the number of significants digits you have on x.
size = 0
while (a >> size) != 0
size += 1
end
# this is the binary with all number 1's on
mask = ("1"*size).to_i(2)
# xor your number
x ^ mask
end
or this code:
def negate_arbitrary_number(x)
x.to_s(2).unpack("U*").map{|x| x == 49 ? 48 : 49}.pack("U*")
end
you might want to do a simple benchmark to test it.

Related

Binary problem : how to find integer B conforms to integer A

I want to write an algorithm (in Python), that get all the integers that are conforms to an another integer B, written in binary.
When A is conforms to B, it means that in all positions where B has bits set to 1, A has corresponding bits set to 1.
By example :
If we have 1001, the confoms numbers are : 1111, 1011, 1101;
We can assume that the solution should work with very large numbers (so has to be quite efficient).
I have thought about many solutions about doing some binary operations but I cannot get a complete solution.
Do you have any idea ?
As shown in your example:
An integer with z zero bits has 2**z conforming integers. We can subtract one, because one of these is the integer itself.
Accordingly, your algorithm has to count from 1 to 2**z and replace the z zero bits in the original integer by the z bits of your counter.
In python, you can use bitwise operators to test or change bit positions within an integer.
Examples for bitwise operations:
x & 1 returns 1, if the least-significant bit is set. Otherwise 0
x = x | 4 will set the 3rd bit corresponding to 4
Sketch of your algorithm:
1. Loop through the integer to find and count the zero bits
2. Loop from 1 to 2**z
Inner loop: Scan the z bits of the counter
Transfer the bits to a copy of the original integer
Record/output the resulting conformant integer

Creating hash function to map 6 numbers to a short string

I have 6 variables 0 ≤ n₁,...,n₆ ≤ 12 and I'd like to build a hash function to do the direct mapping D(n₁,n₂,n₃,n₄,n₅,n₆) = S and another function to do the inverse mapping I(S) = (n₁,n₂,n₃,n₄,n₅,n₆), where S is a string (a-z, A-Z, 0-9).
My goal is to minimize the length of S for 3 or less.
I thought as the variables have 13 possible values, a single letter (a-z) should be able to represent 2 of them, but I realized that 1 + 12 = m and 2 + 11 = m, so I still don't know how to write a function.
Is there any approach to build a function that does this mapping and returns a small string?
Using the whole ASCII to represent S is an option if it's necessary.
You can convert a set of numbers in any given range to numbers in any other range using base conversion.
Binary is base 2 (0-1), decimal is base 10 (0-9). Your 6 numbers are base 13 (0-12).
Checking whether a conversion would be possible involves counting the number of possible combinations of values for each set. With each number in the range [0,n] (thus base n+1), we can go from all 0's to all n's, thus each number can take on n+1 values and the total number of possibilities is (n+1)numberCount. For 6 decimal digits, for example, it would be 106 = 1000000, which checks out, since there are 1000000 possible numbers with (at most) 6 digits, i.e. numbers < 1000000.
Lower- and uppercase letters and numbers (26+26+10) would be base 62 (0-61), but, following from the above, 3 such values would be insufficient to represent your 6 numbers (136 > 623). To do conversion from/to these, you can do the conversion to a set of base 62 numbers, then have appropriate if-statements to convert 0-9 <=> 0-9, a-z <=> 10-35, A-Z <=> 36-61.
You can represent your data in 3 bytes (since 2563 >= 136), although this wouldn't necessary be printable characters - 32-126 is considered the standard printable range (which is still too small of a range), 128-255 is the extended range and may not be displayed properly in any given environment (to give the best chance of properly displaying it, you should at least avoid 0-31 and 127, which are control characters - you can convert 0-... to the above ranges by adding 32 and then adding another 1 if the value is >= 127).
Many / most languages should allow you to give a numeric value to represent a character, so it should be fairly simple to output it once you do the base conversion. Although some may use Unicode to represent characters, which could make it a bit less trivial to work with ASCII.
If the numbers had specific constraints, that would reduce the number of possible combinations, thus possibly making it fit into a smaller set or range of numbers.
To do the actual base conversion:
It might be simplest to first convert it to a regular integral type (typically binary or decimal), where we don't have to worry about the base, and then convert it to the target base (although first make sure your value will fit in whichever data type you're using).
Consider how binary works:
1101 is 13 = 23 + 22 + 20
13 % 2 = 1 13 / 2 = 6
6 % 2 = 0 6 / 2 = 3
3 % 2 = 1 3 / 2 = 1
1 % 2 = 1
The above, from top to bottom: 1101 = our number
Using the same idea, we can convert to/from any base as follows: (pseudo-code)
int convertFromBase(array, base):
output = 0
for each i in array
output = base*output + i
return output
int[] convertToBase(num, base):
output = []
while num > 0
output.append(num % base)
num /= base
output.reverse()
return output
You can also extend this logic to situations where each number is in a different range by changing what you divide or multiple by at each step (a detailed explanation of that is perhaps a bit beyond the scope of the question).
I thought as the variables have 13 possible values, a single letter
(a-z) should be able to represent 2 of them
This reasoning is wrong. In fact to represent two variables (=any combination these variables might take) you will need 13x13 = 169 symbols.
For your example the 6 variables can take 13^6 (=4826809) different combinations. In order to represent all possible combinations you will need 5 letters (a-z) since 26^5 (=11881376) is the least amount that is will yield more than 13^6 combinations.
For ASCII characters 3 symbols should suffice since 256^3 > 13^6.
If you are still interested in code that does the conversion, I will be happy to help.

Check the Number of Powers of 2

I have a number X , I want to check the number of powers of 2 it have ?
For Ex
N=7 ans is 2 , 2*2
N=20 ans is 4, 2*2*2*2
Similar I want to check the next power of 2
For Ex:
N=14 Ans=16
Is there any Bit Hack for this without using for loops ?
Like we are having a one line solution to check if it's a power of 2 X&(X-1)==0,similarly like that ?
GCC has a built-in instruction called __builtin_clz() that returns the number of leading zeros in an integer. So for example, assuming a 32-bit int, the expression p = 32 - __builtin_clz(n) will tell you how many bits are needed to store the integer n, and 1 << p will give you the next highest power of 2 (provided p<32, of course).
There are also equivalent functions that work with long and long long integers.
Alternatively, math.h defines a function called frexp() that returns the base-2 exponent of a double-precision number. This is likely to be less efficient because your integer will have to be converted to a double-precision value before it is passed to this function.
A number is power of two if it has only single '1' in its binary value. For example, 2 = 00000010, 4 = 00000100, 8 = 00001000 and so on. So you can check it using counting the no. of 1's in its bit value. If count is 1 then the number is power of 2 and vice versa.
You can take help from here and here to avoid for loops for counting set bits.
If count is not 1 (means that Value is not power of 2) then take position of its first set bit from MSB and the next power of 2 value to this number is the value having only set bit at position + 1. For example, number 3 = 00000011. Its first set bit from MSB is 2nd bit. Therefore the next power of 2 number is a value having only set bit at 3rd position. i.e. 00000100 = 4.

Google Protocol Buffers: ZigZag Encoding

From "Signed Types" on Encoding - Protocol Buffers - Google Code:
ZigZag encoding maps signed integers to unsigned integers so that numbers with a small absolute value (for instance, -1) have a small varint encoded value too. It does this in a way that "zig-zags" back and forth through the positive and negative integers, so that -1 is encoded as 1, 1 is encoded as 2, -2 is encoded as 3, and so on, as you can see in the following table:
Signed Original Encoded As
0 0
-1 1
1 2
-2 3
2147483647 4294967294
-2147483648 4294967295
In other words, each value n is encoded using
(n << 1) ^ (n >> 31)
for sint32s, or
(n << 1) ^ (n >> 63)
for the 64-bit version.
How does (n << 1) ^ (n >> 31) equal whats in the table? I understand that would work for positives, but how does that work for say, -1? Wouldn't -1 be 1111 1111, and (n << 1) be 1111 1110? (Is bit-shifting on negatives well formed in any language?)
Nonetheless, using the fomula and doing (-1 << 1) ^ (-1 >> 31), assuming a 32-bit int, I get 1111 1111, which is 4 billion, whereas the table thinks I should have 1.
Shifting a negative signed integer to the right copies the sign bit, so that
(-1 >> 31) == -1
Then,
(-1 << 1) ^ (-1 >> 31) = -2 ^ -1
= 1
This might be easier to visualise in binary (8 bit here):
(-1 << 1) ^ (-1 >> 7) = 11111110 ^ 11111111
= 00000001
Another way to think about zig zag mapping is that it is a slight twist on a sign and magnitude representation.
In zig zag mapping, the least significant bit (lsb) of the mapping indicates the sign of the value: if it's 0, then the original value is non-negative, if it's 1, then the original value is negative.
Non-negative values are simply left shifted one bit to make room for the sign bit in the lsb.
For negative values, you could do the same one bit left shift for the absolute value (magnitude) of the number and simply have the lsb indicate the sign. For example, -1 could map to 0x03 or 0b00000011, where the lsb indicates that it is negative and the magnitude of 1 is left shifted by 1 bit.
The ugly thing about this sign and magnitude representation is "negative zero," mapped as 0x01 or 0b00000001. This variant of zero "uses up" one of our values and shifts the range of integers we can represent by one. We probably want to special case map negative zero to -2^63, so that we can represent the full 64b 2's complement range of [-2^63, 2^63). That means we've used one of our valuable single byte encodings to represent a value that will very, very, very rarely be used in an encoding optimized for small magnitude numbers and we've introduced a special case, which is bad.
This is where zig zag's twist on this sign and magnitude representation happens. The sign bit is still in the lsb, but for negative numbers, we subtract one from the magnitude rather than special casing negative zero. Now, -1 maps to 0x01 and -2^63 has a non-special case representation too (i.e. - magnitude 2^63 - 1, left shifted one bit, with lsb / sign bit set, which is all bits set to 1s).
So, another way to think about zig zag encoding is that it is a smarter sign and magnitude representation: the magnitude is left shifted one bit, the sign bit is stored in the lsb, and 1 is subtracted from the magnitude of negative numbers.
It is faster to implement these transformations using the unconditional bit-wise operators that you posted rather than explicitly testing the sign, special case manipulating negative values (e.g. - negate and subtract 1, or bitwise not), shifting the magnitude, and then explicitly setting the lsb sign bit. However, they are equivalent in effect and this more explicit sign and magnitude series of steps might be easier to understand what and why we are doing these things.
I will warn you though that bit shifting signed values in C / C++ is not portable and should be avoided. Left shifting a negative value has undefined behavior and right shifting a negative value has implementation defined behavior. Even left shifting a positive integer can have undefined behavior (e.g. - if you shift into the sign bit it might cause a trap or something worse). So, in general, don't bit shift signed types in C / C++. "Just say no."
Cast first to the unsigned version of the type to have safe, well-defined results according to the standards. This does mean that you then won't have arithmetic shift of negative values (i.e. - dragging the sign bit to the right) -- only logical shift, so you need to adjust the logic to account for that.
Here are the safe and portable versions of the zig zag mappings for 2's complement 64b integers in C:
#include <stdint.h>
uint64_t zz_map( int64_t x )
{
return ( ( uint64_t ) x << 1 ) ^ -( ( uint64_t ) x >> 63 );
}
int64_t zz_unmap( uint64_t y )
{
return ( int64_t ) ( ( y >> 1 ) ^ -( y & 0x1 ) );
}
Note the arithmetic negation of the sign bit in the right hand term of the XORs. That yields either 0 for non-negatives or all 1's for negatives -- just like arithmetic shift of the sign bit from msb to lsb would do. The XOR then effectively "undoes" / "redoes" the 2's complementation minus 1 (i.e. - 1's complementation or logical negation) for negative values without any conditional logic or further math.
Let me add my two cents to the discussion. As other answers noted, the zig-zag encoding can be thought as a sign-magnitude twist. This fact can be used to implement conversion functions which work for arbitrary-sized integers.
For example, I use the following code in one on my Python projects:
def zigzag(x: int) -> int:
return x << 1 if x >= 0 else (-x - 1) << 1 | 1
def zagzig(x: int) -> int:
assert x >= 0
sign = x & 1
return -(x >> 1) - 1 if sign else x >> 1
These functions work despite Python's int has no fixed bitwidth; instead, it extends dynamically. However, this approach may be inefficient in compiled languages since it requires conditional branching.

How to compute one's complement using Ruby's bitwise operators?

What I want:
assert_equal 6, ones_complement(9) # 1001 => 0110
assert_equal 0, ones_complement(15) # 1111 => 0000
assert_equal 2, ones_complement(1) # 01 => 10
the size of the input isn't fixed as in 4 bits or 8 bits. rather its a binary stream.
What I see:
v = "1001".to_i(2) => 9
There's a bit flipping operator ~
(~v).to_s(2) => "-1010"
sprintf("%b", ~v) => "..10110"
~v => -10
I think its got something to do with one bit being used to store the sign or something... can someone explain this output ? How do I get a one's complement without resorting to string manipulations like cutting the last n chars from the sprintf output to get "0110" or replacing 0 with 1 and vice versa
Ruby just stores a (signed) number. The internal representation of this number is not relevant: it might be a FixNum, BigNum or something else. Therefore, the number of bits in a number is also undefined: it is just a number after all. This is contrary to for example C, where an int will probably be 32 bits (fixed).
So what does the ~ operator do then? Wel, just something like:
class Numeric
def ~
return -self - 1
end
end
...since that's what '~' represents when looking at 2's complement numbers.
So what is missing from your input statement is the number of bits you want to switch: a 32-bits ~ is different from a generic ~ like it is in Ruby.
Now if you just want to bit-flip n-bits you can do something like:
class Numeric
def ones_complement(bits)
self ^ ((1 << bits) - 1)
end
end
...but you do have to specify the number of bits to flip. And this won't affect the sign flag, since that one is outside your reach with XOR :)
It sounds like you only want to flip four bits (the length of your input) - so you probably want to XOR with 1111.
See this question for why.
One problem with your method is that your expected answer is only true if you only flip the four significant bits: 1001 -> 0110.
But the number is stored with leading zeros, and the ~ operator flips all the leading bits too: 00001001 -> 11110110. Then the leading 1 is interpreted as the negative sign.
You really need to specify what the function is supposed to do with numbers like 0b101 and 0b11011 before you can decide how to implement it. If you only ever want to flip 4 bits you can do v^0b1111, as suggested in another answer. But if you want to flip all significant bits, it gets more complicated.
edit
Here's one way to flip all the significant bits:
def maskbits n
b=1
prev=n;
mask=prev|(prev>>1)
while (mask!=prev)
prev=mask;
mask|=(mask>>(b*=2))
end
mask
end
def ones_complement n
n^maskbits(n)
end
This gives
p ones_complement(9).to_s(2) #>>"110"
p ones_complement(15).to_s(2) #>>"0"
p ones_complement(1).to_s(2) #>>"0"
This does not give your desired output for ones_compliment(1), because it treats 1 as "1" not "01". I don't know how the function could infer how many leading zeros you want without taking the width as an argument.
If you're working with strings you could do:
s = "0110"
s.gsub("\d") {|bit| bit=="1"?"0":"1"}
If you're working with numbers, you'll have to define the number of significant bits because:
0110 = 6; 1001 = 9;
110 = 6; 001 = 1;
Even, ignoring the sign, you'll probably have to handle this.
What you are doing (using the ~) operator, is indeed a one's complement. You are getting those values that you are not expecting because of the way the number is interpreted by Ruby.
What you actually need to do will depend on what you are using this for. That is to say, why do you need a 1's complement?
Remember that you are getting the one's complement right now with ~ if you pass in a Fixnum: the number of bits which represent the number is a fixed quantity in the interpreter and thus there are leading 0's in front of the binary representation of the number 9 (binary 1001). You can find this number of bits by examining the size of any Fixnum. (the answer is returned in bytes)
1.size #=> 4
2147483647.size #=> 4
~ is also defined over Bignum. In this case it behaves as if all of the bits which are specified in the Bignum were inverted, and then if there were an infinite string of 1's in front of that Bignum. You can, conceivably shove your bitstream into a Bignum and invert the whole thing. You will however need to know the size of the bitstream prior to inversion to get a useful result out after it is inverted.
To answer the question as you pose it right off the bat, you can find the largest power of 2 less than your input, double it, subtract 1, then XOR the result of that with your input and always get a ones complement of just the significant bits in your input number.
def sig_ones_complement(num)
significant_bits = num.to_s(2).length
next_smallest_pow_2 = 2**(significant_bits-1)
xor_mask = (2*next_smallest_pow_2)-1
return num ^ xor_mask
end

Resources