I have two integers that I need to pass through one integer and then get the values of two integers back.
I am thinking of using Logic Operators (AND, OR, XOR, etc) .
Using the C programming language, it could be done as follows assuming that the two integers are less than 65535.
void take2IntegersAsOne(int x)
{
// int1 is stored in the bottom half of x, so take just that part.
int int1 = x & 0xFFFF;
// int2 is stored in the top half of x, so slide that part of the number
// into the bottom half, and take just that part.
int int2 = (x >> 16) & 0xFFFF
// use int1 and int2 here. They must both be less than 0xFFFF or 65535 in decimal
}
void pass2()
{
int int1 = 345;
int int2 = 2342;
take2Integers( int1 | (int2 << 16) );
}
This relies on the fact that in C an integer is stored in 4 bytes. So, the example uses the first two bytes to store one of the integers, and the next two bytes for the second. This does impose the limit though that each of the integers must have a small enough value so that they will each fit into just 2 bytes.
The shift operators << and >> are used to slide the bits of an integer up and down. Shifting by 16, moves the bits by two bytes (as there are 8 bits per byte).
Using 0xFFFF represents the bit pattern where all of the bits in the lower two bytes of the number are 1s So, ANDing (with with & operator) causes all the bits that are not in these bottom two bytes to be switched off (back to zero). This can be used to remove any parts of the 'other integer' from the one you're currently extracting.
There are two parts to this question. First, how do you bitmask two 32-bit Integers into a 64-bit Long Integer?
As others have stated, let's say I have a function that takes an X and Y coordinate, and returns a longint representing that Point's linear value. I tend to call this linearization of 2d data:
public long asLong(int x, int y) {
return ( ((long)x) << 32 ) | y;
}
public int getX(long location) {
return (int)((location >> 32) & 0xFFFFFFFF);
}
public int getY(long location) {
return (int)(location & 0xFFFFFFFF);
}
Forgive me if I'm paranoid about order of operations, sometimes other operations are greedier than <<, causing things to shift further than they should.
Why does this work? When might it fail?
It's convenient that integers tend to be exactly half the size of longints. What we're doing is casting x to a long, shifting it left until it sits entirely to the left of y, and then doing a union operation (OR) to combine the bits of both.
Let's pretend they're 4-bit numbers being combined into an 8-bit number:
x = 14 : 1110
y = 5 : 0101
x = x << 4 : 1110 0000
p = x | y : 1110 0000
OR 0101
---------
1110 0101
Meanwhile, the reverse:
p = 229 : 1110 0101
x = p >> 4 : 1111 1110 //depending on your language and data type, sign extension
//can cause the bits to smear on the left side as they're
//shifted, as shown here. Doesn't happen in unsigned types
x = x & 0xF:
1111 1110
AND 0000 1111
-------------
0000 1110 //AND selects only the bits we have in common
y = p & 0xF:
1110 0101
AND 0000 1111
-------------
0000 0101 //AND strikes again
This sort of approach came into being a long time ago, in environments that needed to squeeze every bit out of their storage or transmission space. If you're not on an embedded system or immediately packing this data for transmission over a network, the practicality of this whole procedure starts to break down really rapidly:
It's way too much work just for boxing a return value that almost always immediately needs to be unboxed and read by the caller. That's kind of like digging a hole and then filling it in.
It greatly reduces your code readability. "What type is returned?" Uh... an int.. and another int... in a long.
It can introduce hard-to-trace bugs down the line. For instance, if you use unsigned types and ignore the sign extension, then later on migrate to a platform that causes those types to go two's complement. If you save off the longint, and try to read it later in another part of your code, you might hit an off-by-one error on the bitshift and spend an hour debugging your function only to find out it's the parameter that's wrong.
If it's so bad, what are the alternatives?
This is why people were asking you about your language. Ideally, if you're in something like C or C++, it'd be best to say
struct Point { int x; int y; };
public Point getPosition() {
struct Point result = { 14,5 };
return result;
}
Otherwise, in HLLs like Java, you might wind up with an inner class to achieve the same functionality:
public class Example {
public class Point {
public int x;
public int y;
public Point(int x, int y) { this.x=x; this.y=y; }
}
public Point getPosition() {
return new Point(14,5);
}
}
In this case, getPosition returns an Example.Point - if you keep using Point often, promote it to a full class of its own. In fact, java.awt has several Point classes already, including Point and Point.Float
Finally, many modern languages now have syntactic sugar for either boxing multiple values into tuples or directly returning multiple values from a function. This is kind of a last resort. In my experience, any time you pretend that data isn't what it is, you wind up with problems down the line. But if your method absolutely must return two numbers that really aren't part of the same data at all, tuples or arrays are the way to go.
The reference for the c++ stdlib tuple can be found at
http://www.cplusplus.com/reference/std/tuple/
Well.. #Felice is right, but if they both fit in 16 bit there's a way:
output_int = (first_int << 16) | second_int
^
means 'or'
to pack them, and
first_int = output_int & 0xffff
second_int = (output int >> 16) & 0xffff
^
means 'and'
to extract them.
Two integer can't fit one integer, or at least you cant get back the two original one.
But anyway, if the two original integer are bounded to a sure number of bits you can ( in pseudocode ):
First integer
OR with
(Second integer SHIFTLEFT(nOfBits))
for getting back the two integer
mask the merged integer with a number that is binary represented by nOfBitsOne and you obtain the first integer, then
ShiftRight by nOfBits the merged integer, and you have back the second.
You could store 2 16-bit integers within a 32-bit integer. First one i 16 first bits and second one in the last 16 bits. To retrieve and compose the value you use shift-operators.
Related
I need a method by which to efficiently translate any float or double value to an array of bytes so that it preserves the comparison relationship to any other value.
Example: V1 and V2 are turned into arrays A1 and A2. If A1[0]<A2[0], then V1 must be smaller than V2. Same for larger. If A1[0]==A2[0] and A1[1]>A2[1] then V1 must be larger than V2. And so on. If all the bytes are the same, then the values V1 and V2 must be equal.
For a four byte integer I, an array that would satisfy the above condition would be [U>>24, (U>>16)&255, (U>>8)&255, U&255], where U is the uint positive value V-int.MinValue.
Since doubles are stored as 8 bytes, I expect something close to 8 bytes.
Do you think such a thing can be achieved? Thanks!
C# solution is preferred.
The standard representation for doubles and floats that is used by most languages, IEEE 754, is already very close to supporting this requirement.
In C#, you can use BitConverter.DoubleToInt64Bits or SingleToInt32Bits to get the underlying bits of a double or float directly as an integer.
In order to make comparisons work out right, you only have to fix up the way negative numbers are handled:
long bits = BitConverter.DoubleToInt64Bits( theDouble );
if (bits < 0L) {
bits ^= Int64.MaxValue;
}
The resulting longs will then have the same numeric order as the corresponding doubles. This works for all values except Nan, which isn't really comparable to anything else. The infinities, +0.0 and -0.0 work fine.
If you want +0.0 and -0.0 to have the same value, you can do this:
long bits = BitConverter.DoubleToInt64Bits( theDouble );
if (bits < 0L) {
bits = (bits^Int64.MaxValue)+1L;
}
Note that if you want to make your byte array, you'll probably want to convert to an unsigned integer. You need to flip the sign bit if you want to preserve the ordering, or just do it like this:
long bits = BitConverter.DoubleToInt64Bits( theDouble );
ulong arraybits;
if (bits >= 0L) {
arraybits = (1UL<<63) + (ulong)bits;
} else {
arraybits = (ulong)~bits;
}
I'm coding an LED display (7x48) and the language I'm working in is BASIC (no former experience in that language, but in C/C++) and I have a small issue.
I have an array (red[20] of byte) and an example of a current state is:
to make it easier here lets say its red[3]
10011010 01011100 01011101
and now i need to shift the array by 1 so in next cycle its supposed to be
00110100 10111000 10111011
so what happened is that the whole array shifted for 1 bit to left
the BASIC I'm working with doesn't have any .NET APIs so I need the total low level code (doesn't have to be BASIC, I can translate it, I just need an idea how to do it as I'm limited to 8KB code memory so I have to fully optmize it)
If most significant bit is 1:
subtract value of most significant bit
multiply by 2
add 1
otherwise:
multiply by 2
You should be able to use bit shift operations:
http://msdn.microsoft.com/en-us/library/2d9yb87a.aspx
Let x be the element you want to shift:
x = (x<<1) | (x>>23)
or in general, if you want to shift left by y bits and there are a total of n bits:
x = (x<<y) | (x>>(n-y))
I don't know basic well, but here's what I would do in a C++/Java/C# language:
Assuming you have red[] of length n:
int b = 32; //Number of bits per byte (your example showed 24, but usually there are 32)
int y = 1; //Number of bytes to shift to the left
int carry = 0; //The bytes to carry over (I'm assuming that they move up the array from red[0] to red[1], etc.
for (int i=0;i<n;i++)
{
int newCarry = (red[i]>>(n-y));
red[i] = (red[i]<<y) | carry;
carry = newCarry;
}
//Complete the loop
red[0]|=carry;
On Page 140 of Programming Pearls, 2nd Edition, Jon proposed an implementation of sets with bit vectors.
We'll turn now to two final structures that exploit the fact that our sets represent integers. Bit vectors are an old friend from Column 1. Here are their private data and functions:
enum { BITSPERWORD = 32, SHIFT = 5, MASK = 0x1F };
int n, hi, *x;
void set(int i) { x[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { x[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i) { return x[i>>SHIFT] &= (1<<(i & MASK)); }
As I gathered, the central idea of a bit vector to represent an integer set, as described in Column 1, is that the i-th bit is turned on if and only if the integer i is in the set.
But I am really at a loss at the algorithms involved in the above three functions. And the book doesn't give an explanation.
I can only get that i & MASK is to get the lower 5 bits of i, while i>>SHIFT is to move i 5 bits toward the right.
Anybody would elaborate more on these algorithms? Bit operations always seem a myth to me, :(
Bit Fields and You
I'll use a simple example to explain the basics. Say you have an unsigned integer with four bits:
[0][0][0][0] = 0
You can represent any number here from 0 to 15 by converting it to base 2. Say we have the right end be the smallest:
[0][1][0][1] = 5
So the first bit adds 1 to the total, the second adds 2, the third adds 4, and the fourth adds 8. For example, here's 8:
[1][0][0][0] = 8
So What?
Say you want to represent a binary state in an application-- if some option is enabled, if you should draw some element, and so on. You probably don't want to use an entire integer for each one of these- it'd be using a 32 bit integer to store one bit of information. Or, to continue our example in four bits:
[0][0][0][1] = 1 = ON
[0][0][0][0] = 0 = OFF //what a huge waste of space!
(Of course, the problem is more pronounced in real life since 32-bit integers look like this:
[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] = 0
The answer to this is to use a bit field. We have a collection of properties (usually related ones) which we will flip on and off using bit operations. So, say, you might have 4 different lights on a piece of hardware that you want to be on or off.
3 2 1 0
[0][0][0][0] = 0
(Why do we start with light 0? I'll explain this in a second.)
Note that this is an integer, and is stored as an integer, but is used to represent multiple states for multiple objects. Crazy! Say we turn lights 2 and 1 on:
3 2 1 0
[0][1][1][0] = 6
The important thing you should note here: There's probably no obvious reason why lights 2 and 1 being on should equal six, and it may not be obvious how we would do anything with this scheme of information storage. It doesn't look more obvious if you add more bits:
3 2 1 0
[1][1][1][0] = 0xE \\what?
Why do we care about this? Do we have exactly one state for each number between 0 and 15?How are we going to manage this without some insane series of switch statements? Ugh...
The Light at the End
So if you've worked with binary arithmetic a bit before, you might realize that the relationship between the numbers on the left and the numbers on the right is, of course, base 2. That is:
1*(23) + 1*(22) + 1*(21) +0 *(20) = 0xE
So each light is present in the exponent of each term of the equation. If the light is on, there is a 1 next to its term- if the light is off, there is a zero. Take the time to convince yourself that there is exactly one integer between 0 and 15 that corresponds to each state in this numbering scheme.
Bit operators
Now that we have this done, let's take a second to see what bitshifting does to integers in this setup.
[0][0][0][1] = 1
When you shift bits to the left or the right in an integer, it literally moves the bits left and right. (Note: I 100% disavow this explanation for negative numbers! There be dragons!)
1<<2 = 4
[0][1][0][0] = 4
4>>1 = 2
[0][0][1][0] = 2
You will encounter similar behavior when shifting numbers represented with more than one bit. Also, it shouldn't be hard to convince yourself that x>>0 or x<<0 is just x. Doesn't shift anywhere.
This probably explains the naming scheme of the Shift operators to anyone who wasn't familiar with them.
Bitwise operations
This representation of numbers in binary can also be used to shed some light on the operations of bitwise operators on integers. Each bit in the first number is xor-ed, and-ed, or or-ed with its fellow number. Take a second to venture to wikipedia and familiarize yourself with the function of these Boolean operators - I'll explain how they function on numbers but I don't want to rehash the general idea in great detail.
...
Welcome back! Let's start by examining the effect of the OR (|) operator on two integers, stored in four bit.
OR OPERATOR ON:
[1][0][0][1] = 0x9
[1][1][0][0] = 0xC
________________
[1][1][0][1] = 0xD
Tough! This is a close analogue to the truth table for the boolean OR operator. Notice that each column ignores the adjacent columns and simply fills in the result column with the result of the first bit and the second bit OR'd together. Note also that the value of anything or'd with 1 is 1 in that particular column. Anything or'd with zero remains the same.
The table for AND (&) is interesting, though somewhat inverted:
AND OPERATOR ON:
[1][0][0][1] = 0x9
[1][1][0][0] = 0xC
________________
[1][0][0][0] = 0x8
In this case we do the same thing- we perform the AND operation with each bit in a column and put the result in that bit. No column cares about any other column.
Important lesson about this, which I invite you to verify by using the diagram above: anything AND-ed with zero is zero. Also, equally important- nothing happens to numbers that are AND-ed with one. They stay the same.
The final table, XOR, has behavior which I hope you all find predictable by now.
XOR OPERATOR ON:
[1][0][0][1] = 0x9
[1][1][0][0] = 0xC
________________
[0][1][0][1] = 0x5
Each bit is being XOR'd with its column, yadda yadda, and so on. But look closely at the first row and the second row. Which bits changed? (Half of them.) Which bits stayed the same? (No points for answering this one.)
The bit in the first row is being changed in the result if (and only if) the bit in the second row is 1!
The one lightbulb example!
So now we have an interesting set of tools we can use to flip individual bits. Let's go back to the lightbulb example and focus only on the first lightbulb.
0
[?] \\We don't know if it's one or zero while coding
We know that we have an operation that can always make this bit equal to one- the OR 1 operator.
0|1 = 1
1|1 = 1
So, ignoring the rest of the bulbs, we could do this
4_bit_lightbulb_integer |= 1;
and know for sure that we did nothing but set the first lightbulb to ON.
3 2 1 0
[0][0][0][?] = 0 or 1? \\4_bit_lightbulb_integer
[0][0][0][1] = 1
________________
[0][0][0][1] = 0x1
Similarly, we can AND the number with zero. Well- not quite zero- we don't want to affect the state of the other bits, so we will fill them in with ones.
I'll use the unary (one-argument) operator for bit negation. The ~ (NOT) bitwise operator flips all of the bits in its argument. ~(0X1):
[0][0][0][1] = 0x1
________________
[1][1][1][0] = 0xE
We will use this in conjunction with the AND bit below.
Let's do 4_bit_lightbulb_integer & 0xE
3 2 1 0
[0][1][0][?] = 4 or 5? \\4_bit_lightbulb_integer
[1][1][1][0] = 0xE
________________
[0][1][0][0] = 0x4
We're seeing a lot of integers on the right-hand-side which don't have any immediate relevance. You should get used to this if you deal with bit fields a lot. Look at the left-hand side. The bit on the right is always zero and the other bits are unchanged. We can turn off light 0 and ignore everything else!
Finally, you can use the XOR bit to flip the first bit selectively!
3 2 1 0
[0][1][0][?] = 4 or 5? \\4_bit_lightbulb_integer
[0][0][0][1] = 0x1
________________
[0][1][0][*] = 4 or 5?
We don't actually know what the value of * is now- just that flipped from whatever ? was.
Combining Bit Shifting and Bitwise operations
The interesting fact about these two operations is when taken together they allow you to manipulate selective bits.
[0][0][0][1] = 1 = 1<<0
[0][0][1][0] = 2 = 1<<1
[0][1][0][0] = 4 = 1<<2
[1][0][0][0] = 8 = 1<<3
Hmm. Interesting. I'll mention the negation operator here (~) as it's used in a similar way to produce the needed bit values for ANDing stuff in bit fields.
[1][1][1][0] = 0xE = ~(1<<0)
[1][1][0][1] = 0xD = ~(1<<1)
[1][0][1][1] = 0xB = ~(1<<2)
[0][1][1][1] = 0X7 = ~(1<<3)
Are you seeing an interesting relationship between the shift value and the corresponding lightbulb position of the shifted bit?
The canonical bitshift operators
As alluded to above, we have an interesting, generic method for turning on and off specific lights with the bit-shifters above.
To turn on a bulb, we generate the 1 in the right position using bit shifting, and then OR it with the current lightbulb positions. Say we want to turn on light 3, and ignore everything else. We need to get a bit shifting operation that ORs
3 2 1 0
[?][?][?][?] \\all we know about these values at compile time is where they are!
and 0x8
[1][0][0][0] = 0x8
Which is easy, thanks to bitshifting! We'll pick the number of the light and switch the value over:
1<<3 = 0x8
and then:
4_bit_lightbulb_integer |= 0x8;
3 2 1 0
[1][?][?][?] \\the ? marks have not changed!
And we can guarantee that the bit for the 3rd lightbulb is set to 1 and that nothing else has changed.
Clearing a bit works similarly- we'll use the negated bits table above to, say, clear light 2.
~(1<<2) = 0xB = [1][0][1][1]
4_bit_lightbulb_integer & 0xB:
3 2 1 0
[?][?][?][?]
[1][0][1][1]
____________
[?][0][?][?]
The XOR method of flipping bits is the same idea as the OR one.
So the canonical methods of bit switching are this:
Turn on the light i:
4_bit_lightbulb_integer|=(1<<i)
Turn off light i:
4_bit_lightbulb_integer&=~(1<<i)
Flip light i:
4_bit_lightbulb_integer^=(1<<i)
Wait, how do I read these?
In order to check a bit we can simply zero out all of the bits except for the one we care about. We'll then check to see if the resulting value is greater than zero- since this is the only value that could possibly be nonzero, it will make the entire integer nonzero if and only if it is nonzero. For example, to check bit 2:
1<<2:
[0][1][0][0]
4_bit_lightbulb_integer:
[?][?][?][?]
1<<2 & 4_bit_lightbulb_integer:
[0][?][0][0]
Remember from the previous examples that the value of ? didn't change. Remember also that anything AND 0 is 0. So, we can say for sure that if this value is greater than zero, the switch at position 2 is true and the lightbulb is zero. Similarly, if the value is off, the value of the entire thing will be zero.
(You can alternately shift the entire value of 4_bit_lightbulb_integer over by i bits and AND it with 1. I don't remember off the top of my head if one is faster than the other but I doubt it.)
So the canonical checking function:
Check if bit i is on:
if (4_bit_lightbulb_integer & 1<<i) {
\\do whatever
}
The specifics
Now that we have a complete set of tools for bitwise operations, we can look at the specific example here. This is basically the same idea- except a much more concise and powerful way of executing it. Let's look at this function:
void set(int i) { x[i>>SHIFT] |= (1<<(i & MASK)); }
From the canonical implementation I'm going to make a guess that this is trying to set some bits to 1! Let's take an integer and look at what's going on here if i feed the value 0x32 (50 in decimal) into i:
x[0x32>>5] |= (1<<(0x32 & 0x1f))
Well, that's a mess.. let's dissect this operation on the right. For convenience, pretend there are 24 more irrelevant zeros, since these are both 32 bit integers.
...[0][0][0][1][1][1][1][1] = 0x1F
...[0][0][1][1][0][0][1][0] = 0x32
________________________
...[0][0][0][1][0][0][1][0] = 0x12
It looks like everything is being cut off at the boundary on top where 1s turn into zeros. This technique is called Bit Masking. Interestingly, the boundary here restricts the resulting values to be between 0 and 31... Which is exactly the number of bit positions we have for a 32 bit integer!
x[0x32>>5] |= (1<<(0x12))
Let's look at the other half.
...[0][0][1][1][0][0][1][0] = 0x32
Shift five bits to the right:
...[0][0][0][0][0][0][0][1] = 0x01
Note that this transformation exactly destroyed all information from the first part of the function- we have 32-5 = 27 remaining bits which could be nonzero. This indicates which of 227 integers in the array of integers are selected. So the simplified equation is now:
x[1] |= (1<<0x12)
This just looks like the canonical bit-setting operation! We've just chosen
So the idea is to use the first 27 bits to pick an integer to shift and the last five bits indicate which bit of the 32 in that integer to shift.
The key to understanding what's going on is to recognize that BITSPERWORD = 2SHIFT. Thus, x[i>>SHIFT] finds which 32-bit element of the array x has the bit corresponding to i. (By shifting i 5 bits to the right, you're simply dividing by 32.) Once you have located the correct element of x, the lower 5 bits of i can then be used to find which particular bit of x[i>>SHIFT] corresponds to i. That's what i & MASK does; by shifting 1 by that number of bits, you move the bit corresponding to 1 to the exact position within x[i>>SHIFT] that corresponds to the ith bit in x.
Here's a bit more of an explanation:
Imagine that we want capacity for N bits in our bit vector. Since each int holds 32 bits, we will need (N + 31) / 32 int values for our storage (that is, N/32 rounded up). Within each int value, we will adopt the convention that bits are ordered from least significant to most significant. We will also adopt the convention that the first 32 bits of our vector are in x[0], the next 32 bits are in x[1], and so forth. Here's the memory layout we are using (showing the bit index in our bit vector corresponding to each bit of memory):
+----+----+-------+----+----+----+
x[0]: | 31 | 30 | . . . | 02 | 01 | 00 |
+----+----+-------+----+----+----+
x[1]: | 63 | 62 | . . . | 34 | 33 | 32 |
+----+----+-------+----+----+----+
etc.
Our first step is to allocate the necessary storage capacity:
x = new int[(N + BITSPERWORD - 1) >> SHIFT]
(We could make provision for dynamically expanding this storage, but that would just add complexity to the explanation.)
Now suppose we want to access bit i (either to set it, clear it, or just to know its current value). We need to first figure out which element of x to use. Since there are 32 bits per int value, this is easy:
subscript for x = i / 32
Making use of the enum constants, the x element we want is:
x[i >> SHIFT]
(Think of this as a 32-bit-wide window into our N-bit vector.) Now we have to find the specific bit corresponding to i. Looking at the memory layout, it's not hard to figure out that the first (rightmost) bit in the window corresponds to bit index 32 * (i >> SHIFT). (The window starts afteri >> SHIFT slots in x, and each slot has 32 bits.) Since that's the first bit in the window (position 0), then the bit we're interested in is is at position
i - (32 * (i >> SHIFT))
in the windows. With a little experimenting, you can convince yourself that this expression is always equal to i % 32 (actually, that's one definition of the mod operator) which, in turn, is always equal to i & MASK. Since this last expression is the fastest way to calculate what we want, that's what we'll use.
From here, the rest is pretty simple. We start with a single bit in the least-significant position of the window (that is, the constant 1), and move it to the left by i & MASK bits to get it to the position in the window corresponding to bit i in the bit vector. This is where the expression
1 << (i & MASK)
comes from. With the bit now moved to where we want it, we can use this as a mask to set, clear, or query the value of the bit at that position in x[i>>SHIFT] and we know that we're actually setting, clearing, or querying the value of bit i in our bit vector.
If you store your bits in an array of n words you can imagine them to be layed out as a matrix with n rows and 32 columns (BITSPERWORD):
3 0
1 0
0 xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx
1 xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx
2 xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx
....
n xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx
To get the k-th bit you divide k by 32. The (integer) result will give you the row (word) the bit is in, the reminder will give you which bit is within the word.
Dividing by 2^p can be done simply by shifting p postions to the right. The reminder can be obtained by getting the p rightmost bits (i.e the bitwise AND with (2^p - 1)).
In C terms:
#define div32(k) ((k) >> 5)
#define mod32(k) ((k) & 31)
#define word_the_bit_is_in(k) div32(k)
#define bit_within_word(k) mod32(k)
Hope it helps.
We have two N-bit numbers (0< N< 100000). We have to perform q queries (0< q<500000) over these numbers. The query can be of following three types:
set_a idx x: Set A[idx] to x, where 0 <= idx < N, where A[idx] is idx'th least significant bit of A.
set_b idx x: Set B[idx] to x, where 0 <= idx < N.
get_c idx: Print C[idx], where C=A+B, and 0<=idx
Now, I have optimized the code to the best extent I can.
First, I tried with an int array for a, b and c. For every update, I calculate c and return the ith bit when queried. It was damn slow. Cleared 4/11 test cases only.
I moved over to using boolean array. It was around 2 times faster than int array approach. Cleared 7/11 testcases.
Next, I figured out that I need not calculate c for calculating idx th bit of A+B. I will just scan A and B towards right from idx until I find either a[i]=b[i]=0 or a[i]=b[i]=1. If a[i]=b[i]=0, then I just add up towards left to idx th bit starting with initial carry=0. And if a[i]=b[i]=1, then I just add up towards left to idx th bit starting with initial carry=1.
This was faster but cleared only 8/11 testcases.
Then, I figured out once, I get to the position i, a[i]=b[i]=0 or a[i]=b[i]=1, then I need not add up towards idx th position. If a[i]=b[i]=0, then answer is (a[idx]+b[idx])%2 and if a[i]=b[i]=1, then the answer is (a[idx]+b[idx]+1)%2. It was around 40% faster but still cleared only 8/11 testcases.
Now my question is how do get down those 3 'hard' testcases? I dont know what they are but the program is taking >3 sec to solve the problem.
Here is the code: http://ideone.com/LopZf
One possible optimization is to replace
(a[pos]+b[pos]+carry)%2
with
a[pos]^b[pos]^carry
The XOR operator (^) performs addition modulo 2, making the potentially expensive mod operation (%) unnecessary. Depending on the language and compiler, the compiler may make optimizations for you when doing a mod with a power of 2. But since you are micro-optimizing it is a simple change to make that removes dependence on that optimization being made for you behind the scenes.
http://en.wikipedia.org/wiki/Exclusive_or
This is just one suggestion that is simple to make. As others have suggested, using packed ints to represent your bit array will likely also improve what is probably the worst case test for your code. That would be the get_c function of the most significant bit, with either A or B (but not both) being 1 for all the other positions, requiring a scan of every bit position to the least significant bit to determine carry. If you were using packed ints for your bits, there would only be approximately 1/32 as many operations neccessary (assuming 32 bit ints). Using packed ints however would be a somewhat more complicated than your use of a simple boolean array (which really is likely just an array of bytes).
C/C++ Bit Array or Bit Vector
Convert bit array to uint or similar packed value
http://en.wikipedia.org/wiki/Bit_array
There are lots of other examples on Stackoverflow and the net for using ints as if they were bit arrays.
Here is a solution that looks a bit like your algorithm. I demonstrate it with bytes, but of course you can easily optimize the algorithm using 32 bit words (I suppose your machine has 64 bits arithmetic nowadays).
void setbit( unsigned char*x,unsigned int idx,unsigned int bit)
{
unsigned int digitIndex = idx>>3;
unsigned int bitIndex = idx & 7;
if( ((x[digitIndex]>>bitIndex)&1) ^ bit) x[digitIndex]^=(1u<<bitIndex);
}
unsigned int getbit(unsigned char *a,unsigned char *b,unsigned int idx)
{
unsigned int digitIndex = idx>>3;
unsigned int bitIndex = idx & 7;
unsigned int c = a[digitIndex]+b[digitIndex];
unsigned int bit = (c>>bitIndex) & 1;
/* a zero bit on the right will absorb a carry, let's check if any */
if( (c^(c+1))>>bitIndex )
{
/* none, we must check if there's a carry propagating from the right digits */
for(;digitIndex-- > 0;)
{
c=a[digitIndex]+b[digitIndex];
if( c > 255 ) return bit^1; /* yes, a carry */
if( c < 255 ) return bit; /* no carry possible, a zero bit will absorb it */
}
}
return bit;
}
If you find anything cryptic, just ask.
Edit: oops, I inverted the zero bit condition...
I want to be able to take, as input, a character pointer to a number in base 2 through 16 and as a second parameter, what base the number is in and then convert that to it's representation in base 2. The integer can be of arbitrary length. My solution now does what the atoi() function does, but I was curious purely out of academic interest if a lookup table solution is possible.
I have found that this is simple for binary, octal, and hexadecimal. I can simply use a lookup table for each digit to get a series of bits. For instance:
0xF1E ---> (F = 1111) (1 = 0001) (E = 1110) ---> 111100011110
0766 ---> (7 = 111) (6 = 110) (6 = 110) ---> 111110110
1000 ---> ??? ---> 1111101000
However, my problem is that I want to do this look up table method for odd bases, like base 10. I know that I could write the algorithm like atoi does and do a bunch of multiplies and adds, but for this specific problem I'm trying to see if I can do it with a look up table. It's definitely not so obvious with base 10, though. I was curious if anyone had any clever way to figure out how to generate a generic look up table for Base X -> Base 2. I know that for base 10, you can't just give it one digit at a time, so the solution would likely have to lookup a group of digits at a time.
I am aware of the multiply and add solution but since these are arbitrary length numbers, the multiply and add operations are not free so I'd like to avoid them, if at all possible.
You will have to use a look up table with an input width of m base b symbols returning n bits so that
n = log2(b) * m
for positive integers b, n and m. So if b is not a power of two, there will be no (simple) look up table solution.
I do not think that there is a solution. The following example with base 10 illustrates why.
65536 = 1 0000 0000 0000 0000
Changing the last digit from 6 to 5 will flip all bits.
65535 = 0 1111 1111 1111 1111
And almost the same will hold if you process the input starting from the end. Changing the first digit from 6 to 5 flips a significant number of bits.
55535 = 0 1101 1000 1111 0000
This is not possible in bases that aren't powers of two to convert to base-2. The reason that it is possible for base 8 (and 16) is that the way the conversion works is following:
octal ABC = 8^2*A + 8^1*B + 8^0*C (decimal)
= 0b10000000*A + 0b1000*B + C (binary)
so if you have the lookup table of A = (0b000 to 0b111), then the multiplication is always by 1 and some trailing zeros, so the multiplication is simple (just shifting left).
However, consider the 'odd' base of 10. When you look at the powers of 10:
10^1 = 0b1010
10^2 = 0b1100100
10^3 = 0b1111101000
10^4 = 0b10011100010000
..etc
You'll notice that the multiplication never gets simple, so you can't have any lookup tables and do bitshifts and ors, no matter how big you group them. It will always overlap. The best you can do is have a lookup table of the form: (a,b) where a is the digit position, and b is the digit (0..9). Then, you are only reduced to adding n numbers, rather than multiplying and adding n numbers (plus the cost of the memory of the lookup table)
How big are the strings? You can potentially convert the multiply-and-add to a lookup-and-add by doing something like this:
Store the numbers 0-9, 10, 20, 30, 40, ... 90, 100, 200, ... 900, 1000, 2000, ... , 9000, 10000, ... in the target base in a table.
For each character starting with the rightmost, index appropriately into the table and add it to a running result.
Of course I'm not sure how well this will actually perform, but it's a thought.
The algorithm is quite simple. Language agnostic would be:
total = 0
base <- input_base
for each character in input:
total <- total*base + number(char)
In C++:
// Helper to convert a digit to a number
unsigned int number( char ch )
{
if ( ch >= '0' && ch <= '9' ) return ch-'0';
ch = toupper(ch);
if ( ch >= 'A' && ch <= 'F' ) return 10 + (ch-'A');
}
unsigned int parse( std::string const & input, unsigned int base )
{
unsigned int total = 0;
for ( int i = 0; i < input.size(); ++i )
{
total = total*base + number(input[i]);
}
return total;
}
Of course, you should take care of possible errors (incoherent input: base 2 and input string 'af12') or any other exceptional condition.
Start with a running count of 0.
For each character in the string (reading left to right)
Multiply count by base.
Convert character to int value (0 through base)
Add character value to running count.
How accurate do you need to be?
If you're looking for perfection, then multiply-and-add is really your only recourse. And I'd be very surprised if it's the slowest part of your application.
If order-of-magnitude is good enough, use a lookup table to find the closest power of 2.
Example 1: 1234, closest power of 2 is 1024.
Example 2: 98765, closest is 65536
You could also drive this by counting the number of digits, and multiplying the appropriate power of 2 by the leftmost digit. This can be implemented as a left-shift:
Example 3: 98765 has 5 digits, closest power of 2 to 10000 is 8192 (2^13), so result is 9 << 13
I wrote this before your clarifying comment so it probably isn't quite is applicable. I'm not sure if a lookup table approach is possible or not. If you really don't need arbitrary precision, then take advantage of the runtime.
If a C/C++ solution is acceptable, I believe that the following is what you are looking for is something like the following. It probably contains bugs in edge cases, but it does compile and work as expected at least for positive numbers. Making it really work is an exercise for the reader.
/*
* NAME
* convert_num - convert a numerical string (str) of base (b) to
* a printable binary representation
* SYNOPSIS
* int convert_num(char const* s, int b, char** o)
* DESCRIPTION
* Generates a printable binary representation of an input number
* from an arbitrary base. The input number is passed as the ASCII
* character string `s'. The input string consists of characters
* from the ASCII character set {'0'..'9','A'..('A'+b-10)} where
* letter characters may be in either upper or lower case.
* RETURNS
* The number of characters from the input string `s' which were
* consumed by this operation. The output string is placed into
* newly allocated storage which is pointed to by `*o' upon successful
* completion. An error is signalled by returning `-1'.
*/
int
convert_num(char const *str, int b, char **out)
{
int rc = -1;
char *endp = NULL;
char *outp = NULL;
unsigned long num = strtoul(str, &endp, b);
if (endp != str) { /* then we have some numbers */
int numdig = -1;
rc = (endp - str); /* we have this many base `b' digits! */
frexp((double)num, &numdig); /* we need this many base 2 digits */
if ((outp=malloc(numdig+1)) == NULL) {
return -1;
}
*out = outp; /* return the buffer */
outp += numdig; /* make sure it is NUL terminated */
*outp-- = '\0';
while (numdig-- != 0) { /* fill it in from LSb to MSb */
*outp-- = ((num & 1) ? '1' : '0');
num >>= 1;
}
}
return rc;
}