Bitmasking--when to use hex vs binary - algorithm

I'm working on a problem out of Cracking The Coding Interview which requires that I swap odd and even bits in an integer with as few instructions as possible (e.g bit 0 and 1 are swapped, bits 2 and 3 are swapped, etc.)
The author's solution revolves around using a mask to grab, in one number, the odd bits, and in another num the even bits, and then shifting them off by 1.
I get her solution, but I don't understand how she grabbed the even/odd bits. She creates two bit masks --both in hex -- for a 32 bit integer. The two are: 0xaaaaaaaa and 0x55555555. I understand she's essentially creating the equivalent of 1010101010... for a 32 bit integer in hexadecimal and then ANDing it with the original num to grab the even/odd bits respectively.
What I don't understand is why she used hex? Why not just code in 10101010101010101010101010101010? Did she use hex to reduce verbosity? And when should you use one over the other?

It's to reduce verbosity. Binary 10101010101010101010101010101010, hexadecimal 0xaaaaaaaa, and decimal 2863311530 all represent exactly the same value; they just use different bases to do so. The only reason to use one or another is for perceived readability.
Most people would clearly not want to use decimal here; it looks like an arbitrary value.
The binary is clear: alternating 1s and 0s, but with so many, it's not obvious that this is a 32-bit value, or that there isn't an adjacent pair of 1s or 0s hiding in the middle somewhere.
The hexadecimal version takes advantage of chunking. Assuming you recognize that 0x0a == 0b1010, you can mentally picture the 8 groups of 1010 in the assumed value.
Another possibility would be octal 25252525252, since... well, maybe not. You can see that something is alternating, but unless you use octal a lot, it's not clear what that alternating pattern in binary is.

Related

Fastest algorithm to convert hexadecimal numbers into decimal form without using a fixed length variable to store the result

I want to write a program to convert hexadecimal numbers into their decimal forms without using a variable of fixed length to store the result because that would restrict the range of inputs that my program can work with.
Let's say I were to use a variable of type long long int to calculate, store and print the result. Doing so would limit the range of hexadecimal numbers that my program can handle to between 8000000000000001 and 7FFFFFFFFFFFFFFF. Anything outside this range would cause the variable to overflow.
I did write a program that calculates and stores the decimal result in a dynamically allocated string by performing carry and borrow operations but it runs much slower, even for numbers that are as big as 7FFFFFFFF!
Then I stumbled onto this site which could take numbers that are way outside the range of a 64 bit variable. I tried their converter with numbers much larger than 16^65 - 1 and still couldn't get it to overflow. It just kept on going and printing the result.
I figured that they must be using a much better algorithm for hex to decimal conversion, one that isn't limited to 64 bit values.
So far, Google's search results have only led me to algorithms that use some fixed-length variable for storing the result.
That's why I am here. I wanna know if such an algorithm exists and if it does, what is it?
Well, it sounds like you already did it when you wrote "a program that calculates and stores the decimal result in a dynamically allocated string by performing carry and borrow operations".
Converting from base 16 (hexadecimal) to base 10 means implementing multiplication and addition of numbers in a base 10x representation. Then for each hex digit d, you calculate result = result*16 + d. When you're done you have the same number in a 10-based representation that is easy to write out as a decimal string.
There could be any number of reasons why your string-based method was slow. If you provide it, I'm sure someone could comment.
The most important trick for making it reasonably fast, though, is to pick the right base to convert to and from. I would probably do the multiplication and addition in base 109, so that each digit will be as large as possible while still fitting into a 32-bit integer, and process 7 hex digits at a time, which is as many as I can while only multiplying by single digits.
For every 7 hex digts, I'd convert them to a number d, and then do result = result * ‭(16^7) + d.
Then I can get the 9 decimal digits for each resulting digit in base 109.
This process is pretty easy, since you only have to multiply by single digits. I'm sure there are faster, more complicated ways that recursively break the number into equal-sized pieces.

Fixed Point Multiplication for FFT

I’m writing a Radix-2 DIT FFT algorithm in VHDL, which requires some fractional multiplication of input data by Twiddle Factor (TF). I use Fixed Point arithmetic’s to achieve that, with every word being 16 bit long, where 1 bit is a sign bit and the rest is distributed between integer and fraction. Therefore my dilemma:
I have no idea, in what range my input data will be, so if I just decide that 4 bits go to integer and the rest 11 bits to fraction, in case I get integer numbers higher than 4 bits = 15 decimal, I’m screwed. The same applies if I do 50/50, like 7 bits to integer and the rest to fraction. If I get numbers, which are very small, I’m screwed because of truncation or rounding, i.e:
Let’s assume I have an integer "3"(0000 0011) on input and TF of "0.7071" ( 0.10110101 - 8 bit), and let’s assume, for simplicity, my data is 8 bit long, therefore:
3x0.7071 = 2.1213
3x0.7071 = 0000 0010 . 0001 1111 = 2.12109375 (for 16 bits).
Here comes the trick - I need to up/down round or truncate 16 bits to 8 bits, therefore, I get 0000 0010, i.e 2 - the error is way too high.
My questions are:
How would you solve this problem of range vs precision if you don’t know the range of your input data AND you would have numbers represented in fixed point?
Should I make a process, which decides after every multiplication where to put the comma? Wouldn’t it make the multiplication slower?
Xilinx IP Core has 3 different ways for Fixed Number Arithmetic’s – Unscaled (similar to what I want to do, just truncate in case overflow happens), Scaled fixed point (I would assume, that in that case it decides after each multiplication, where the comma should be and what should be rounded) and Block Floating Point(No idea what it is or how it works - would appreciate an explanation). So how does this IP Core decide where to put the comma? If the decision is made depending on the highest value in my dataset, then in case I have just 1 high peak and the rest of the data is low, the error will be very high.
I will appreciate any ideas or information on any known methods.
You don't need to know the fixed-point format of your input. You can safely treat it as normalized -1 to 1 range or full integer-range.
The reason is that your output will have the same format as the input. Or, more likely for FFT, a known relationship like 3 bits increase, which would the output has 3 more integer bits than the input.
It is the core user's burden to know where the decimal point will end up, you have to document the change to dynamic range of course.

Encode an array of integers to a short string

Problem:
I want to compress an array of non-negative integers of non-fixed length (but it should be 300 to 400), containing mostly 0's, some 1's, a few 2's. Although unlikely, it is also possible to have bigger numbers.
For example, here is an array of 360 elements:
0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,
0,0,4,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,5,2,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,1,2,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.
Goal:
The goal is to compress an array like this, into a shortest possible encoding using letters and numbers. Ideally, something like: sd58x7y
What I've tried:
I tried to use "delta encoding", and use zeroes to denote any value higher than 1. For example: {0,0,1,0,0,0,2,0,1} would be denoted as: 2,3,0,1. To decode it, one would read from left to right, and write down "2 zeroes, one, 3 zeroes, one, 0 zeroes, one (this would add to the previous one, and thus have a two), 1 zero, one".
To eliminate the need of delimiters (commas) and thus saves more space, I tried to use only one alphanumerical character to denote delta values of 0 to 35 (using 0 to y), while leaving letter z as "35 PLUS the next character". I think this is called "variable bit" or something like that. For example, if there are 40 zeroes in a row, I'd encode it as "z5".
That's as far as I got... the resultant string is still very long (it would be about 20 characters long in the above example). I would ideally want something like, 8 characters or even shorter. Thanks for your time; any help or inspiration would be greatly appreciated!
Since your example contains long runs of zeroes, your first step (which it appears you have already taken) could be to use run-lenth encoding (RLE) to compress them. The output from this step would be a list of integers, starting with a run-length count of zeroes, then alternating between that and the non-zero values. (a zero-run-length of 0 will indicate successive non-zero values...)
Second, you can encode your integers in a small number of bits, using a class of methods called universal codes. These methods generally compress small integers using a smaller number of bits than larger integers, and also provide the ability to encode integers of any size (which is pretty spiffy...). You can tune the encoding to improve compression based on the exact distribution you expect.
You may also want to look into how JPEG-style encoding works. After DCT and quantization, the JPEG entropy encoding problem seems similar to yours.
Finally, if you want to go for maximum compression, you might want to look up arithmetic encoding, which can compress your data arbitrarily close to the statistical minimum entropy.
The above links explain how to compress to a stream of raw bits. In order to convert them to a string of letters and numbers, you will need to add another encoding step, which converts the raw bits to such a string. As one commenter points out, you may want to look into base64 representation; or (for maximum efficiency with whatever alphabet is available) you could try using arithmetic compression "in reverse".
Additional notes on compression in general: the "shortest possible encoding" depends greatly on the exact properties of your data source. Effectively, any given compression technique describes a statistical model of the kind of data it compresses best.
Also, once you set up an encoding based on the kind of data you expect, if you try to use it on data unlike the kind you expect, the result may be an expansion, rather than a compression. You can limit this expansion by providing an alternative, uncompressed format, to be used in such cases...
In your data you have:
14 1s (3.89% of data)
4 2s (1.11%)
1 3s, 4s and 5s (0.28%)
339 0s (94.17%)
Assuming that your numbers are not independent of each other and you do not have any other information, the total entropy of your data is 0.407 bits per number, that is 146.4212 bits overall (18.3 bytes). So it is impossible to encode in 8 bytes.

How is data stored in a bit vector?

I'm a bit confused how a fixed size bit vector stores its data.
Let's assume that we have a bit vector bv that I want to store hello in as ASCII.
So we do bv[0]=104, bv[1]=101, bv[2]=108, bv[3]=108, bv[4]=111.
How is the ASCII of hello represented in the bit vector?
Is it as binary like this: [01101000][01100101][01101100][01101100][01101111]
or as ASCII like this: [104][101][108][108][111]
The following paper HAMPI at section 3.5 step 2, the author is assigning ascii code to a bit vector, but Im confused how the char is represented in the bit vector.
Firstly, you should probably read up on what a bit vector is, just to make sure we're on the same page.
Bit vectors don't represent ASCII characters, they represent bits. Trying to do bv[0]=104 on a bit vector will probably not compile / run, or, if it does, it's very unlikely to do what you expect.
The operations that you would expect to be supported is along the lines of set the 5th bit to 1, set the 10th bit to 0, set all these bit to this, OR the bits of these two vectors and probably some others.
How these are actually stored in memory is completely up to the programming language, and, on top of that, it may even be completely up to a given implementation of that language.
The general consensus (not a rule) is that each bit should take up roughly 1 bit in memory (maybe, on average, slightly more, since there could be overhead related to storing these).
As one example (how Java does it), you could have an array of 64-bit numbers and store 64 bits in each position. The translation to ASCII won't make sense in this case.
Another thing you should know - even ASCII gets stored as bits in memory, so those 2 arrays are essentially the same, unless you meant something else.

Fastest/easiest way to average ARGB color ints?

I have five colors stored in the format #AARRGGBB as unsigned ints, and I need to take the average of all five. Obviously I can't simply divide each int by five and just add them, and the only way I thought of so far is to bitmask them, do each channel separately, and then OR them together again. Is there a clever or concise way of averaging all five of them?
Half way between your (OP) proposed solution and Patrick's solution looks quite neat:
Color colors[5]={ 0xAARRGGBB,...};
unsigned long sum1=0,sum2=0;
for (int i=0;i<5;i++)
{
sum1+= colors[i] &0x00FF00FF; // 0x00RR00BB
sum2+=(colors[i]>>8)&0x00FF00FF; // 0x00AA00GG
}
unsigned long output=0;
output|=(((sum1&0xFFFF)/5)&0xFF);
output|=(((sum2&0xFFFF)/5)&0xFF)<<8;
sum1>>=16;sum2>>=16; // and now the top halves
output|=(((sum1&0xFFFF)/5)&0xFF)<<16;
output|=(((sum2&0xFFFF)/5)&0xFF)<<24;
I don't think you could really divide sum1/sum2 by 5, because the bits from the top half would spill down...
If an approximation would be valid, you could try a multiplication by something like, 0.1875 (0.125+0.0625), (this means: multiply by 3 and shift down by 4 places. This you could do with bitmasking and care.)
The problem is, 0.2 has a crappy binary representation, so multiplying by it is an ass.
As ever, accuracy or speed. Your choice.
When using x86 machines with at least SSE, and if you need to approximate only, you could use the assembly instruction PAVGB (Packed Average Byte), which averages bytes. See http://www.tommesani.com/SSEPrimer.html for explanation.
Since you've got 5 values, you would need to be creative in calling PAVGB, since PAVGB will only do two values at a time.
I found smart solution of your problem, sadly it is only applicable if number of colors is power of 2. I'll show it in case of two colors:
mask = 01010101
pom = ~(a^b & mask) # ^ means xor here, ~ negation
a = a & pom
b = b & pom
avg = (a+b) >> 1
The trick of this method is — when you count average, LSB of sum (in case of two numbers) has no meaning, as it will be dropped in division (we're talking integers here, of course). In your problem, LSB of partial sums is at the same moment carry bit of sum of adjacent color. Provided, that LSB of every color sum will be 0 you can safely add those two integers — additions won't interfere with each other. Bit shift divides every color by two.
This method can be used with 4 colors as well, but you have to implement finding out the carry flag of sum of numbers made of two last bits of every color. It is also possible to omit this part and just zero last two bits of every color — biggest mistake made with this omission is 1 for every component.
EDIT I'll leave this attempt for posterity, but please note that it is incorrect and will not work.
One "clever" way you could do it would be to insert zeros between the components, parse into an unsigned long, average the numbers, convert back to a hex string, remove the zeros and finally parse into an unsigned int.
i.e. convert #AARRGGBB to #AA00RR00GG00BB
This method involves parsing and string manipulations, so will undoubtedly be slower than the method you proposed.
If you were to factor your own solution carefully, it might actually look quite clever itself.

Resources