Why this exponent got calculated in this way at this example? - algorithm

Number: 0.1101112 × 2^–3 (the first bit is included in this example in the mantissa)
where 8 bits are used for the characteristic, and the exponent bias is
2^7 – 1
Their solution:
The sign bit is 0. The characteristic is –3 + 2^7 – 1, represented as an 8-
bit binary number. The simplest way to calculate the characteristic
here is to find the 7-bit 2’s complement of the binary representation
of 4 (= 3 + 1), and adjoin a leading zero:
Binary representation of 4: 0000100
2’s complement: 1111100
Characteristic: 0111 1100
Why: my solution was get the 8-bit instead of the 7-bit complement
1111 1100 then add it to 128 8-bit representation 1000 0000
Which get me 1 0111 1100 then ignoring the ninth column I got the same answer,
but i did not get the approach of the author.
Your explanation is highly appreciated
Thanks

The idea behind the original approach is to rewrite the expression
–3 + 2^7 – 1
as
2^7 - 4
The lower seven bits of this expression are the 7-bit two's complement of 4 (i.e. the representation of -4 in 7 bits). Since the number is obviously in the range 0-127, then eighth bit must be zero.

Related

Checking if an adress is linecache aligned

This is a quiz question which I failed in the past and despite having access to the solution, I don't understand the different step to come to the correct answer.
Here is the problem :
Which of these adress is line cache aligned
a. 0x7ffc32a21164
b. 0x560c40e05350
c. 0x560c40e052c0
d. 0x560c3f2d71ff
And the solution to the problem:
Each hex char is represented by 4 bits
It takes 6 bits to represent 64 adress, since ln(64)/ln(2) = 6
0x0 0000
0x4 0100
0x8 1000
0xc 1100
________
2^3 2^2 2^1 2^0
8 4 2 1
Conclusion: if the adress ends if either 00, 40, 80 or c0, then it is aligned on 64 bytes.
The answer is c.
I really don't see how we go from 6 bits representation to this answer. Can anyone adds something to the solution given to make it clearer?
The question boils down to: Which number is a multiple of 64? All that remains is understanding the number system they're using.
In binary, 64 is written as 1000000. In hexadecimal, it's written as 0x40. So multiples of 64 will end in 0x00 (0 * 64), 0x40 (1 * 64), 0x80 (2 * 64), or 0xC0 (3 * 64). (The cycle then repeats.) Answer c is the one with the right ending.
An analogy in decimal would be: Which number is a multiple of 5? 0 * 5 is 0 and 1 * 5 is 5, after which the cycle repeats. So we just need to look at the last digit. If it's a 0 or a 5, we know the number is a multiple of 5.

How can I substract 253 from 175(175-253) through 2's complement method?

For 2's complement, substraction process by computer.
176-253=176+(-253)
176=10110000
253=11111101
253(inverse)=00000010
253(complement)=00000010+1=00000011
-253=253(complement)=00000011
176+(-253)=10110000+00000011=10110011=179?
but in fact 176-253=-77
is anybody tell me what's wrong here?
With 8 bits you can only represent numbers from -128 to 127 inclusive in 2's complement. Both your numbers lie outside that range. You would need at least nine bits to do the calculation you want to do.
In 2's complement the most significant bit (MSB, the first bit from the left), indicates the sign, 1 for negative numbers and 0 for non-negative numbers. The value:
00000011
is not -253, but is 3.
Doing your calculation in 9 bits yields:
176 = 010110000
253 = 011111101
253(inverse) = 100000010
253(complement) = 100000010+1=100000011
-253 = 253(complement) = 100000011
176+(-253) = 010110000 + 100000011 = 110110011 = -77
Note that all the negative numbers have MSB=1 and all the non-negative numbers have MSB=0.

An efficient algorithm to compute the number of '1' bit in a long decimal integer that is represented in string?

I came across this interesting question today. (Note that this is not for my homework or interview, etc.)
Given a decimal number that is represented in string, we want to compute the number of '1' bits for the large number in binary format. Here the string can have thousands of characters, and cannot be represented with one int or long long variable.
For example, countBits("10") = 2 as '10' in decimal can be represented as '1010' in binary format. Similarly, we have countBits("12") = 2, countBits("7") = 3
What is an efficiently algorithm for this? One possible solution is to convert the decimal string to another string in the binary format, and count the '1's. Can we do better?
When converting from a decimal representation to and integer, the *n*th digit from the end of the string represents the number of 1010n ( one zero base ten to the power of n ) that is added to total the integer value. If you then want to represent that integer in binary, you have to raise 1010 which is 10102 to that power and multiply that value by the digit's value.
Because one of the factors of the base you are translating from, 5, is relatively prime compared to 2, the powers of 1010 have increasing long representations in base 2 - 12, 10102, 11001002, 11111010002.
Note that these powers have trailing zeros ( 1010 = 2 × 5 and 2 is not relatively prime with the base we are translating into ), so will only effect 1, 3, 5, and 7 bits of the answer instead of all 1, 4, 7, 10 bits. But the number of bits they effect will still vary with O(N) where N is the length of the input, so to calculate the effected bits will take O(N2) operations.
If the base you were translating from did not have factors where were relatively prime to the base you are translating to - say translating base 16 to base 2 or base 9 to base 3 and counting non-zero digits, then there would be a O(N) algorithm as the sum of non-zero digits in the target base would equal the sum for each digit in the input translated individually, but since that is not the case then you are stuck at an O(N2) algorithm where you translate the decimal representation into binary and then count the bits in the binary representation.
You convert it to binary and use Hamming weight algorithm.
How it works? Suppose you have the number 8, which is 00001000.
The algorithm takes chunks of 2 bits, so it'll have 00 00 10 00.
Now it'll sum each two bits (by having a mask 10101010, multiplying and shifting), which will result: 00 00 01 00.
Now it does the same for each 4 bits (by having a mask 00110011..), so it'll have 0000 1000. After adding each side, you'll have 0000 00001.
The last stage is adding the two numbers, 0 + 1, which is 1 and that's the final result.

Huffman code tables

I didn't understand what do the Huffman tables of Jpeg contain, could someone explain this to me?
Thanks
Huffman encoding is a variable-length data compression method. It works by assigning the most frequent values in an input stream to the encodings with the smallest bit lengths.
For example, the input Seems every eel eeks elegantly. may encode the letter e as binary 1 and all other letters as various other longer codes, all starting with 0. That way, the resultant bit stream would be smaller than if every letter was a fixed size. By way of example, let's examine the quantities of each character and construct a tree that puts the common ones at the top.
Letter Count
------ -----
e 10
<SPC> 4
l 3
sy 2
Smvrkgant. 1
<EOF> 1
The end of file marker EOF is there since you generally have to have a multiple of eight bits in your file. It's to stop any padding at the end from being treated as a real character.
__________#__________
________________/______________ \
________/________ ____\____ e
__/__ __\__ __/__ \
/ \ / \ / \ / \
/ \ / \ / SPC l s
/ \ / \ / \ / \ / \
y S m v / k g \ n t
/\ / \
r . a EOF
Now this isn't necessarily the most efficient tree but it's enough to establish how the encodings are done. Let's first look at the uncompressed data. Assuming an eight-bit encoding, those thirty-one characters (we don't need the EOF for the uncompressed data) are going to take up 248 bits.
But, if you use the tree above to locate the characters, outputting a zero bit if you take the left sub-tree and a one bit if you take the right, you get the following:
Section Encoding
---------- --------
Seems<SPC> 00001 1 1 00010 0111 0101 (20 bits)
every<SPC> 1 00011 1 001000 00000 0101 (22 bits)
eel<SPC> 1 1 0110 0101 (10 bits)
eeks<SPC> 1 1 00101 0111 0101 (15 bits)
elegantly 1 0110 1 00110 001110 01000 01001 0110 00000 (36 bits)
.<EOF> 001001 001111 (12 bits)
That gives a grand total of 115 bits, rounded up to 120 since it needs to be a multiple of a byte, but that's still about half the size of the uncompressed data.
Now that's usually not worth it for a small file like this, since you have to add the space taken up by the actual tree itself(a), otherwise you cannot decode it at the other end. But certainly, for larger files where the distribution of characters isn't even, it can lead to impressive savings in space.
So, after all that, the Huffman tables in a JPEG are simply the tables that allow you to uncompress the stream into usable information.
The encoding process for JPEG consists of a few different steps (color conversion, chroma resolution reduction, block-based discrete cosine transforms, and so on) but the final step is a lossless Huffman encoding on each block which is what those tables are used to reverse when reading the image.
(a) Probably the best case for minimal storage of this table would be something like:
Size of length section (8-bits) = 3 (longest bit length of 6 takes 3 bits)
Repeated for each byte:
Actual length (3 bits, holding value between 1..6 inclusive)
Encoding (n bits, where n is the actual length)
Byte (8 bits)
End of table marker (3 bits) = 0 to distinguish from actual length above
For the text above, that would be:
00000011 8 bits
n bits byte
--- ------ -----
001 1 'e' 12 bits
100 0101 <SPC> 15 bits
101 00001 'S' 16 bits
101 00010 'm' 16 bits
100 0111 's' 15 bits
101 00011 'v' 16 bits
110 001000 'r' 17 bits
101 00000 'y' 16 bits
101 00101 'k' 16 bits
100 0110 'l' 15 bits
101 00110 'g' 16 bits
110 001110 'a' 17 bits
101 01000 'n' 16 bits
101 01001 't' 16 bits
110 001001 '.' 17 bits
110 001111 <EOF> 17 bits
000 3 bits
That makes the table 264 bits which totally wipes out the savings from compression. However, as stated, the impact of the table becomes far less as the input file becomes larger and there's a way to avoid the table altogether.
That way involves the use of another variant of Huffman, called Adaptive Huffman. This is where the table isn't actually stored in the compressed data.
Instead, during compression, the table starts with just EOF and a special bit sequence meant to introduce a new real byte into the table.
When introducing a new byte into the table, you would output the introducer bit sequence followed by the full eight bits of that byte.
Then, after each byte is output and the counts updated, the table/tree is rebalanced based on the new counts to be the most space-efficient (though the rebalancing may be deferred to improve speed, you just have to ensure the same deferral happens during decompression, an example being every time you add byte for the first 1K of input, then every 10K of input after that, assuming you've added new bytes since the last rebalance).
This means that the table itself can be built in exactly the same way at the other end (decompression), starting with the same minimal table with just the EOF and introducer sequence.
During decompression, when you see the introducer sequence, you can add the byte following it (the next eight bits) to the table with a count of zero, output the byte, then adjust the count and re-balance (or defer as previously mentioned).
That way, you do not have to have the table shipped with the compressed file. This, of course, costs a little more time during compression and decompression in that you're periodically rebalancing the table but, as with most things in life, it's a trade-off.
The DHT marker doesn't specify directly which symbol is associated with a code. It contains a vector with counts of how many codes there are of a given length. After that it contains a vector with symbol values.
So when you want to decode you have to generate the huffman codes from the first vector and then associate every code with a symbol in the second vector.

Query about working out whether number is a power of 2

Using the classic code snippet:
if (x & (x-1)) == 0
If the answer is 1, then it is false and not a power of 2. However, working on 5 (not a power of 2) and 4 results in:
0001 1111
0001 1111
0000 1111
That's 4 1s.
Working on 8 and 7:
1111 1111
0111 1111
0111 1111
The 0 is first, but we have 4.
In this link (http://www.exploringbinary.com/ten-ways-to-check-if-an-integer-is-a-power-of-two-in-c/) for both cases, the answer starts with 0 and there is a variable number of 0s/1s. How does this answer whether the number is a power of 2?
You need refresh yourself on how binary works. 5 is not represented as 0001 1111 (5 bits on), it's represented as 0000 0101 (2^2 + 2^0), and 4 is likewise not 0000 1111 (4 bits on) but rather 0000 0100 (2^2). The numbers you wrote are actually in unary.
Wikipedia, as usual, has a pretty thorough overview.
Any power of two number can be represent in binary with a single 1 and multiple 0s.
eg.
10000(16)
1000(8)
100(4)
If you subtract 1 from any power of two number, you will get all 1s to the right of where the original one was.
10000(16) - 1 = 01111(15)
ANDing these two numbers will give you 0 every time.
In the case of a non-power of two number, subtracting one will leave at least one "1" unchanged somewhere in the number like:
10010(18) - 1 = 10001(17)
ANDing these two will result in
10000(16) != 0
Keep in mind that if x is a power of 2, there is exactly 1 bit set. Subtract 1, and you know two things: the resulting value is not a power of two, and the bit that was set is no longer set. So, when you do a bitwise and &, every bit that was set in x is not unset, and all the bits in (x-1) that are set must be matched against bits not set in x. So the and of each bit is always 0.
In other words, for any bit pattern, you are guaranteed that (x&(x-1)) is zero.
((n & (n-1)) == 0)
It checks whether the value of “n” is a power of 2.
Example:
if n = 8, the bit representation is 1000
n & (n-1) = (1000) & ( 0111) = (0000)
So it return zero only if its value is in power of 2.
The only exception to this is ‘0’.
0 & (0-1) = 0 but ‘0’ is not the power of two.
Why does this make sense?
Imagine what happens when you subtract 1 from a string of bits. You read from left to right,
turning each 0 to a 1 until you hit a 1, at which point that bit is flipped:
1000100100 -> (subtract 1) -> 1000100011
Thus, every bit, up through the first 1, is flipped. If there’s exactly one 1 in the number, then every bit (other than the leading zeros) will be flipped. Thus, n & (n-1) == 0 if there’s exactly one 1. If there’s exactly one 1, then it must be a power of two.

Resources