Least significant bit first - ruby

While working on ruby I came across:
> "cc".unpack('b8B8')
=> ["11000110", "01100011"]
Then I tried Googleing to find a good answer on "least significant bit", but could not find any.
Anyone care to explain, or point me in the right direction where I can understand the difference between "LSB first" and "MSB first".

It has to do with the direction of the bits. Notice that in this example it's unpacking two ascii "c" characters, and yet the bits are mirror images of each other. LSB means the rightmost (least-significant) bit is the first bit. MSB means the leftmost (most-significant) is the first bit.
As a simple example, consider the number 5, which in "normal" (readable) binary looks like this:
00000101
The least significant bit is the rightmost 1, because that is the 2^0 position (or just plan 1). It doesn't impact the value too much. The one next to it is the 2^1 position (or just plain 0 in this case), which is a bit more significant. The bit to its left (2^2 or just plain 4) is more significant still. So we say this is MSB notation because the most significant bit (2^7) comes first. If we change it to LSB, it simply becomes:
10100000
Easy right?
(And yes, for all you hardware gurus out there I'm aware that this changes from one architecture to another depending on endianness, but this is a simple answer for a simple question)

The term "significance" of a bit or byte only makes sense in the context of interpreting a sequence of bits or bytes as an integer. The bigger the impact of the bit or byte on the value of the resulting integer - the higher its significance. The more "significant" it is to the value.
So, for example, when we talk about a sequence of four bytes having the least significant byte first (aka little-endian), what we mean is that when we interpret those four bytes as a 32bit integer, the first byte denotes the lowest eight binary digits of the integer, the second bytes denotes the 17th through 24th binary digits of the integer, the third denotes the 9th through 16th, and the last byte denotes the highest eight bits of the integer.
Likewise, if we say a sequence of 8 bits is in most significant bit first order, what we mean is that if we interpret the 8 bits as an 8-bit integer, the first bit in the sequence denotes the highest binary digit of the integer, the second bit the second highest, and so on, until the last bit denotes the lowest binary digit of the integer.
Another thing to think about is that one can say that the usual decimal notation has a convention that is most significant digit first. For example, a decimal number like:
1250
is read to mean:
1 x 1000 +
2 x 100 +
5 x 10 +
0 x 1
Right? Now imagine a different convention that is least significant digit first. The same number would be written:
0521
and would be read as:
0 x 1 +
5 x 10 +
2 x 100 +
1 x 1000
Another thing you should observe in passing is that in the C family of languages (most modern programming languages), the shift-left operator (<<) and shift-right operator (>>) are pointing in a most significant bit direction. That is shifting a bit left increases its significance and shifting it right decreases it, meaning that left is most significant (and the left side is usually what we mean by first, at least in the west).

Related

What is the output of '%b' verb when it is floating number

According to the go doc, %b used with floating number means:
decimalless scientific notation with exponent a power of two,
in the manner of strconv.FormatFloat with the 'b' format,
e.g. -123456p-78
As the code shows below, the program output is
8444249301319680p-51
I'm a little confused about %b in floating number, can anybody tell me how this result is calculated? Also what does p- mean?
f := 3.75
fmt.Printf("%b\n", f)
fmt.Println(strconv.FormatFloat(f, 'b', -1, 64))
The decimalless scientific notation with exponent a power of two that means follows:
8444249301319680*(2^-51) = 3.75 or 8444249301319680/(2^51) = 3.75
p-51 means 2^-51 which can also be calculated as 1/(2^51)
Nice article on Floating-Point Arithmetic.
https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
The five rules of scientific notation are given below:
The base is always 10
The exponent must be a non-zero integer, which means it can be either positive or negative
The absolute value of the coefficient is greater than or equal to 1 but it should be less than 10
The coefficient carries the sign (+) or (-)
he mantissa carries the rest of the significant digits
p
%b scientific notation with exponent a power of two (its p)
%e scientific notation
It is worth pointing out that the %b output is particularly easy for the runtime system to generate as well, due to the internal storage format for floating point numbers.
If we ignore "denormalized" floating point numbers (we can add them back later), a floating point number is stored, internally, as 1.bbbbbb...bbb x 2exp for some set of bits ("b" here), e.g., the value four is stored as 1.000...000 <exp> 2. The value six is stored as 1.100...000 <exp> 2, the value seven is stored as 1.110...000 <exp> 2, and eight is stored as 1.000...000 <exp> 3. The value seven-and-a-half is 1.111 <exp> 2, seven and three quarters is 1.1111 <exp> 2, and so on. Each bit here, in the 1.bbbb, represents the next power of two lower than the exponent.
To print out 1.111 <exp> 2 with the %b format, we simply note that we need four 1 bits in a row, i.e., the value 15 decimal or 0xf or 1111 binary, which causes the exponent to need to be decreased by 3, so that instead of multiplying by 22 or 4, we want to multiply by 2-1 or ½. So we can take the actual exponent (2), subtract 3 (because we moved the "point" three times to print 1111 binary or 15), and hence print out the string 15p-1.
That's not what Go's %b prints though: it prints 8444249301319680p-50. This is the same value (so either one would be correct output)—but why?
Well, 8444249301319680 is, in hexadecimal, 1E000000000000. Expanded into full binary, this is 1 1110 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000. That's 53 binary digits. Why 53 binary digits, when four would suffice?
The answer to that is found in the link in Nick's answer: IEEE 754 floating point format uses a 53-digit "mantissa" or "significand" (the latter is the better term and the one I usually try to use, but you'll see the former pop up very often). That is, the 1.bbb...bbb has 52 bs, plus that forced-in leading 1. So there are always exactly 53 binary digits (for IEEE "double precision").
If we just treat this 53-binary-digit number as a decimal number, we can always print it out without a decimal point. That means we just adjust the power-of-two exponent.
In IEEE754 format, the exponent itself is already stored in "excess form", with 1023 added (for double precision again). That means that 1.111000...000 <exp> 2 is actually stored with an exponent value of 2+1023 = 1025. What this means is that to get the actual power of two, the machine code formatting the number is already going to have to subtract 1023. We can just have it subtract 52 more at the same time.
Last, because the implied 1 is always there, the internal IEEE754 number doesn't actually store the 1 bit. So to read out the value and convert it, the code internally does:
decimalPart := machineDependentReinterpretation1(&doubleprec_value)
expPart := machineDependentReinterpretation2(&doubleprec_value)
where the machine-dependent-reinterpretation simply extracts the correct bits, puts in the implied 1 bit as needed in the decimal part, subtracts the offset (1023+52) for the exponent part, and then does:
fmt.Sprint("%dp%d", decimalPart, expPart)
When printing a floating-point number in decimal, the base conversion (from base 2 to base 10) is problematic, requiring a lot of code to get the rounding right. Printing it in binary like this is much easier.
Exercises for the reader, to help with understanding this:
Compute 1.102 x 22. Note: 1.12 is 1½ decimal.
Compute 11.02 x 21. (11.02 is 3.)
Based on the above, what happens as you "slide the binary point" left and right?
(more difficult) Why can we assume a leading 1? If necessary, read on.
Why we can assume a leading 1?
Let's first note that when we use scientific notation in decimal, we can't assume a leading 1. A number might be 1.7 x 103, or 5.1 x 105, or whatever. But when we use scientific notation "correctly", the first digit is never zero. That is, we do not write 0.3 x 100 but rather 3.0 x 10-1. In this kind of notation, the number of digits tells us about the precision, and the first digit never has to be zero and generally isn't supposed to be zero. If the first digit were zero, we just move the decimal point and adjust the exponent (see exercises 1 and 2 above).
The same rules apply with floating-point numbers. Instead of storing 0.01, for instance, we just slide the binary point two over two positions and get 1.00, and decrease the exponent by 2. If we might want to have stored 11.1, we slide the binary point one position the other way and increase the exponent. Whenever we do this, the first digit always winds up being a one.
There is one big exception here, which is: when we do this, we can't store zero! So we don't do this for the number 0.0. In IEEE754, we store 0.0 as all-zero-bits (except for the sign, which we can set to store -0.0). This has an all-zero exponent, which the computer hardware handles as a special case.
Denormalized numbers: when we can't assume a leading 1
This system has one notable flaw (which isn't entirely fixed by denorms, but nonetheless, IEEE has denorms). That is: the smallest number we can store "abruptly underflows" to zero. Kahan has a 15 page "brief tutorial" on gradual underflow, which I am not going to attempt to summarize, but when we hit the minimum allowed exponent (2-1023) and want to "get smaller", IEEE lets us stop using these "normalized" numbers with the leading 1 bit.
This doesn't affect the way that Go itself formats floating point numbers, because Go just takes the entire significand "as is". All we have to do is stop inserting the 253 "implied 1" when the input value is a denormalized number, and everything else Just Works. We can hide this magic inside the machine-dependent float64 reinterpretation code, or do it explicitly in Go, whichever is more convenient.

How can one byte be more significant than another?

The difference between little endian and big endian was explained to me like this: "In little endian the least significant byte goes into the low address, and in big endian the most significant byte goes into the low address". What exactly makes one byte more significant than the other?
In the number 222, you could regard the first 2 as most significant because it represents the value 200; the second 2 is less significant because it represents the value 20; and the third 2 is the least significant digit because it represents the value 2.
So, although the digits are the same, the magnitude of the number they represent is used to determine the significance of a digit.
It is the same as when a value is rounded to a number of significant figures ("S.F." or "SF"): 123.321 to 3SF is 123, to 2SF it is 120, to 4SF it is 123.3. That terminology has been used since before electronics were invented.
In any positional numeric system, each digit has a different weight in creating the overall value of the number.
Consider the number 51354 (in base 10): the first 5 is more significant than the second 5, as it stands for 5 multiplied by 10000, while the second 5 is just 5 multiplied by 10.
In computers number are generally fixed-width: for example, a 16 bit unsigned integer can be thought as a sequence of exactly 16 binary digits, with the leftmost one being unambiguously the most significant (it is worth exactly 32768, more than any other bit in the number), and the rightmost the least significant (it is worth just one).
As long as integers are in the CPU registers we don't really need to care about their representation - the CPU will happily perform operations on them as required. But when they are saved to memory (which generally is some random-access bytes store), they need to be represented as bytes in some way.
If we consider "normal" computers, representing a number (bigger than one byte) as a sequence of bytes means essentially representing it in base 256, each byte being a digit in base 256, and each base-256 digit being more or less significant.
Let's see an example: take the value 54321 as a 16 bit integer. If you write it in base 256, it'll be two base-256 digits: the digit 0xD41 (which is worth 0xD4 multiplied by 256) and the digit 0x31 (which is worth 0x31 multiplied by 1). It's clear that the first one is more significant than the second one, as indeed the leftmost "digit" is worth 256 times more than the one at its right.
Now, little endian machines will write in memory the least significant digit first, big endian ones will do the opposite.
Incidentally, there's a nice relationship between binary, hexadecimal and base-256: 4 bits are mapped straight to a hexadecimal digit, and 2 hexadecimal digits are mapped straight to a byte. For this reason you can also see that 54321 is in binary
1101010000110001 = 0xD431
can be split straight into two groups of 8 bits
11010100 00110001
which are the 0xD4 and 0x31 above. So you can see as well that the most significant byte is the one that contains the most significant bits.
Here I'm using the corresponding hexadecimal values to represent each base-256 digit, as there's no good way to represent them symbolically. I could use their ASCII character value, but I 0xD4 is outside ASCII, and 0x31 is 1, which would only add confusion.

Two ways to represent 0 with bits

Let's say we want to represent a signed number with 5 bits where the first bit is used for the sign (+ or -) of the number. Then the zero can be represented by two bit representations (10000 and 00000).
How is this problem solved?
Okay. There are always two bit in binary 1 or 0
And then there could be any number of bits for example 1bit to 64bit
If the question is 5-bit string then it should be XXXXX where X can be any bit(1 or 0)
First bit(sign bit) we can have either +0 and -0. (Thanks #machinery)
So if it is positive, we put 0 at first position and if it is negative, we put 1 at first position.
Four Bits
Now, we got our first bit, we are left with another 4-bits 0XXXX or 1XXXX as the question asked for 0,
the rest bit will be zero.
therefore the answer is 00000 or 10000
Look how to convert decimal to binary and binary to decimal.

Fastest method for adding/summing the individual digit components of a number

I saw a question on a math forum a while back where a person was discussing adding up the digits in a number over and over again until a single digit is achieved. (i.e. "362" would become "3+6+2" which would become "11"... then "11" would become "1+1" would would become "2" therefor "362" would return 2... I wrote some nice code to get an answer to this and posted it only to be outdone by a user who suggested that any number in modulo 9 is equal to this "infinite digit sum", I checked it an he was right... well almost right, if zero was returned you had to switch it out with a "9" but that was a very quick fix...
362 = 3+6+2 = 11 = 1+1 = 2
or...
362%9 = 2
Anways, the mod9 method works fantastic for infinitely adding the sum of the digits until you are left with just a single digit... but what about only doing it once (i.e. 362 would just return "11")... Can anyone think of fast algorithms?
There's a cool trick for summing the 1 digits in binary, and with a fixed-width integer. At each iteration, you separate out half the digits each into two values, bit-shift one value down, then add. First iteration, separate ever other digit. Second iteration, pairs of digits, and so on.
Given that 27 is 00011011 as 8-bit binary, the process is...
00010001 + 00000101 = 00010110 <- every other digit step
00010010 + 00000001 = 00010011 <- pairs of digits
00000011 + 00000001 = 00000100 <- quads, giving final result 4
You could do a similar trick with decimal, but it would be less efficient than a simple loop unless you had a direct representation of decimal numbers with fast operations to zero out selected digits and to do digit-shifting. So for 12345678 you get...
02040608 + 01030507 = 03071115 <- every other digit
00070015 + 00030011 = 00100026 <- pairs
00000026 + 00000010 = 00000036 <- quads, final result
So 1+2+3+4+5+6+7+8 = 36, which is correct, but you can only do this efficiently if your number representation is fixed-width decimal. It always takes lg(n) iterations, where lg means the base two logarithm, and you round upwards.
To expand on this a little (based on in-comments discussions), let's pretend this was sane, for a bit...
If you count single-digit additions, there's actually more work than a simple loop here. The idea, as with the bitwise trick for counting bits, is to re-order those additions (using associativity) and then to compute as many as possible in parallel, using a single full-width addition to implement two half-width additions, four quarter-width additions etc. There's significant overhead for the digit-clearing and digit-shifting operations, and even more if you implement this as a loop (calculating or looking up the digit-masking and shift-distance values for each step). The "loop" should probably be fully unrolled and those masks and shift-distances be included as constants in the code to avoid that.
A processor with support for Binary Coded Decimal (BCD) could handle this. Digit masking and digit shifting would be implemented using bit masking and bit shifting, as each decimal digit would be encoded in 4 (or more) bits, independent of the encoding of other digits.
One issue is that BCD support is quite rare these days. It used to be fairly common in the 8 bit and 16 bit days, but as far as I'm aware, processors that still support it now do so mainly for backward compatibility. Reasons include...
Very early processors didn't include hardware multiplication and division. Hardware support for these operations means it's easier and more efficient to convert binary to decimal now. Binary is used for almost everything now, and BCD is mostly forgotten.
There are decimal number representations around in libraries, but few if any high level languages ever provided portable support to hardware BCD, so since assembler stopped being a real-world option for most developers BCD support simply stopped being used.
As numbers get larger, even packed BCD is quite inefficiently packed. Number representations base 10^x have the most important properties of base 10, and are easily decoded as decimal. Base 1000 only needs 10 bits per three digits, not 12, because 2^10 is 1024. That's enough to show you get an extra decimal digit for 32 bits - 9 digits instead of 8 - and you've still got 2 bits left over, e.g. for a sign bit.
The thing is, for this digit-totalling algorithm to be worthwhile at all, you need to be working with fixed-width decimal of probably at least 32 bits (8 digits). That gives 12 operations (6 masks, 3 shifts, 3 additions) rather than 15 additions for the (fully unrolled) simple loop. That's a borderline gain, though - and other issues in the code could easily mean it's actually slower.
The efficiency gain is clearer at 64 bits (16 decimal digits) as there's still only 16 operations (8 masks, 4 shifts, 4 additions) rather than 31, but the odds of finding a processor that supports 64-bit BCD operations seems slim. And even if you did, how often do you need this anyway? It seems unlikely that it could be worth the effort and loss of portability.
Here's something in Haskell:
sumDigits n =
if n == 0
then 0
else let a = mod n 10
in a + sumDigits (div n 10)
Oh, but I just read you're doing that already...
(then there's also the obvious:
sumDigits n = sum $ map (read . (:[])) . show $ n
)
For short code, try this:
int digit_sum(int n){
if (n<10) return n;
return n%10 + digit_sum(n/10);
}
Or, in words,
-If the number is less than ten, then the digit sum is the number itself.
-Otherwise, the digit sum is the current last digit (a.k.a. n mod10 or n%10), plus the digit sum of everything to the left of that number (n divided by 10, using integer division).
-This algorithm can also be generalized for any base, substituting the base in for 10.
int digit_sum(int n)
Do
if (n<10) return n;
Exit do
else
n=n%10 + digit_sum(n/10);
Loop

Error detecting code and Hamming distance

The Hamming distance of v and w equals 2, but without parity
bit it would be just 1. Why is this the case?
This would be more appropriately asked in the theoretical computer science section of StackExchange, but since you've been honest and flagged it as homework ...
ASCII uses 7 bits to specify a character. (In ASCII, 'X' is represented by the 7 bits `1011000'.) If you start with any ASCII sequence, the number of bits you need to flip to get to another legitimate ASCII sequence is only 1 bit. Therefore the Hamming distance between plain ASCII sequences is 1.
However, if a parity bit is added (for a total of 8 bits -- the 7 ASCII bits plus one parity bit, conventionally shown in the leftmost position) then any single-bit flip in the sequence will cause the result to have incorrect parity. Following the example, with even parity 'X' is represented by 11011000, because the parity bit is chosen to give an even number of 1s in the sequence. If you now flip any single bit in that sequence then the result will be unacceptable because it will have incorrect parity. In order to arrive at an acceptable new sequence with even parity you must change a minimum of two bits. Therefore when parity is in effect the Hamming distance between acceptable sequences is 2.

Resources