Why does the CRC of "1" yield the generator polynomial itself? - algorithm

While testing a CRC implementation, I noticed that the CRC of 0x01 usually (?) seems to be the polynomial itself. When trying to manually do the binary long division however, I keep ending up losing the leading "1" of the polynomial, e.g. with a message of "0x01" and the polynomial "0x1021", I would get
1 0000 0000 0000 (zero padded value)
(XOR) 1 0000 0010 0001
-----------------
0 0000 0010 0001 = 0x0021
But any sample implementation (I'm dealing with XMODEM-CRC here) results in 0x1021 for the given input.
Looking at https://en.wikipedia.org/wiki/Computation_of_cyclic_redundancy_checks, I can see how the XOR step of the upper bit leaving the shift register with the generator polynomial will cause this result. What I don't get is why this step is performed in that manner at all, seeing as it clearly alters the result of a true polynomial division?

I just read http://www.ross.net/crc/download/crc_v3.txt and noticed that in section 9, there is mention of an implicitly prepended 1 to enforce the desired polynomial width.
In my example case, this means that the actual polynomial used as divisor would not be 0x1021, but 0x11021. This results in the leading "1" being dropped, and the remainder being the "intended" 16-bit polynomial:
1 0000 0000 0000 0000 (zero padded value)
(XOR) 1 0001 0000 0010 0001
-----------------
0 0001 0000 0010 0001 = 0x1021

Related

Go Float32bit() result not expected

eg: 16777219.0 dec to bit
16777219 -> 1 0000 0000 0000 0000 0000 0011
Mantissa: 23 bit
0000 0000 0000 0000 0000 001
Exponent:
24+127 = 151
151 -> 10010111
Result shoud be:
0_10010111_000 0000 0000 0000 0000 0001
1001011100000000000000000000001
but:
fmt.Printf("%b\n", math.Float32bits(float32(16777219.0)))
// 1001011100000000000000000000010
why the Go Float32bit() result not expected?
reference:
base-conversion.ro
update:
fmt.Printf("16777216.0:%b\n", math.Float32bits(float32(16777216.0)))
fmt.Printf("16777217.0:%b\n", math.Float32bits(float32(16777217.0)))
//16777216.0:1001011100000000000000000000000
//16777217.0:1001011100000000000000000000000
fmt.Printf("16777218.0:%b\n", math.Float32bits(float32(16777218.0)))
//16777218.0:1001011100000000000000000000001
fmt.Printf("16777219.0:%b\n", math.Float32bits(float32(16777219.0)))
fmt.Printf("16777220.0:%b\n", math.Float32bits(float32(16777220.0)))
fmt.Printf("16777221.0:%b\n", math.Float32bits(float32(16777221.0)))
//16777219.0:1001011100000000000000000000010
//16777220.0:1001011100000000000000000000010
//16777221.0:1001011100000000000000000000010
fmt.Printf("000:%f\n", math.Float32frombits(0b_10010111_00000000000000000000000))
// 000:16777216.000000
fmt.Printf("001:%f\n", math.Float32frombits(0b_10010111_00000000000000000000001))
// 001:16777218.000000
fmt.Printf("010:%f\n", math.Float32frombits(0b_10010111_00000000000000000000010))
// 010:16777220.000000
fmt.Printf("011:%f\n", math.Float32frombits(0b_10010111_00000000000000000000011))
// 011:16777222.000000
what is the rules?
Go gives the correct IEEE-754 binary floating point result - round to nearest, ties to even.
The Go Programming Language Specification
float32 the set of all IEEE-754 32-bit floating-point numbers
Decimal
16777219
is binary
1000000000000000000000011
For 32-bit IEEE-754 floating-point binary binary, the 24-bit integer mantissa is
100000000000000000000001.1
Round to nearest, ties to even gives
100000000000000000000010
Removing the implicit one bit for the 23-bit mantissa gives
00000000000000000000010
package main
import (
"fmt"
"math"
)
func main() {
const n = 16777219
fmt.Printf("%d\n", n)
fmt.Printf("%b\n", n)
f := float32(n)
fmt.Printf("%g\n", f)
fmt.Printf("%b\n", math.Float32bits(f))
}
https://go.dev/play/p/yMaVkuiSJ5A
16777219
1000000000000000000000011
1.677722e+07
1001011100000000000000000000010
Why the result is not expected: You're expecting the wrong result.
The IEEE 754 standard specifies that, per Wikipedia:
if the number falls midway, it is rounded to the nearest value with an even least significant digit.
So when rounding 16777219, which is midway between 16777218 and 16777218. The "round up" option 16777220 gives an even LSB, and is the correct result you're observing.

Why does the algorithm expression result get trucated?

If I run $((0x100 - 0 & 0xff)), I got 0.
However $((0x100 - 0)) gives me 256.
Why the result from the first expression got truncated?
Because & is a bitwise operator, and there are no matching bits in 0x100 and 0xff.
What that means is it looks at the bits that make up your numbers and you get a 1 back in the position where both inputs have a 1.
So if you do $((0x06 & 0x03))
In binary you end up with
6 = 0110
3 = 0011
So when you logical and those together, you'll get
0010 (binary) or 0x02
For the numbers you have, there are no bits in common:
0x100 in binary is
0000 0001 0000 0000
0xff in binary is
0000 0000 1111 1111
If you bitwise and them together, there are no matching bits, so you'll end up with
0000 0000 0000 0000
Interestingly, it does the subtraction before it does the bitwise and operation (I expected it to do the other way):
$((0x100 - 1 & 0xff)) gives 255 or 0xff because 0x100 - 1 = 0xff

What is the byte/bit order in this Microsoft document?

This is the documentation for the Windows .lnk shortcut format:
https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-shllink/16cb4ca1-9339-4d0c-a68d-bf1d6cc0f943
The ShellLinkHeader structure is described like this:
This is a file:
Looking at HeaderSize, the bytes are 4c 00 00 00 and it's supposed to mean 76 decimal. This is a little-endian integer, no surprise here.
Next is the LinkCLSID with the bytes 01 14 02 00 00 00 00 00 c0 00 00 00, representing the value "00021401-0000-0000-C000-000000000046". This answer seems to explain why the byte order changes because the last 8 bytes are a byte array while the others are little-endian numbers.
My question is about the LinkFlags part.
The LinkFlags part is described like this:
And the bytes in my file are 9b 00 08 00, or in binary:
9 b 0 0 0 8 0 0
1001 1011 0000 0000 0000 1000 0000 0000
^
By comparing different files I found out that the bit marked with ^ is bit 6/G in the documentation (marked in red).
How to interpret this? The bytes are in the same order as in the documentation but each byte has its bits reversed?
The issue here springs from the fact the shown list of bits in these specs is not meant to fit a number underneath it at all. It is meant to fit a list of bits underneath it, and that list goes from the lowest bit to the highest bit, which is the complete inverse of how we read numbers from left to right.
The list clearly shows bits numbered from 0 to 31, though, meaning this is indeed one 32-bit value, and not four bytes. Specifically, this means the original read bytes need to be interpreted as a single 32-bit integer before doing anything else. Like with all other values, this means it needs to be read as little-endian number, with its bytes reversed.
So your 9b 00 08 00 becomes 0008009b, or, in binary, 0000 0000 0000 1000 0000 0000 1001 1011.
But, as I said, that list in the specs shows the bits from lowest to highest. So to fit them under that, reverse the binary version:
0 1 2 3
0123 4567 8901 2345 6789 0123 4567 8901
ABCD EFGH IJKL MNOP QRST UVWX YZ#_ ____
---------------------------------------
1101 1001 0000 0000 0001 0000 0000 0000
^
So bit 6, indicated in the specs as 'G', is 0.
This whole thing makes a lot more sense if you invert the specs, though, and list the bits logically, from highest to lowest:
3 2 1 0
1098 7654 3210 9876 5432 1098 7654 3210
____ _#ZY XWVU TSRQ PONM LKJI HGFE DCBA
---------------------------------------
0000 0000 0000 1000 0000 0000 1001 1011
^
0 0 0 8 0 0 9 b
This makes the alphabetic references look a lot less intuitive, but it does perfectly fit the numeric versions underneath. The bit matches your findings (third bit on what you have as value '9'), and you can also clearly see that the highest 5 bits are unused.

How to bitmask a number (in hex) using the AND operator?

I know that you can bitmask by ANDing a value with 0. However, how can I both bitmask certain nibbles and maintain others. In other words if I have 0x000f0b7c and I wanted to mask the everything but b (in other words my result would be 0x00000b00) how would I use AND to do this? Would it require multiple steps?
You can better understand boolean operations if you represent values in binary form.
The AND operation between two binary digits returns 1 if both the binary digits have a value of 1, otherwise it returns 0.
Suppose you have two binary digits a and b, you can build the following "truth table":
a | b | a AND b
---+---+---------
0 | 0 | 0
1 | 0 | 0
0 | 1 | 0
1 | 1 | 1
The masking operation consists of ANDing a given value with a "mask" where every bit that needs to be preserved is set to 1, while every bit to discard is set to 0.
This is done by ANDing each bit of the given value with the corresponding bit of the mask.
The given value, 0xf0b7c, can be converted as follows:
f 0 b 7 c (hex)
1111 0000 1011 0111 1100 (bin)
If you want to preserve only the bits corresponding to the "b" value (bits 8..11) you can mask it this way:
f 0 b 7 c
1111 0000 1011 0111 1100
0000 0000 1111 0000 0000
The value 0000 0000 1111 0000 0000 can be converted to hex and has a value of 0xf00.
So if you calculate "0xf0b7c AND 0xf00" you obtain 0xb00.

Are these endian transformations correct?

I am struggling to figure this out, I am trying to represent a 32bit variable in both big and little endian. For the sake of argument let's say we try the number, "666."
Big Endian: 0010 1001 1010 0000 0000 0000 0000
Little Endian: 0000 0000 0000 0000 0010 1001 1010
Is this correct, or is my thinking wrong here?
666 (decimal) as 32-bit binary is represented as:
[0000 0000] [0000 0000] [0000 0010] [1001 1010] (big endian, most significant byte first))
[1001 1010] [0000 0010] [0000 0000] [0000 0000] (little endian, least significant byte first)
Ref.
(I have used square brackets to group 4-bit nibbles into bytes)

Resources