What sort of binary representation is %xA and such? - websocket

I'm setting up a websocket server to connect with a web browser, while reading the Specification (RFC 6455) for websocket frame exchange I came across these values that are supposed to represent a 4bit op code, they look like this:
%x0 %x1 %x2 ..... %xA %xB .... %xF
I know that %x0 = 0000 and %x1 = 0001
I'd like to know what these values are called and how to convert them to bits.
Thank you.

These %xN representations use a single hexadecimal digit to describe a four-bit binary number.
The possible values of a four-bit binary number range from 0 (0000) to 15 (1111), so each value can be expressed as a hexadecimal digit:
0 %x0
1 %x1
10 %x2
11 %x3
100 %x4
101 %x5
110 %x6
111 %x7
1000 %x8
1001 %x9
1010 %xA
1011 %xB
1100 %xC
1101 %xD
1110 %xE
1111 %xF

Related

What is the byte/bit order in this Microsoft document?

This is the documentation for the Windows .lnk shortcut format:
https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-shllink/16cb4ca1-9339-4d0c-a68d-bf1d6cc0f943
The ShellLinkHeader structure is described like this:
This is a file:
Looking at HeaderSize, the bytes are 4c 00 00 00 and it's supposed to mean 76 decimal. This is a little-endian integer, no surprise here.
Next is the LinkCLSID with the bytes 01 14 02 00 00 00 00 00 c0 00 00 00, representing the value "00021401-0000-0000-C000-000000000046". This answer seems to explain why the byte order changes because the last 8 bytes are a byte array while the others are little-endian numbers.
My question is about the LinkFlags part.
The LinkFlags part is described like this:
And the bytes in my file are 9b 00 08 00, or in binary:
9 b 0 0 0 8 0 0
1001 1011 0000 0000 0000 1000 0000 0000
^
By comparing different files I found out that the bit marked with ^ is bit 6/G in the documentation (marked in red).
How to interpret this? The bytes are in the same order as in the documentation but each byte has its bits reversed?
The issue here springs from the fact the shown list of bits in these specs is not meant to fit a number underneath it at all. It is meant to fit a list of bits underneath it, and that list goes from the lowest bit to the highest bit, which is the complete inverse of how we read numbers from left to right.
The list clearly shows bits numbered from 0 to 31, though, meaning this is indeed one 32-bit value, and not four bytes. Specifically, this means the original read bytes need to be interpreted as a single 32-bit integer before doing anything else. Like with all other values, this means it needs to be read as little-endian number, with its bytes reversed.
So your 9b 00 08 00 becomes 0008009b, or, in binary, 0000 0000 0000 1000 0000 0000 1001 1011.
But, as I said, that list in the specs shows the bits from lowest to highest. So to fit them under that, reverse the binary version:
0 1 2 3
0123 4567 8901 2345 6789 0123 4567 8901
ABCD EFGH IJKL MNOP QRST UVWX YZ#_ ____
---------------------------------------
1101 1001 0000 0000 0001 0000 0000 0000
^
So bit 6, indicated in the specs as 'G', is 0.
This whole thing makes a lot more sense if you invert the specs, though, and list the bits logically, from highest to lowest:
3 2 1 0
1098 7654 3210 9876 5432 1098 7654 3210
____ _#ZY XWVU TSRQ PONM LKJI HGFE DCBA
---------------------------------------
0000 0000 0000 1000 0000 0000 1001 1011
^
0 0 0 8 0 0 9 b
This makes the alphabetic references look a lot less intuitive, but it does perfectly fit the numeric versions underneath. The bit matches your findings (third bit on what you have as value '9'), and you can also clearly see that the highest 5 bits are unused.

How to bitmask a number (in hex) using the AND operator?

I know that you can bitmask by ANDing a value with 0. However, how can I both bitmask certain nibbles and maintain others. In other words if I have 0x000f0b7c and I wanted to mask the everything but b (in other words my result would be 0x00000b00) how would I use AND to do this? Would it require multiple steps?
You can better understand boolean operations if you represent values in binary form.
The AND operation between two binary digits returns 1 if both the binary digits have a value of 1, otherwise it returns 0.
Suppose you have two binary digits a and b, you can build the following "truth table":
a | b | a AND b
---+---+---------
0 | 0 | 0
1 | 0 | 0
0 | 1 | 0
1 | 1 | 1
The masking operation consists of ANDing a given value with a "mask" where every bit that needs to be preserved is set to 1, while every bit to discard is set to 0.
This is done by ANDing each bit of the given value with the corresponding bit of the mask.
The given value, 0xf0b7c, can be converted as follows:
f 0 b 7 c (hex)
1111 0000 1011 0111 1100 (bin)
If you want to preserve only the bits corresponding to the "b" value (bits 8..11) you can mask it this way:
f 0 b 7 c
1111 0000 1011 0111 1100
0000 0000 1111 0000 0000
The value 0000 0000 1111 0000 0000 can be converted to hex and has a value of 0xf00.
So if you calculate "0xf0b7c AND 0xf00" you obtain 0xb00.

CRC polynomial calculation

I am trying to understand this document, but can't seem to get it right. http://www.ross.net/crc/download/crc_v3.txt
What's the algorithm used to calculate it?
I thought it uses XOR but I don't quite get it how he gets 0110 from 1100 XOR 1001. It should be 101 (or 0101 or 1010 if a bit goes down). If I can get this, I think the rest would come easy, but for some reason I just don't get it.
9= 1001 ) 0000011000010111 = 0617 = 1559 = DIVIDEND
DIVISOR 0000.,,....,.,,,
----.,,....,.,,,
0000,,....,.,,,
0000,,....,.,,,
----,,....,.,,,
0001,....,.,,,
0000,....,.,,,
----,....,.,,,
0011....,.,,,
0000....,.,,,
----....,.,,,
0110...,.,,,
0000...,.,,,
----...,.,,,
1100..,.,,,
1001..,.,,,
====..,.,,,
0110.,.,,,
0000.,.,,,
----.,.,,,
1100,.,,,
1001,.,,,
====,.,,,
0111.,,,
0000.,,,
----.,,,
1110,,,
1001,,,
====,,,
1011,,
1001,,
====,,
0101,
0000,
----
1011
1001
====
0010 = 02 = 2 = REMAINDER
The part you quoted is just standard long division like you learned in elementary school, except that it is done on binary numbers. At each step you perform a subtraction to get the remainder, and this is done in the example you gave: 1100 - 1001 = 0110.
Note that the article just uses this as a preliminary example, and it is not actually what is done in calculating CRC. Instead of normal numbers, CRC uses division of polynomials over the field GF(2). This can be modeled by using normal binary numbers and doing long division normally, except for using XOR instead of subtraction.
The link you provided says:
we'll do the division using good-'ol long division which you
learnt in school (remember?)
You just repetitively subtract, but since it is in binary, there are only two options: either the number fits once in the current selection, or 0 times. I annotated the steps:
0000011000010111
0000
1001 x 0
---- -
0000
1001 x 0
---- -
0001
1001 x 0
---- -
0011
1001 x 0
---- -
0110
1001 x 0
---- -
1100
1001 x 1
---- -
0110
1001 x 0
---- -
1100
1001 x 1
---- -
0110
and so on

Length values from deflate algorithm

I compressed the text "TestingTesting" and the hex result was: 0B 49 2D 2E C9 CC 4B 0F 81 50 00. I can't figure out the length and distance codes. The binary below is reversed because the RFC says to read the bits from right to left (thanks Matthew Slattery for the help). Here is what was parsed so far:
1 BFINAL (last block)
01 BTYPE (static)
1000 0100 132-48= 84 T
1001 0101 149-48= 101 e
1010 0011 163-48= 115 s
1010 0100 164-48= 116 t
1001 1001 153-48= 105 i
1001 1110 158-48= 110 n
1001 0111 151-48= 103 g
These are the remaining bits that I don't know how to parse:
1000 0100 0000 1000 0101 0000 0000 0
The final 10 bits (end of block value is x100) is the only part I can parse. I think the length and distance values should be 7 (binary 0111) since the length of "Testing" is 7 letters, and it gets copied 7 characters after the current position, but I can't figure out how its representing this in the remaining bits. What am I doing wrong?
The distance code is 5, but a distance code of 5 is followed by one "extra bit" to indicate an actual distance of either 7 or 8. (See the second table in paragraph 3.2.5 of the RFC.)
The complete decoding of the data is:
1 BFINAL
01 BTYPE=static
10000100 'T'
10010101 'e'
10100011 's'
10100100 't'
10011001 'i'
10011110 'n'
10010111 'g'
10000100 another 'T'
0000100 literal/length code 260 = length 6
00101 distance code 5
0 extra bit => the distance is 7
0000000 literal/length code 256 = end of block

How do you convert little Endian to big Endian with bitwise operations?

I get that you'd want to do something like take the first four bits put them on a stack (reading from left to right) then do you just put them in a register and shift them x times to put them at the right part of the number?
Something like
1000 0000 | 0000 0000 | 0000 0000 | 0000 1011
Stack: bottom - 1101 - top
shift it 28 times to the left
Then do something similar with the last four bits but shift to the right and store in a register.
Then you and that with an empty return value of 0
Is there an easier way?
Yes there is. Check out the _byteswap functions/intrinsics, and/or the bswap instruction.
You could do this way..
For example
I/p : 0010 1000 and i want output
1000 0010
input store into a variable x
int x;
i = x>>4
j = x<<4
k = i | j
print(K) //it will have 1000 0010.

Resources