Converting bits in hexadecimal to bytes - byte

I am trying to understand
256 bits in hexadecimal is 32 bytes, or 64 characters in the range 0-9 or A-F
How can a 32 bytes string be 64 characters in the range 0-9 or A-F?
What does 32 bytes mean?
I would assume that bits mean a digit 0 or 1, so 256 bits would be 256 digits of either 0 or 1.
I know that 1 byte equals 8 bits, so is 32 bytes a 32 digits of either 0, 1, 2, 3, 4, 5, 6, or 7 (i.e. 8 different values)?
I do know a little about different bases (e.g. that binary has 0 and 1, decimal has 0-9, hexadecimal has 0-9 and A-F, etc.), but I still fail to understand why 256 bits in hexadecimal can be 32 bytes or 64 characters.
I know it's quite basic in computer science, so I have to read up on this, but can you give a brief explanation?

A single hexadecimal character represents 4 bits.
1 = 0001
2 = 0010
3 = 0011
4 = 0100
5 = 0101
6 = 0110
7 = 0111
8 = 1000
9 = 1001
A = 1010
B = 1011
C = 1100
D = 1101
E = 1110
F = 1111
Two hexadecimal characters can represent a byte (8 bits).
How can a 32 bytes string be 64 characters in the range 0-9 or A-F?
Keep in mind that the hexadecimal representation is an EXTERNAL depiction of the bit settings. If byte contains 01001010, was can say that it 4A in hex. The characters 4A are not stored in the byte. It's like in mathematics where we use the depictions "e" and "π" to represent numbers.
What does 32 bytes mean?
1 Byte = 8 bits. 32 bytes = 256 bits.

Related

when we only have 6 bits of data on a byte, what do we fill the byte with up to 8?

When we only have 6 bits of data on a byte, what do we fill the byte with up to 8? In the picture below the important data , it's only 10 03 , but what is the science behind, how that neimportant bits are choosen ? What mean [55] or [AA]? I mention 10 03 is a request for diagnosis and 50 03 are a response.
The communication its on CAN and that it's a trace with CAN DATA .
I dont understand what are you talking about, but that looks like a Hex representation.
1 byte -> 2 hex characters -> 8 bits. AA -> 10, 10 in decimal -> 1010 1010 (binary)
explicit bits are always the right side or LSB (less significant bits)
example, in javascript regular integer is 32 bit long.
`
const number = 0b1010 //binary
const hexNumber = 0xA // hex
` -> 10 in decimal. As you can see we have only tell the less significant 4 bits. every other bit is an implicit 0

Why is huffman encoded text bigger than actual text?

I am trying to understand how Huffman coding works and it is supposed to compress data to take less memory than actual text but when I encode for example
"Text to be encoded"
which has 18 characters the result I get is
"100100110100101110101011111000001110011011110010101100011"
Am I supposed to divide those result bits by 8 since character has 8 bits?
You should compare the same units (bits as in the after the compession or characters as in the text before), e.g.
before: "Text to be encoded" == 18 * 8 bits = 144 bits
== 18 * 7 bits = 126 bits (in case of 7-bit characters)
after: 100100110100101110101011111000001110011011110010101100011 = 57 bits
so you have 144 (or 126) bits before and 57 bits after the compression. Or
before: "Text to be encoded" == 18 characters
after: 10010011
01001011
10101011
11100000
11100110
11110010
10110001
00000001 /* the last chunk is padded */ == 8 characters
so you have 18 ascii characters before and only 8 one byte characters after the compression. If characters are supposed to be 7-bit (0..127 range Ascii table) we have 9 characters after the compression:
after: 1001001 'I'
1010010 'R'
1110101 'u'
0111110 '>'
0000111 '\0x07'
0011011 '\0x1B'
1100101 'e'
0110001 'l'
0000001 '\0x01'

How to use urandom to generate all possible base64 characters?

In a base64 digits you could save up to 6 bits (2**6 == 64).
Which means that you could fit 3 bytes in 4 base64 digits.
64**4 == 2**24
That's why:
0x000000 == 'AAAA'
0xFFFFFF == '////'
This means that a random string of 3 bytes is equivalent to a base64 string of 4 characters.
However if I am converting a number of bytes which is not a multiple of 3 in a base64 string, I will not be able to generate all the combination of the base64 string.
Let's take an example:
If I want a random 7 characters base64 string, I would need to generate 42 random bits (64**7 == 2**42).
If I am using urandom to get 5 random bytes I will get only 40 bits (5*8) and if ask for 6 I will get 48 bits (6*8).
Can I ask for 6 bytes and use a mask to short it down to 5 or will it break my random repartition?
One solution:
hex(0x123456789012 & 0xFFFFFFFFFF)
'0x3456789012'
Another one:
hex(0x123456789012 >> 8)
'0x1234567890'
What do you think?
base64 strings with 7 characters of length
is an encode of a file with 5 bytes ( 40 bits: no less, no more )
40%6 = 4
base64 needs to add 2 more bits, and then, with 42 bits, 42%6=0,
the encode is possible; but, beware:
" If I want a random 7 characters base64 string, I would need to
generate 42 random bits (64**7 == 2**42). "
the 2 additional bits are not random, and are constants; are zero, indeed.
The cardinal number of your key-space doesn't change: is 2**40 = 1099511627776, not (64**7 == 2**42).
(64**7 == 2**42) is the cardinal number of the key-space that contains all possible combinations of 64 chars with length of 7; but, with the last two bits fixed (to zero, in this case, but doesn't matter the value) you don't have all possible combinations.
6 random bytes (48 bits), or 42 random bits, increase your original key-space; you should use 5 random bytes (40 bits), and send it to base64

Modulus without math operators

I want to create a method which given a number n and the number 16 and applies the modulus operator to them (n % 16). The thing which makes it hard for me is that I need to not use any of math operators (+, -, /, *, %).
Since 16 is 2^4 you can obtain the same result by truncating the value to the 4 least significant bits.
So:
x & 0xF is equivalent to x % 16
This is valid just because you are working with a power of two.
The key here is that 16 is a power of two, and so you can exploit the fact that computers utilise binary representations to achieve what you want with a bitwise operator.
Consider how multiples of 16 look when represented in binary:
0001 0000 // n = 16, n%16 = 0
0010 0000 // n = 32, n%16 = 0
0011 0000 // n = 48, n%16 = 0
0100 0000 // n = 64, n%16 = 0
Now take a look at some numbers for which n % 16 would be non-zero:
0000 0111 // n = 7, n%16 = 7
0001 0111 // n = 23, n%16 = 7
0010 0001 // n = 33, n%16 = 1
0100 0001 // n = 65, n%16 = 1
Notice that the remainder is simply the least significant 4 bits (nibble) - therefore we simply need to construct a bitwise expression that will keep these bits intact, whilst masking all other bits to zero. This can be achieved by performing a bitwise AND operation with the binary value 15:
x = n & 0xF

Ruby - How to represent message length as 2 binary bytes

I'm using Ruby and I'm communicating with a network endpoint that requires the formatting of a 'header' prior to sending the message itself.
The first field in the header must be the message length which is defined as a 2 binary byte message length in network byte order.
For example, my message is 1024 in length. How do I represent 1024 as binary two-bytes?
The standard tools for byte wrangling in Ruby (and Perl and Python and ...) are pack and unpack. Ruby's pack is in Array. You have a length that should be two bytes long and in network byte order, that sounds like a job for the n format specifier:
n | Integer | 16-bit unsigned, network (big-endian) byte order
So if the length is in length, you'd get your two bytes thusly:
two_bytes = [ length ].pack('n')
If you need to do the opposite, have a look at String#unpack:
length = two_bytes.unpack('n').first
See Array#pack.
[1024].pack("n")
This packs the number as the network-order byte sequence \x04\x00.
The way this works is that each byte is 8 binary bits. 1024 in binary is 10000000000. If we break this up into octets of 8 (8 bits per byte), we get: 00000100 00000000.
A byte can represent (2 states) ^ (8 positions) = 256 unique values. However, since we don't have 256 ascii-printable characters, we visually represent bytes as hexadecimal pairs, since a hexadecimal digit can represent 16 different values and 16 * 16 = 256. Thus, we can take the first byte, 00000100 and break it into two hexadecimal quads as 0000 0100. Translating binary to hex gives us 0x04. The second byte is trivial, as 0000 0000 is 0x00. This gives us our hexadecimal representation of the two-byte string.
It's worth noting that because you are constrained to a 2-byte (16-bit) header, you are limited to a maximum value of 11111111 11111111, or 2^16 - 1 = 65535 bytes. Any message larger than that cannot accurately represent its length in two bytes.

Resources