What 's different between "text frame" and "binary frame" in Websocket? - websocket

Text frame(opcode = 0x01) and Binary frame(opcode = 0x02) in RFC 6455. What's different between them and which one is faster?

For some background and perhaps more familiarity in other realms, HTTP/1 relied on an unstructured plaintext protocol while HTTP/2 allowed for faster processing of messages through binary framing. Also, SMTP relies on text while TCP relies on a binary protocol. To sum up, text relies on ASCII ('Hello') while binary, although a confusing term, has no readable representation. In JavaScript, socket.send(new ArrayBuffer(8)) is a basic binary object and is one example of the format that can be sent. This allocates a contiguous memory area of 8 bytes and pre-fills it with zeroes.
Hopefully this provides some good context.
0x01 is hexadecimal number (denoted by 0x and represents the integer 1) is the 4 bit opcode for the receiver (client or server) to know what type of "frame" it will receive. UTF-8 is denoted by 0x01 and raw binary is denoted by 0x02.
Raw binary will be much faster. Imagine an architecture like so:
If we send textual data (opcode 0x01) via the Websocket, it must translate the received data at the proxy. Before the TCP server responds to the client message, it may have to translate the response to textual data. Encoding binary data to ASCII text via base64 will increase the size of the message by 20-30%.
With an opcode of 0x02, we can skip 2 steps in the whole request/response cycle and reduce the size of the message we are passing. In short, we skip interpretation. Moreover, the Websocket spec has UTF-8 validation rules. The difference between text and binary might be 2x in favor of binary.

Related

How can I read the received packets with a NDIS filter driver?

I am currently experimenting with the NDIS driver samples.
I am trying to print the packets contents (including the MAC-addresses, EtherType and the data).
My first guess was to implement this in the function FilterReceiveNetBufferLists. Unfortunately I am not sure how to extract the packets contents out of the NetBufferLists.
That's the right place to start. Consider this code:
void FilterReceiveNetBufferLists(..., NET_BUFFER_LIST *nblChain, ...)
{
UCHAR buffer[14];
UCHAR *header;
for (NET_BUFFER_LIST *nbl = nblChain; nbl; nbl = nbl->Next) {
header = NdisGetDataBuffer(nbl->FirstNetBuffer, sizeof(buffer), buffer, 1, 1);
if (!header)
continue;
DbgPrint("MAC address: %02x-%02x-%02x-%02x-%02x-%02x\n",
header[0], header[1], header[2],
header[3], header[4], header[5]);
}
NdisFIndicateReceiveNetBufferLists(..., nblChain, ...);
}
There are a few points to consider about this code.
The NDIS datapath uses the NET_BUFFER_LIST (nbl) as its primary data structure. An nbl represents a set of packets that all have the same metadata. For the receive path, nobody really knows much about the metadata, so that set always has exactly 1 packet in it. In other words, the nbl is a list... of length 1. For the receive path, you can count on it.
The nbl is a list of one or more NET_BUFFER (nb) structures. An nb represents a single network frame (subject to LSO or RSC). So the nb corresponds most closely to what you think of as a packet. Its metadata is stored on the nbl that contains it.
Within an nb, the actual packet payload is stored as one or more buffers, each represented as an MDL. Mentally, you should pretend the MDLs are just concatenated together. For example, the network headers might be in one MDL, while the rest of the payload might be in another MDL.
Finally, for performance, NDIS gives as many NBLs to your LWF as possible. This means there's a list of one or more NBLs.
Put it all together, and you have:
Your function receives a list of NBLs.
Each NBL contains exactly 1 NB (on the receive path).
Each NB contains a list of MDLs.
Each MDL points to a buffer of payload.
So in our example code above, the for-loop iterates along that first bullet point: the chain of NBLs. Within the loop, we only need to look at nbl->FirstNetBuffer, since we can safely assume there is no other nb besides the first.
It's inconvenient to have to fiddle with all those MDLs directly, so we use the helper routine NdisGetDataBuffer. You tell this guy how many bytes of payload you want to see, and he'll give you a pointer to a contiguous range of payload.
In the good case, your buffer is contained in a single MDL, so NdisGetDataBuffer just gives you a pointer back into that MDL's buffer.
In the slow case, your buffer straddles more than one MDL, so NdisGetDataBuffer carefully copies the relevant bit of payload into a scratch buffer that you provided.
The latter case can be fiddly, if you're trying to inspect more than a few bytes. If you're reading all 1500 bytes of the packet, you can't just allocate 1500 bytes on the stack (kernel stack space is scarce, unlike usermode), so you have to allocate it from the pool. Once you figure that out, note it will slow things down to copy all 1500 bytes of data into a scratch buffer for every packet. Is the slowdown too much? It depends on your needs. If you're only inspecting occasional packets, or if you're deploying the LWF on a low-throughput NIC, it won't matter. If you're trying to get beyond 1Gbps, you shouldn't be memcpying so much data around.
Also note that if you ultimately want to modify the packet, you'll need to be wary of NdisGetDataBuffer. It can give you a copy of the data (stored in your local scratch buffer), so if you modify the payload, those changes won't actually stick to the packet.
What if you do need to scale to high throughputs, or modify the payload? Then you need to work out how to manipulate the MDL chain. That's a bit confusing at first, but spend a little time with the documentation and draw yourself some whiteboard diagrams.
I suggest first starting out by understanding an MDL. From networking's point of view, an MDL is just a fancy way of holding a { char * buffer, size_t length }, along with a link to the next MDL.
Next, consider the NB's DataOffset and DataLength. These conceptually move the buffer boundaries in from the beginning and the end of the buffer. They don't really care about MDL boundaries -- for example, you can reduce the length of the packet payload by decrementing DataLength, and if that means that one or more MDLs are no longer contributing any buffer space to the packet payload, it's no big deal, they're just ignored.
Finally, add on top CurrentMdl and CurrentMdlOffset. These are redundant with everything above, but they exist for (microbenchmark) performance. You aren't required to even think about them if you're reading the NB, but if you are editing the size of the NB, you do need to update them.

Why does FTP require port number to be split?

I recently had to implement a FTP client (in active mode). Something I found remarkable in RFC 959 is the fact that the port number should be split into 8-bits for the PORT command.
An example: when using port 20000 on the client, this should binary be split. 20000 base 10 = 0100111000100000 base 2. This should be split into 01001110 and 00100000, which are resp. 78 and 32. These numbers should be sent as plaintext digits.
Is there any reason why the standard chose this approach? It seems weird both from an efficiency and an easy to debug standpoint.
Is there any reason why the standard chose this approach?
This is likely lost in history. But probably the typical format for IP:Port as used today was not established at this time (this was way before HTTP and the syntax of URLs) so encoding a sockaddr_in with its 4 byte IP and 2 byte port as a sequence of 6 numbers delimited by comma probably made some sense.
It seems weird both from an efficiency and an easy to debug standpoint.
FTP is a text based protocol. Efficiency was obviously not a design criteria - otherwise it would have been done all binary. Having a sequence of 6 bytes instead of IP:port is fine for debugging if the layer where the debugging is done is C code and your are effectively dealing with a 6 byte addressing (4 byte IP, 2 byte port) in the form of a sockaddr_in struct.

Significance of Bytes as 8 bits

I was just wondering the reason why A BYTE IS 8 BITS ? Specifically if we talk about ASCII character set, then all its symbols can be represented just 7 bits leaving one spare bit(in reality where 8 bits is 1 Byte). So if we assume, that that there is big company wherein everyone has agreed to just use ASCII character set and nothing else(also this company doesn't have to do anything with the outside world) then couldn't in this company the developers develop softwares that would consider 7 Bits as 1 Byte and hence save one precious bit, and if done so they could save for instance 10 bits space for every 10 bytes(here 1 byte is 7 bits again) and so, ultimately lots and lots of precious space. The hardware(hard disk,processor,memory) used in this company specifically knows that it need to store & and bunch together 7 bits as 1 byte.If this is done globally then couldn't this revolutionise the future of computers. Can this system be developed in reality ?
Won't this be efficient ?
A byte is not necessarily 8 bits. A byte a unit of digital information whose size is processor-dependent. Historically, the size of a byte is equal to the size of a character as specified by the character encoding supported by the processor. For example, a processor that supports Binary-Coded Decimal (BCD) characters defines a byte to be 4 bits. A processor that supports ASCII defines a byte to be 7 bits. The reason for using the character size to define the size of a byte is to make programming easier, considering that a byte has always (as far as I know) been used as the smallest addressable unit of data storage. If you think about it, you'll find that this is indeed very convenient.
A byte is defined to be 8 bits in the extremely successful IBM S/360 computer family, which used an 8-bit character encoding called EBCDI. IBM, through its S/360 computers, introduced several crucially important computing techniques that became the foundation of all future processors including the ones we using today. In fact, the term byte has been coined by Buchholz, a computer scientist at IBM.
When Intel introduced its first 8-bit processor (8008), a byte was defined to be 8 bits even though the instruction set didn't support directly any character encoding, thereby breaking the pattern. The processor, however, provided numerous instructions that operate on packed (4-bit) and unpacked (8-bit) BCD-encoded digits. In fact, the whole x86 instruction set design was conveniently designed based on 8-bit bytes. The fact that 7-bit ASCII characters fit in 8-bit bytes was a free, additional advantage. As usual, a byte is the smallest addressable unit of storage. I would like to mention here that in digital circuit design, its convenient to have the number of wires or pins to be powers of 2 so that every possible value that appear as input or output has a use.
Later processors continued to use 8-bit bytes because it makes it much easier to develop newer designs based on older ones. It also helps making newer processors compatible with older ones. Therefore, instead of changing the size of a byte, the register, data bus, address bus sizes were doubled every time (now we reached 64-bit). This doubling enabled us to use existing digital circuit designs easily, significantly reducing processor design costs.
The main reason why it's 8 bits and not 7 is that is needs to be a power of 2.
Also: imagine what nibbles would look like in 7-bit bytes..
Also ideal (and fast) for conversion to and from hexadecimal.
Update:
What advantage do we get if we have power of 2... Please explain
First, let's distinguish between a BYTE and a ASCII character. Those are 2 different things.
A byte is used to store and process digital information (numbers) in a optimized way, whereas a character is (or should be) only meant to interact with us, humans, because we find it hard to read binary (although in modern days of big-data, big-internetspeed and big-clouds, even servers start talking to each other in text (xml, json), but that's a whole different story..).
As for a byte being a power of 2, the short answer:
The advantage of having powers of 2, is that data can easily be aligned efficiently on byte- or integer-boundaries - for a single byte that would be 1, 2, 4 and 8 bits, and it gets better with higher powers of 2.
Compare that to a 7-bit ASCII (or 7-bit byte): 7 is a prime number, which means only 1-bit and 7-bit values could be stored in an aligned form.
Of course there are a lot more reasons one could think of (for example the lay-out and structure of the logic gates and multiplexers inside CPU's/MCU's).
Say you want to control the in- or output pins on a multiplexer: with 2 control-lines (bits) you can address 4 pins, with 3 inputs, 8 pins can be addressed, with 4 -> 16,.. - idem for address-lines. So the more you look at it, the more sense it makes to use powers of 2. It seems to be the most efficient model.
As for optimized 7-bit ASCII:
Even on a system with 8-bit bytes, 7-bit ASCII can easily be compacted with some bit-shifting. A Class with a operator[] could be created, without the need to have 7-bit bytes (and of course, a simple compression would even do better).

What's the reason behind ZigZag encoding in Protocol Buffers and Avro?

ZigZag requires a lot of overhead to write/read numbers. Actually I was stunned to see that it doesn't just write int/long values as they are, but does a lot of additional scrambling. There's even a loop involved:
https://github.com/mardambey/mypipe/blob/master/avro/lang/java/avro/src/main/java/org/apache/avro/io/DirectBinaryEncoder.java#L90
I don't seem to be able to find in Protocol Buffers docs or in Avro docs, or reason myself, what's the advantage of scrambling numbers like that? Why is it better to have positive and negative numbers alternated after encoding?
Why they're not just written in little-endian, big-endian, network order which would only require reading them into memory and possibly reverse bit endianness? What do we buy paying with performance?
It is a variable length 7-bit encoding. The first byte of the encoded value has it high bit set to 0, subsequent bytes have it at 1. Which is the way the decoder can tell how many bytes were used to encode the value. Byte order is always little-endian, regardless of the machine architecture.
It is an encoding trick that permits writing as few bytes as needed to encode the value. So an 8 byte long with a value between -64 and 63 takes only one byte. Which is common, the range provided by long is very rarely used in practice.
Packing the data tightly without the overhead of a gzip-style compression method was the design goal. Also used in the .NET Framework. The processor overhead needed to en/decode the value is inconsequential. Already much lower than a compression scheme, it is a very small fraction of the I/O cost.

Can AES algorithm work the same over plain text and over bytes sequences?

It's clear how the algorithm manages plain text as the characters byte values to the state matrix.
But what about AES encryption of binary files?
How does the algorithm manages larger than 16 bytes files, as long as the state is standarized to be 4x4 bytes?
The AES primitive is the basis of constructions that allow encryption/decryption of arbitrary binary streams.
AES-128 takes a 128-bit key and a 128-bit data block and "encrypts" or "decrypts" this block. 128 bit is 16 bytes. Those 16 bytes can be text (e.g. ASCII, one character per byte) or binary data.
A naive implementation would just break a file with longer than 16 bytes into groups of 16 bytes and encrypt each of these with the same key. You might also need to "pad" the file to make it a multiple of 16 bytes. The problem with that is that it exposes information about the file because every time you encrypt the same block with the same key you'll get the same ciphertext.
There are different ways to build on the AES function to encrypt/decrypt more than 16 bytes securely. For example you can use CBC or use counter mode.
Counter mode is a little easier to explain so let's look at that. If we have AES_e(k, b) encrypt block b with key k we do not want to re-use the same key to encrypt the same block more than once. So the construction we'll use is something like this:
Calculate AES_e(k, 0), AES_e(k, 1), AES_e(k, n)
Now we can take arbitrary input, break it into 16 bytes blocks, and XOR with this sequence. Since the attacker does not know they key they can not regenerate this sequence and decode our (longer) message. The XOR is applied bit by bit between the blocks generated above and the cleartext. The receiving side can now generate the same sequence, XOR it with the ciphertext and retrieve the cleartext.
In application you also want to combine this with some sort of authentication mechanism so you something like AES-GCM or AES-CCM.
Imagine you have a 17 byte plain text.
state matrix will be filled with the first 16 bytes and one block will be encrypt.
Next block will be 1 byte that left and state matrix will be padded with data in order to fill those 16 bytes AES needs.
It works well with bytes/binary files because AES always consider bytes unities.Does not matter if that is a ascii chunk or any other think. Just remember that everything in a computer is binary/bytes/bits. Once data be a stream data (chunks of information in bytes) it'll work fine.

Resources