What type of framing to use in serial communication - serial-communication

In a serial communication link, what is the prefered message framing/sync method?
framing with SOF and escaping sequences, like in HDLC?
relying on using a header with length info and CRC?
It's an embedded system using DMA transfers of data from UART to memory.
I think the framing method with SOF is most attractive, but maybe the other one is good enough?
Does anyone has pros and cons for these two methods?

Following based on UART serial experience, not research.
I have found fewer communication issues when the following are included - or in other words, do both SOF/EOF and (length - maybe)/checkcode. Frame:
SOFrame
(Length maybe)
Data (address, to, from, type, sequence #, opcode, bytes, etc.)
CheckCode
EOFrame
Invariably, the received "fames" include:
good ones - no issues.
Corrupt due to sender not sending a complete message (it hung, powered down, or rump power-on transmission) (receiver should timeout stale incomplete messages.)
Corrupt due to noise or transmission interference. (byte framing errors, parity, incorrect data)
Corrupt due to receiver starting up in the middle of a sent message or missing a few bytes due to input buffer over-run.
Shared bus collisions.
Break - is this legit in your system?
Whatever framing you use, insure it is robust to address these messages types, promptly validate #1 and rapidly identifying 2-5 and becoming ready for the next frame.
SOF has the huge advantage of it it easy to get started again, if the receiver is lost due to a previous crap frame, etc.
Length is good, but IMHO the least useful. It can limit through-put, if the length needs to be at the beginning of a message. Some low latency operations just do not know the length before they are ready to begin transmitting.
CRC Recommend more than 2-byte. A short check-code does not improve things enough for me. I'd rather have no check code than a 1-byte one. If errors occur from time-to-time only be caught by the check-code, I want something better than a 2-byte's 99.999% of the time, I like a 4-byte's 99.99999997%
EOF so useful!
BTW: If your protocol is ASCII (instead of binary), recommend to not use cr or lf as EOFrame. Maybe only use them out-of-frame where they are not part of a message.
BTW2: If your receiver can auto-detect the baud, it saves on a lot of configuration issues.
BTW3: A sender could consider sending a "nothing" byte (before the SOF) to insure proper SOF syncing.

Related

WebSocket frame fragmentation in an API

Would exposing a WebSocket fragmentation have any value in a client-side API?
Reading the RFC 6455 I became convinced a non-continuation frame doesn't guarantee you anything in terms of its semantics. One shouldn't rely on frame boundaries. It's just too risky. The spec addresses this explicitly:
Unless specified otherwise by an extension, frames have no semantic
meaning. An intermediary might coalesce and/or split frames, if no
extensions were negotiated by the client and the server or if some
extensions were negotiated, but the intermediary understood all the
extensions negotiated and knows how to coalesce and/or split frames
in the presence of these extensions. One implication of this is that
in absence of extensions, senders and receivers must not depend on
the presence of specific frame boundaries.
Thus receiving a non-continuation frame of type Binary or Text doesn't mean it's something atomic and meaningful that has been sent from the other side of the channel. Similarly a sequence of continuation frames doesn't mean that coalescing them will yield a meaningful message. And what's even more upsetting,
a single non-continuation type frame may be a result of coalescing many other frames.
To sum up, groups of bytes sent over the WebSocket may be received regrouped pretty much any way, given the byte order is the same (that's of course in absence of extensions).
If so, then is it useful to introduce this concept at all? Maybe it's better to hide it as a detail of implementation? I wonder if WebSocket users have found it useful in such products like Netty, Jetty, Grizzly, etc. Thanks.
Fragmentation is not a boundary for anything.
It's merely a way for the implementation to handle itself based on memory, websocket extensions, performance, etc.
A typical scenario would be a client endpoint sending text, which is passed through the permessage-deflate extension, which will compress and generate fragments based on its deflate algorithm memory configuration, writing those fragments to the remote endpoint as it has a buffer of compressed data to write (some implementations will only write if the buffer is full or the message has received its final byte)
While exposing access to the fragments in an API has happened (Jetty has 2 core websocket APIs, both that support fragment access), its really only useful for those wanting lower level control on streaming applications. (think video / voip where you want to stream with quality adjustments, dropping data if need be, not writing too fast, etc ...)
There seems to be some ambiguity in the RFC concerning unfragmented messages, that they can be split or combined arbitrarily. But, in the situation where a message is deliberately sent as multiple fragments (totalling X bytes), is it allowable for an intermediary to split some of these frames in a way that returns a different number (than X) of bytes in the sequence? I don't think that is allowed and fragmentation has some value in that respect. This is just from reading the RFC, as opposed to looking at real implementations.
The fragments of one message MUST NOT be interleaved between the
fragments of another message unless an extension has been
negotiated that can interpret the interleaving.
To my reading this implies that unless some extension has been negotiated which allows it, fragments from different messages cannot be interleaved and this means that while the number of fragments can be altered, the exact number of bytes (and the bytes themselves) cannot be.
There should be support for controlling fragmentation; We have a C# program that intentionally splits a large WebSocket message into small fragments so a small embedded processor receiving the data can process small chunks at a time. Instead it is arriving completely coalesced into a single large block consuming most of the available memory.
We are not sure where the coalescing is taking place. Maybe the C# library.

How to test algorithm performance on devices with low resources?

I am interested in using atmel avr controllers to read data from LIN bus. Unfortunately, messages on such bus have no beginning or end indicator and only reasonable solution seems to be brute force parsing. Available data from bus is loaded into circular buffer, and brute force method finds valid messages in buffer.
Working with 64 byte buffer and 20MHZ attiny, how can I test the performance of my code in order to see if buffer overflow is likely to occur? Added: My concern is that algorith will be running slow, thus buffering even more data.
A bit about brute force algorithm. Second element in a buffer is assumed to be message size. For example, if assumed length is 22, first 21 bytes are XORed and tested against 22nd byte in buffer. If checksum passes, code checks if first (SRC) and third (DST) byte are what they are supposed to be.
AVR is one of the easiest microcontrollers for performance analysis, because it is a RISC machine with a simple intruction set and well-known instruction execution time for each instruction.
So, the beasic procedure is that you take the assembly coude and start calculating different scenarios. Basic register operations take one clock cycle, branches usually two cycles, and memory accesses three cycles. A XORing cycle would take maybe 5-10 cycles per byte, so it is relatively cheap. How you get your hands on the assembly code depends on the compiler, but all compilers tend to give you the end result in a reasonable legible form.
Usually without seeing the algorithm and knowing anything about the timing requirements it is quite impossible to give a definite answer to this kind of questions. However, as the LIN bus speed is limited to 20 kbit/s, you will have around 10 000 clock cycles for each byte. That is enough for almost anything.
A more difficult question is what to do with the LIN framing which is dependent on timing. It is not a very nice habit, as it really requires some time extra effort from the microcontroller. (What on earth is wrong with using the 9th bit?)
The LIN frame consists of a
break (at least 13 bit times)
synch delimiter (0x55)
message id (8 bits)
message (0..8 x 8 bits)
checksum (8 bits)
There are at least four possible approaches with their ups and downs:
(Your apporach.) Start at all possible starting positions and try to figure out where the checksummed message is. Once you are in sync, this is not needed. (Easy but returns ghost messages with a probability 1/256. Remember to discard the synch field.)
Use the internal UART and look for the synch field; try to figure out whether the data after the delimiter makes any sense. (This has lower probability of errors than the above, but requires the synch delimiter to come through without glitches and may thus miss messages.)
Look for the break. Easiest way to do this to timestamp all arriving bytes. It is quite probably not required to buffer the incoming data in any way, as the data rate is very low (max. 2000 bytes/s). Nominally, the distance between the end of the last character of a frame and the start of the first character of the next frame is at least 13 bits. As receiving a character takes 10 bits, the delay between receiving the end of the last character in the previous message and end of the first character of the next message is nominally at least 23 bits. In order to allow some tolerance for the bit timing, the limit could be set to, e.g. 17 bits. If the distance in time between "character received" interrupts exceeds this limit, the characters belong to different frame. Once you have detected the break, you may start collecting a new message. (This works almost according to the official spec.)
Do-it-yourself bit-by-bit. If you do not have a good synchronization between the slave and the master, you will have to determine the master clock using this method. The implementation is not very straightforward, but one example is: http://www.atmel.com/images/doc1637.pdf (I do not claim that one to be foolproof, it is rather simplistic.)
I would go with #3. Create an interrupt for incoming data and whenever data comes you compare the current timestamp (for which you need a counter) to the timestamp of the previous interrupt. If the inter-character time is too long, you start a new message, otherwise append to the old message. Then you may need double buffering for the messages (one you are collecting, another you are analyzing) to avoid very long interrupt routines.
The actual implementation depends on the other structure of your code. This shouldn't take much time.
And if you cannot make sure your clock is well enough synchronized (+- 4%) to the moster clock, then you'll have to look at #4, which is probably much more instructive but quite tedious.
Your fundamental question is this (as I see it):
how can I test the performance of my code in order to see if buffer overflow is likely to occur?
Set a pin high at the start of the algorithm, set it low at the end. Look at it on an oscilloscope (I assume you have one of these - embedded development is very difficult without it.) You'll be able to measure the max time the algorithm takes, and also get some idea of the variability.

Magic number with MmMapIoSpace

So upon mapping a memory space with MmMapIoSpace, I noticed that past a certain point, the data was just being discarded when written to. No errors, breakpoints, or even bugchecks were thrown. Everything worked as normal, just without any adverse effects.
I decided to do a write/read test (the driver would write 1's to every byte for the length of the intended size) and the reader (userland) mode would read and report where the 1's ended.
The number it came up with was 3208, which is a seemingly nice, round number (/8=401, /256=12, etc.)
What's up with this? How come I can't map the full buffer space?
EDIT And in 64-bit it drops to 2492.
I'm no expert, but I don't see how MmMapIoSpace can be relied upon to do what you're asking it to, because there's no guarantee that the user-space buffer is contiguous in physical memory.
Instead, I think you should be using IoAllocateMdl and MmProbeAndLockPages to lock down the user buffer and then MmGetSystemAddressForMdlSafe to map it into the system address space. This process is described here.
As previous stated, I think that the point at which the mapping is failing (3208/2492 bytes into the buffer) is probably just the end of the page, but that's easy enough for you to verify: get the user-space application to report the (virtual) address of the first byte that didn't get written rather than the offset, and check whether it is a multiple of 4096 or not.

Sending (serial) break using windows (XP+) api

Is there a better way to send a serial break then the setcommbreak - delay - clearcommbreak sequence?
I have to communicate with a microcontroller that uses serial break as the start of a packet on 115k2, and the setcommbreak has two problems:
with 115k2, the break is well below 1ms, and it will get timing critical.
Since the break must be embedded in the packet stream at the correct position, I expect trouble with the fifo.
Is there a better way of doing this, without moving the serial communication to a thread without fifo ? The UART is typically a 16550+
I have a choice in the sense that the microcontroller setup can be switched(other firmware) to a more convention packet format, but the manual warns that the "break" way features hardware integrity checking of the serial.
Compiler is Delphi (2009/XE), but any code or even just a reference is welcome.
The short answer is that serial programming with Windows is fairly limited :-(
You're right that the normal way of sending a break is with SetCommBreak(), and yes, you have to handle the delay yourself - which tends to mean the break ends up substantially longer than it needs to be. The good news is that this doesn't usually matter - most devices expecting a break will treat a much longer break in exactly the same way as a short one.
In the event that your microcontroller is fussy about the precise duration of the break, one way of achieving a shorter, precisely-defined break is to change the baud rate on the port to a slower rate, send a zero byte, then change it back again.
The reason that this works is that a byte sent to the serial port is sent as (usually) one start bit (a zero), followed by the bits in the byte, followed by one or more stop bits (high bits). A 'break' is a sequence of zero bits that is too long to be a byte - i.e. the stop bits don't come in time. By choosing a slower baud rate and sending a zero, you end up holding the line at zero for longer than the receiver expects a byte to be, so it interprets it as a break. (It's up to you whether to determine the baud rate to use by precise calculation or trial-and-error of what the microcontroller seems to like :-)
Of course, either method (SetCommBreak() or baud changing) requires you to know when all data has been sent out of the serial port (i.e. there's nothing left in the transmit FIFO). This nice article about Windows Serial programming describes how to use SetCommMask(), WaitCommEvent() etc. to determine this.

Ensuring packet order in UDP

I'm using 2 computers with an application to send and receive udp datagrams. There is no flow control and ICMP is disabled. Frequently when I send a file as UDP datagrams via the application, I get two packets changing their order and therefore - packet loss.
I've disabled and kind of firewall and there is no hardware switch connected between the computers (they are directly wired).
Is there a way to make sure Winsock and send() will send the packets the same way they got there?
Or is the OS doing that?
Or network device configuration needed?
UDP is a lightweight protocol that by design doesn't handle things like packet sequencing. TCP is a better choice if you want robust packet delivery and sequencing.
UDP is generally designed for applications where packet loss is acceptable or preferable to the delay which TCP incurs when it has to re-request packets. UDP is therefore commonly used for media streaming.
If you're limited to using UDP you would have to develop a method of identifying the out of sequence packets and resequencing them.
UDP does not guarantee that your packets will arrive in order. (It does not even guarantee that your packets will arrive at all.) If you need that level of robustness you are better off with TCP. Alternatively you could add sequence markers to your datagrams and rearrange them at the other end, but why reinvent the wheel?
is there a way to make sure winsock and send() will send the packets the same way they got there?
It's called TCP.
Alternatively try a reliable UDP protocol such as UDT. I'm guessing you might be on a small embedded platform so you want a more compact protocol like Bell Lab's RUDP.
there is no flow control (ICMP disabled)
You can implement your own flow control using UDP:
Send one or more UDP packets
Wait for acknowledgement (sent as another UDP packets from receiver to sender)
Repeat as above
See Sliding window protocol for further details.
[This would be in addition to having a sequence number in the packets which you send.]
There is no point in trying to create your own TCP like wrapper. We love the speed of UPD and that is just going to slow things down. Your problem can be overcome if you design your protocol so that every UDP datagram is independent of each other. Our packets can arrive in any order so long as the header packet arrives first. The header says how many packets are suppose to arrive. Also, UPD has become a lot more reliable since this post was created over a decade ago. Don't try to
This question is 12 years old, and it seems almost a waste to answer it now. Even as the suggestions that I have have already been posed. I dealt with this issue back in 2002, in a program that was using UDP broadcasts to communicate with other running instances on the network. If a packet got lost, it wasn't a big deal. But if I had to send a large packet, greater than 1020 bytes, I broke it up into multiple packets. Each packet contained a header that described what packet number it was, along with a header that told me it was part of a larger overall packet. So, the structure was created, and the payload was simply dropped in the (correct) place in the buffer, and the bytes were subtracted from the overall total that was needed. I knew all the packets had arrived once the needed byte total reached zero. Once all of the packets arrived, that packet got processed. If another advertisement packet came in, then everything that had been building up was thrown away. That told me that one of the fragments didn't make it. But again, this wasn't critical data; the code could live without it. But, I did implement an AdvReplyType in every packet, so that if it was a critical packet, I could reply to the sender with an ADVERTISE_INCOMPLETE_REQUEST_RETRY packet type, and the whole process could start over again.
This whole system was designed for LAN operation, and in all of my debugging/beta testing, I rarely ever lost a packet, but on larger networks I would often get them out of order...but I did get them. Being that it's now 12 years later, and UDP broadcasting seems to be frowned upon by a lot of IT Admins, UDP doesn't seem like a good, solid system any longer. ChrisW mentioned a Sliding Window Protocol; this is sort of what I built...without the sliding part! More of a "Fixed Window Protocol". I just wasted a few more bytes in the header of each of the payload packets to tell how many total bytes are in this Overlapped Packet, which packet this was, and the unique MsgID it belonged to so that I didn't have to get the initial packet telling me how many packets to expect. Naturally, I didn't go as far as implementing RFC 1982, as that seemed like overkill for this. As long as I got one packet, I'd know the total length, unique Message Id, and which packet number this one was, making it pretty easy to malloc() a buffer large enough to hold the entire Message. Then, a little math could tell me where exactly in the Message this packet fits into. Once the Message buffer was filled in...I knew I got the whole message. If a packet arrived that didn't belong to this unique Message ID, then we knew this was a bust, and we likely weren't going to ever get the remainder of the old message.
The only real reason I mention this today, is that I believe there still is a time and a place to use a protocol like this. Where TCP actually involves too much overhead, on slow, or spotty networks; but it's those where you also have the most likelihood of and possible fear of packet loss. So, again, I'd also say that "reliability" cannot be a requirement, or you're just right back to TCP. If I had to write this code today, I probably would have just implemented a Multicast system, and the whole process probably would have been a lot easier on me. Maybe. It has been 12 years, and I've probably forgotten a huge portion of the implementation details.
Sorry, if I woke a sleeping giant here, that wasn't my intention. The original question intrigued me, and reminded me of this turn-of-the-century Windows C++ code I had written. So, please try to keep the negative comments to a minimum--if at all possible! (Positive comments, of course...always welcome!) J/K, of course, folks.

Resources