What is a good way to deal with byte alignment and endianess when packing a struct? - byte

My current design involves communication between an embedded system and PC, where I am always buzzed by the struct design.
The two systems have different endianess that I need to deal with. However, I find that I cannot just do a simple byte-order switch for every 4 bytes to solve the problem. It turns out to depend on the struct.
For example, a struct like this:
{
uint16_t a;
uint32_t b;
}
would result in padding between a and b. Eventually, the endian switch has to be specific to a and b because the existence of the padding bytes. But it looks ugly because I need to change the endian switch logic every time I change the struct content.
What is a good strategy to arrange elements in a struct when padding comes in? Should we try to rearrange the elements so that there is only padding bytes at the end of the struct?
Thanks.

I'm afraid you'll need to do some more platform-neutral serialization, since different architectures have different alignment requirements. I don't think there is a safe and generic way to do something like grabbing a chunk of memory and sending it to another architecture where you just place it at some address and read from it (the correct data). Just convert and send the elements one-by-one - you can push the values into a buffer, that will not have any padding and you'll know exactly what is where. Plus you decide which part will do the conversions (typically the PC has more resources to do that). As a bonus you can checksum/sign the communication to catch errors/tampering.
BTW, afaik while the compiler keeps the order of the variables intact, it theoretically can put some additional padding between them (e.g. for performance reasons), so it's not just an architecture related thing.

Related

Use big.Rat with Go to get Abs() value

I am a beginner with Go and a java developer.
I am currently working with big.Rat.
I need to get the Abs of a Rat n for which I have to write something like
n.Abs(n) or something like big.Rat{}.Abs(n)
Why didn't go provide something like just n.Abs()?
Or am I going wrong somewhere?
Go's big package is concerned with memory allocation when it comes to its function signatures. A big.Rat consists of two big.Ints which each contain an array of uints. Unlike an int (native 32 or 64 bit integer), a big.Int must thus be allocated dynamically, depending on its value. For large values this means more elements in the array.
Your proposed function signature n.Abs() would mean that a new array of the same size as n's would have to be allocated for this operation. In reality we often have the case that the original n is no longer needed, thus we can reuse its existing memory. To allow this, the Abs function takes a pointer to an existing big.Rat which might be n itself. The implementation can now reuse the memory. The caller is now in full control of what memory to use for these operations.
This might not make the nicest API for all use cases, in fact if you just want to do a quick calculation for a few large numbers, on a computer with Gigabytes of RAM, you might have preferred the n.Abs() version, but if you do numerically expensive computations with a lot of large numbers, you must be able to control your memory. Imagine doing some image manipulation on a Raspberry for example, where you are more constraint by the available memory. In this case the existing API allows you to be more efficient.

Protocol Buffers - Best practice for repeated boolean values

I need to transfer some data over a relative slow (down to only 1Kb/s) connection. I have read that the encoding of Googles protocol buffers is efficient.
Thats true for most of my data, but not for boolean values, especialy if it is a repeated field.
The problem is that I have to transfer, beside other data, a specified number (15) of boolean values every 50 milliseconds. Protobuf is encoding each boolean value into one byte for the field ID and one byte for the boolean value (0x00 or 0x01) which results in 30 bytes of data for 15 boolean values.
So I am searching for a better way of encoding this now. Anybody also had this problem already? What would be the best practice to reach a efficient encoding for this situation?
My idea was to use a numbered data type (uint32) and manual encode the data, for every bool one bit of the integer. Any feedback about this idea?
In Protobuf, your best bet is to use an integer bitfield. If you have more than 64 bits, use a bytes field (and pack the bits manually).
Note that Cap'n Proto will pack boolean values (in both structs and lists) as individual bits, and so may be worth looking at.
However, if you are extremely bandwidth-constrained, it may be best to develop your own custom protocol. Most of these serialization frameworks trade-off a little bit of space for ease of use (especially when it comes to dealing with version skew), but if your case it may be more important to focus solely on size. A custom message format that just contains some bits should be easy enough to maintain and can be packed as tightly as you want.
(Disclosure: I am the author of Cap'n Proto, as well as most of Google's open source Protobuf code.)

Is it fastest to access a byte than a bit? Why?

The question is very straight: is it fastest to access a byte than a bit? If I store 8 booleans in a byte will it be slower when I have to compare them than if I used 8 bytes? Why?
Chances are no. The smallest addressable unit of memory in most machines today is a byte. In most cases, you can't address or access by bit.
In fact, accessing a specific bit might be even more expensive because you have to build a mask and use some logic.
EDIT:
Your question mentions "compare", I'm not sure exactly what you mean by that. But in some cases, you perform logic very efficiently on multiple booleans using bitwise operators if your booleans are densely packed into larger integer types.
As for which to use: array of bytes (with one boolean per byte), or a densely packed structure with one boolean per bit is a space-effiicency trade-off. For some applications that need to store a massive amount of bools, dense packing is better since it saves memory.
The underlying hardware that your code runs on is built to access bytes (or longer words) from memory. To read a bit, you have to read the entire byte, and then mask off the bits you don't care about, and possibly also shift to get the bit into the ones position. So the instructions to access a bit are a superset of the instructions to access a byte.
It may be faster to store the data as bits for a different reason - if you need to traverse and access many 8-bit sets of flags in a row. You will perform more ops per boolean flag, but you will traverse less memory by having it packed in fewer bytes. You will also be able to test multiple flags in a single operation, although you may be able to do this with bools to some extent as well, as long as they lie within a single machine word.
The memory latency penalty is far higher than register bit twiddling. In the end, only profiling the code on the hardware on which it will actually run will tell you which way is best.
From a hardware point of view, I would say that in general all the bit masking and other operations in the best case might occur within a single clock (resulting in no different), but that entirely depends on hardware layer that you likely won't ever know the specifics of, and as such you cannot bank on it.
It's worth pointing out that things like the .NET system.collections.bitarray uses a 32bit integer array underneath to store it's bit data. There is likely a performance reason behind this implementation (even if only in a general case that 32bit words perform above average), I would suggest reading up about the inner workings of that might be revealing.
From a coding point of view, it really depends what you're going to do with the bits afterwards. That is to say if you're going to store your data in booleans such as:
bool a0, a1, a2, a3, a4, a5, a6, a7;
And then in your code you compare them one by one (and most of them together):
if ( a0 && a1 && !a2 && a3 && !a4 && (!a5 || a6) || a7) {
...
}
Then you will find that it will be faster (and likely neater in code) to use a bit mask. But really the only time this would matter is if you're going to be running this code millions of times in a high performance or time critical environment.
I guess what I'm getting at here is that you should do whatever your coding standards say (and if you don't have any or they don't consider such details then just do what looks neatest for your application and need).
But I highly suggest trying to look around and read a blog or two explaining the inner workings of the .NET system.collections.bitarray.
This depends on the kind of processor and motherboard data bus, i.e. 32 bit data bus will compare your data faster if you collect them into "word"s rather than "bool"s or "byte"s....
This is only valid when you are writing in assembly language when you can compare each instruction how many cycles it takes .... but since you are using compiler then it is almost the same.
However, collecting booleans into words or integers will be useful in saving memory required for variables.
Computers tend to access things in words. Accessing a bit is slower because it requires more effort:
Imagine I said something to you, then said "oh change my second word to instead".
Now imagine my edit instead was "oh, change the third letter in the second word to 's'".
Which requires more thinking on your part?

Mapping Untyped Lisp data into a typed binary format for use in compiled functions

Background: I'm writing a toy Lisp (Scheme) interpreter in Haskell. I'm at the point where I would like to be able to compile code using LLVM. I've spent a couple days dreaming up various ways of feeding untyped Lisp values into compiled functions that expect to know the format of the data coming at them. It occurs to me that I am not the first person to need to solve this problem.
Question: What are some historically successful ways of mapping untyped data into an efficient binary format.
Addendum: In point of fact, I do know which of about a dozen different types the data is, I just don't know which one might be sent to the function at compile time. The function itself needs a way to determine what it got.
Do you mean, "I just don't know which [type] might be sent to the function at runtime"? It's not that the data isn't typed; certainly 1 and '() have different types. Rather, the data is not statically typed, i.e., it's not known at compile time what the type of a given variable will be. This is called dynamic typing.
You're right that you're not the first person to need to solve this problem. The canonical solution is to tag each runtime value with its type. For example, if you have a dozen types, number them like so:
0 = integer
1 = cons pair
2 = vector
etc.
Once you've done this, reserve the first four bits of each word for the tag. Then, every time two objects get passed in to +, first you perform a simple bit mask to verify that both objects' first four bits are 0b0000, i.e., that they are both integers. If they are not, you jump to an error message; otherwise, you proceed with the addition, and make sure that the result is also tagged accordingly.
This technique essentially makes each runtime value a manually-tagged union, which should be familiar to you if you've used C. In fact, it's also just like a Haskell data type, except that in Haskell the taggedness is much more abstract.
I'm guessing that you're familiar with pointers if you're trying to write a Scheme compiler. To avoid limiting your usable memory space, it may be more sensical to use the bottom (least significant) four bits, rather than the top ones. Better yet, because aligned dword pointers already have three meaningless bits at the bottom, you can simply co-opt those bits for your tag, as long as you dereference the actual address, rather than the tagged one.
Does that help?
Your default solution should be a simple tagged union. If you want to narrow your typing down to more specific types, you can do it - but it won't be that "toy" any more. A thing to look at is called abstract interpretation.
There are few successful implementations of such an optimisation, with V8 being probably the most widespread. In the Scheme world, the most aggressively optimising implementation is Stalin.

Size reduction for enum storage in Fujitsu Softune

Fujitsu microcontroller used is 32bit.
Hence enum storage is also 32bit. But in my project actually enum elements do not exceed more than 256.
Is there any compiler options to size down the storage for enums?
You could use a bit field to be able to store 256 unique values in 8 words (256 bits / 32 bit words = 8), but then the compiler will no longer be able to enforce that only a single bit is set at a time. But, you could easily write a wrapper function to clear out all the previous bits before setting one. It would probably end up kind of messy, but that's what tends to happen when you start using these kinds of tricks at this level to save memory.
You could use preprocessor macros (#define) to map symbolic names to values. without knowing what your application is, it's hard to predict if this is sensible :)

Resources