Determining whether a C struct is packed or not - gcc

I'm extracting C struct layout from and executable using gdb-python.
I manage to get all the fields, offsets, types & sizes.
Still, when trying to re-generate the struct's code, I do not have any indication for whether it was marked with GCC's attribute((__packed__)).
Is there any way to get this information from the executable? (preferably using gdb-python, but any other way will do too)

Is there any way to get this information from the executable?
No, but you should be able to deduce this with a simple heuristic:
if sizeof(struct foo) is greater than the sum of its member field sizes, the struct is not packed.
if sizeof(struct foo) is equal to the sum of its member field sizes, the struct is either packed, or its members are naturally aligned with no holes, and packing doesn't matter for it.

Related

Convention for modifying maps in go

In go, is it more of a convention to modify maps by reassigning values, or using pointer values?
type Foo struct {
Bar int
}
Reassignment:
foos := map[string]Foo{"a": Foo{1}}
v := foos["a"]
v.Bar = 2
foos["a"] = v
vs Pointers
foos := map[string]*Foo{"a": &Foo{1}}
foos["a"].Bar = 2
You may be (inadvertently) conflating the matters here.
The reason to store pointers in a map is not to make "dot-field" modifications work—it is rather to preserve the exact placements of the values "kept" by a map.
One of the crucial properties of Go maps is that the values bound to their keys are not addressable. In other words, you cannot legally do something like
m := {"foo": 42}
p := &m["foo"] // this won't compile
The reason is that particular implementations of the Go language¹ are free to implement maps in a way which allow them to move around the values they hold. This is needed because maps are typically implemented as balanced trees, and these trees may require rebalancing after removing and/or adding new entries.
Hence if the language specification were to allow taking an address of a value kept in a map, that would forbid the map to move its values around.
This is precisely the reason why you cannot do "in place" modification of map values if they have struct types, and you have to replace them "wholesale".
By extension, when you add an element to a map, the value is copied into a map, and it is also copied (moved) when the map shuffles its entries around.
Hence, the chief reason to store pointers into a map is to preserve "identities" of the values to be "indexed" by a map—having them exist in only a single place in memory—and/or to prevent excessive memory operations.
Some types cannot even be sensibly copied without introducing a bug—sync.Mutex or a struct type containing one is a good example.
Getting back to your question, using pointers with the map for the purpose you propose might be a nice hack, but be aware that this is a code smell: when deciding on values vs pointers regarding a map, you should be rather concerned with the considerations outlined above.
¹ There are at least two of them which are actively maintained: the "stock" one, dubbed "gc", and a part of GCC.

How to check two structs for equality

I have two instances of this struct with references inside (as properties):
type ST struct {
some *float64
createdAt *time.Time
}
How can I preform a check for equality for two different instances of this struct? Is it only by using reflect?
While you could use reflection, as Corey Ogburn suggested, I would not do so for a simple struct like that. Per the official Go Blog, reflection is
a powerful tool that should be used with care and avoided unless strictly necessary
-- The Laws of Reflection
It should be a simple exercise for you to write a function that takes two pointers to values of your struct type and returns a boolean true/false as to whether they are equal, first by testing for nil pointers and then by testing for equality of each of the fields of the struct.
time.Time values already have an equality test method with signature
func (t Time) Equal(u Time) bool
Depending on your use cases, the bigger problem may be comparing two floating point values for equality. While == comparisons work on float64 values, for many applications you want two float values to be considered equal when they are close, as well as when they are exactly the same. If that is the case for your application, I recommend defining an equal function that accepts a precision and verifies that the difference between the two values is not greater than the precision. To learn more, research floating point representations of decimal values.
Note that time package documentation has this to say about using pointers:
Programs using times should typically store and pass them as values, not pointers. That is, time variables and struct fields should be of type time.Time, not *time.Time.
So you should probably change the type of createdAt in your struct.
You can use reflect.DeepEqual.
DeepEqual reports whether x and y are “deeply equal,” defined as follows. Two values of identical type are deeply equal if one of the following cases applies. Values of distinct types are never deeply equal.
The documentation then goes on to describe how arrays, structs, functions, pointers and other types are considered to be deeply equal.

Enums in computer memory

A quote from Wikipedia's article on enumerated types would be the best opening for this question:
In other words, an enumerated type has values that are different from each other, and that can be compared and assigned, but which are not specified by the programmer as having any particular concrete representation in the computer's memory; compilers and interpreters can represent them arbitrarily.
While I understand the definition and uses of enums, I can't yet grasp the interaction between enums and memory — when an enum type is declared without creating an instance of enum type variable, is the type definition stored in memory as a union or a structure? And what is the meaning behind the aforementioned Wiki excerpt?
The Wikipedia excerpt isn't talking specifically about C's enum types. The C standard has some specific requirements for how enums work.
An enumerated type is compatible with either char or some signed or unsigned integer type. The choice of representation is up to the compiler, which must document its choice (it's implementation-defined), but the type must be able to represent all the values of the enumeration.
The values of the enumeration constants start at 0 by default, and increment by 1 for each successive constant:
enum foo {
zero, // equal to 0
one, // equal to 1
two // equal to 2
};
The constants are always of type int, regardless of what the enum type itself is compatible with. (It would have made more sense for the constants to be of the enumerated type; they're of type int for historical reason.)
You can specify values for some or all of the constants -- which means that the values are not necessarily distinct:
enum bar {
two = 2,
deux = 2,
zwei = 2,
one = 1,
dos // implicitly equal to 2
};
Defining an enumerated type doesn't result in anything being stored in memory at run time. If you define an object of the enumerated type, that object's value will be stored in memory (unless it's optimized away), and will occupy sizeof (enum whatever) bytes. It's the same as for objects of any other type.
An enumeration constant is treated as a constant expression. The expression two is treated almost identically to a constant 2.
Note that C++ has some different rules for enum types. Your question is tagged C, so I won't go into details.
It means that the enum constants are not required to be located in memory. You cannot take the addresses of them.
This allows the compiler to replace all references to enum constants with their actual values. For example, the code:
enum { x = 123; }
int y = x;
may compile as if it were:
int y = 123;
When an enum type is declared without creating an instance of enum type variable, is the type definition stored in memory as a union or a structure?
In C, types are mostly compile-time constructs; once the program has been compiled to machine code, all the type information disappears*. Accessing a struct member is instead "access the memory n bytes past this pointer".
So if the compiler inlines all the enums as shown above, then enums do not exist at all in compiled code.
* Except optionally in the debugging info section, but that's usually only read by debuggers.

What is a good way to deal with byte alignment and endianess when packing a struct?

My current design involves communication between an embedded system and PC, where I am always buzzed by the struct design.
The two systems have different endianess that I need to deal with. However, I find that I cannot just do a simple byte-order switch for every 4 bytes to solve the problem. It turns out to depend on the struct.
For example, a struct like this:
{
uint16_t a;
uint32_t b;
}
would result in padding between a and b. Eventually, the endian switch has to be specific to a and b because the existence of the padding bytes. But it looks ugly because I need to change the endian switch logic every time I change the struct content.
What is a good strategy to arrange elements in a struct when padding comes in? Should we try to rearrange the elements so that there is only padding bytes at the end of the struct?
Thanks.
I'm afraid you'll need to do some more platform-neutral serialization, since different architectures have different alignment requirements. I don't think there is a safe and generic way to do something like grabbing a chunk of memory and sending it to another architecture where you just place it at some address and read from it (the correct data). Just convert and send the elements one-by-one - you can push the values into a buffer, that will not have any padding and you'll know exactly what is where. Plus you decide which part will do the conversions (typically the PC has more resources to do that). As a bonus you can checksum/sign the communication to catch errors/tampering.
BTW, afaik while the compiler keeps the order of the variables intact, it theoretically can put some additional padding between them (e.g. for performance reasons), so it's not just an architecture related thing.

MPI_Scatter redundant parameters?

My question is rather simple, the MPI_Scatter function definition is:
#include <mpi.h>
void MPI::Comm::Scatter(const void* sendbuf, int sendcount,
const MPI::Datatype& sendtype, void* recvbuf,
int recvcount, const MPI::Datatype& recvtype,
int root) const
Are 'sendcount' and 'sendtype' redundant?
In which case it can happen: sendcount!=recvcount?
Edit:
Maybe some clarification is needed about the question. I understand that maybe the reason is that, for the root the data is some 'struct X' and for the receivers is some 'struct Y' that somehow it also makes sense (it all fits 'Ok').
If that's the case... I don't get why is needed to say again that the total size of the expected data to receive, is the same of the sended data size. If it's just a matter of casting the view of the data, I'd only do the cast. In fact, the buffer is a (void *).
MPI allows for both datatypes on the sending and on the receiving end to be different as long as they are constructed from the same basic datatypes. Thare are many cases where this comes handy, e.g. scattering rows of a matrix from the root process into columns in the other processes. Sending and receiving rows is straightforward in C and C++ as the memory layout of the matrices is row-major. Sending and receiving columns requres that a special strided vector type is constructed first. Usually this type is constructed for a specified number of rows and columns and then one has to supply a count of 1 when receiving the data.
There are also many other cases when sendcount and recvcount might differ. Mind also that recvcount does not specify the size of the message to be received but rather the capacity of the receive buffer and that capacity may be way larger than the size of the message.
MPI_scatter() is for break the message in equal piezes and process each one in the child nodes and in your own. Knowing this:
Are 'sendcount' and 'sendtype' redundant?
-How can that happen?, if sendCount is the number of elements sent, and sendType is the type of those elements. Both contains different information.
And for the last question:
In which case it can happen: sendcount!=recvcount?.
-When you want to sort a sequence of numbers, you send blocks of size N and type=int to your nodes. You want the same but sorted.

Resources