Optimising datastructure/word alignment padding in golang - go

Similar to what I've learned in C++, I believe it's the padding that causes a difference in the size of instances of both structs.
type Foo struct {
w byte //1 byte
x byte //1 byte
y uint64 //8 bytes
}
type Bar struct {
x byte //1 byte
y uint64 //8 bytes
w byte// 1 byte
}
func main() {
fmt.Println(runtime.GOARCH)
newFoo := new(Foo)
fmt.Println(unsafe.Sizeof(*newFoo))
newBar := new(Bar)
fmt.Println(unsafe.Sizeof(*newBar))
}
Output:
amd64
16
24
Is there a rule of thumb to follow when defining struct members? (like ascending/descending order of size of types)
Is there a compile time optimisation which we can pass, that can automatically take care of this?
Or shouldn't I be worried about this at all?

Currently there's no compile-time optimisation; the values are padded to 8 bytes on x64.
You can manually arrange structs to optimally utilise space; typically by going from larger types to smaller; 8 consecutive byte fields for example, will only use 8 bytes, but a single byte would be padded to an 8 byte alignment, consider this: https://play.golang.org/p/0qsgpuAHHp
package main
import (
"fmt"
"unsafe"
)
type Compact struct {
a, b uint64
c, d, e, f, g, h, i, j byte
}
// Larger memory footprint than "Compact" - but less fields!
type Inefficient struct {
a uint64
b byte
c uint64
d byte
}
func main() {
newCompact := new(Compact)
fmt.Println(unsafe.Sizeof(*newCompact))
newInefficient := new(Inefficient)
fmt.Println(unsafe.Sizeof(*newInefficient))
}
If you take this into consideration; you can optimise the memory footprint of your structs.

Or shouldn't I be worried about this at all?
Yes you should.
This is also called mechanical sympathy (see this Go Time podcast episode), so it also depends on the hardware architecture you are compiling for.
See as illustration:
"The day byte alignment came back to bite me" (January 2014)
"On the memory alignment of Go slice values" (July 2016)
The values in Go slices are 16-byte aligned. They are not 32 byte aligned.
Go pointers are byte-aligned.

It depends on type of application that you are developing and on usage of those structures. If application needs to meet some memory/performance criteria you definitely should care about memory alignment and paddings, but not only - there is nice article https://www.usenix.org/legacy/publications/library/proceedings/als00/2000papers/papers/full_papers/sears/sears_html/index.html that highlights theme of optimal CPU caches usage and correlation between struct layouts and performance. It highlights cache line alignment, false sharing, etc.
Also there is a nice golang tool https://github.com/1pkg/gopium that helps to automate those optimizations, check it out!

Some guideline
To minimize the number of padding bytes, we must lay out the fields from
the highest allocation to lowest allocation.
One exception is an empty structure
As we know the size of empty is zero
type empty struct {
a struct{}
}
Following the common rule above, we may arrange the fields of structure as below
type E struct {
a int64
b int64
c struct{}
}
However, the size of E is 24,
When arrange the fields of structure as
type D struct {
b struct{}
a int64
c int64
}
The size of D is 16, refer to https://go.dev/play/p/ID_hN1zwIwJ
IMO, it is better to use tools that help us to automate structure alignment optimizations
aligncheck — https://gitlab.com/opennota/check
maligned — https://github.com/mdempsky/maligned, the original maligned is deprecated. Use https://pkg.go.dev/golang.org/x/tools/go/analysis/passes/fieldalignment
golang clint
you just need to enable ‘maligned’ in the ‘golangci-lint’ settings.
Example, from the configuration file.golangci.example.yml
linters-settings:
maligned:
# print struct with more effective memory layout or not, false by default
suggest-new: true

Related

Is it safe to directly convert a struct 'point' to another struct using unsafe.Pointer?

Is it safe?
(*TeamData)(unsafe.Pointer(&team.Id))
Example code:
func testTrans() []*TeamData {
teams := createTeams()
teamDatas := make([]*TeamData, 0, len(teams))
for _, team := range teams {
// is this safe?
teamDatas = append(teamDatas, (*TeamData)(unsafe.Pointer(&team.Id)))
}
return teamDatas
}
// ??
teams := testTrans()
Will the members of the teams := testTrans() array be garbage collected?
There are many structs and many fields returned through grpc and their definitions are the same as the local definitions, so I want to use this more efficient way((*TeamData)(unsafe.Pointer(&team.Id))), but I don't know if there will be any risks.
Full Example:
https://go.dev/play/p/q3gwp2mERvj
The documentation for unsafe.Pointer describes supported uses. In particular:
(1) Conversion of a *T1 to Pointer to *T2.
Provided that T2 is no larger than T1 and that the two share an
equivalent memory layout, this conversion allows reinterpreting data
of one type as data of another type.
Go's garbage collector recognises interior pointers an will not collect the original allocation until there are no remaining references to that block.
Hence the larger allocation (GrpcRetTeam in your example) will be pinned while references to *TeamData exists.
Another critical consideration is the alignment of the struct fields. Eg:
type Parent struct {
A uint8
B uint8
// 6 bytes of padding to align C.
C uint64
}
type Bad struct {
B uint8
// 7 bytes of padding to align C.
C uint64
}
In this case it would be invalid to use unsafe to extract Bad from Parent since the memory layout is different.
In most cases it's typically better to avoid unsafe.Pointer tricks unless required to meet functionality or performance requirements. It's often possible to refactor code to minimise allocations instead.
If you must use unsafe to meet performance requirements --
I would recommend implementing a test using the reflect package to ensure the memory alignment/layout is valid for the child struct.

Casting []uint32 to []byte without copying in golang

I'm working on a processor simulator in golang (for educational purposes). I need a type for memory unit for addressing. It may contain either a slice of memory (memory type is []byte) or one or several registers (they have []uint32 type), must be readable and writable. So, is there an option to convert []uint32 to []byte? I know there's an unsafe module, but I'm not sure how exactly to do this conversion. In other words, I need something like reinterpret_cast in C++
I know memory unit can be an interface with different implementations for memory, single register and several register, but it's not so efficient. Making register a byte slice also decreases performance
The unsafe.Slice function makes it more convenient to convert an arbitrary pointer to a slice of any type. You could even make a generic version to cast a slice of any type to bytes if you were so inclined:
func castToBytes[T any](s []T) []byte {
if len(s) == 0 {
return nil
}
size := unsafe.Sizeof(s[0])
return unsafe.Slice((*byte)(unsafe.Pointer(&s[0])), int(size)*len(s))
}

Can I include formulas in a struct somehow?

I am trying to create a struct which uses a formula to automatically create data in one of the struct fields when the other two values are entered.
For example, I want to create a 2D rectangular room with Length and Width which are values that are entered. I would then like to include the formula Area = Length * Width in the struct.
Have tried and just get a syntax error :
syntax error: unexpected =, expecting semicolon or newline or }
// CURRENT CODE
type room struct {
L int
W int
A int
}
// WOULD LIKE IT TO BE
type room struct {
L int
W int
A int = room.L*room.H
}
Since A is invariant, this would be a good fit for a function, not a field.
type room struct {
L int
W int
}
func (r *room) area() int {
return r.L * r.W
}
If you would like to keep A as a Field, you can optionally preform the computation in a constructor.
type room struct {
L int
W int
A int
}
func newRoom(length, width, int) room {
return room{
L: length,
W: width,
A: length * width,
}
}
If you think about what you're after, you'll see that basically your desire to "not add unnecessary code" is really about not writing any code by hand, rather than not executing any code: sure, if the type definition
type room struct {
L int
W int
A int = room.L*room.H
}
could be possible in Go, that would mean the Go compiler would have make arrangements so than any code like this
var r room
r.L = 42
is compiled in a way to implicitly mutate r.A.
In other words, the compiler must make sure that any modification of either L or W fields of any variable of type room in a program would also perform a calculaton and update the field A of each such variable.
This poses several problems:
What if your formula is trickier—like, say, A int = room.L/room.W?
First, given the casual Go rules for zero values of type int,
an innocent declaration var r room would immediately crash the program because of the integer division by zero performed by the code inserted by the compiler to force the invariant being discussed.
Second, even if we would invent a questionable rule of not calculating a formula on mere declarations (which, in Go, are also initializations), the problem would remain: what would happen in the following scenario?
var r room
r.L = 42
As you can see, even if the compiler would not make the program crash on the first line, it would have to arrange for that on the second.
Sure, we could add another questionable rule to sidestep the problem: either somehow "mark" each field as "explicitly set" or require the user to provide an explicit "constructor" for such types "armed" with a "formula".
Either solution stinks in its own way: tracing write field access incurs performance costs (some fields now have a hidden flag which takes up space, and each access of such fields spends extra CPU counts), and having constructors goes again one of the cornerstone principles of the Go design: to have as little magic as possible.
The formula creates a hidden write.
This may not be obvious until you start writing "harder-core" Go programs for tasks it shines at—highly concurrent code with lots of simultaneously working goroutines,—but when you do you're forced to think about shared state and the ways it's mutated and—consequently—on the ways such mutations are synchronized to keep the program correct.
So, let's suppose we protect access to either W or L with a mutex; how would the compiler make sure mutation of A is also proteted given that mutex operations are explicit (that is, a programmer explicitly codes locking/unlocking operations)?
(A problem somewhat related to the previous one.)
What if "the formula" does "interesting things"—such as accessing/mutating external state?
This could be anything from accessing global variables to querying databases to working with a filesystems to exchanges over IPC or via networking protocols.
And this all could be very innocently-looking, like A int = room.L * room.W * getCoefficient() where all the nifty details are hidden in that getCoefficient() call.
Sure, we, again, could work-around this by imposing an arbitrary limit on the compiler to only allow explicit access to the fields of the same enclosing type and only allow them to participate in simple expressions with no function calls or some "whitelisted" subset of them such as math.Abs or whatever.
This clearly reduces the usefulness of the feature while greatly complicating the language.
What if "the formula" has non-linear complexity?
Suppose, the formula is O(N³) with regard to the value of W.
Then setting W on a value to 0 would be processed almost instantly but setting it to 10000 would slow the program down quite noticeably, and both of these outcomes would result form a seemingly not too different statements: r.W = 0 vs r.W = 10000.
This, again, goes agains the principle of having as little magic as possible.
Why would we ony allow such things on struct types and not on arbitrary variables—prodived they are all in the same lexical scope?
This looks like another arbitrary restriction.
And another—supposedly—the most obvious problem is what should happen when the programmer goes like
var r room
r.L = 2 // r.A is now 2×0=0
r.W = 5 // r.A is now 2×5=10
r.A = 42 // The invariant r.A = r.L×r.W is now broken
?
Now you can see that all the problems above may be solved by merily coding what you need, say, with the following approach:
// use "unexported" fields
type room struct {
l int
w int
a int
}
func (r *room) SetL(v int) {
r.l = v
updateArea()
}
func (r *room) SetW(v int) {
r.w = v
updateArea()
}
func (r *room) GetA() int {
return r.a
}
func (r *room) updateArea() {
r.a = r.l * r.w
}
With this approach, you may be crystal-clear about all the issues above.
Remember that the programs are written for humans to read and only then for machines to execute; it's paramount for proper software engeneering to keep the code as much without any magic or intricate hidden dependencies between various parts of of it as possible. Please remember that
Software engineering is what happens to programming
when you add time and other programmers.
© Russ Cox
See more.

Is `String::with_capacity()` equal to `malloc`?

I read this article a few days ago and I thought what is the best way to implement such a thing in Rust. The article suggests to use a buffer instead of printing the string after each iteration.
Is this correct to say String::with_capacity() (or Vec) is equal to malloc in C?
Example from the codes:
String::with_capacity(size * 4096)
equal to:
char *buf = malloc(size * 4096);
It is not "equal", Rust's String is a composite object; String::with_capacity creates a String which is not only a buffer; it is a wrapper around a Vec<u8>:
pub struct String {
vec: Vec<u8>,
}
And a Vec is not just a section in memory - it also contains a RawVec and its length:
pub struct Vec<T> {
buf: RawVec<T>,
len: usize,
}
And a RawVec is not a primitive either:
pub struct RawVec<T> {
ptr: Unique<T>,
cap: usize,
}
So when you call String::with_capacity:
pub fn with_capacity(capacity: usize) -> String {
String { vec: Vec::with_capacity(capacity) }
}
You are doing much more than just reserving a section of memory.
That isn't quite accurate. It'd make more sense to say String::with_capacity is similar to std::string::reserve. From the documentation:
Creates a new empty String with a particular capacity.
Strings have an internal buffer to hold their data. The capacity is
the length of that buffer, and can be queried with the capacity
method. This method creates an empty String, but one with an initial
buffer that can hold capacity bytes. This is useful when you may be
appending a bunch of data to the String, reducing the number of
reallocations it needs to do.
If the given capacity is 0, no allocation will occur, and this method
is identical to the new method.
Whether or not it uses something similar to malloc for managing the internal buffer is an implementation detail.
In response to your edit:
You are explicitly allocating memory, whereas in C++ a memory allocation for std::string::reserve only occurs if the argument passed to reserve is greater than the existing capacity. Note that Rust's String does have a reserve method, but C++'s string does not have a with_capacity equivalent .
Two things:
If you link to an allocator, well, just call malloc.
The hook into the default global allocator is still unstable, but if you're on nightly, you can call it directly.
On stable Rust today, the closest thing you can get is Vec if you want to use the global allocator, but it's not equivalent for reasons spelled out in other answers.

Why the binary.Size() return (-1)?

The code snippet likes this:
package main
import (
"fmt"
"encoding/binary"
"reflect"
)
const (
commandLen = 1
bufLen int = 4
)
func main(){
fmt.Printf("%v %v\n", reflect.TypeOf(commandLen), reflect.TypeOf(bufLen))
fmt.Printf("%d %d", binary.Size(commandLen), binary.Size(bufLen))
}
And the output is:
int int
-1 -1
I think since the types of commandLen and bufLen are int, and from "Programming in golang",
the int should be int32 or int64 which depending on the implementation, so I think the binary.Size() should return a value, not (-1).
Why the binary.Size() return (-1)?
tl;dr
int is not a fixed-length type, so it won't work. Use something that has a fixed length, for example int32.
Explanation
This might look like a bug but it is actually not a bug. The documentation of Size() says:
Size returns how many bytes Write would generate to encode the value v, which must be a
fixed-size value or a slice of fixed-size values, or a pointer to such data.
A fixed-size value is a value that is not dependent on the architecture and the size is known
beforehand. This is the case for int32 or int64 but not for int as it depends on the
environment's architecture. See the documentation of int.
If you're asking yourself why Size() enforces this, consider encoding an int on your
64 bit machine and decoding the data on a remote 32 bit machine. This is only possible if you have length encoded types, which int is not. So you either have to store the size along with the type or enforce fixed-length types which the developers did.
This is reflected in the sizeof() function of encoding/binary that computes the size:
case reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64,
reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64,
reflect.Float32, reflect.Float64, reflect.Complex64, reflect.Complex128:
return int(t.Size()), nil
As you can see, there are all number types listed but reflect.Int.

Resources