I read this article a few days ago and I thought what is the best way to implement such a thing in Rust. The article suggests to use a buffer instead of printing the string after each iteration.
Is this correct to say String::with_capacity() (or Vec) is equal to malloc in C?
Example from the codes:
String::with_capacity(size * 4096)
equal to:
char *buf = malloc(size * 4096);
It is not "equal", Rust's String is a composite object; String::with_capacity creates a String which is not only a buffer; it is a wrapper around a Vec<u8>:
pub struct String {
vec: Vec<u8>,
}
And a Vec is not just a section in memory - it also contains a RawVec and its length:
pub struct Vec<T> {
buf: RawVec<T>,
len: usize,
}
And a RawVec is not a primitive either:
pub struct RawVec<T> {
ptr: Unique<T>,
cap: usize,
}
So when you call String::with_capacity:
pub fn with_capacity(capacity: usize) -> String {
String { vec: Vec::with_capacity(capacity) }
}
You are doing much more than just reserving a section of memory.
That isn't quite accurate. It'd make more sense to say String::with_capacity is similar to std::string::reserve. From the documentation:
Creates a new empty String with a particular capacity.
Strings have an internal buffer to hold their data. The capacity is
the length of that buffer, and can be queried with the capacity
method. This method creates an empty String, but one with an initial
buffer that can hold capacity bytes. This is useful when you may be
appending a bunch of data to the String, reducing the number of
reallocations it needs to do.
If the given capacity is 0, no allocation will occur, and this method
is identical to the new method.
Whether or not it uses something similar to malloc for managing the internal buffer is an implementation detail.
In response to your edit:
You are explicitly allocating memory, whereas in C++ a memory allocation for std::string::reserve only occurs if the argument passed to reserve is greater than the existing capacity. Note that Rust's String does have a reserve method, but C++'s string does not have a with_capacity equivalent .
Two things:
If you link to an allocator, well, just call malloc.
The hook into the default global allocator is still unstable, but if you're on nightly, you can call it directly.
On stable Rust today, the closest thing you can get is Vec if you want to use the global allocator, but it's not equivalent for reasons spelled out in other answers.
Related
Is it safe?
(*TeamData)(unsafe.Pointer(&team.Id))
Example code:
func testTrans() []*TeamData {
teams := createTeams()
teamDatas := make([]*TeamData, 0, len(teams))
for _, team := range teams {
// is this safe?
teamDatas = append(teamDatas, (*TeamData)(unsafe.Pointer(&team.Id)))
}
return teamDatas
}
// ??
teams := testTrans()
Will the members of the teams := testTrans() array be garbage collected?
There are many structs and many fields returned through grpc and their definitions are the same as the local definitions, so I want to use this more efficient way((*TeamData)(unsafe.Pointer(&team.Id))), but I don't know if there will be any risks.
Full Example:
https://go.dev/play/p/q3gwp2mERvj
The documentation for unsafe.Pointer describes supported uses. In particular:
(1) Conversion of a *T1 to Pointer to *T2.
Provided that T2 is no larger than T1 and that the two share an
equivalent memory layout, this conversion allows reinterpreting data
of one type as data of another type.
Go's garbage collector recognises interior pointers an will not collect the original allocation until there are no remaining references to that block.
Hence the larger allocation (GrpcRetTeam in your example) will be pinned while references to *TeamData exists.
Another critical consideration is the alignment of the struct fields. Eg:
type Parent struct {
A uint8
B uint8
// 6 bytes of padding to align C.
C uint64
}
type Bad struct {
B uint8
// 7 bytes of padding to align C.
C uint64
}
In this case it would be invalid to use unsafe to extract Bad from Parent since the memory layout is different.
In most cases it's typically better to avoid unsafe.Pointer tricks unless required to meet functionality or performance requirements. It's often possible to refactor code to minimise allocations instead.
If you must use unsafe to meet performance requirements --
I would recommend implementing a test using the reflect package to ensure the memory alignment/layout is valid for the child struct.
I am writing an interpreter in Rust for a domain specific language that should allow for a high performance implementation. The relevant properties for the heap are:
Programs are short (but a lot of them are executed)
I know, before execution starts, the maximum amount of memory needed
The needed memory is small
No de-allocation is needed until the program terminates
All the data is immutable once created
The language is statically type checked so no risk of accessing uninitialized memory
The language is single threaded so no shared memory
My elements in the heap have a fixed structure:
pub enum Data {
Adt(u16, Vec<u64>),
Prim(Vec<u64>),
}
The Vec<u64> contains either pointers to other elements or primitive data.
I first considered representing it directly with an enum.
This requires a lot of allocations at runtime and needs an unnecessary indirection for the Vec. It also wastes space for the capacity & size field (as the data is immutable and the size is statically known).
I'm relatively new to Rust and have kept my self away from unsafe code, but I think it could help here. I first tried to express what I want the functionality to be:
#[derive(Copy, Clone)]
struct Data(u64);
struct DataStore(Vec<u64>);
impl DataStore {
fn alloc(max_size: u64) -> Self {
DataStore(Vec::with_capacity(max_size as usize))
}
fn new_adt(&mut self, tag: u16) -> Data {
let start = self.0.len() as u64;
let header = tag as u64;
self.0.push(header);
Data(start)
}
fn new_primitive(&mut self, data: &[u64]) -> Data {
let start = self.0.len() as u64;
for e in data {
self.0.push(*e)
}
Data(start)
}
fn init_adt_field(&mut self, Data(ptr): Data) {
self.0.push(ptr);
}
fn get_field(&self, Data(start): Data, index: u32) -> Data {
Data(self.0[(start + 1 + (index as u64)) as usize])
}
fn get_tag(&self, Data(start): Data) -> u16 {
self.0[start as usize] as u16
}
}
Can this be somehow be implemented more efficiently than with a Vec?. I looked at TypedArena, but was not sure if and how I could use it in this scenario. Will it be faster then the Vec? The Vec will never grow, as it is initialized with a capacity statically proven to be enough.
Is it possible to allocate just a memory blob and then use unsafe pointers? What are my options here?
Similar to what I've learned in C++, I believe it's the padding that causes a difference in the size of instances of both structs.
type Foo struct {
w byte //1 byte
x byte //1 byte
y uint64 //8 bytes
}
type Bar struct {
x byte //1 byte
y uint64 //8 bytes
w byte// 1 byte
}
func main() {
fmt.Println(runtime.GOARCH)
newFoo := new(Foo)
fmt.Println(unsafe.Sizeof(*newFoo))
newBar := new(Bar)
fmt.Println(unsafe.Sizeof(*newBar))
}
Output:
amd64
16
24
Is there a rule of thumb to follow when defining struct members? (like ascending/descending order of size of types)
Is there a compile time optimisation which we can pass, that can automatically take care of this?
Or shouldn't I be worried about this at all?
Currently there's no compile-time optimisation; the values are padded to 8 bytes on x64.
You can manually arrange structs to optimally utilise space; typically by going from larger types to smaller; 8 consecutive byte fields for example, will only use 8 bytes, but a single byte would be padded to an 8 byte alignment, consider this: https://play.golang.org/p/0qsgpuAHHp
package main
import (
"fmt"
"unsafe"
)
type Compact struct {
a, b uint64
c, d, e, f, g, h, i, j byte
}
// Larger memory footprint than "Compact" - but less fields!
type Inefficient struct {
a uint64
b byte
c uint64
d byte
}
func main() {
newCompact := new(Compact)
fmt.Println(unsafe.Sizeof(*newCompact))
newInefficient := new(Inefficient)
fmt.Println(unsafe.Sizeof(*newInefficient))
}
If you take this into consideration; you can optimise the memory footprint of your structs.
Or shouldn't I be worried about this at all?
Yes you should.
This is also called mechanical sympathy (see this Go Time podcast episode), so it also depends on the hardware architecture you are compiling for.
See as illustration:
"The day byte alignment came back to bite me" (January 2014)
"On the memory alignment of Go slice values" (July 2016)
The values in Go slices are 16-byte aligned. They are not 32 byte aligned.
Go pointers are byte-aligned.
It depends on type of application that you are developing and on usage of those structures. If application needs to meet some memory/performance criteria you definitely should care about memory alignment and paddings, but not only - there is nice article https://www.usenix.org/legacy/publications/library/proceedings/als00/2000papers/papers/full_papers/sears/sears_html/index.html that highlights theme of optimal CPU caches usage and correlation between struct layouts and performance. It highlights cache line alignment, false sharing, etc.
Also there is a nice golang tool https://github.com/1pkg/gopium that helps to automate those optimizations, check it out!
Some guideline
To minimize the number of padding bytes, we must lay out the fields from
the highest allocation to lowest allocation.
One exception is an empty structure
As we know the size of empty is zero
type empty struct {
a struct{}
}
Following the common rule above, we may arrange the fields of structure as below
type E struct {
a int64
b int64
c struct{}
}
However, the size of E is 24,
When arrange the fields of structure as
type D struct {
b struct{}
a int64
c int64
}
The size of D is 16, refer to https://go.dev/play/p/ID_hN1zwIwJ
IMO, it is better to use tools that help us to automate structure alignment optimizations
aligncheck — https://gitlab.com/opennota/check
maligned — https://github.com/mdempsky/maligned, the original maligned is deprecated. Use https://pkg.go.dev/golang.org/x/tools/go/analysis/passes/fieldalignment
golang clint
you just need to enable ‘maligned’ in the ‘golangci-lint’ settings.
Example, from the configuration file.golangci.example.yml
linters-settings:
maligned:
# print struct with more effective memory layout or not, false by default
suggest-new: true
Suppose I'm interacting with a third party C library from Go that maintains some complicated data structure. It allows me to iterate over the data structure using an API that looks something like this:
typedef void (*callback)(double x, void *params);
iterate(int64_t start_idx, int64_t end_idx, callback f, void *params);
With the idea being that iterate loops over every index between start_idx and end_idx and calls f on every element in that range. If you wanted to read the data out of the complicated data structure and into an array, you would write something like this:
typedef struct buffer {
double *data;
int64_t i;
} buffer;
void read_callback(double x, void *params) {
buffer *buf = (buffer*) params;
buf->data[i] = x;
buf->i++;
}
Now let's suppose that I wanted to wrap this API call in a Go function where the user passes the function a pre-allocated buffer. In Go 1.5, I might have done something like this:
func Read(startIdx, endIdx int64, data []float64) {
buf := &C.buffer{}
buf.data = unsafe.Pointer(&data[0])
C.iterate(C.int64_t(startIdx), C.int64_t(endIdx),
(C.callback)(C.read_callback), unsafe.Pointer(buf))
}
However, in Go 1.6 this is invalid. data is a Go-pointer and so is buf, meaning that the runtime panics to prevent a GC error. Allocating buf as a C-pointer is also not allowed. I think the intended way to handle something like this would be to allocate data as a C pointer, but I don't want to allocate a temporary array inside Read because these arrays are large enough that I can't hold two of them in memory at once (and even if I could, I wouldn't be able to deal with the heap fragmentation).
My current (very hacky) solutions are to either pass some of the data around as global variables or to pack i and the array length in the first element of array (and then to swap some data around at the end of the iteration). Is there a way for me to do this which is less terrible?
What's the difference? Is map[T]bool optimized to map[T]struct{}? Which is the best practice in Go?
Perhaps the best reason to use map[T]struct{} is that you don't have to answer the question "what does it mean if the value is false"?
From "The Go Programming Language":
The struct type with no fields is called the empty struct, written
struct{}. It has size zero and carries no information but may be
useful nonetheless. Some Go programmers use it instead of bool as the
value type of a map that represents a set, to emphasize that only the
keys are significant, but the space saving is marginal and the syntax
more cumbersome, so we generally avoid it.
If you use bool testing for presence in the "set" is slightly nicer since you can just say:
if mySet["something"] {
/* .. */
}
Difference is in memory requirements. Under the bonnet empty struct is not a pointer but a special value to save memory.
An empty struct is a struct type like any other. All the properties you are used to with normal structs apply equally to the empty struct. You can declare an array of structs{}s, but they of course consume no storage.
var x [100]struct{}
fmt.Println(unsafe.Sizeof(x)) // prints 0
If empty structs hold no data, it is not possible to determine if two struct{} values are different.
Considering the above statements it means that we may use them as method receivers.
type S struct{}
func (s *S) addr() { fmt.Printf("%p\n", s) }
func main() {
var a, b S
a.addr() // 0x1beeb0
b.addr() // 0x1beeb0
}