Is allocation C memory to hold a Go struct a supported use case for cgo? - go

I've been exploring strategies around not passing nested go pointers around into C. Here's an example of how I'm try out allocating a block of C memory with the intent of holding a Go struct:
(*MyGoStruvt)(C.calloc(1, unsafe.Sizeof(MyGoStruvt{})))
Does anyone know if this is a supported use case? And if not, could someone explain how wrong this approach is?

Related

CGO known implementation bug in pointer checks

According to the documentation of CGO (https://pkg.go.dev/cmd/cgo), there is a known bug in the implementation:
Note: the current implementation has a bug. While Go code is permitted to write nil or a C pointer (but not a Go pointer) to C memory, the current implementation may sometimes cause a runtime error if the contents of the C memory appear to be a Go pointer. Therefore, avoid passing uninitialized C memory to Go code if the Go code is going to store pointer values in it. Zero out the memory in C before passing it to Go.
I looked for this in the issue tracker at GitHub but can't find it there. Could someone please elaborate on why this might happen? How does the runtime find Go pointers in uninitialized C memory?
Eg. let's say I am passing an uninitialized char array to a Go function from C, how can the runtime interpret a Go pointer in this memory?
Also, the "if the Go code is going to store pointer values in it" part confuses me. Why does later use of this memory matter?
I looked for this in the issue tracker at GitHub but can't find it there.
The bug to which this comment refers is https://golang.org/issue/19928, which is admittedly not easy to find. 😅
Could someone please elaborate on why this might happen? How does the runtime find Go pointers in uninitialized C memory?
During certain parts of a garbage collection cycle, the collector turns on a “write barrier” for writes to pointers in the Go heap, recording the previously-stored pointer value to ensure that it is not missed during the GC scan.
The bug here is that the write barrier sometimes also records the previously-stored pointer value for pointers outside the Go heap. If that value looks like a Go pointer, the garbage collector may try to scan it recursively, and could crash if it isn't actually a valid pointer.
Eg. let's say I am passing an uninitialized char array to a Go function from C, how can the runtime interpret a Go pointer in this memory?
This bug should not occur if the uninitialized data passed to Go is of a type that does not contain any pointers. So for a char array in particular you should be fine either way.
Also, the "if the Go code is going to store pointer values in it" part confuses me. Why does later use of this memory matter?
The compiler inserts write barriers at store instructions for pointer types. If the Go program does not store pointers, then the compiler will not emit any write barriers, and the bug in the write barrier will not be triggered.

Synthesize to smart pointers with Boost Spirit X3

I need to parse a complex AST, and it would be impossible to allocate this AST on heap memory, and the AST nodes must support polymorphism. One solution would be to allocate the AST nodes using smart pointers.
To simplify the question, how would I synthesize the following struct (std::unique_ptr<GiantIntegerStruct> giantIntegerStruct), with Boost Spirit X3 for example?
struct GiantIntegerStruct {
std::vector<unique_ptr<int>> manyInts;
}
My tentative solution, is to use semantic actions. Is there an alternative?
You can do semantic actions, or you can define traits for you custom types. However, see here Semantic actions runs multiple times in boost::spirit parsing (especially the two links there) - basically, consider not doing that.
I need to parse a complex AST, and it would be impossible to allocate this AST on heap memory
This somewhat confusing statement leads me to the logical conclusion that you merely need to allocate from a shared memory segment instead.
In the good old spirit of Rule Of Zero you could make a value-wrapper that does the allocation using whatever method you prefer and still enjoy automatic attribute propagation with "value semantics" (which will server as mere "handles" for the actual object in shared memory).
If you need any help getting this set up, feel free to post a new question.

Rust manual memory management

When I began learning C, I implemented common data structures such as lists, maps and trees. I used malloc, calloc, realloc and free to manage the memory manually when requested. I did the same thing with C++, using new and delete.
Now comes Rust. It seems like Rust doesn't offer any functions or operators which correspond to the ones of C or C++, at least in the stable release.
Are the Heap structure and the ptr module (marked with experimental) the ones to look at for this kind of thing?
I know that these data structures are already in the language. It's for the sake of learning.
Although it's really not recommended to do this ever, you can use malloc and free like you are used to from C. It's not very useful, but here's how it looks:
extern crate libc; // 0.2.65
use std::mem;
fn main() {
unsafe {
let my_num: *mut i32 = libc::malloc(mem::size_of::<i32>() as libc::size_t) as *mut i32;
if my_num.is_null() {
panic!("failed to allocate memory");
}
libc::free(my_num as *mut libc::c_void);
}
}
A better approach is to use Rust's standard library:
use std::alloc::{alloc, dealloc, Layout};
fn main() {
unsafe {
let layout = Layout::new::<u16>();
let ptr = alloc(layout);
*(ptr as *mut u16) = 42;
assert_eq!(*(ptr as *mut u16), 42);
dealloc(ptr, layout);
}
}
It's very unusual to directly access the memory allocator in Rust. You generally want to use the smart pointer constructors (Box::new, Rc::new, Arc::new) for single objects and just use Vec or Box<[T]> if you want a heap-based array.
If you really want to allocate memory and get a raw pointer to it, you can look at the implementation of Rc. (Not Box. Box is magical.) To get its backing memory, it actually creates a Box and then uses its into_raw_non_null function to get the raw pointer out. For destroying, it uses the allocator API, but could alternatively use Box::from_raw and then drop that.
Are the Heap structure and the ptr module (marked with experimental) the ones to look at for this kind of thing?
No, as a beginner you absolutely shouldn't start there. When you started learning C, malloc was all there was, and it's still a hugely error-prone part of the language - but you can't write any non-trivial program without it. It's very important for C programmers to learn about malloc and how to avoid all the pitfalls (memory leaks, use-after-free, and so on).
In modern C++, people are taught to use smart pointers to manage memory, instead of using delete by hand, but you still need to call new to allocate the memory for your smart pointer to manage. It's a lot better, but there's still some risk there. And still, as a C++ programmer, you need to learn how new and delete work, in order to use the smart pointers correctly.
Rust aims to be much safer than C or C++. Its smart pointers encapsulate all the details of how memory is handled at low-level. You only need to know how to allocate and deallocate raw memory if you're implementing a smart pointer yourself. Because of the way ownership is managed, you actually need to know a lot more details of the language to be able to write correct code. It can't be lesson one or two like it is in C or C++: it's a very advanced topic, and one many Rust programmers never need to learn about.
If you want to learn about how to allocate memory on the heap, the Box class is the place to start with that. In the Rust book, the chapter about smart pointers is the chapter about memory allocation.

Any documentation/article about the `&MyType{}` pattern in golang?

In most golang codebases I look, people are using types by reference:
type Foo struct {}
myFoo := &Foo{}
I usually take the opposite approach, passing everything as copy and only pass by reference when I want to perform something destructive on the value, which allows me to easily spot destructive functions (and which is fairly rare).
But seeing how references are commonplace, I guess it's not just a matter of taste. I get there's a cost in duplicating values, is it that much of a game changer? Or are there other reasons why references are preferred?
It would be great if someone could point me to an article or documentation about why references are preferred.
Thanks!
Go is pass by value. I try to use references like in your example as much as possible to remove the mental process of thinking about not making duplicates of objects. Go is mostly meant for networking & scaling, which makes performance a priority. Obvious downside of this is as you say, receiving methods can destroy the object that the pointer points to.
Otherwise there is no rule as to which you should use. Both are quite ok.
Also, somewhat related to the question, from the Go docs: Pointers vs. Values

(Idiomatic?) Difference between new(T) and &T{...}?

I started kidding around with Go and am a little irritated by the new function. It seems to be quite limited, especially when considering structures with anonymous fields or inline initialisations. So I read through the spec and stumbled over the following paragraph:
Calling the built-in function new or taking the address of a composite literal allocates storage for a variable at run time.
So I have the suspicion that new(T) and &T{} will behave in the exact same way, is that correct? And if that is correct, in what situation should new be used?
Yes, you are correct. new is not that useful with structs. But it is with other basic types. new(int) will get you a pointer to a zero-valued int, and you can't do &int{} or similar.
In any case, in my experience, you rarely want that, so new is rarely used. You can just declare a plain int and pass around a pointer to it. In fact, doing this is probably better because liberates you from thinking about allocating in the stack vs. in the heap, as the compiler will decide for you.

Resources