how does GC behave when a pointer to struct is replaced - go

I'm learning Go and I'm reading examples from libraries. I found that some examples are using:
type MyType struct {
Code string
//...
}
func main() {
myType := &MyType{...}
//...
myType = &MyType{...}
}
Basically they are reusing variables. I understand that &MyType{..} returns a pointer, later I can replace that pointer. What happens with the previous pointed memory. Will the GC reclaim that memory or will I waste that memory. Maybe this is a silly question and I'm concerned for nothing but I'm trying to learn Go to build performance APIs :)

The memory will be reclaimed by the garbage collector.
If you want to replace the struct you can do it this way:
func main() {
myType := &MyType{...}
//...
*myType = MyType{...}
}
The difference will probably be negligible though.

Related

golang struct creation return different memory addresses

I'm having trouble understanding why golang returns a different memory address on what appears to be the same struct (maybe it's not, perhaps it copies with the same values to another memory address?).
Here's the code
package main
import (
"fmt"
)
type Creature struct {
Name string
isAlive bool
}
func foo() Creature {
myCreature := Creature{Name: "dino", isAlive: true}
fmt.Printf("%p\n", &myCreature)
fmt.Println(myCreature)
return myCreature
}
func main() {
myCreat := foo()
fmt.Printf("%p\n", &myCreat)
fmt.Println(myCreat)
}
The output of the code is the following
0xc000004090
{dino true}
0xc000004078
{dino true}
As you can see, the memory addresses are different. Why?
Should I instead return a memory address?
I'm having trouble understanding why golang returns a different memory address on what appears to be the same struct (maybe it's not, perhaps it copies with the same values to another memory address?).
You didn't return a memory address, you returned a struct.
As you can see, the memory addresses are different.
Because you returned a struct and it was copied to a new one.
Why? Should I instead return a memory address?
Yes, if you want a pointer then return that.
package main
import (
"fmt"
)
type Creature struct {
Name string
isAlive bool
}
func foo() *Creature {
myCreature := Creature{Name: "dino", isAlive: true}
fmt.Printf("%p\n", &myCreature)
fmt.Println(myCreature)
return &myCreature
}
func main() {
myCreat := foo()
fmt.Printf("%p\n", myCreat)
fmt.Println(*myCreat)
}
Playground
The rule in Go is that you only use pointers when you actually need them, when you have to modify a struct's values or something. You should not use pointers because you think it might be more efficient. The memory optimiser can do its work more efficiently if you don't force it to do things one way or another.
See https://medium.com/#vCabbage/go-are-pointers-a-performance-optimization-a95840d3ef85, https://betterprogramming.pub/why-you-should-avoid-pointers-in-go-36724365a2a7, and many more articles.

How to cast from a Go pointer to uintptr and back without a memory corruption?

Conversion between pointers should be made using unsafe.Pointer() and uintptr.
I am writing an interpreter using Go. This is very simple fragment using an EID struct to carry pairs (type,values) between different sections of native code. This code is surprising because the same print statement gets two different values (see the Foo() method). The object is "encapsulated" into an EID and transformed back to an object.
The code compiles but the result is deeply broken.
If you run this you get:
~/go%  go run testBug.go
create object 0xc000068e28 with class 0xc00000c060
here is y:0xc000068e28, y.Isa: 0xc00000c060"
here is y:0xc000068e28, y.Isa: 0x2c
package main
import (
"fmt"
"unsafe"
)
type EID struct {
SORT *Class
VAL uintptr
}
// access to VAL
func OBJ(x EID) *Anything { return (*Anything)(unsafe.Pointer(x.VAL)) }
func INT(x EID) int { return (int)((uintptr)(unsafe.Pointer(x.VAL))) }
// useful utility get the pointer as a uintptr
func (x *Anything) Uip() uintptr { return uintptr(unsafe.Pointer(x)) }
type Anything struct {
Isa *Class
}
func (x *Anything) Id() *Anything { return x }
type Object struct {
Anything
name string
}
type Class struct {
Object
Super *Class
}
type Integer struct {
Anything
Value int
}
func MakeObject(c *Class) *Anything {
o := new(Object)
o.Isa = c
return o.Id()
}
// this is the surprising example - EID is passed but the content is damaged
func (c *Class) Foo() EID {
x := c.Bar()
y := OBJ(x)
z := y.Isa
fmt.Printf("here is y:%p, y.Isa: %p\n", y, z)
fmt.Printf("here is y:%p, y.Isa: %p\n", y, y.Isa) // this produces a different value !
return x
}
func (c *Class) Bar() EID {
UU := EID{c, MakeObject(c).Uip()}
fmt.Printf("create object %p with class %p\n", OBJ(UU), OBJ(UU).Isa)
return UU
}
var aClass *Class
var aInteger *Class
func main() {
aClass := new(Class)
aClass.Isa = aClass
aClass.Foo()
}
Clearly the uintptr to pointer has to be local and cannot happen in two different places (Foo() and Bar() here). I have found a workaround but I curious about this strange behavior.
When you store a pointer (of any concrete type or even of type unsafe.Pointer) into a uintptr, this hides the pointer-ness from Go's garbage collector. Go is therefore free to GC the underlying object if there is no other pointer to it.
When you convert a uintptr to unsafe.Pointer, the object, a pointer to which the value stored in the uintptr converts, needs to exist. If it's been GC'ed, it no longer exists. Hence the "safe" way to take some pointer value p of any type *T and store it in a uintptr is to store it instead in unsafe.Pointer. The unsafe.Pointer object is visible to Go's garbage collector, as a pointer, so this keeps the actual object alive.
You'll see this pattern in some of the Go internal software:
// need to keep the pointer alive while we make a syscall
p := unsafe.Pointer(foo)
ret := syscall.SyscallN(..., uintptr(foo), ...)
The apparently pointless creation of local variable p serves to protect the underlying object from being GC'ed while the OS system call reads its bytes. (Note that this is being overly chummy with the compiler since the assignment to p appears to be dead code here. Perhaps the internal software is fancier than this, and/or they're using //go:... comments as well.)
This same pattern works in the Go playground if I take your not-quite-minimal reproducible example and make the obvious minimal changes to it. Whether that's sufficient (and precisely how you'd like to use this same concept in your interpreter) is another question entirely, but see playground copy. Note: I had to add one closing brace to your program but after that it exhibited the same behavior you saw; here's that version. It draws two warnings from go vet about misuse of unsafe.Pointer, which my updated version doesn't.

Need help to understand garbage collection in GoLang

I'm a little bit confused with GoLang's garbage collector.
Consider this following code, where I implement reader interface for my type T.
type T struct {
header Header
data []*MyDataType
}
func (t *T) Read(p []byte) (int, error) {
t.Header = *(*Header) (t.readFileHeader(p))
t.Data = *(*[]*MyDataType) (t.readFileData(p))
}
wherein the reader functions I will cast the data to MyDataType using the unsafe.Pointer which will point to slice created with the reflect module (this is more complicated, but for the sake of the example this should be enough)
func (t *T) readFileData(data []byte, idx int, ...) unsafe.Pointer {
...
return unsafe.Pointer(&reflect.SliceHeader{Data : uintptr(unsafe.Pointer(&data[idx])), ...})
}
and If I am gonna read the data in different function
func (d *Dummy) foo() {
data, _ := ioutil.ReadFile(filename)
d.t.Read(data) <---will GC free data?
}
Now I'm confused if it is possible, that the GC will free loaded data from file after exiting the foo function. Or the data will be freed after the d.t is freed.
To understand what GC might do to your variables, first you need to know how and where Go allocates them. Here is a good reading about escape analysis, that is how Go compiler decides where allocate memory, between stack or heap.
Long story short, GC will free memory only if it is not referenced by your Go program.
In your example, the reference to loaded data by data, _ := ioutil.ReadFile(filename) is passed to t.Data = *(*[]*MyDataType) (t.readFileData(p)) ultimately. Therefore, they will be referenced as long as (t *T) struct is referenced as well. As far as I can see from your code, the loaded data will be garbage-collected along with (t *T).
According to the reflect docs, I've to keep a separate pointer to data *[]byte, to avoid garbage collection. So the solution is to add a referencePtr to
type T struct {
header Header
data []*MyDataType
referencePtr *[]byte
}
which will point to my data inside the Read function
func (t *T) Read(p []byte) (int, error) {
t.referencePtr = &p
t.Header = *(*Header) (t.readFileHeader(p))
t.Data = *(*[]*MyDataType) (t.readFileData(p))
}
or is this unnecessary?

How to "pass a Go pointer to Cgo"?

I am confused regarding the passing of Go pointers (which, to my understanding, include all pointer types as well as unsafe.Pointer) to cgo. When calling C functions with cgo, I can only provide variables of types known on the C-side, or unsafe.Pointer if it matches with a void*-typed parameter in the C-function's signature. So when "Go pointers passed to C are pinned for lifetime of call", how does Go know that what I am passing is, in fact, a Go pointer, if I am ever forced to cast it to C.some_wide_enough_uint_type or C.some_c_pointer_type beforehand? The moment it is cast, isn't the information that it is a Go pointer lost, and I run risk of the GC changing the pointer? (I can see how freeing is prevented at least, when a pointer-type reference is retained on the Go-side)
We have a project with a fair amount of working cgo code, but zero confidence in its reliability. I would like to see an example of "here is how to do it correctly" which doesn't resort to circumventing Go's memory model by using C.malloc() or such, which most examples unfortunately do.
So regardless of what "pinning the pointer for lifetime of call" actually means, I see a problem either way:
If it means that Go will pin all pointers in the entire program, I see a race condition in the time interval between casting a Go pointer to a C-type and the cgo-call actually being invoked.
If it means that Go will pin only those Go pointers which are being passed, how does it know that they are Go pointers when, at the time of calling, they can only have a C-type?
I've been reading through Go issues for half the day and am starting to feel like I'm just missing something simple. Any pointers are appreciated.
EDIT: I will try to clarify the question by providing examples.
Consider this:
/*
#include <stdio.h>
void myCFunc(void* ptr) {
printf((char*)ptr);
}
*/
import "C"
import "unsafe"
func callMyCFunc() {
goPointer := []byte("abc123\n\x00")
C.myCFunc(unsafe.Pointer(&goPointer[0]))
}
Here, Go's unsafe.Pointer-type effortlessly translates into C's void*-type, so we are happy on the C-side of things, and we should be on the Go-side also: the pointer clearly points into Go-allocated memory, so it should be trivial for Go to figure out that it should pin this pointer during the call, despite it being an unsafe one. Is this the case? If it is, without further research, I would consider this to be the preferred way to pass Go pointers to cgo. Is it?
Then, consider this:
/*
#include <stdio.h>
void myCFunc(unsigned long long int stupidlyTypedPointerVariable) {
char* pointerToHopefullyStillTheSameMemory = (char*)stupidlyTypedPointerVariable;
printf(pointerToHopefullyStillTheSameMemory);
}
*/
import "C"
import "unsafe"
func callMyCFunc() {
goPointer := []byte("abc123\n\x00")
C.myCFunc(C.ulonglong(uintptr(unsafe.Pointer(&goPointer[0]))))
}
Here, I would expect that Go won't make any guesses on whether some C.ulonglong-typed variable actually means to contain the address of a Go pointer. But am I correct?
My confusion largely arises from the fact that it's not really possible to write some code to reliably test this with.
Finally, what about this:
/*
#include <stdio.h>
void cFuncOverWhichIHaveNoControl(char* ptr) {
printf(ptr);
}
*/
import "C"
import "unsafe"
func callMyCFunc() {
goPointer := []byte("abc123\n\x00")
C.cFuncOverWhichIHaveNoControl((*C.char)(unsafe.Pointer(&goPointer[0])))
}
If I am, for whatever reason, unable to change the signature of the C-function, I must cast to *C.char. Will Go still check if the value is a Go pointer, when it already is a C pointer-type?
Looking at the section on passing pointers in the current cgo documentation, (thanks to peterSO) we find that
the term Go pointer means a pointer to memory allocated by Go
as well as that
A pointer type may hold a Go pointer or a C pointer
Thus, using uintptr and other integer (read: non-pointer) types will lose us Go's guarantee of pinning the pointer.
A uintptr is an integer, not a reference. Converting a Pointer to a uintptr creates an integer value with no pointer semantics. Even if a uintptr holds the address of some object, the garbage collector will not update that uintptr's value if the object moves, nor will that uintptr keep the object from being reclaimed.
Source: https://golang.org/pkg/unsafe/#Pointer
Regarding C pointer types such as *char/*C.char, these are only safe when the pointed data does not itself contain pointers to other memory allocated by Go. This can actually be shown by trying to trigger Go's Cgo Debug mechanism, which disallows passing a Go pointer to (or into) a value which itself contains another Go pointer:
package main
import (
"fmt"
"unsafe"
/*
#include <stdio.h>
void cFuncChar(char* ptr) {
printf("%s\n", ptr);
}
void cFuncVoid(void* ptr) {
printf("%s\n", (char*)ptr);
}
*/
"C"
)
type MyStruct struct {
Distraction [2]byte
Dangerous *MyStruct
}
func main() {
bypassDetection()
triggerDetection()
}
func bypassDetection() {
fmt.Println("=== Bypass Detection ===")
ms := &MyStruct{[2]byte{'A', 0}, &MyStruct{[2]byte{0, 0}, nil}}
C.cFuncChar((*C.char)(unsafe.Pointer(ms)))
}
func triggerDetection() {
fmt.Println("=== Trigger Detection ===")
ms := &MyStruct{[2]byte{'B', 0}, &MyStruct{[2]byte{0, 0}, nil}}
C.cFuncVoid(unsafe.Pointer(ms))
}
This will print the following:
=== Bypass Detection ===
A
=== Trigger Detection ===
panic: runtime error: cgo argument has Go pointer to Go pointer
Using *C.char bypassed the detection. Only using unsafe.Pointer will detect Go pointer to Go pointer scenarios. Unfortunately, this means we will have to have an occasional nebulous void*-parameter in the C-function's signature.
Adding for clarity: Go may very well pin the value pointed by a *C.char or such, which is safe to pass; it just (reasonably) won't make an effort to find out whether it might be something else which could contain additional pointers into memory allocated by Go. Casting to unsafe.Pointer is actually safe; casting from it is what may be dangerous.

Does assigning value to interface copy anything?

I've been trying to wrap my head around the concept of interfaces in Go. Reading this and this helped a lot.
The only thing that makes me uncomfortable is the syntax. Have a look at the example below:
package main
import "fmt"
type Interface interface {
String() string
}
type Implementation int
func (v Implementation) String() string {
return fmt.Sprintf("Hello %d", v)
}
func main() {
var i Interface
impl := Implementation(42)
i = impl
fmt.Println(i.String())
}
My issue is with i = impl. Based on the fact that an interface instance actually holds a pointer reference to the actual data, it would feel more natural for me to do i = &impl. Usually assignment of non-pointer when not using & will make a full memory copy of the data, but when assigning to interfaces this seem to side-step this and instead simply (behind the scenes) assign the pointer to the interface value. Am I right? That is, the data for the int(42) will not be copied in memory?
The data for int(42) will be copied. Try this code:
func main() {
var i Interface
impl := Implementation(42)
i = impl
fmt.Println(i.String())
impl = Implementation(91)
fmt.Println(i.String())
}
(Playground link)
You'll find that the second i.String() still shows 42. Perhaps one of the trickier aspects of Go is that method receivers can be pointers as well.
func (v *Implementation) String() string {
return fmt.Sprintf("Hello %d", *v)
}
// ...
i = &impl
Is what you want if you want the interface to hold a pointer to the original value of impl. "Under the hood" an interface is a struct that either holds a pointer to some data, or the data itself (and some type metadata that we can ignore for our purposes). The data itself is stored if its size is less than or equal to one machine word -- whether it be a pointer, struct, or other value.
Otherwise it will be a pointer to some data, but here's the tricky part: if the type implementing the interface is a struct the pointer will be to a copy of the struct, not the struct assigned to the interface variable itself. Or at least semantically the user can think of it as such, optimizations may allow the value to not be copied until the two diverge (e.g. until you call String or reassign impl).
In short: assigning to an interface can semantically be thought of as a copy of the data that implements the interface. If this is a pointer to a type, it copies the pointer, if it's a big struct, it copies the big struct. The particulars of interfaces using pointers under the hood are for reasons of garbage collection and making sure the stack expands by predictable amounts. As far as the developer is concerned, they should be thought of as semantic copies of the specific instance of the implementing type assigned.

Resources