Need help to understand garbage collection in GoLang - go

I'm a little bit confused with GoLang's garbage collector.
Consider this following code, where I implement reader interface for my type T.
type T struct {
header Header
data []*MyDataType
}
func (t *T) Read(p []byte) (int, error) {
t.Header = *(*Header) (t.readFileHeader(p))
t.Data = *(*[]*MyDataType) (t.readFileData(p))
}
wherein the reader functions I will cast the data to MyDataType using the unsafe.Pointer which will point to slice created with the reflect module (this is more complicated, but for the sake of the example this should be enough)
func (t *T) readFileData(data []byte, idx int, ...) unsafe.Pointer {
...
return unsafe.Pointer(&reflect.SliceHeader{Data : uintptr(unsafe.Pointer(&data[idx])), ...})
}
and If I am gonna read the data in different function
func (d *Dummy) foo() {
data, _ := ioutil.ReadFile(filename)
d.t.Read(data) <---will GC free data?
}
Now I'm confused if it is possible, that the GC will free loaded data from file after exiting the foo function. Or the data will be freed after the d.t is freed.

To understand what GC might do to your variables, first you need to know how and where Go allocates them. Here is a good reading about escape analysis, that is how Go compiler decides where allocate memory, between stack or heap.
Long story short, GC will free memory only if it is not referenced by your Go program.
In your example, the reference to loaded data by data, _ := ioutil.ReadFile(filename) is passed to t.Data = *(*[]*MyDataType) (t.readFileData(p)) ultimately. Therefore, they will be referenced as long as (t *T) struct is referenced as well. As far as I can see from your code, the loaded data will be garbage-collected along with (t *T).

According to the reflect docs, I've to keep a separate pointer to data *[]byte, to avoid garbage collection. So the solution is to add a referencePtr to
type T struct {
header Header
data []*MyDataType
referencePtr *[]byte
}
which will point to my data inside the Read function
func (t *T) Read(p []byte) (int, error) {
t.referencePtr = &p
t.Header = *(*Header) (t.readFileHeader(p))
t.Data = *(*[]*MyDataType) (t.readFileData(p))
}
or is this unnecessary?

Related

How to cast from a Go pointer to uintptr and back without a memory corruption?

Conversion between pointers should be made using unsafe.Pointer() and uintptr.
I am writing an interpreter using Go. This is very simple fragment using an EID struct to carry pairs (type,values) between different sections of native code. This code is surprising because the same print statement gets two different values (see the Foo() method). The object is "encapsulated" into an EID and transformed back to an object.
The code compiles but the result is deeply broken.
If you run this you get:
~/go%  go run testBug.go
create object 0xc000068e28 with class 0xc00000c060
here is y:0xc000068e28, y.Isa: 0xc00000c060"
here is y:0xc000068e28, y.Isa: 0x2c
package main
import (
"fmt"
"unsafe"
)
type EID struct {
SORT *Class
VAL uintptr
}
// access to VAL
func OBJ(x EID) *Anything { return (*Anything)(unsafe.Pointer(x.VAL)) }
func INT(x EID) int { return (int)((uintptr)(unsafe.Pointer(x.VAL))) }
// useful utility get the pointer as a uintptr
func (x *Anything) Uip() uintptr { return uintptr(unsafe.Pointer(x)) }
type Anything struct {
Isa *Class
}
func (x *Anything) Id() *Anything { return x }
type Object struct {
Anything
name string
}
type Class struct {
Object
Super *Class
}
type Integer struct {
Anything
Value int
}
func MakeObject(c *Class) *Anything {
o := new(Object)
o.Isa = c
return o.Id()
}
// this is the surprising example - EID is passed but the content is damaged
func (c *Class) Foo() EID {
x := c.Bar()
y := OBJ(x)
z := y.Isa
fmt.Printf("here is y:%p, y.Isa: %p\n", y, z)
fmt.Printf("here is y:%p, y.Isa: %p\n", y, y.Isa) // this produces a different value !
return x
}
func (c *Class) Bar() EID {
UU := EID{c, MakeObject(c).Uip()}
fmt.Printf("create object %p with class %p\n", OBJ(UU), OBJ(UU).Isa)
return UU
}
var aClass *Class
var aInteger *Class
func main() {
aClass := new(Class)
aClass.Isa = aClass
aClass.Foo()
}
Clearly the uintptr to pointer has to be local and cannot happen in two different places (Foo() and Bar() here). I have found a workaround but I curious about this strange behavior.
When you store a pointer (of any concrete type or even of type unsafe.Pointer) into a uintptr, this hides the pointer-ness from Go's garbage collector. Go is therefore free to GC the underlying object if there is no other pointer to it.
When you convert a uintptr to unsafe.Pointer, the object, a pointer to which the value stored in the uintptr converts, needs to exist. If it's been GC'ed, it no longer exists. Hence the "safe" way to take some pointer value p of any type *T and store it in a uintptr is to store it instead in unsafe.Pointer. The unsafe.Pointer object is visible to Go's garbage collector, as a pointer, so this keeps the actual object alive.
You'll see this pattern in some of the Go internal software:
// need to keep the pointer alive while we make a syscall
p := unsafe.Pointer(foo)
ret := syscall.SyscallN(..., uintptr(foo), ...)
The apparently pointless creation of local variable p serves to protect the underlying object from being GC'ed while the OS system call reads its bytes. (Note that this is being overly chummy with the compiler since the assignment to p appears to be dead code here. Perhaps the internal software is fancier than this, and/or they're using //go:... comments as well.)
This same pattern works in the Go playground if I take your not-quite-minimal reproducible example and make the obvious minimal changes to it. Whether that's sufficient (and precisely how you'd like to use this same concept in your interpreter) is another question entirely, but see playground copy. Note: I had to add one closing brace to your program but after that it exhibited the same behavior you saw; here's that version. It draws two warnings from go vet about misuse of unsafe.Pointer, which my updated version doesn't.

I have a question about Go pointer usage in package bytes

I have a question about a usage of pointer in Go. The link is here: https://golang.org/pkg/bytes/#example_Buffer.
In the type Buffer section, the first example:
type Buffer struct {
// contains filtered or unexported fields
}
func main() {
var b bytes.Buffer // A Buffer needs no initialization.
b.Write([]byte("Hello "))
fmt.Fprintf(&b, "world!")
b.WriteTo(os.Stdout)
}
and then in the
func (b *Buffer) Write(p []byte) (n int, err error)
I know that the receiver of func Write is (b *Buffer) then why in the main() function, after declaring/initializing b, we can simply use b.Write() but not (&b).Write()?
Thank you!
The receiver is a pointer, and in b.Write(), b is addressable. So Write is invoked on a pointer to b, not a copy of b. If b was not addressable, then you'd have received a compile error. For instance, this would fail:
bytes.Buffer{}.Write([]byte{1})
In general: you can call methods with pointer receivers only if you can take the address of the receiver object. The compiler passes the reference, not the copy for such methods.

Function returns lock by value

I have the following structure
type Groups struct {
sync.Mutex
Names []string
}
and the following function
func NewGroups(names ...string) (Groups, error) {
// ...
return groups, nil
}
When I check for semantic errors with go vet, I am getting this warning:
NewGroups returns Lock by value: Groups
As go vet is shouting, it is not good. What problems can this code bring? How can I fix this?
You need to embed the sync.Mutex as a pointer:
type Groups struct {
*sync.Mutex
Names []strng
}
Addressing your comment on your question: In the article http://blog.golang.org/go-maps-in-action notice Gerrand is not returning the struct from a function but is using it right away, that is why he isn't using a pointer. In your case you are returning it, so you need a pointer so as not to make a copy of the Mutex.
Update: As #JimB points out, it may not be prudent to embed a pointer to sync.Mutex, it might be better to return a pointer to the outer struct and continue to embed the sync.Mutex as a value. Consider what you are trying to accomplish in your specific case.
Return a pointer *Groups instead.
Embedding the mutex pointer also works but has two disadvantages that require extra care from your side:
the zero value of the struct would have a nil mutex, so you must explicitly initialize it every time
func main() {
a, _ := NewGroups()
a.Lock() // panic: nil pointer dereference
}
func NewGroups(names ...string) (Groups, error) {
return Groups{/* whoops, mutex zero val is nil */ Names: names}, nil
}
assigning a struct value, or passing it as function arg, makes a copy so you also copy the mutex pointer, which then locks all copies. (This may be a legit use case in some particular circumstances, but most of the time it might not be what you want.)
func main() {
a, _ := NewGroups()
a.Lock()
lockShared(a)
fmt.Println("done")
}
func NewGroups(names ...string) (Groups, error) {
return Groups{Mutex: &sync.Mutex{}, Names: names}, nil
}
func lockShared(g Groups) {
g.Lock() // whoops, deadlock! the mutex pointer is the same
}
Keep your original struct and return pointers. You don't have to explicitly init the embedded mutex, and it's intuitive that the mutex is not shared with copies of your struct.
func NewGroups(names ...string) (*Groups, error) {
// ...
return &Groups{}, nil
}
Playground (with the failing examples): https://play.golang.org/p/CcdZYcrN4lm

how does GC behave when a pointer to struct is replaced

I'm learning Go and I'm reading examples from libraries. I found that some examples are using:
type MyType struct {
Code string
//...
}
func main() {
myType := &MyType{...}
//...
myType = &MyType{...}
}
Basically they are reusing variables. I understand that &MyType{..} returns a pointer, later I can replace that pointer. What happens with the previous pointed memory. Will the GC reclaim that memory or will I waste that memory. Maybe this is a silly question and I'm concerned for nothing but I'm trying to learn Go to build performance APIs :)
The memory will be reclaimed by the garbage collector.
If you want to replace the struct you can do it this way:
func main() {
myType := &MyType{...}
//...
*myType = MyType{...}
}
The difference will probably be negligible though.

Thread Safe In Value Receiver In Go

type MyMap struct {
data map[int]int
}
func (m Mymap)foo(){
//insert or read from m.data
}
...
go func f (m *Mymap){
for {
//insert into m.data
}
}()
...
Var m Mymap
m.foo()
When I call m.foo(), as we know , there is a copy of "m",value copy ,which is done by compiler 。 My question is , is there a race in the procedure? It is some kind of reading data from the var "m", I mean , you may need a read lock in case someone is inserting values into m.data when you are copying something from m.data.
If it is thread-safe , is it guarenteed by compiler?
This is not safe, and there is no implied safe concurrent access in the language. All concurrent data access is unsafe, and needs to be protected with channels or locks.
Because maps internally contain references to the data they contain, even as the outer structure is copied the map still points to the same data. A concurrent map is often a common requirement, and all you need to do is add a mutex to protect the reads and writes. Though a Mutex pointer would work with your value receiver, it's more idiomatic to use a pointer receiver for mutating methods.
type MyMap struct {
sync.Mutex
data map[int]int
}
func (m *MyMap) foo() {
m.Lock()
defer m.Unlock()
//insert or read from m.data
}
The go memory model is very explicit, and races are generally very easy to reason about. When in doubt, always run your program or tests with -race.

Resources