Free C pointer when collected by GC - go

I have a package that interfaces with a C library. Now I need to store a pointer to a C struct in the Go struct
type A struct {
s *C.struct_b
}
Obviously this pointer needs to be freed before the struct is collected by the GC. How can I accomplish that?

The best thing to do is when possible copy the C struct into go controlled memory.
var ns C.struct_b
ns = *A.s
A.s = &ns
Obviously, that won't work in all cases. C.struct_b may be too complicated or shared with something still in C code. In this case, you need to create a .Free() or .Close() method (whichever makes more sense) and document that the user of your struct must call it. In Go, a Free method should always be safe to call. For example, after free is run, be sure to set A.s = nil so that if the user calls Free twice, the program does not crash.
There is also a way to create finalizers. See another answer I wrote here. However, they may not always run and if garbage is created fast enough, it is very possible that the creation of garbage will out pace collection. This should be considered as a supplement to having a Free/Close method and not a replacement.

Related

Golang: are global variables protected from garbage collection?

I'm fairly new to Golang. I'm working on an application that builds an in-memory object-oriented data model (basically an ORM) to support the application functionality. I realize this isn't really idiomatic Go but it makes sense in this situation.
All my core objects are allocated on the heap then stored in global (though not necessarily exported) map structures that allow the code to look them up based on database IDs. Objects that reference instances of other objects have pointer fields in their structure definitions.
I was under the impression that any data that can be reached from a global variable is protected from being garbage collected. However, I am seeing intermittent cases of pointer references apparently becoming nil over time. If I restart the application, and rebuild the object model, then try the same operation, the problem disappears.
Is GC freeing my memory out from under me? Or should I look elsewhere to understand this problem? And if the answer to my first question is yes... how can I stop this from happening?
The garbage collector does not free memory as long as it is reachable. Global or package level variables are accessible during the whole lifetime of your app, so they can't be freed by the GC.
If you see the opposite, that is definitely a bug / mistake on your part (unless the Go runtime itself has a bug). For example you may have a data race initializing / accessing your global variables, or you (or some library you use) may use package unsafe or the uintptr type incorrectly. For example, quoting from unsafe.Pointer:
A uintptr is an integer, not a reference. Converting a Pointer to a uintptr creates an integer value with no pointer semantics. Even if a uintptr holds the address of some object, the garbage collector will not update that uintptr's value if the object moves, nor will that uintptr keep the object from being reclaimed.

What happens when the raw pointer from shared_ptr get() is deleted?

I wrote some code like this:
shared_ptr<int> r = make_shared<int>();
int *ar = r.get();
delete ar; // report double free or corruption
// still some code
When the code ran up to delete ar;, the program crashed, and reported​ "double free or corruption", I'm confused why double free? The "r" is still in the scope, and not popped-off from stack. Do the delete operator do something magic?? Does it know the raw pointer is handled by a smart pointer currently? and then counter in "r" be decremented to zero automatically?
I know the operations is not recommended, but I want to know why?
You are deleting a pointer that didn't come from new, so you have undefined behavior (anything can happen).
From cppreference on delete:
For the first (non-array) form, expression must be a pointer to an object type or a class type contextually implicitly convertible to such pointer, and its value must be either null or pointer to a non-array object created by a new-expression, or a pointer to a base subobject of a non-array object created by a new-expression. If expression is anything else, including if it is a pointer obtained by the array form of new-expression, the behavior is undefined.
If the allocation is done by new, we can be sure that the pointer we have is something we can use delete on. But in the case of shared_ptr.get(), we cannot be sure if we can use delete because it might not be the actual pointer returned by new.
shared_ptr<int> r = make_shared<int>();
There is no guarantee that this will call new int (which isn't strictly observable by the user anyway) or more generally new T (which is observable with a user defined, class specific operator new); in practice, it won't (there is no guarantee that it won't).
The discussion that follows isn't just about shared_ptr, but about "smart pointers" with ownership semantics. For any owning smart pointer smart_owning:
The primary motivation for make_owning instead of smart_owning<T>(new T) is to avoid having a memory allocation without owner at any time; that was essential in C++ when order of evaluation of expressions didn't provide the guarantee that evaluation of the sub-expressions in the argument list was immediately before call of that function; historically in C++:
f (smart_owning<T>(new T), smart_owning<U>(new U));
could be evaluated as:
T *temp1 = new T;
U *temp2 = new U;
auto &&temp3 = smart_owning<T>(temp1);
auto &&temp4 = smart_owning<U>(temp2);
This way temp1 and temp2 are not managed by any owning object for a non trivial time:
obviously new U can throw an exception
constructing an owning smart pointer usually requires the allocation of (small) ressources and can throw
So either temp1 or temp2 could be leaked (but not both) if an exception was thrown, which was the exact problem we were trying to avoid in the first place. This means composite expressions involving construction of owning smart pointers was a bad idea; this is fine:
auto &&temp_t = smart_owning<T>(new T);
auto &&temp_u = smart_owning<U>(new U);
f (temp_t, temp_u);
Usually expression involving as many sub-expression with function calls as f (smart_owning<T>(new T), smart_owning<U>(new U)) are considered reasonable (it's a pretty simple expression in term of number of sub-expressions). Disallowing such expressions is quite annoying and very difficult to justify.
[This is one reason, and in my opinion the most compelling reason, why the non determinism of the order of evaluation was removed by the C++ standardisation committee so that such code is not safe. (This was an issue not just for memory allocated, but for any managed allocation, like file descriptors, database handles...)]
Because code frequently needed to do things such as smart_owning<T>(allocate_T()) in sub-expressions, and because telling programmers to decompose moderately complex expressions involving allocation in many simple lines wasn't appealing (more lines of code doesn't mean easier to read), the library writers provided a simple fix: a function to do the creation of an object with dynamic lifetime and the creation of its owning object together. That solved the order of evaluation problem (but was complicated at first because it needed perfect forwarding of the arguments of the constructor).
Giving two tasks to a function (allocate an instance of T and a instance of smart_owning) gives the freedom to do an interesting optimization: you can avoid one dynamic allocation by putting both the managed object and its owner next to each others.
But once again, that was not the primary purpose of functions like make_shared.
Because exclusive ownership smart pointers by definition don't need to keep a reference count, and by definition don't need to share the data needed for the deleter either between instances, and so can keep that data in the "smart pointer"(*), no additional allocation is needed for the construction of unique_ptr; yet a make_unique function template was added, to avoid the dangling pointer issue, not to optimize a non-thing (an allocation that isn't done in the fist place).
(*) which BTW means unique owner "smart pointers" do not have pointer semantic, as pointer semantic implies that you can makes copies of the "pointer", and you can't have two copies of a unique owner pointing to the same instance; "smart pointers" were never pointers anyway, the term is misleading.
Summary:
make_shared<T> does an optional optimization where there is no separate dynamic memory allocation for T: there is no operator new(sizeof (T)). There is obviously still the creation of an instance with dynamic lifetime with another operator new: placement new.
If you replace the explicit memory deallocation with an explicit destruction and add a pause immediately after that point:
class C {
public:
~C();
};
shared_ptr<C> r = make_shared<C>();
C *ar = r.get();
ar->~C();
pause(); // stops the program forever
The program will probably run fine; it is still illogical, indefensible, incorrect to explicitly destroy an object managed by a smart pointer. It isn't "your" resource. If pause() could exit with an exception, the owning smart pointer would try to destroy the managed object which doesn't even exist anymore.
It of course depends on how library implements make_shared, however most probable implementation is that:
std::make_shared allocates one block for two things:
shared pointer control block
contained object
std::make_shared() will invoke memory allocator once and then it will call placement new twice to initialize (call constructors) of mentioned two things.
| block requested from allocator |
| shared_ptr control block | X object |
#1 #2 #3
That means that memory allocator has provided one big block, which address is #1.
Shared pointer then uses it for control block (#1) and actual contained object (#2).
When you invoke delete with actual object kept by shred_ptr ( .get() ) you call delete(#2).
Because #2 is not known by allocator you get an corruption error.
See here. I quot:
std::shared_ptr is a smart pointer that retains shared ownership of an object through a pointer. Several shared_ptr objects may own the same object. The object is destroyed and its memory deallocated when either of the following happens:
the last remaining shared_ptr owning the object is destroyed;
the last remaining shared_ptr owning the object is assigned another pointer via operator= or reset().
The object is destroyed using delete-expression or a custom deleter that is supplied to shared_ptr during construction.
So the pointer is deleted by shared_ptr. You're not suppose to delete the stored pointer yourself
UPDATE:
I didn't realize that there were more statements and the pointer was not out of scope, I'm sorry.
I was reading more and the standard doesn't say much about the behavior of get() but here is a note, I quote:
A shared_ptr may share ownership of an object while storing a pointer to another object. get() returns the stored pointer, not the managed pointer.
So it looks that it is allowed that the pointer returned by get() is not necessarily the same pointer allocated by the shared_ptr (presumably using new). So delete that pointer is undefined behavior. I will be looking a little more into the details.
UPDATE 2:
The standard says at § 20.7.2.2.6 (about make_shared):
6 Remarks: Implementations are encouraged, but not required, to perform no more than one memory allocation. [ Note: This provides efficiency equivalent to an intrusive smart pointer. — end note ]
7 [ Note: These functions will typically allocate more memory than sizeof(T) to allow for internal bookkeeping structures such as the reference counts. — end note ]
So an specific implementation of make_shared could allocate a single chunk of memory (or more) and use part of that memory to initialize the stored pointer (but maybe not all the memory allocated). get() must return a pointer to the stored object, but there is no requirement by the standard, as previously said, that the pointer returned by get() has to be the one allocated by new. So delete that pointer is undefined behavior, you got a signal raised but anything can happen.

Why finalizer is never called?

var p = &sync.Pool{
New: func() interface{} {
return &serveconn{}
},
}
func newServeConn() *serveconn {
sc := p.Get().(*serveconn)
runtime.SetFinalizer(sc, (*serveconn).finalize)
fmt.Println(sc, "SetFinalizer")
return sc
}
func (sc *serveconn) finalize() {
fmt.Println(sc, "finalize")
*sc = serveconn{}
runtime.SetFinalizer(sc, nil)
p.Put(sc)
}
The above code tries to reuse object by SetFinalizer, but after debug I found finalizer is never called, why?
UPDATE
This may be related:https://github.com/golang/go/issues/2368
The above code tries to reuse object by SetFinalizer, but after debug I found finalizer is never called, why?
The finalizer is only called on an object when the GC
marks it as unused and then tries to sweep (free) at the end
of the GC cycle.
As a corollary, if a GC cycle is never performed during the runtime of your program, the finalizers you set may never be called.
Just in case you might hold a wrong assumption about the Go's GC, it may worth noting that Go does not employ reference counting on values; instead, it uses GC which works in parallel with the program, and the sessions during which it works happen periodically and are triggered by certain parameters like pressure on the heap produced by allocations.
A couple assorted notes regarding finalizers:
When the program terminates, no GC is forcibly run.
A corollary of this is that a finalizer is not guaranteed
to run at all.
If the GC finds a finalizer on an object about to be freed,
it calls the finalizer but does not free the object.
The object itself will be freed only at the next GC cycle —
wasting the memory.
All in all, you appear as trying to implement destructors.
Please don't: make your objects implement the sort-of standard method called Close and state in the contract of your type that the programmer is required to call it when they're done with the object.
When a programmer wants to call such a method no matter what, they use defer.
Note that this approach works perfectly for all types in the Go
stdlib which wrap resources provided by the OS—file and socket descriptors. So there is no need to pretend your types are somehow different.
Another useful thing to keep in mind is that Go was explicitly engineered to be no-nonsense, no-frills, no-magic, in-your-face language, and you're just trying to add magic to it.
Please don't, those who like decyphering layers of magic do program in Scala different languages.

Do I have to free structs created with Cgo?

I create C structs in my Go code, like this:
var data C.MyStruct_t
Do I have to free them manually at some point, like I do when I use CString? With CString I often do something like:
ctitle := C.String(title)
defer C.free(unsafe.Pointer(&ctitle))
C.my_func(&ctitle)
No. You only call free on something that was allocated via the C *alloc functions. The C.CString and C.CBytes functions are documented as doing so internally, and requiring the use of C.free.
In this case even though data is of type C.MyStruct_t it is allocated in Go, and therefor will be handled by the Go garbage collector.

D Dynamic Arrays - RAII

I admit I have no deep understanding of D at this point, my knowledge relies purely on what documentation I have read and the few examples I have tried.
In C++ you could rely on the RAII idiom to call the destructor of objects on exiting their local scope.
Can you in D?
I understand D is a garbage collected language, and that it also supports RAII.
Why does the following code not cleanup the memory as it leaves a scope then?
import std.stdio;
void main() {
{
const int len = 1000 * 1000 * 256; // ~1GiB
int[] arr;
arr.length = len;
arr[] = 99;
}
while (true) {}
}
The infinite loop is there so as to keep the program open to make residual memory allocations easy visible.
A comparison of a equivalent same program in C++ is shown below.
It can be seen that C++ immediately cleaned up the memory after allocation (the refresh rate makes it appear as if less memory was allocated), whereas D kept it even though it had left scope.
Therefore, when does the GC cleanup?
scope declarations are going in D2, so I'm not terribly certain on the semantics, but what I'd imagine is happening is that scope T[] a; only allocates the array struct on the stack (which needless to say, already happens, regardless of scope). As they are going, don't use scope (using scope(exit) and friends is different -- keep using them).
Dynamic arrays always use the GC to allocate their memory -- there's no getting around that. If you want something more deterministic, using std.container.Array would be the simplest manner, as I think you could pretty much drop it in where your scope vector3b array is:
Array!vector3b array
Just don't bother setting the length to zero -- the memory will be free'd once it goes out of scope (Array uses malloc/free from libc under the hood).
No, you cannot assume that the garbage collector will collect your object at any point in time.
There is, however, a delete keyword (as well as a scope keyword) that can delete an object deterministically.
scope is used like:
{
scope auto obj = new int[5];
//....
} //obj cleaned up here
and delete is used like in C++ (there's no [] notation for delete).
There are some gotcha's, though:
It doesn't always work properly (I hear it doesn't work well with arrays)
The developers of D (e.g. Andrei) are intending to remove them in later versions, because it can obviously mess up things if used incorrectly. (I personally hate this, given that it's so easy to screw things up anyway, but they're sticking with removing it, and I don't think people can convince them otherwise although I'd love it if that was the case.)
In its place, there is already a clear method that you can use, like arr.clear(); however, I'm not quite sure what it exactly does yet myself, but you could look at the source code in object.d in the D runtime if you're interested.
As to your amazement: I'm glad you're amazed, but it shouldn't be really surprising considering that they're both native code. :-)

Resources