How to "pass a Go pointer to Cgo"? - go

I am confused regarding the passing of Go pointers (which, to my understanding, include all pointer types as well as unsafe.Pointer) to cgo. When calling C functions with cgo, I can only provide variables of types known on the C-side, or unsafe.Pointer if it matches with a void*-typed parameter in the C-function's signature. So when "Go pointers passed to C are pinned for lifetime of call", how does Go know that what I am passing is, in fact, a Go pointer, if I am ever forced to cast it to C.some_wide_enough_uint_type or C.some_c_pointer_type beforehand? The moment it is cast, isn't the information that it is a Go pointer lost, and I run risk of the GC changing the pointer? (I can see how freeing is prevented at least, when a pointer-type reference is retained on the Go-side)
We have a project with a fair amount of working cgo code, but zero confidence in its reliability. I would like to see an example of "here is how to do it correctly" which doesn't resort to circumventing Go's memory model by using C.malloc() or such, which most examples unfortunately do.
So regardless of what "pinning the pointer for lifetime of call" actually means, I see a problem either way:
If it means that Go will pin all pointers in the entire program, I see a race condition in the time interval between casting a Go pointer to a C-type and the cgo-call actually being invoked.
If it means that Go will pin only those Go pointers which are being passed, how does it know that they are Go pointers when, at the time of calling, they can only have a C-type?
I've been reading through Go issues for half the day and am starting to feel like I'm just missing something simple. Any pointers are appreciated.
EDIT: I will try to clarify the question by providing examples.
Consider this:
/*
#include <stdio.h>
void myCFunc(void* ptr) {
printf((char*)ptr);
}
*/
import "C"
import "unsafe"
func callMyCFunc() {
goPointer := []byte("abc123\n\x00")
C.myCFunc(unsafe.Pointer(&goPointer[0]))
}
Here, Go's unsafe.Pointer-type effortlessly translates into C's void*-type, so we are happy on the C-side of things, and we should be on the Go-side also: the pointer clearly points into Go-allocated memory, so it should be trivial for Go to figure out that it should pin this pointer during the call, despite it being an unsafe one. Is this the case? If it is, without further research, I would consider this to be the preferred way to pass Go pointers to cgo. Is it?
Then, consider this:
/*
#include <stdio.h>
void myCFunc(unsigned long long int stupidlyTypedPointerVariable) {
char* pointerToHopefullyStillTheSameMemory = (char*)stupidlyTypedPointerVariable;
printf(pointerToHopefullyStillTheSameMemory);
}
*/
import "C"
import "unsafe"
func callMyCFunc() {
goPointer := []byte("abc123\n\x00")
C.myCFunc(C.ulonglong(uintptr(unsafe.Pointer(&goPointer[0]))))
}
Here, I would expect that Go won't make any guesses on whether some C.ulonglong-typed variable actually means to contain the address of a Go pointer. But am I correct?
My confusion largely arises from the fact that it's not really possible to write some code to reliably test this with.
Finally, what about this:
/*
#include <stdio.h>
void cFuncOverWhichIHaveNoControl(char* ptr) {
printf(ptr);
}
*/
import "C"
import "unsafe"
func callMyCFunc() {
goPointer := []byte("abc123\n\x00")
C.cFuncOverWhichIHaveNoControl((*C.char)(unsafe.Pointer(&goPointer[0])))
}
If I am, for whatever reason, unable to change the signature of the C-function, I must cast to *C.char. Will Go still check if the value is a Go pointer, when it already is a C pointer-type?

Looking at the section on passing pointers in the current cgo documentation, (thanks to peterSO) we find that
the term Go pointer means a pointer to memory allocated by Go
as well as that
A pointer type may hold a Go pointer or a C pointer
Thus, using uintptr and other integer (read: non-pointer) types will lose us Go's guarantee of pinning the pointer.
A uintptr is an integer, not a reference. Converting a Pointer to a uintptr creates an integer value with no pointer semantics. Even if a uintptr holds the address of some object, the garbage collector will not update that uintptr's value if the object moves, nor will that uintptr keep the object from being reclaimed.
Source: https://golang.org/pkg/unsafe/#Pointer
Regarding C pointer types such as *char/*C.char, these are only safe when the pointed data does not itself contain pointers to other memory allocated by Go. This can actually be shown by trying to trigger Go's Cgo Debug mechanism, which disallows passing a Go pointer to (or into) a value which itself contains another Go pointer:
package main
import (
"fmt"
"unsafe"
/*
#include <stdio.h>
void cFuncChar(char* ptr) {
printf("%s\n", ptr);
}
void cFuncVoid(void* ptr) {
printf("%s\n", (char*)ptr);
}
*/
"C"
)
type MyStruct struct {
Distraction [2]byte
Dangerous *MyStruct
}
func main() {
bypassDetection()
triggerDetection()
}
func bypassDetection() {
fmt.Println("=== Bypass Detection ===")
ms := &MyStruct{[2]byte{'A', 0}, &MyStruct{[2]byte{0, 0}, nil}}
C.cFuncChar((*C.char)(unsafe.Pointer(ms)))
}
func triggerDetection() {
fmt.Println("=== Trigger Detection ===")
ms := &MyStruct{[2]byte{'B', 0}, &MyStruct{[2]byte{0, 0}, nil}}
C.cFuncVoid(unsafe.Pointer(ms))
}
This will print the following:
=== Bypass Detection ===
A
=== Trigger Detection ===
panic: runtime error: cgo argument has Go pointer to Go pointer
Using *C.char bypassed the detection. Only using unsafe.Pointer will detect Go pointer to Go pointer scenarios. Unfortunately, this means we will have to have an occasional nebulous void*-parameter in the C-function's signature.
Adding for clarity: Go may very well pin the value pointed by a *C.char or such, which is safe to pass; it just (reasonably) won't make an effort to find out whether it might be something else which could contain additional pointers into memory allocated by Go. Casting to unsafe.Pointer is actually safe; casting from it is what may be dangerous.

Related

runtime.SetFinalizer: could not determine kind of name for C.Char

Please consider this sample go code below:
package main
/*
#include <stdio.h>
#include <stdlib.h>
*/
import "C"
import (
"fmt"
"runtime"
"unsafe"
)
func main() {
// Convert Go string to C string using C.CString
cString := C.CString("Wold!")
fmt.Printf("C.CString type: %T\n", cString)
//C.free(unsafe.Pointer(cString)) // <-- this works, but I don't want to free it manually..
runtime.SetFinalizer(&cString, func(t *C.Char) {
C.free(unsafe.Pointer(t))
})
}
I am experimenting with cGo, and trying to free cString. When I try to free my variable cString using runtime.SetFinalizer I encounter:
$ go build a.go
# command-line-arguments
./a.go:22:41: could not determine kind of name for C.Char
Please point me the correct direction. Thanks!
When the cgo system is turning your wrapper into something the Go compiler understands, it has to translate each of the C types to a Go type for various purposes. It turns out this doesn't work for your case (this is the error you saw).
That's actually OK, because your code would never have worked the way you wanted in the first place. A runtime finalizer runs when Go's garbage collector is ready to release a Go object that occupies Go memory, but C.Cstring returns a pointer that is not Go memory. In particular, note the following quote from the cgo documentation:
// Go string to C string
// The C string is allocated in the C heap using malloc.
// It is the caller's responsibility to arrange for it to be
// freed, such as by calling C.free (be sure to include stdlib.h
// if C.free is needed).
func C.CString(string) *C.char
Since the returned string is on the "C heap" it will never be finalized by the Go garbage collector. Had your code compiled, it would have just been a no-op.
If you have a Go object whose lifetime parallels that of the C object, you could perhaps use that. Here's a made-up (but working) example:
package main
/*
#include <stdio.h>
#include <stdlib.h>
*/
import "C"
import (
"fmt"
"runtime"
"time"
"unsafe"
)
type S struct {
Foo int
ToFree unsafe.Pointer
}
func main() {
doit()
runtime.GC()
time.Sleep(10 * time.Millisecond) // ugly hack
}
func doit() {
cString := C.CString("Wold!")
fmt.Printf("C.CString type: %T\n", cString)
x := &S{Foo: 1, ToFree: unsafe.Pointer(cString)}
runtime.SetFinalizer(x, func(t *S) {
fmt.Println("freeing C string")
C.free(t.ToFree)
})
}
When the allocated object for x goes out of scope it becomes eligible for GC. The actual GC may never happen, so I forced one with runtime.GC() in main. This triggers the finalizer:
$ ./cfree_example
C.CString type: *main._Ctype_char
freeing C string
The "ugly hack" is in there because if main returns before the finalizer call has finished writing the freeing C string messages, it gets lost. In a real program you would not need this.

How to get a pointer to the underlying value of an Interface{} in Go

I'm interfacing with C code in Go using cgo, and I need to call a C function with a pointer to the underlying value in an Interface{} object. The value will be any of the atomic primitive types (not including complex64/complex128), or string.
I was hoping I'd be able to do something like this to get the address of ptr as an unsafe.Pointer:
unsafe.Pointer(reflect.ValueOf(ptr).UnsafeAddr())
But this results in a panic due to the value being unaddressable.
A similar question to this is Take address of value inside an interface, but this question is different, as in this case it is known that the value will always be one of the types specified above (which will be at most 64 bits), and I only need to give this value to a C function. Note that there are multiple C functions, and the one that will be called varies based off of a different unrelated parameter.
I also tried to solve this using a type switch statement, however I found myself unable to get the address of the values even after the type assertion was done. I was able to assign the values to temporary copies, then get the address of those copies, but I'd rather avoid making these copies if possible.
interface{} has own struct:
type eface struct {
typ *rtype
val unsafe.Pointer
}
You have no access to rtype directly or by linking, on the other hand, even though you'll copy whole rtype, it may be changed (deprecated) at future.
But thing is that you can replace pointer types with unsafe.Pointer (it may be anything else with same size, but pointer is much idiomatic, because each type has own pointer):
type eface struct {
typ, val unsafe.Pointer
}
So, now we can get value contained in eface:
func some_func(arg interface{}) {
passed_value := (*eface)(unsafe.Pointer(&arg)).val
*(*byte)(passed_value) = 'b'
}
some_var := byte('a')
fmt.Println(string(some_var)) // 'a'
some_func(some_var)
fmt.Println(string(some_var)) // 'a', it didn't changed, just because it was copied
some_func(&some_var)
fmt.Println(string(some_var)) // 'b'
You also might see some more usages at my repo:
https://github.com/LaevusDexter/fast-cast
Sorry for my poor English.

Why is unsafe.Sizeof considered unsafe?

Consider the following:
import (
"log"
"unsafe"
)
type Foo struct {
Bar int32
}
func main() {
log.Println(int(unsafe.Sizeof(Foo{})))
}
Why is determining the size of a variable considered unsafe, and a part of the unsafe package? I don't understand why obtaining the size of any type is an unsafe operation, or what mechanism go uses to determine its size that necessitates this.
I would also love to know if there are any alternatives to the unsafe package for determining size of a known struct.
Because in Go if you need to call sizeof, it generally means you're manipulating memory directly, and you should never need to do that.
If you come from the C world, you'll probably most often have used sizeof together with malloc to create a variable-length array - but this should not be needed in Go, where you can simply make([]Foo, 10). In Go, the amount of memory to be allocated is taken care of by the runtime.
You should not be afraid of calling unsafe.Sizeof where it really makes sense - but you should ask yourself whether you actually need it.
Even if you're using it for, say, writing a binary format, it's generally a good idea to calculate by yourself the number of bytes you need, or if anything generate it dynamically using reflect:
calling unsafe.Sizeof on a struct will also include the number of bytes added in for padding.
calling it on dynamically-sized structures (ie. slices, strings) will yield the length of their headers - you should call len() instead.
Using unsafe on a uintptr, int or uint to determine whether you're running on 32-bit or 64-bit? You can generally avoid that by specifying int64 where you actually need to support numbers bigger than 2^31. Or, if you really need to detect that, you have many other options, such as build tags or something like this:
package main
import (
"fmt"
)
const is32bit = ^uint(0) == (1 << 32) - 1
func main() {
fmt.Println(is32bit)
}
From the looks of the unsafe package the methods don't use go's type safety for their operations.
https://godoc.org/unsafe
Package unsafe contains operations that step around the type safety of
Go programs.
Packages that import unsafe may be non-portable and are not protected
by the Go 1 compatibility guidelines.
So from the sounds of it the unsafe-ness is in the kind of code being provided, not necessarily from calling it in particular
Go is a type safe programming language. It won't let you do stuff like this:
package main
type Foo = struct{ A string }
type Bar = struct{ B int }
func main() {
var foo = &Foo{A: "Foo"}
var bar = foo.(*Bar) // invalid operation!
var bar2, ok = foo.(*Bar) // invalid operation!
}
Even if you use the type assertion with the special form that yields an additional boolean value; the compiler goes: haha, nope.
In a programming language like C though, the default is to assume that you are in charge. The program below will compile just fine.
typedef struct foo {
const char* a_;
} foo;
typedef struct bar {
int b_;
} bar;
int main() {
foo f;
f.a_ = "Foo";
bar* b = &f; // warning: incompatible pointer types
bar* b2 = (bar*)&f;
return 0;
}
You get warnings for things that are probably wrong because people have learned over time that this is a common mistake but it's not stopping you. It's just emitting a warning.
Type safety just means that you can't make the same mistake C programmers have made a thousand times over already but it is neither unsafe nor wrong to use the the unsafe package or the C programming language. The unsafe package has just been named in opposition to type safety and it is precisely the right tool when you need to fiddle with the bits (manipulate the representation of things in memory; directly).

How can I fill out void* C pointer in Go?

I am trying to interface with some C code from Go. Using cgo, this has been relatively straight-forward until I hit this (fairly common) case: needing to pass a pointer to a structure that itself contains a pointer to some data. I cannot seem to figure out how to do this from Go without resorting to putting the creation of the structure into the C code itself, which I'd prefer not to do. Here is a snippet that illustrates the problem:
package main
// typedef struct {
// int size;
// void *data;
// } info;
//
// void test(info *infoPtr) {
// // Do something here...
// }
import "C"
import "unsafe"
func main() {
var data uint8 = 5
info := &C.info{size: C.int(unsafe.Sizeof(data)), data: unsafe.Pointer(&data)}
C.test(info)
}
While this compiles fine, trying to run it results in:
panic: runtime error: cgo argument has Go pointer to Go pointer
In my case, the data being passed to the C call doesn't persist past the call (i.e. the C code in question digs into the structure, copies what it needs, then returns).
See "Passing pointers" section in cgo docs:
Go code may pass a Go pointer to C provided the Go memory to which it points does not contain any Go pointers.
And also:
These rules are checked dynamically at runtime. The checking is controlled by the cgocheck setting of the GODEBUG environment variable. The default setting is GODEBUG=cgocheck=1, which implements reasonably cheap dynamic checks. These checks may be disabled entirely using GODEBUG=cgocheck=0. Complete checking of pointer handling, at some cost in run time, is available via GODEBUG=cgocheck=2.
If you run the snippet you've provided with:
GODEBUG=cgocheck=0 go run snippet.go
Then there is no panic. However, the correct way to go is to use C.malloc (or obtain a "C pointer" from somewhere else):
package main
// #include <stdlib.h>
// typedef struct {
// int size;
// void *data;
// } info;
//
// void test(info *infoPtr) {
// // Do something here...
// }
import "C"
import "unsafe"
func main() {
var data uint8 = 5
cdata := C.malloc(C.size_t(unsafe.Sizeof(data)))
*(*C.char)(cdata) = C.char(data)
defer C.free(cdata)
info := &C.info{size: C.int(unsafe.Sizeof(data)), data: cdata}
C.test(info)
}
It works because while regular Go pointers are not allowed, C.malloc returns a "C pointer":
Go pointer means a pointer to memory allocated by Go (such as by using the & operator or calling the predefined new function) and the term C pointer means a pointer to memory allocated by C (such as by a call to C.malloc). Whether a pointer is a Go pointer or a C pointer is a dynamic property determined by how the memory was allocated.
Note that you need to include stdlib.h to use C.free.

Does assigning value to interface copy anything?

I've been trying to wrap my head around the concept of interfaces in Go. Reading this and this helped a lot.
The only thing that makes me uncomfortable is the syntax. Have a look at the example below:
package main
import "fmt"
type Interface interface {
String() string
}
type Implementation int
func (v Implementation) String() string {
return fmt.Sprintf("Hello %d", v)
}
func main() {
var i Interface
impl := Implementation(42)
i = impl
fmt.Println(i.String())
}
My issue is with i = impl. Based on the fact that an interface instance actually holds a pointer reference to the actual data, it would feel more natural for me to do i = &impl. Usually assignment of non-pointer when not using & will make a full memory copy of the data, but when assigning to interfaces this seem to side-step this and instead simply (behind the scenes) assign the pointer to the interface value. Am I right? That is, the data for the int(42) will not be copied in memory?
The data for int(42) will be copied. Try this code:
func main() {
var i Interface
impl := Implementation(42)
i = impl
fmt.Println(i.String())
impl = Implementation(91)
fmt.Println(i.String())
}
(Playground link)
You'll find that the second i.String() still shows 42. Perhaps one of the trickier aspects of Go is that method receivers can be pointers as well.
func (v *Implementation) String() string {
return fmt.Sprintf("Hello %d", *v)
}
// ...
i = &impl
Is what you want if you want the interface to hold a pointer to the original value of impl. "Under the hood" an interface is a struct that either holds a pointer to some data, or the data itself (and some type metadata that we can ignore for our purposes). The data itself is stored if its size is less than or equal to one machine word -- whether it be a pointer, struct, or other value.
Otherwise it will be a pointer to some data, but here's the tricky part: if the type implementing the interface is a struct the pointer will be to a copy of the struct, not the struct assigned to the interface variable itself. Or at least semantically the user can think of it as such, optimizations may allow the value to not be copied until the two diverge (e.g. until you call String or reassign impl).
In short: assigning to an interface can semantically be thought of as a copy of the data that implements the interface. If this is a pointer to a type, it copies the pointer, if it's a big struct, it copies the big struct. The particulars of interfaces using pointers under the hood are for reasons of garbage collection and making sure the stack expands by predictable amounts. As far as the developer is concerned, they should be thought of as semantic copies of the specific instance of the implementing type assigned.

Resources