Why is unsafe.Sizeof considered unsafe? - go

Consider the following:
import (
"log"
"unsafe"
)
type Foo struct {
Bar int32
}
func main() {
log.Println(int(unsafe.Sizeof(Foo{})))
}
Why is determining the size of a variable considered unsafe, and a part of the unsafe package? I don't understand why obtaining the size of any type is an unsafe operation, or what mechanism go uses to determine its size that necessitates this.
I would also love to know if there are any alternatives to the unsafe package for determining size of a known struct.

Because in Go if you need to call sizeof, it generally means you're manipulating memory directly, and you should never need to do that.
If you come from the C world, you'll probably most often have used sizeof together with malloc to create a variable-length array - but this should not be needed in Go, where you can simply make([]Foo, 10). In Go, the amount of memory to be allocated is taken care of by the runtime.
You should not be afraid of calling unsafe.Sizeof where it really makes sense - but you should ask yourself whether you actually need it.
Even if you're using it for, say, writing a binary format, it's generally a good idea to calculate by yourself the number of bytes you need, or if anything generate it dynamically using reflect:
calling unsafe.Sizeof on a struct will also include the number of bytes added in for padding.
calling it on dynamically-sized structures (ie. slices, strings) will yield the length of their headers - you should call len() instead.
Using unsafe on a uintptr, int or uint to determine whether you're running on 32-bit or 64-bit? You can generally avoid that by specifying int64 where you actually need to support numbers bigger than 2^31. Or, if you really need to detect that, you have many other options, such as build tags or something like this:
package main
import (
"fmt"
)
const is32bit = ^uint(0) == (1 << 32) - 1
func main() {
fmt.Println(is32bit)
}

From the looks of the unsafe package the methods don't use go's type safety for their operations.
https://godoc.org/unsafe
Package unsafe contains operations that step around the type safety of
Go programs.
Packages that import unsafe may be non-portable and are not protected
by the Go 1 compatibility guidelines.
So from the sounds of it the unsafe-ness is in the kind of code being provided, not necessarily from calling it in particular

Go is a type safe programming language. It won't let you do stuff like this:
package main
type Foo = struct{ A string }
type Bar = struct{ B int }
func main() {
var foo = &Foo{A: "Foo"}
var bar = foo.(*Bar) // invalid operation!
var bar2, ok = foo.(*Bar) // invalid operation!
}
Even if you use the type assertion with the special form that yields an additional boolean value; the compiler goes: haha, nope.
In a programming language like C though, the default is to assume that you are in charge. The program below will compile just fine.
typedef struct foo {
const char* a_;
} foo;
typedef struct bar {
int b_;
} bar;
int main() {
foo f;
f.a_ = "Foo";
bar* b = &f; // warning: incompatible pointer types
bar* b2 = (bar*)&f;
return 0;
}
You get warnings for things that are probably wrong because people have learned over time that this is a common mistake but it's not stopping you. It's just emitting a warning.
Type safety just means that you can't make the same mistake C programmers have made a thousand times over already but it is neither unsafe nor wrong to use the the unsafe package or the C programming language. The unsafe package has just been named in opposition to type safety and it is precisely the right tool when you need to fiddle with the bits (manipulate the representation of things in memory; directly).

Related

How to convert a struct to a different struct with fewer fields

I am trying to copy a struct of type Big to type Small without explicitly creating a new struct of type Small with the same fields.
I have tried searching for other similar problems such as this and this yet all the conversions between different struct types happen only if the structs have the same fields.
Here is an example of what I tried to do:
// Big has all the fields that Small has including some new ones.
type Big struct {
A int
B string
C float
D byte
}
type Small struct {
A int
B string
}
// This is the current solution which I hope to not use.
func ConvertFromBigToSmall(big Big) Small {
return Small{
A: big.A,
B: big.B,
}
}
I expected to be able to do something like this, yet it does not work:
big := Big{}
small := Small(big)
Is there a way of converting between Big to Small (and maybe even vice-versa) without using a Convert function?
There is no built-in support for this. If you really need this, you could write a general function which uses reflection to copy the fields.
Or you could redesign. If Big is a Small plus some other, additional fields, why not reuse Small in Big?
type Small struct {
A int
B string
}
type Big struct {
S Small
C float
D byte
}
Then if you have a Big struct, you also have a Small: Big.S. If you have a Small and you need a Big: Big{S: small}.
If you worry about losing the convenience of shorter field names, or different marshalled results, then use embedding instead of a named field:
type Big struct {
Small // Embedding
C float
D byte
}
Then these are also valid: Big.A, Big.B. But if you need a Small value, you can refer to the embedded field using the unqualified type name as the field name, e.g. Big.Small (see Golang embedded struct type). Similarly, to create a Big from a Small: Big{Small: small}.
Is there a way of converting between Big to Small (and maybe even vice-versa) without using a Convert function?
The only option is to do it manually, as you have done. Whether you wrap that in a function or not, is a matter of taste/circumstance.
you can do something like this:
package main
import (
"fmt"
)
type Big struct {
Small
C float32
D byte
}
type Small struct {
A int
B string
}
func main() {
big := new(Big)
big.A = 1
big.B = "test"
big.C = 2.3
fmt.Printf("big struct: %+v", big)
fmt.Println()
small := big.Small
fmt.Printf("small struct: %+v", small)
fmt.Println()
}
output:
big struct: &{Small:{A:1 B:test} C:2.3 D:0}
small struct: {A:1 B:test}
playgroundlink:https://play.golang.org/p/-jP8Wb--att
I'm afraid there is no direct way to do that. What you did is the right way.
You can try to write the first object to JSON and then try to parse it back to the second object. Though, I wouldn't go this way.
One more way, which is a specific one is that the Big object will inherit the Small object. then you can downcast. Again, I wouldn't do that but if you must...

How to "pass a Go pointer to Cgo"?

I am confused regarding the passing of Go pointers (which, to my understanding, include all pointer types as well as unsafe.Pointer) to cgo. When calling C functions with cgo, I can only provide variables of types known on the C-side, or unsafe.Pointer if it matches with a void*-typed parameter in the C-function's signature. So when "Go pointers passed to C are pinned for lifetime of call", how does Go know that what I am passing is, in fact, a Go pointer, if I am ever forced to cast it to C.some_wide_enough_uint_type or C.some_c_pointer_type beforehand? The moment it is cast, isn't the information that it is a Go pointer lost, and I run risk of the GC changing the pointer? (I can see how freeing is prevented at least, when a pointer-type reference is retained on the Go-side)
We have a project with a fair amount of working cgo code, but zero confidence in its reliability. I would like to see an example of "here is how to do it correctly" which doesn't resort to circumventing Go's memory model by using C.malloc() or such, which most examples unfortunately do.
So regardless of what "pinning the pointer for lifetime of call" actually means, I see a problem either way:
If it means that Go will pin all pointers in the entire program, I see a race condition in the time interval between casting a Go pointer to a C-type and the cgo-call actually being invoked.
If it means that Go will pin only those Go pointers which are being passed, how does it know that they are Go pointers when, at the time of calling, they can only have a C-type?
I've been reading through Go issues for half the day and am starting to feel like I'm just missing something simple. Any pointers are appreciated.
EDIT: I will try to clarify the question by providing examples.
Consider this:
/*
#include <stdio.h>
void myCFunc(void* ptr) {
printf((char*)ptr);
}
*/
import "C"
import "unsafe"
func callMyCFunc() {
goPointer := []byte("abc123\n\x00")
C.myCFunc(unsafe.Pointer(&goPointer[0]))
}
Here, Go's unsafe.Pointer-type effortlessly translates into C's void*-type, so we are happy on the C-side of things, and we should be on the Go-side also: the pointer clearly points into Go-allocated memory, so it should be trivial for Go to figure out that it should pin this pointer during the call, despite it being an unsafe one. Is this the case? If it is, without further research, I would consider this to be the preferred way to pass Go pointers to cgo. Is it?
Then, consider this:
/*
#include <stdio.h>
void myCFunc(unsigned long long int stupidlyTypedPointerVariable) {
char* pointerToHopefullyStillTheSameMemory = (char*)stupidlyTypedPointerVariable;
printf(pointerToHopefullyStillTheSameMemory);
}
*/
import "C"
import "unsafe"
func callMyCFunc() {
goPointer := []byte("abc123\n\x00")
C.myCFunc(C.ulonglong(uintptr(unsafe.Pointer(&goPointer[0]))))
}
Here, I would expect that Go won't make any guesses on whether some C.ulonglong-typed variable actually means to contain the address of a Go pointer. But am I correct?
My confusion largely arises from the fact that it's not really possible to write some code to reliably test this with.
Finally, what about this:
/*
#include <stdio.h>
void cFuncOverWhichIHaveNoControl(char* ptr) {
printf(ptr);
}
*/
import "C"
import "unsafe"
func callMyCFunc() {
goPointer := []byte("abc123\n\x00")
C.cFuncOverWhichIHaveNoControl((*C.char)(unsafe.Pointer(&goPointer[0])))
}
If I am, for whatever reason, unable to change the signature of the C-function, I must cast to *C.char. Will Go still check if the value is a Go pointer, when it already is a C pointer-type?
Looking at the section on passing pointers in the current cgo documentation, (thanks to peterSO) we find that
the term Go pointer means a pointer to memory allocated by Go
as well as that
A pointer type may hold a Go pointer or a C pointer
Thus, using uintptr and other integer (read: non-pointer) types will lose us Go's guarantee of pinning the pointer.
A uintptr is an integer, not a reference. Converting a Pointer to a uintptr creates an integer value with no pointer semantics. Even if a uintptr holds the address of some object, the garbage collector will not update that uintptr's value if the object moves, nor will that uintptr keep the object from being reclaimed.
Source: https://golang.org/pkg/unsafe/#Pointer
Regarding C pointer types such as *char/*C.char, these are only safe when the pointed data does not itself contain pointers to other memory allocated by Go. This can actually be shown by trying to trigger Go's Cgo Debug mechanism, which disallows passing a Go pointer to (or into) a value which itself contains another Go pointer:
package main
import (
"fmt"
"unsafe"
/*
#include <stdio.h>
void cFuncChar(char* ptr) {
printf("%s\n", ptr);
}
void cFuncVoid(void* ptr) {
printf("%s\n", (char*)ptr);
}
*/
"C"
)
type MyStruct struct {
Distraction [2]byte
Dangerous *MyStruct
}
func main() {
bypassDetection()
triggerDetection()
}
func bypassDetection() {
fmt.Println("=== Bypass Detection ===")
ms := &MyStruct{[2]byte{'A', 0}, &MyStruct{[2]byte{0, 0}, nil}}
C.cFuncChar((*C.char)(unsafe.Pointer(ms)))
}
func triggerDetection() {
fmt.Println("=== Trigger Detection ===")
ms := &MyStruct{[2]byte{'B', 0}, &MyStruct{[2]byte{0, 0}, nil}}
C.cFuncVoid(unsafe.Pointer(ms))
}
This will print the following:
=== Bypass Detection ===
A
=== Trigger Detection ===
panic: runtime error: cgo argument has Go pointer to Go pointer
Using *C.char bypassed the detection. Only using unsafe.Pointer will detect Go pointer to Go pointer scenarios. Unfortunately, this means we will have to have an occasional nebulous void*-parameter in the C-function's signature.
Adding for clarity: Go may very well pin the value pointed by a *C.char or such, which is safe to pass; it just (reasonably) won't make an effort to find out whether it might be something else which could contain additional pointers into memory allocated by Go. Casting to unsafe.Pointer is actually safe; casting from it is what may be dangerous.

confusion in understanding type conversions in go

package main
import (
"fmt"
)
type val []byte
func main() {
var a []byte = []byte{0x01,0x02}
var b val = a
fmt.Println(a)
fmt.Println(b)
}
o/p:
[1 2]
[1 2]
Here, my understanding is that a,b identifier share the same underlying type([]byte). so we can exchange the values b/w 2 variables.
package main
import (
"fmt"
)
type abc string
func main() {
fm := fmt.Println
var second = "whowww"
var third abc = second //compile Error at this line 12
fm(second)
fm(third)
}
In line 12 I'm not able to assign the variable.
This Error can be eliminated by using Explicit conversion T(x), I want to understand why we cannot do implicit conversion
As both variables share the same underlying-type, but I'm not able to assign it.
can someone explain the reason behind these?
IF possible provide me the good documentation for type conversions between variables, struct types, function parameters.
This is by design. The Go programming language requires assignment between different types to have an explicit conversion.
It might look like you're simply aliasing the string type to have a different name, but you're technically creating a new type, with a storage type of string, there's a subtle difference.
The way you would define an alias in Go (as of 1.9) is subtly different, theres an equals sign.
type abc = string
If there's any confusion as to why Go doesn't have implicit conversions, it might seem silly when you're only dealing with an underlying string type, but with more complex types, it ensures that the programmer knows just by looking at the code that a conversion is happening.
Its especially helpful in debugging an application, particularly when converting between numeric types to know when a conversion is taking place, so that if there is a truncation of bits i.e. uint64 to uint32, it is obvious to see where that is happening.
https://tour.golang.org/basics/13

map[T]struct{} and map[T]bool in golang

What's the difference? Is map[T]bool optimized to map[T]struct{}? Which is the best practice in Go?
Perhaps the best reason to use map[T]struct{} is that you don't have to answer the question "what does it mean if the value is false"?
From "The Go Programming Language":
The struct type with no fields is called the empty struct, written
struct{}. It has size zero and carries no information but may be
useful nonetheless. Some Go programmers use it instead of bool as the
value type of a map that represents a set, to emphasize that only the
keys are significant, but the space saving is marginal and the syntax
more cumbersome, so we generally avoid it.
If you use bool testing for presence in the "set" is slightly nicer since you can just say:
if mySet["something"] {
/* .. */
}
Difference is in memory requirements. Under the bonnet empty struct is not a pointer but a special value to save memory.
An empty struct is a struct type like any other. All the properties you are used to with normal structs apply equally to the empty struct. You can declare an array of structs{}s, but they of course consume no storage.
var x [100]struct{}
fmt.Println(unsafe.Sizeof(x)) // prints 0
If empty structs hold no data, it is not possible to determine if two struct{} values are different.
Considering the above statements it means that we may use them as method receivers.
type S struct{}
func (s *S) addr() { fmt.Printf("%p\n", s) }
func main() {
var a, b S
a.addr() // 0x1beeb0
b.addr() // 0x1beeb0
}

How to statically limit function arguments to a subset of values

How does one statically constrain a function argument to a subset of values for the required type?
The set of values would be a small set defined in a package. It would be nice to have it be a compile-time check instead of runtime.
The only way that I've been able to figure out is like this:
package foo
// subset of values
const A = foo_val(0)
const B = foo_val(1)
const C = foo_val(2)
// local interface used for constraint
type foo_iface interface {
get_foo() foo_val
}
// type that implements the foo_iface interface
type foo_val int
func (self foo_val) get_foo() foo_val {
return self
}
// function that requires A, B or C
func Bar(val foo_iface) {
// do something with `val` knowing it must be A, B or C
}
So now the user of a package is unable to substitute any other value in place of A, B or C.
package main
import "foo"
func main() {
foo.Bar(foo.A) // OK
foo.Bar(4) // compile-time error
}
But this seems like quite a lot of code to accomplish this seemingly simple task. I have a feeling that I've overcomplicated things and missed some feature in the language.
Does the language have some feature that would accomplish the same thing in a terse syntax?
Go can't do this (I don't think, I don't think a few months makes me experienced)
ADA can, and C++ can sometimes-but-not-cleanly (constexpr and static_assert).
BUT the real question/point is here, why does it matter? I play with Go with GCC as the compiler and GCC is REALLY smart, especially with LTO, constant propigation is one of the easiest optimisations to apply and it wont bother with the check (you are (what we'd call in C anyway) statically initialising A B and C, GCC optimises this (if it has a definition of the functions, with LTO it does))
Now that's a bit off topic so I'll stop with that mashed up blob but tests for sane-ness of a value are good unless your program is CPU bound don't worry about it.
ALWAYS write what it easier to read, you'll be thankful you did later
So do your runtime checks, if the compiler has enough info to hand it wont bother doing them if it can deduce (prove) they wont throw, with constant values like that it'll spot it eaisly.
Addendum
It's difficult to do compile time checks, constexpr in c++ for example is very limiting (everything it touches must also be constexpr and such) - it doesn't play nicely with normal code.
Suppose a value comes from user input? That check has to be at runtime, it'd be silly (and violate DRY) if you wrote two sets of constraints (however that'd work), one for compile one for run.
The best we can do is make the compiler REALLY really smart, and GCC is. I'm sure others are good too ('cept MSs one, I've never heard a compliment about it, but the authors are smart because they wrote a c++ parser for a start!)
A slightly different approach that may suit your needs is to make the function a method of the type and export the set of valid values but not a way to construct new values.
For example:
package foo
import (
"fmt"
)
// subset of values
const A = fooVal(0)
const B = fooVal(1)
const C = fooVal(2)
// type that implements the foo_iface interface
type fooVal int
// function that requires A, B or C
func (val fooVal) Bar() {
fmt.Println(val)
}
Used by:
package main
import "test/foo"
func main() {
foo.A.Bar() // OK, prints 0
foo.B.Bar() // OK, prints 1
foo.C.Bar() // OK, prints 2
foo.4.Bar() // syntax error: unexpected literal .4
E := foo.fooVal(5) // cannot refer to unexported name foo.fooVal
}

Resources