Golang const unsafe.Sizeof - go

Don't understand why I can do
const OK = uint64(0)
const OK = int(unsafe.Sizeof(uint64(0)))
but not this?
const NOK = binary.Size(uint64(0))

It's explained in the specification.
Package unsafe is implemented in the compiler. The expression unsafe.Sizeof(uint64(0)) can be evaluated at compile time. It is a constant expression.
The function plain function call binary.Size(uint64(0)) cannot be evaluated at compile time. It is not constant expression.
Constant declarations require a constant expression.

One thing to mention here is that if the type in question is from a type parameter, then the specification considers the type variably sized, so some code, possibly reflection? is run to figure the size. So using unsafe.Sizeof() as such has to do something at runtime to figure out the size, even if the actual type is able to be known at compile time.
func [T constraints.Integer] WhatIsTheFrequencyKenneth(n T) uintptr {
return unsafe.Sizeof(n)
}
fmt.Println("I had to generate code to tell you this: ", WhatIsTheFrequencyKenneth(int64(42)))

Related

GoLang: why doesn't address-of operator work without a variable declaration

In Go, suppose I have a []byte of UTF-8 that I want to return as a string.
func example() *string {
byteArray := someFunction()
text := string(byteArray)
return &text
}
I would like to eliminate the text variable, but Go doesn't support the following:
func example() *string {
byteArray := someFunction()
return &string(byteArray)
}
Is this second example syntax correct? And if so, why doesn't Go support it?
Because the spec defines is that way:
For an operand x of type T, the address operation &x generates a pointer of type *T to x. The operand must be addressable, that is, either a variable, pointer indirection, or slice indexing operation; or a field selector of an addressable struct operand; or an array indexing operation of an addressable array. As an exception to the addressability requirement, x may also be a (possibly parenthesized) composite literal.
Notice that type conversions (what you are trying to do with string(byteArray)) are not included in this list.
See Marc's answer for an official citation, but here's an intuitive reason for why Go doesn't support this.
Suppose the following code
var myString string
stringPointer := &myString
*stringPointer = "some new value"
Hopefully you know, this code will write some new value into myString. This is a basic use of pointers. Now consider the modified example (pretending that it is valid code):
var myBytes []byte
// modify myBytes...
stringPointer := &string(myString)
*stringPointer = "some new value"
The question is, where in the world (or computer) are we writing to?? Where is some new value going?
In order for the language to handle this correctly, the compiler would need some internal process to "promote" the temporary value to an invisible variable, and then take the address of that. This would be adding needless complexity to make some code slightly shorter, but create this confusing situation where we have pointers with no well defined location in the program. Instead of creating these confusing ghost-variables, the language delegates to the programmer to use their own variable as usual.

Why is unsafe.Sizeof considered unsafe?

Consider the following:
import (
"log"
"unsafe"
)
type Foo struct {
Bar int32
}
func main() {
log.Println(int(unsafe.Sizeof(Foo{})))
}
Why is determining the size of a variable considered unsafe, and a part of the unsafe package? I don't understand why obtaining the size of any type is an unsafe operation, or what mechanism go uses to determine its size that necessitates this.
I would also love to know if there are any alternatives to the unsafe package for determining size of a known struct.
Because in Go if you need to call sizeof, it generally means you're manipulating memory directly, and you should never need to do that.
If you come from the C world, you'll probably most often have used sizeof together with malloc to create a variable-length array - but this should not be needed in Go, where you can simply make([]Foo, 10). In Go, the amount of memory to be allocated is taken care of by the runtime.
You should not be afraid of calling unsafe.Sizeof where it really makes sense - but you should ask yourself whether you actually need it.
Even if you're using it for, say, writing a binary format, it's generally a good idea to calculate by yourself the number of bytes you need, or if anything generate it dynamically using reflect:
calling unsafe.Sizeof on a struct will also include the number of bytes added in for padding.
calling it on dynamically-sized structures (ie. slices, strings) will yield the length of their headers - you should call len() instead.
Using unsafe on a uintptr, int or uint to determine whether you're running on 32-bit or 64-bit? You can generally avoid that by specifying int64 where you actually need to support numbers bigger than 2^31. Or, if you really need to detect that, you have many other options, such as build tags or something like this:
package main
import (
"fmt"
)
const is32bit = ^uint(0) == (1 << 32) - 1
func main() {
fmt.Println(is32bit)
}
From the looks of the unsafe package the methods don't use go's type safety for their operations.
https://godoc.org/unsafe
Package unsafe contains operations that step around the type safety of
Go programs.
Packages that import unsafe may be non-portable and are not protected
by the Go 1 compatibility guidelines.
So from the sounds of it the unsafe-ness is in the kind of code being provided, not necessarily from calling it in particular
Go is a type safe programming language. It won't let you do stuff like this:
package main
type Foo = struct{ A string }
type Bar = struct{ B int }
func main() {
var foo = &Foo{A: "Foo"}
var bar = foo.(*Bar) // invalid operation!
var bar2, ok = foo.(*Bar) // invalid operation!
}
Even if you use the type assertion with the special form that yields an additional boolean value; the compiler goes: haha, nope.
In a programming language like C though, the default is to assume that you are in charge. The program below will compile just fine.
typedef struct foo {
const char* a_;
} foo;
typedef struct bar {
int b_;
} bar;
int main() {
foo f;
f.a_ = "Foo";
bar* b = &f; // warning: incompatible pointer types
bar* b2 = (bar*)&f;
return 0;
}
You get warnings for things that are probably wrong because people have learned over time that this is a common mistake but it's not stopping you. It's just emitting a warning.
Type safety just means that you can't make the same mistake C programmers have made a thousand times over already but it is neither unsafe nor wrong to use the the unsafe package or the C programming language. The unsafe package has just been named in opposition to type safety and it is precisely the right tool when you need to fiddle with the bits (manipulate the representation of things in memory; directly).

Understanding memory allocation for named returned type

In the following code sample, can I assume that I don't need to allocate the returned value? Will the compiler always allocate any function's named returned value?
package main
import "fmt"
type Point struct {
X, Y int
}
func MakePoint(x, y int) (pt Point) {
pt.X = x
pt.Y = y
return
}
func main() {
fmt.Printf("%v\n", MakePoint(1, 2))
}
Also, why do I need to add the return statement at the end of the function? Is this a bug with the compiler?
If I decide to return a pointer:
func MakePoint(x, y int) (pt *Point) {
The code will compile but I am getting a runtime error! Why is the compiler letting me believe an allocation with a statement such as pt = new(Point) is not needed? Is this another bug with the compiler? Clearly, I am missing something crucial about memory allocation in Go.
Will the compiler always allocate any function's named returned value?
Yes, absolutely.
Also, why do I need to add the return statement at the end of the function? Is this a bug with the compiler?
Because function which return a value must end in a return statement. That's what the language spec requires and the compiler enforces.
Why is the compiler letting me believe an allocation with a statement such as pt = new(Point) is not needed?
The compiler allocates space for the pointer (that's what you return) but he cannot know where the memory for the actual Point shall be. That's your job.
Is this another bug with the compiler?
No. Such stuff is fundamental. The chances you discover such a bug in such obvious code is basically zero.
In the following code sample, can I assume that I don't need to allocate the returned value? Will the compiler always allocate any function's named returned value?
I assume by allocate you don't mean assigning like this:
a := func(1,2)
In this case, compiler will assign whatever comes out of func to a.
But I think you mean allocating function result variables.
1. Yes, in this case you don't need to allocate as this is just a value.
2. Nope. It depends whether value is a pointer or a concrete type. In concrete type, you don't need to allocate in any case.
Also, why do I need to add the return statement at the end of the function? Is this a bug with the compiler?
Now to return part, according to standard:
If the function's signature declares result parameters, the function body's statement list must end in a terminating statement.
And this result can be named or not.
The code will compile but I am getting a runtime error! Why is the
compiler letting me believe an allocation with a statement such as pt
= new(Point) is not needed? Is this another bug with the compiler?
It isn't a bug. It's happening because compiler doesn't care about which concrete type a pointer's point. Its create a pointer with zero value of nil. You need to take care of it. Otherwise, you get nil pointer dereference as you still haven't pointed it to any concrete type.

Pointer to an interface in Go

I'm currently reading the source code of the https://github.com/codegangsta/inject go package to understand how does this package works.
I have some questions concerning the file https://github.com/codegangsta/inject/blob/master/inject.go file thats use some element of the Go language I don't understand and don't find precise explanations in the documentation.
// InterfaceOf dereferences a pointer to an Interface type.
// It panics if value is not an pointer to an interface.
func InterfaceOf(value interface{}) reflect.Type {
t := reflect.TypeOf(value)
for t.Kind() == reflect.Ptr {
t = t.Elem()
}
if t.Kind() != reflect.Interface {
panic("Called inject.InterfaceOf with a value that is not a pointer to an interface. (*MyInterface)(nil)")
}
return t
}
My first question is concerning the for loop. Why does it uses a for loop with a test expression ?
The second relates to the message in the panic function. "A pointer to an interface" is mentioned with the (*MyInterface)(nil). I only encounter a similar construction in the go documentation concerning 'compile time checking structure' when you check that a type implements a structure :
var _ SomeType = (*SomeInterface)(nil)
I did not find any informations about a statement with (*Interface)(nil) and pointer to interface.
How should we interpret this statement ? What is the relation with a pointer to interface and where could I find informations about pointer to interface ?
To summarize both answers:
The for loop
for t.Kind() == reflect.Ptr {
t = t.Elem()
}
t.Elem() is the reflection equivalent to *t, so what this loop does it dereferencing t as long as it holds another pointer value. At the end of the loop, t will hold the value that the last pointer pointed to, not a pointer anymore.
The message
Called [...] with a value that is not a pointer to an interface. (*MyInterface)(nil)
The expression (*MyInterface)(nil) is just an (poorly phrased) example of what is expected as parameter.
The syntax is that of a conversion. A conversion will attempt to convert a value (in this case nil) to a given type (*MyInterface) in this case. So,
(*MyInterface)(nil)
will give you a zero value of a *MyInterface whose interface type would be MyInterface (play):
x := (*MyInterface)(nil)
InterfaceOf(x) // MyInterface
Of course, this value does not point somewhere meaningful.
Compile time checking of interface implementation
To avoid confusion, the construct you showed
var _ SomeType = (*SomeInterface)(nil)
is probably not what you wanted. I guess you wanted this:
var _ SomeInterface = (*SomeType)(nil)
This construct enables compile time checking of interface implementation for certain types.
So in case you're writing a library of some sort and you want to satisfy an interface without
using it, you can use this to make sure that your struct implements the interface.
Why this works
First of all, var _ someType is a variable that is going to be checked by the compiler but
will not be in the compiled program and is not accessible due to the Blank Identifier _:
The blank identifier may be used like any other identifier in a declaration, but it does not introduce a binding and thus is not declared.
This enables you do declare an arbitrary number of these constructs without interfering with the
rest of the program.
You can declare a zero value of a pointer of any type by writing:
(*T)(nil)
Check this example on play.
Next, assignability says that x is assignable to T if T is an interface and x implements T.
So to summarize:
T _ = (*x)(nil)
enforces that x implements T as everything else would be an error.
The for loop is used to continually dereference the type until it is no longer a pointer. This will handle case where the type acquired an extra indirection(s).
e.g. play.golang.org/p/vR2gKNJChE
As for the (*MyInterface)(nil), pointers to interfaces always1 an error Go code. I assume the author is just describing what he means by pointer to interface with a code snippet since they are so rare.
If you're still intrigued by the forbidden type Russ Cox has some info how exactly all this works under the hood: research.swtch.com/interfaces. You'll have a hard time finding info on the use of pointers to an interface because [1].
(1) OK not really always, but honestly don't do it unless you're a Go pro. In which case don't tell anyone about it.
That for loop is identical to while loop in other languages
The second thing is just a syntax for conversions:
(*Point)(p) // p is converted to *Point
Because how this library works you just have to pass the pointer to interface, for loop then dereferences it (if we pass something like (***MyInterface)(nil)) and then if statement checks if the ty[e pointed to is an interface.

How to statically limit function arguments to a subset of values

How does one statically constrain a function argument to a subset of values for the required type?
The set of values would be a small set defined in a package. It would be nice to have it be a compile-time check instead of runtime.
The only way that I've been able to figure out is like this:
package foo
// subset of values
const A = foo_val(0)
const B = foo_val(1)
const C = foo_val(2)
// local interface used for constraint
type foo_iface interface {
get_foo() foo_val
}
// type that implements the foo_iface interface
type foo_val int
func (self foo_val) get_foo() foo_val {
return self
}
// function that requires A, B or C
func Bar(val foo_iface) {
// do something with `val` knowing it must be A, B or C
}
So now the user of a package is unable to substitute any other value in place of A, B or C.
package main
import "foo"
func main() {
foo.Bar(foo.A) // OK
foo.Bar(4) // compile-time error
}
But this seems like quite a lot of code to accomplish this seemingly simple task. I have a feeling that I've overcomplicated things and missed some feature in the language.
Does the language have some feature that would accomplish the same thing in a terse syntax?
Go can't do this (I don't think, I don't think a few months makes me experienced)
ADA can, and C++ can sometimes-but-not-cleanly (constexpr and static_assert).
BUT the real question/point is here, why does it matter? I play with Go with GCC as the compiler and GCC is REALLY smart, especially with LTO, constant propigation is one of the easiest optimisations to apply and it wont bother with the check (you are (what we'd call in C anyway) statically initialising A B and C, GCC optimises this (if it has a definition of the functions, with LTO it does))
Now that's a bit off topic so I'll stop with that mashed up blob but tests for sane-ness of a value are good unless your program is CPU bound don't worry about it.
ALWAYS write what it easier to read, you'll be thankful you did later
So do your runtime checks, if the compiler has enough info to hand it wont bother doing them if it can deduce (prove) they wont throw, with constant values like that it'll spot it eaisly.
Addendum
It's difficult to do compile time checks, constexpr in c++ for example is very limiting (everything it touches must also be constexpr and such) - it doesn't play nicely with normal code.
Suppose a value comes from user input? That check has to be at runtime, it'd be silly (and violate DRY) if you wrote two sets of constraints (however that'd work), one for compile one for run.
The best we can do is make the compiler REALLY really smart, and GCC is. I'm sure others are good too ('cept MSs one, I've never heard a compliment about it, but the authors are smart because they wrote a c++ parser for a start!)
A slightly different approach that may suit your needs is to make the function a method of the type and export the set of valid values but not a way to construct new values.
For example:
package foo
import (
"fmt"
)
// subset of values
const A = fooVal(0)
const B = fooVal(1)
const C = fooVal(2)
// type that implements the foo_iface interface
type fooVal int
// function that requires A, B or C
func (val fooVal) Bar() {
fmt.Println(val)
}
Used by:
package main
import "test/foo"
func main() {
foo.A.Bar() // OK, prints 0
foo.B.Bar() // OK, prints 1
foo.C.Bar() // OK, prints 2
foo.4.Bar() // syntax error: unexpected literal .4
E := foo.fooVal(5) // cannot refer to unexported name foo.fooVal
}

Resources