Is it better to use pointer for non-primitive types in struct fields in Go - go

I am going on a project which process some data, I am wondering that if it is better to use pointer in non-primitive typed fields of struct.
What I've found is that the reason of using pointer is that nil can be used as a zero-value, is this the only reason to use pointer?
For example, I am going to store time.Time in my struct and it cannot be nil, then is it better to use non-pointer field?
So is it okay to use
type A struct {
CreatedAt time.Time
}
rather than
type A struct {
CreatedAt *time.Time
}
when Now is not going to be nil?

Not sure I understand the question. In the case of "Now" I would make it a function of the struct i.e.:
type A struct{}
func (a A) Now() time.Time { return time.Now(); }
otherwise what does Now mean? Now is constantly changing.
There are great blogs on when to use pointers
The short would be it doesn't really depend on if the value can be nil, but more on memory and concurrency. Pointers will be passed as references, so less memory, and faster, but also means that changing in one go routine can be very dangerous because the value could be referenced in another go routine and cause race conditions and unexpected behaviors.

I'm not really a professional or know the ins and outs of Go, so take everything I say with some grain of salt.
But as I am understanding it, you should most likely use pointers.
This is because every time you use a non-pointer type, the whole struct will be part of your struct in memory. As a consequence, you can't share a single instance of a struct between multiple structs - every single one gets a copy of your original struct.
Heres a small example:
// This struct has 2x64 bits in size
type MyStruct struct {
A uint64
B uint64
}
// This struct has 32 + 2x64 bits in size
type MyOtherStruct struct {
C uint32
Parent MyStruct
}
// This struct has 32 + the length of an address bits size
type MyPointerStruct struct {
D uint32
Parent *MyStruct
}
But apart from memory concerns, there is also a performance hit if your inner struct is very big. Because every time you set the inner struct the whole memory has to be copied to your instance.
However you have to be careful if your are dealing with interfaces or structs. At runtime, an interface is represented as a type with two fields: A reference to the actual (runtime) type and one with a reference to the actual instance.
So I - with my unprofessional opinion - would recommend to not use pointers if you have interface types because otherwise the CPU has to deference twice (once to get the interface reference, and then again to get the instance of the interface).

Related

Is type casting structs in Go a no-op?

Consider the following code in Go
type A struct {
f int
}
type B struct {
f int `somepkg:"somevalue"`
}
func f() {
var b *B = (*B)(&A{1}) // <-- THIS
fmt.Printf("%#v\n", b)
}
Will the marked line result in a memory copy (which I would like to avoid as A has many fields attached to it) or will it be just a reinterpretation, similar to casting an int to an uint?
EDIT: I was concerned, whether the whole struct would have to be copied, similarly to converting a byte slice to a string. A pointer copy is therefore a no-op for me
It is called a conversion. The expression (&A{}) creates a pointer to an instance of type A, and (*B) converts that pointer to a *B. What's copied there is the pointer, not the struct. You can validate this using the following code:
a:=A{}
var b *B = (*B)(&a)
b.f=2
fmt.Printf("%#v\n", a)
Prints 2.
The crucial points to understand is that
First, unlike C, C++ and some other languages of their ilk, Go does not have type casting, it has type conversions.
In most, but not all, cases, type conversion changes the type but not the internal representation of a value.
Second, as to whether a type conversion "is a no-op", depends on how you define the fact of being a no-op.
If you are concerned with a memory copy being made, there are two cases:
Some type conversions are defined to drastically change the value's representation or to copy memory; for example:
Type-converting a value of type string to []rune would interpret the value as a UTF-8-encoded byte stream, decode each encoded Unicode code point and produce a freshly-allocated slice of decoded Unicode runes.
Type-converting a value of type string to []byte, and vice-versa, will clone the backing array underlying the value.
Other type-conversions are no-op in this sense but in order for them to be useful you'd need to either assign a type-converted value to some variable or to pass it as an argument to a function call or send to a channel etc — in other words, you have to store the result or otherwise make use of it.
All of such operations do copy the value, even though it does not "look" like this; consider:
package main
import (
"fmt"
)
type A struct {
X int
}
type B struct {
X int
}
func (b B) Whatever() {
fmt.Println(b.X)
}
func main() {
a := A{X: 42}
B(a).Whatever()
b := B(a)
b.Whatever()
}
Here, the first type conversion in main does not look like a memory copy, but the resulting value will serve as a receiver in the call to B.Whatever and will be physically copied there.
The second type conversion stores the result in a variable (and then copies it again when a method is called).
Reasonong about such things is easy in Go as there everything, always, is passed by value (and pointers are values, too).
It may worth adding that variables in Go does not store the type of the value they hold, so a type conversion cannot mutate the type of a variable "in place". Values do not have type information stored in them, either. This basically means that type conversions is what compiler is concerned with: it knows the types of all the participating values and variables and performs type checking.

How to get a pointer to the underlying value of an Interface{} in Go

I'm interfacing with C code in Go using cgo, and I need to call a C function with a pointer to the underlying value in an Interface{} object. The value will be any of the atomic primitive types (not including complex64/complex128), or string.
I was hoping I'd be able to do something like this to get the address of ptr as an unsafe.Pointer:
unsafe.Pointer(reflect.ValueOf(ptr).UnsafeAddr())
But this results in a panic due to the value being unaddressable.
A similar question to this is Take address of value inside an interface, but this question is different, as in this case it is known that the value will always be one of the types specified above (which will be at most 64 bits), and I only need to give this value to a C function. Note that there are multiple C functions, and the one that will be called varies based off of a different unrelated parameter.
I also tried to solve this using a type switch statement, however I found myself unable to get the address of the values even after the type assertion was done. I was able to assign the values to temporary copies, then get the address of those copies, but I'd rather avoid making these copies if possible.
interface{} has own struct:
type eface struct {
typ *rtype
val unsafe.Pointer
}
You have no access to rtype directly or by linking, on the other hand, even though you'll copy whole rtype, it may be changed (deprecated) at future.
But thing is that you can replace pointer types with unsafe.Pointer (it may be anything else with same size, but pointer is much idiomatic, because each type has own pointer):
type eface struct {
typ, val unsafe.Pointer
}
So, now we can get value contained in eface:
func some_func(arg interface{}) {
passed_value := (*eface)(unsafe.Pointer(&arg)).val
*(*byte)(passed_value) = 'b'
}
some_var := byte('a')
fmt.Println(string(some_var)) // 'a'
some_func(some_var)
fmt.Println(string(some_var)) // 'a', it didn't changed, just because it was copied
some_func(&some_var)
fmt.Println(string(some_var)) // 'b'
You also might see some more usages at my repo:
https://github.com/LaevusDexter/fast-cast
Sorry for my poor English.

Stringer method requires value

The Go FAQ answers a question regarding the choice of by-value vs. by-pointer receiver definition in methods. One of the statements in that answer is:
If some of the methods of the type must have pointer receivers, the rest should too, so the method set is consistent regardless of how the type is used.
This implies that if I have for a data type a few methods that mutate the data, thus require by-pointer receiver, I should use by-pointer receiver for all the methods defined for that data type.
On the other hand, the "fmt" package invokes the String() method as defined in the Stringer interface by value. If one defines the String() method with a receiver by-pointer it would not be invoked when the associated data type is given as a parameter to fmt.Println (or other fmt formatting methods). This leaves one no choice but to implement the String() method with a receiver by value.
How can one be consistent with the choice of by-value vs. by-pointer, as the FAQ suggests, while fulfilling fmt requirements for the Stringer interface?
EDIT:
In order to emphasize the essence of the problem I mention, consider a case where one has a data type with a set of methods defined with receiver by-value (including String()). Then one wishes to add an additional method that mutates that data type - so he defines it with receiver by-pointer, and (in order to be consistent, per FAQ answer) he also updates all the other methods of that data type to use by-pointer receiver. This change has zero impact on any code that uses the methods of this data type - BUT for invocations of fmt formatting functions (that now require passing a pointer to a variable instead of its value, as before the change). So consistency requirements are only problematic in the context of fmt. The need to adjust the manner one provides a variable to fmt.Println (or similar function) based on the receiver type breaks the capability to easily refactor one's package.
If you define your methods with pointer receiver, then you should use and pass pointer values and not non-pointer values. Doing so the passed value does indeed implement Stringer, and the fmt package will have no problem "detecting" and calling your String() method.
Example:
type Person struct {
Name string
}
func (p *Person) String() string {
return fmt.Sprintf("Person[%s]", p.Name)
}
func main() {
p := &Person{Name: "Bob"}
fmt.Println(p)
}
Output (try it on the Go Playground):
Person[Bob]
If you would pass a value of type Person to fmt.Println() instead of a pointer of type *Person, yes, indeed the Person.String() would not be called. But if all methods of Person has pointer receiver, that's a strong indication that you should use the type and its values as pointers (unless you don't intend its methods to be used).
Yes, you have to know whether you have to use Person or *Person. Deal with it. If you want to write correct and efficient programs, you have to know a lot more than just whether to use pointer or non-pointer values, I don't know why this is a big deal for you. Look it up if you don't know, and if you're lazy, use a pointer as the method set of (the type of) a pointer value contains methods with both pointer and non-pointer receiver.
Also the author of Person may provide you a NewPerson() factory function which you can rely on to return the value of the correct type (e.g. Person if methods have value receivers, and *Person if the methods have pointer receivers), and so you won't have to know which to use.
Answer to later adding a method with pointer receiver to a type which previously only had methods with value receiver:
Yes, as you described in the question, that might not break existing code, yet continuing to use a non-pointer value may not profit from the later added method with pointer receiver.
We might ask: is this a problem? When the type was used, the new method you just added didn't exist. So the original code made no assumption about its existence. So it shouldn't be a problem.
Second consideration: the type only had methods with value receiver, so one could easily assume that by their use, the value was immutable as methods with value receiver cannot alter the value. Code that used the type may have built on this, assuming it was not changed by its methods, so using it from multiple goroutines may have omitted certain synchronization rightfully.
So I do think that adding a new method with pointer receiver to a type that previously only had methods with value receiver should not be "opaque", the person who adds this new method has the responsibility to either modify uses of this type to "switch" to pointers and make sure the code remains safe and correct, or deal with the fact that non-pointer values will not have this new method.
Tips:
If there's a chance that a type may have mutator methods in the future, you should start creating it with methods with pointer receivers. Doing so you avoid later having to go through the process described above.
Another tip could be to hide the type entirely, and only publish interfaces. Doing so, the users of this type don't have to know whether the interface wraps a pointer or not, it just doesn't matter. They receive an interface value, and they call methods of the interface. It's the responsibility of the package author to take care of proper method receivers, and return the appropriate type that implements the interface. The clients don't see this and they don't depend on this. All they see and use is the interface.
In order to emphasize the essence of the problem I mention, consider a case where one has a data type with a set of methods defined with receiver by-value (including String()). Then one wishes to add an additional method that mutates that data type - so he defines it with receiver by-pointer, and (in order to be consistent, per FAQ answer) he also updates all the other methods of that data type to use by-pointer receiver. This change has zero impact on any code that uses the methods of this data type - BUT for invocations of fmt formatting functions (that now require passing a pointer to a variable instead of its value, as before the change).
This is not true. All interface of it and some of type assertions will be affected as well - that is why fmt is affected. e.g. :
package main
import (
"fmt"
)
type I interface {
String() string
}
func (t t) String() string { return "" }
func (p *p) String() string { return "" }
type t struct{}
type p struct{}
func S(i I) {}
func main() {
fmt.Println("Hello, playground")
T := t{}
P := p{}
_ = P
S(T)
//S(P) //fail
}
To understand this from the root, you should know that a pointer method and a value method is different from the very base. However, for convenience, like the omit of ;, golang compiler looks for cases using pointer methods without a pointer and change it back.
As explained here: https://tour.golang.org/methods/6
So back to the orignal question: consistency of pointer methods. If you read the faq more carefully, you will find it is the very last part of considering to use a value or pointer methods. And you can find counter-example in standard lib examples, in container/heap :
// A PriorityQueue implements heap.Interface and holds Items.
type PriorityQueue []*Item
func (pq PriorityQueue) Len() int { return len(pq) }
func (pq PriorityQueue) Less(i, j int) bool {
// We want Pop to give us the highest, not lowest, priority so we use greater than here.
return pq[i].priority > pq[j].priority
}
func (pq PriorityQueue) Swap(i, j int) {
pq[i], pq[j] = pq[j], pq[i]
pq[i].index = i
pq[j].index = j
}
func (pq *PriorityQueue) Push(x interface{}) {
n := len(*pq)
item := x.(*Item)
item.index = n
*pq = append(*pq, item)
}
func (pq *PriorityQueue) Pop() interface{} {
old := *pq
n := len(old)
item := old[n-1]
item.index = -1 // for safety
*pq = old[0 : n-1]
return item
}
// update modifies the priority and value of an Item in the queue.
func (pq *PriorityQueue) update(item *Item, value string, priority int) {
item.value = value
item.priority = priority
heap.Fix(pq, item.index)
}
So, indeed, as the FAQ say, to determine whether to use a pointer methods, take the following consideration in order:
Does the method need to modify the receiver? If yes, use a pointer. If not, there should be a good reason or it makes confusion.
Efficiency. If the receiver is large, a big struct for instance, it will be much cheaper to use a pointer receiver. However, efficiency is not easy to discuss. If you think it is an issue, profile and/or benchmark it before doint so.
Consistency. If some of the methods of the type must have pointer receivers, the rest should too, so the method set is consistent regardless of how the type is used. This, to me, means that if the type shall be used as a pointer (e.g., frequent modify), it should use the method set to mark so. It does not mean one type can only have solely pointer methods or the other way around.
The previous answers here do not address the underlying issue, although the answer from leaf bebop is solid advice.
Given a value, you can in fact invoke either pointer or value receiver methods on it, the compiler will do that for you. However, that does not apply when invoking via interface implementations.
This boils down to this dicussion about interface implementations.
In that discussion the discussion is about implementing interfaces with nil pointers. But the underlying discussion revolves around the same issue: when implementing an interface you must choose the pointer or the value type, and there will be no attempt by the compiler, nor can there be any attempt in golang code, to figure out exactly what type it is, and adjust the interface call accordingly.
So for example, when calling
fmt.Println(object)
you are implementing the arg of type interface{} with object of type X. The fmt code within has no interest in knowing whether the type of object is a pointer type or not. It will not even be able to tell without using reflection. It will simply call String() on whatever type that is.
So if you supplied a value of type X, and there just so happens to be a (*X) String() string method, that does not matter, that method will not be called, it will only type-assert whether that type X implements Stringer, it has no interest if type *X asserts Stringer. Since there is no (X) String() string method, it will move on. It will not attempt to check what X may happen to be, whether it's a pointer type, and if not, whether the associated pointer type implements Stringer, and call that String() method instead.
So this is not really a pointer vs value methods issue, this is an interface implementation issue when implementing interface{} in calls to fmt methods.

map[T]struct{} and map[T]bool in golang

What's the difference? Is map[T]bool optimized to map[T]struct{}? Which is the best practice in Go?
Perhaps the best reason to use map[T]struct{} is that you don't have to answer the question "what does it mean if the value is false"?
From "The Go Programming Language":
The struct type with no fields is called the empty struct, written
struct{}. It has size zero and carries no information but may be
useful nonetheless. Some Go programmers use it instead of bool as the
value type of a map that represents a set, to emphasize that only the
keys are significant, but the space saving is marginal and the syntax
more cumbersome, so we generally avoid it.
If you use bool testing for presence in the "set" is slightly nicer since you can just say:
if mySet["something"] {
/* .. */
}
Difference is in memory requirements. Under the bonnet empty struct is not a pointer but a special value to save memory.
An empty struct is a struct type like any other. All the properties you are used to with normal structs apply equally to the empty struct. You can declare an array of structs{}s, but they of course consume no storage.
var x [100]struct{}
fmt.Println(unsafe.Sizeof(x)) // prints 0
If empty structs hold no data, it is not possible to determine if two struct{} values are different.
Considering the above statements it means that we may use them as method receivers.
type S struct{}
func (s *S) addr() { fmt.Printf("%p\n", s) }
func main() {
var a, b S
a.addr() // 0x1beeb0
b.addr() // 0x1beeb0
}

Does assigning value to interface copy anything?

I've been trying to wrap my head around the concept of interfaces in Go. Reading this and this helped a lot.
The only thing that makes me uncomfortable is the syntax. Have a look at the example below:
package main
import "fmt"
type Interface interface {
String() string
}
type Implementation int
func (v Implementation) String() string {
return fmt.Sprintf("Hello %d", v)
}
func main() {
var i Interface
impl := Implementation(42)
i = impl
fmt.Println(i.String())
}
My issue is with i = impl. Based on the fact that an interface instance actually holds a pointer reference to the actual data, it would feel more natural for me to do i = &impl. Usually assignment of non-pointer when not using & will make a full memory copy of the data, but when assigning to interfaces this seem to side-step this and instead simply (behind the scenes) assign the pointer to the interface value. Am I right? That is, the data for the int(42) will not be copied in memory?
The data for int(42) will be copied. Try this code:
func main() {
var i Interface
impl := Implementation(42)
i = impl
fmt.Println(i.String())
impl = Implementation(91)
fmt.Println(i.String())
}
(Playground link)
You'll find that the second i.String() still shows 42. Perhaps one of the trickier aspects of Go is that method receivers can be pointers as well.
func (v *Implementation) String() string {
return fmt.Sprintf("Hello %d", *v)
}
// ...
i = &impl
Is what you want if you want the interface to hold a pointer to the original value of impl. "Under the hood" an interface is a struct that either holds a pointer to some data, or the data itself (and some type metadata that we can ignore for our purposes). The data itself is stored if its size is less than or equal to one machine word -- whether it be a pointer, struct, or other value.
Otherwise it will be a pointer to some data, but here's the tricky part: if the type implementing the interface is a struct the pointer will be to a copy of the struct, not the struct assigned to the interface variable itself. Or at least semantically the user can think of it as such, optimizations may allow the value to not be copied until the two diverge (e.g. until you call String or reassign impl).
In short: assigning to an interface can semantically be thought of as a copy of the data that implements the interface. If this is a pointer to a type, it copies the pointer, if it's a big struct, it copies the big struct. The particulars of interfaces using pointers under the hood are for reasons of garbage collection and making sure the stack expands by predictable amounts. As far as the developer is concerned, they should be thought of as semantic copies of the specific instance of the implementing type assigned.

Resources