map[T]struct{} and map[T]bool in golang - go

What's the difference? Is map[T]bool optimized to map[T]struct{}? Which is the best practice in Go?
Perhaps the best reason to use map[T]struct{} is that you don't have to answer the question "what does it mean if the value is false"?

From "The Go Programming Language":
The struct type with no fields is called the empty struct, written
struct{}. It has size zero and carries no information but may be
useful nonetheless. Some Go programmers use it instead of bool as the
value type of a map that represents a set, to emphasize that only the
keys are significant, but the space saving is marginal and the syntax
more cumbersome, so we generally avoid it.
If you use bool testing for presence in the "set" is slightly nicer since you can just say:
if mySet["something"] {
/* .. */
}

Difference is in memory requirements. Under the bonnet empty struct is not a pointer but a special value to save memory.

An empty struct is a struct type like any other. All the properties you are used to with normal structs apply equally to the empty struct. You can declare an array of structs{}s, but they of course consume no storage.
var x [100]struct{}
fmt.Println(unsafe.Sizeof(x)) // prints 0
If empty structs hold no data, it is not possible to determine if two struct{} values are different.
Considering the above statements it means that we may use them as method receivers.
type S struct{}
func (s *S) addr() { fmt.Printf("%p\n", s) }
func main() {
var a, b S
a.addr() // 0x1beeb0
b.addr() // 0x1beeb0
}

Related

Is type casting structs in Go a no-op?

Consider the following code in Go
type A struct {
f int
}
type B struct {
f int `somepkg:"somevalue"`
}
func f() {
var b *B = (*B)(&A{1}) // <-- THIS
fmt.Printf("%#v\n", b)
}
Will the marked line result in a memory copy (which I would like to avoid as A has many fields attached to it) or will it be just a reinterpretation, similar to casting an int to an uint?
EDIT: I was concerned, whether the whole struct would have to be copied, similarly to converting a byte slice to a string. A pointer copy is therefore a no-op for me
It is called a conversion. The expression (&A{}) creates a pointer to an instance of type A, and (*B) converts that pointer to a *B. What's copied there is the pointer, not the struct. You can validate this using the following code:
a:=A{}
var b *B = (*B)(&a)
b.f=2
fmt.Printf("%#v\n", a)
Prints 2.
The crucial points to understand is that
First, unlike C, C++ and some other languages of their ilk, Go does not have type casting, it has type conversions.
In most, but not all, cases, type conversion changes the type but not the internal representation of a value.
Second, as to whether a type conversion "is a no-op", depends on how you define the fact of being a no-op.
If you are concerned with a memory copy being made, there are two cases:
Some type conversions are defined to drastically change the value's representation or to copy memory; for example:
Type-converting a value of type string to []rune would interpret the value as a UTF-8-encoded byte stream, decode each encoded Unicode code point and produce a freshly-allocated slice of decoded Unicode runes.
Type-converting a value of type string to []byte, and vice-versa, will clone the backing array underlying the value.
Other type-conversions are no-op in this sense but in order for them to be useful you'd need to either assign a type-converted value to some variable or to pass it as an argument to a function call or send to a channel etc — in other words, you have to store the result or otherwise make use of it.
All of such operations do copy the value, even though it does not "look" like this; consider:
package main
import (
"fmt"
)
type A struct {
X int
}
type B struct {
X int
}
func (b B) Whatever() {
fmt.Println(b.X)
}
func main() {
a := A{X: 42}
B(a).Whatever()
b := B(a)
b.Whatever()
}
Here, the first type conversion in main does not look like a memory copy, but the resulting value will serve as a receiver in the call to B.Whatever and will be physically copied there.
The second type conversion stores the result in a variable (and then copies it again when a method is called).
Reasonong about such things is easy in Go as there everything, always, is passed by value (and pointers are values, too).
It may worth adding that variables in Go does not store the type of the value they hold, so a type conversion cannot mutate the type of a variable "in place". Values do not have type information stored in them, either. This basically means that type conversions is what compiler is concerned with: it knows the types of all the participating values and variables and performs type checking.

Is it better to use pointer for non-primitive types in struct fields in Go

I am going on a project which process some data, I am wondering that if it is better to use pointer in non-primitive typed fields of struct.
What I've found is that the reason of using pointer is that nil can be used as a zero-value, is this the only reason to use pointer?
For example, I am going to store time.Time in my struct and it cannot be nil, then is it better to use non-pointer field?
So is it okay to use
type A struct {
CreatedAt time.Time
}
rather than
type A struct {
CreatedAt *time.Time
}
when Now is not going to be nil?
Not sure I understand the question. In the case of "Now" I would make it a function of the struct i.e.:
type A struct{}
func (a A) Now() time.Time { return time.Now(); }
otherwise what does Now mean? Now is constantly changing.
There are great blogs on when to use pointers
The short would be it doesn't really depend on if the value can be nil, but more on memory and concurrency. Pointers will be passed as references, so less memory, and faster, but also means that changing in one go routine can be very dangerous because the value could be referenced in another go routine and cause race conditions and unexpected behaviors.
I'm not really a professional or know the ins and outs of Go, so take everything I say with some grain of salt.
But as I am understanding it, you should most likely use pointers.
This is because every time you use a non-pointer type, the whole struct will be part of your struct in memory. As a consequence, you can't share a single instance of a struct between multiple structs - every single one gets a copy of your original struct.
Heres a small example:
// This struct has 2x64 bits in size
type MyStruct struct {
A uint64
B uint64
}
// This struct has 32 + 2x64 bits in size
type MyOtherStruct struct {
C uint32
Parent MyStruct
}
// This struct has 32 + the length of an address bits size
type MyPointerStruct struct {
D uint32
Parent *MyStruct
}
But apart from memory concerns, there is also a performance hit if your inner struct is very big. Because every time you set the inner struct the whole memory has to be copied to your instance.
However you have to be careful if your are dealing with interfaces or structs. At runtime, an interface is represented as a type with two fields: A reference to the actual (runtime) type and one with a reference to the actual instance.
So I - with my unprofessional opinion - would recommend to not use pointers if you have interface types because otherwise the CPU has to deference twice (once to get the interface reference, and then again to get the instance of the interface).

Can I Use the Address of a returned value? [duplicate]

What's the cleanest way to handle a case such as this:
func a() string {
/* doesn't matter */
}
b *string = &a()
This generates the error:
cannot take the address of a()
My understanding is that Go automatically promotes a local variable to the heap if its address is taken. Here it's clear that the address of the return value is to be taken. What's an idiomatic way to handle this?
The address operator returns a pointer to something having a "home", e.g. a variable. The value of the expression in your code is "homeless". if you really need a *string, you'll have to do it in 2 steps:
tmp := a(); b := &tmp
Note that while there are completely valid use cases for *string, many times it's a mistake to use them. In Go string is a value type, but a cheap one to pass around (a pointer and an int). String's value is immutable, changing a *string changes where the "home" points to, not the string value, so in most cases *string is not needed at all.
See the relevant section of the Go language spec. & can only be used on:
Something that is addressable: variable, pointer indirection, slice indexing operation, field selector of an addressable struct, array indexing operation of an addressable array; OR
A composite literal
What you have is neither of those, so it doesn't work.
I'm not even sure what it would mean even if you could do it. Taking the address of the result of a function call? Usually, you pass a pointer of something to someone because you want them to be able to assign to the thing pointed to, and see the changes in the original variable. But the result of a function call is temporary; nobody else "sees" it unless you assign it to something first.
If the purpose of creating the pointer is to create something with a dynamic lifetime, similar to new() or taking the address of a composite literal, then you can assign the result of the function call to a variable and take the address of that.
In the end you are proposing that Go should allow you to take the address of any expression, for example:
i,j := 1,2
var p *int = &(i+j)
println(*p)
The current Go compiler prints the error: cannot take the address of i + j
In my opinion, allowing the programmer to take the address of any expression:
Doesn't seem to be very useful (that is: it seems to have very small probability of occurrence in actual Go programs).
It would complicate the compiler and the language spec.
It seems counterproductive to complicate the compiler and the spec for little gain.
I recently was tied up in knots about something similar.
First talking about strings in your example is a distraction, use a struct instead, re-writing it to something like:
func a() MyStruct {
/* doesn't matter */
}
var b *MyStruct = &a()
This won't compile because you can't take the address of a(). So do this:
func a() MyStruct {
/* doesn't matter */
}
tmpA := a()
var b *MyStruct = &tmpA
This will compile, but you've returned a MyStruct on the stack, allocated sufficient space on the heap to store a MyStruct, then copied the contents from the stack to the heap. If you want to avoid this, then write it like this:
func a2() *MyStruct {
/* doesn't matter as long as MyStruct is created on the heap (e.g. use 'new') */
}
var a *MyStruct = a2()
Copying is normally inexpensive, but those structs might be big. Even worse when you want to modify the struct and have it 'stick' you can't be copying then modifying the copies.
Anyway, it gets all the more fun when you're using a return type of interface{}. The interface{} can be the struct or a pointer to a struct. The same copying issue comes up.
You can't get the reference of the result directly when assigning to a new variable, but you have idiomatic way to do this without the use of a temporary variable (it's useless) by simply pre-declaring your "b" pointer - this is the real step you missed:
func a() string {
return "doesn't matter"
}
b := new(string) // b is a pointer to a blank string (the "zeroed" value)
*b = a() // b is now a pointer to the result of `a()`
*b is used to dereference the pointer and directly access the memory area which hold your data (on the heap, of course).
Play with the code: https://play.golang.org/p/VDhycPwRjK9
Yeah, it can be annoying when APIs require the use of *string inputs even though you’ll often want to pass literal strings to them.
For this I make a very tiny function:
// Return pointer version of string
func p(s string) *string {
return &s
}
and then instead of trying to call foo("hi") and getting the dreaded cannot use "hi" (type string) as type *string in argument to foo, I just wrap the argument in a call to to p():
foo(p("hi"))
a() doesn't point to a variable as it is on the stack. You can't point to the stack (why would you ?).
You can do that if you want
va := a()
b := &va
But what your really want to achieve is somewhat unclear.
At the time of writing this, none of the answers really explain the rationale for why this is the case.
Consider the following:
func main() {
m := map[int]int{}
val := 1
m[0] = val
v := &m[0] // won't compile, but let's assume it does
delete(m, 0)
fmt.Println(v)
}
If this code snippet actually compiled, what would v point to!? It's a dangling pointer since the underlying object has been deleted.
Given this, it seems like a reasonable restriction to disallow addressing temporaries
guess you need help from More effective Cpp ;-)
Temp obj and rvalue
“True temporary objects in C++ are invisible - they don't appear in your source code. They arise whenever a non-heap object is created but not named. Such unnamed objects usually arise in one of two situations: when implicit type conversions are applied to make function calls succeed and when functions return objects.”
And from Primer Plus
lvalue is a data object that can be referenced by address through user (named object). Non-lvalues include literal constants (aside from the quoted strings, which are represented by their addresses), expressions with multiple terms, such as (a + b).
In Go lang, string literal will be converted into StrucType object, which will be a non-addressable temp struct object. In this case, string literal cannot be referenced by address in Go.
Well, the last but not the least, one exception in go, you can take the address of the composite literal. OMG, what a mess.

Does assigning value to interface copy anything?

I've been trying to wrap my head around the concept of interfaces in Go. Reading this and this helped a lot.
The only thing that makes me uncomfortable is the syntax. Have a look at the example below:
package main
import "fmt"
type Interface interface {
String() string
}
type Implementation int
func (v Implementation) String() string {
return fmt.Sprintf("Hello %d", v)
}
func main() {
var i Interface
impl := Implementation(42)
i = impl
fmt.Println(i.String())
}
My issue is with i = impl. Based on the fact that an interface instance actually holds a pointer reference to the actual data, it would feel more natural for me to do i = &impl. Usually assignment of non-pointer when not using & will make a full memory copy of the data, but when assigning to interfaces this seem to side-step this and instead simply (behind the scenes) assign the pointer to the interface value. Am I right? That is, the data for the int(42) will not be copied in memory?
The data for int(42) will be copied. Try this code:
func main() {
var i Interface
impl := Implementation(42)
i = impl
fmt.Println(i.String())
impl = Implementation(91)
fmt.Println(i.String())
}
(Playground link)
You'll find that the second i.String() still shows 42. Perhaps one of the trickier aspects of Go is that method receivers can be pointers as well.
func (v *Implementation) String() string {
return fmt.Sprintf("Hello %d", *v)
}
// ...
i = &impl
Is what you want if you want the interface to hold a pointer to the original value of impl. "Under the hood" an interface is a struct that either holds a pointer to some data, or the data itself (and some type metadata that we can ignore for our purposes). The data itself is stored if its size is less than or equal to one machine word -- whether it be a pointer, struct, or other value.
Otherwise it will be a pointer to some data, but here's the tricky part: if the type implementing the interface is a struct the pointer will be to a copy of the struct, not the struct assigned to the interface variable itself. Or at least semantically the user can think of it as such, optimizations may allow the value to not be copied until the two diverge (e.g. until you call String or reassign impl).
In short: assigning to an interface can semantically be thought of as a copy of the data that implements the interface. If this is a pointer to a type, it copies the pointer, if it's a big struct, it copies the big struct. The particulars of interfaces using pointers under the hood are for reasons of garbage collection and making sure the stack expands by predictable amounts. As far as the developer is concerned, they should be thought of as semantic copies of the specific instance of the implementing type assigned.

When should `new` be used in Go?

It seems pointless to be used in primitive language constructs, as you can't specify any sort of values
func main() {
y := new([]float)
fmt.Printf("Len = %d", len(*y) ) // => Len = 0
}
For stucts it makes a bit more sense, but what's the difference between saying y := new(my_stuct) and the seemingly more concise y := &my_struct?
And since anything you create is based on those primitives, they will be initialized to the said zero values. So what's the point? When would you ever want to use new()?
Sorry for the very-beginner question, but the documentation isn't always that clear.
You can't use new for slices and maps, as in your code example, but instead you must use the make command: make([]float, 100)
Both new(MyStruct) and &MyStruct{} do to the same thing, because Go will allocate values on the heap if you get their address with &. Sometimes the code just expresses it intent better in one style or the other.
Go does not have built-in support for constructors, so usually you would wrap the call to new into a function, for example NewMyStruct() which does all the necessary initialization. It also makes it possible to initialize private fields or hide the struct behind an interface, to prevent users of the object from directly messing with its internals. Also evolving the structure of the struct is easier that way, when you don't need to change all of its users when adding/removing/renaming/reordering fields.
make does only work for maps, slices and channels and composite literals like type{} work only for structs, arrays, slices, and maps. For other types, you'll have to use new to get a pointer to a newly allocated instance (if you don't want to use a longer var v T; f(&v)).
I guess this is useful if you want to initialize a struct:
typedef foo struct {
bar *int
}
v := foo{bar: new(int)}

Resources