Map types are reference types. var m map[string]int doesn't point to an initialized map. What doe this mean? - go

I have read on the golang blog: https://blog.golang.org/go-maps-in-action that:
var m map[string]int
Map types are reference types, like pointers or slices, and so the
value of m above is nil; it doesn't point to an initialized map. A nil
map behaves like an empty map when reading, but attempts to write to a
nil map will cause a runtime panic; don't do that. To initialize a
map, use the built in make function:
m = make(map[string]int)
The make function allocates and initializes a hash map data structure
and returns a map value that points to it.
I have a hard time understanding some parts of this:
What does var m map[string]int do?
Why do I need to write m = make(map[string]int) but not i = make(int)

What does var m map[string]int do?
It tells the compiler that m is a variable of type map[string]int, and assigns "The Zero Value" of the type map[string] int to m (that's why m is nil as nil is The Zero Value of any map).
Why do I need to write m = make(map[string]int) but not i = make(int)
You don't need to. You can create a initialized map also like this:
m = map[string]int{}
which does exactly the same.
The difference between maps and ints is: A nil map is perfectly fine. E.g. len() of a nil map works and is 0. The only thing you cannot do with a nil map is store key-value-pairs. If you want to do this you'll have to prepare/initialize the map. This preparation/initialization in Go is done through the builtin make (or by a literal map as shown above). This initialization process is not needed for ints. As there are no nil ints this initialization would be total noise.
Note that you do not initialize the variable m: The variable m is a map of strings to ints, initialized or not. Like i is a variable for ints. Now ints are directly usable while maps require one more step because the language works that way.

What does var m map[string]int do?
You can think about it like pointer with nil value, it does not point to anything yet but able to point to concrete value.
Why do I need to write m = make(map[string]int) but not i = make(int)
https://golang.org/doc/effective_go.html#allocation_make
Back to allocation. The built-in function make(T, args) serves a purpose different from new(T). It creates slices, maps, and channels only, and it returns an initialized (not zeroed) value of type T (not *T). The reason for the distinction is that these three types represent, under the covers, references to data structures that must be initialized before use. A slice, for example, is a three-item descriptor containing a pointer to the data (inside an array), the length, and the capacity, and until those items are initialized, the slice is nil. For slices, maps, and channels, make initializes the internal data structure and prepares the value for use. For instance,
make([]int, 10, 100)
allocates an array of 100 ints and then creates a slice structure with length 10 and a capacity of 100 pointing at the first 10 elements of the array. (When making a slice, the capacity can be omitted; see the section on slices for more information.) In contrast, new([]int) returns a pointer to a newly allocated, zeroed slice structure, that is, a pointer to a nil slice value.
These examples illustrate the difference between new and make.
var p *[]int = new([]int) // allocates slice structure; *p == nil; rarely useful
var v []int = make([]int, 100) // the slice v now refers to a new array of 100 ints
// Unnecessarily complex:
var p *[]int = new([]int)
*p = make([]int, 100, 100)
// Idiomatic:
v := make([]int, 100)
Remember that make applies only to maps, slices and channels and does not return a pointer. To obtain an explicit pointer allocate with new or take the address of a variable explicitly.

All words have the same length of 32 bits (4 bytes) or 64 bits (8 bytes),
depending on the processor and the operating system. They are identified by their memory address (represented as a hexadecimal number).
All variables of primitive types like int, float, bool, string ... are value types, they point directly to the values contained in the memory. Also composite types like arrays and structs are value types. When assigning with = the value of a value type to another variable: j = i, a copy of the original value i is made in memory.
More complex data which usually needs several words are treated as reference types. A reference type variable r1 contains the address (a number) of the memory location where the value of r1 is stored (or at least the 1st word of it):
For reference types when assigning r2 = r1, only the reference (the address) is copied and not the value!!. If the value of r1 is modified, all references of that value (like r1 and r2) will be reflected.
In Go pointers are reference types, as well as slices, maps and channels. The variables that are referenced are stored in the heap, which is garbage collected.
In the light of the above statements it's clear why the article states:
To initialize a map, use the built in make function.
The make function allocates and initializes a hash map data structure and returns a map value that points to it. This means you can write into it, compare to
var m map[string]int
which is readable, resulting a nil map, but an attempt to write to a nil map will cause a runtime panic. This is the reason why it's important to initialize the map with make.
m = make(map[string]int)

Related

Difference in Go between allocating memory by new(Type) and &Type{}

Consider the following example:
type House struct{}
func main() {
house1 := new(House)
house2 := &House{}
fmt.Printf("%T | %T\n", house1, house2)
}
Output: *main.House | *main.House
Both assignments produce a pointer to type House (from package main).
From the go docu of new:
// The new built-in function allocates memory. The first argument is a type,
// not a value, and the value returned is a pointer to a newly
// allocated zero value of that type.
Is memory allocation in both assignments technically the same? Is there a best practice?
Is memory allocation in both assignments technically the same?
I'm not going to talk about implementation details, since those may or may not change.
From the point of view of the language specifications, there is a difference.
Allocating with new(T) follows the rules of allocation (my own note in [italics]):
The built-in function new takes a type T, allocates storage for a variable of that type at run time, and returns a value of type *T pointing to it. The variable is initialized as described in the section on initial values. [its zero value]
Allocating with a composite literal as &T{} follows the rules of composite literals and addressing:
Composite literals construct values for structs, arrays, slices, and maps and create a new value each time they are evaluated.
Taking the address of a composite literal generates a pointer to a unique variable initialized with the literal's value.
So both new(T) and &T{} allocate storage and yield values of type *T. However:
new(T) accepts any type, including predeclared identifiers as int and bool, which may come in handy if you just need to initialize such vars with a one-liner:
n := new(int) // n is type *int and points to 0
b := new(bool) // b is type *bool and points to false
new (quoting Effective Go) "does not initialize1 that memory, it only zeros it". I.e. the memory location will have the type's zero value.
composite literals can be used only for structs, arrays, slices and maps
composite literals can construct non-zero values by explicitly initializing struct fields or array/slice/map items
Bottom line: as Effective Go points out (same paragraph as above):
if a composite literal contains no fields at all, it creates a zero value for the type. The expressions new(T) and &T{} are equivalent.
[1]: there's an inconsistency in the use of the term "initialization" in the specs and Effective Go. The take-away is that with new you can only yield a zero-value, whereas a composite literal allows you to construct non-zero values. And obviously the specs are the source of truth.

Properties of slice of structs as function argument without allocating slice

I have a slice of structs that looks like:
type KeyValue struct {
Key uint32
Value uint32
}
var mySlice []Keyvalue
This slice mySlice can have a variable length.
I would like to pass all elements of this slice into a function with this signature:
takeKeyValues(keyValue []uint32)
Now I could easily allocate a slice of []uint32, populate it with all the keys/values of the elements in mySlice and then pass that slice as an argument into takeKeyValues()...
But I'm trying to find a way of doing that without any heap allocations, by directly passing all the keys/values on the stack. Is there some syntax trick which allows me to do that?
There is no safe way to arbitrarily reinterpret the memory layout of the your data. It's up to you whether the performance gain is worth the use of an unsafe type conversion.
Since the fields of KeyValue are of equal size and the struct has no padding, you can convert the underlying array of KeyValue elements to an array of uint32.
takeKeyValues((*[1 << 30]uint32)(unsafe.Pointer(&s[0]))[:len(s)*2])
https://play.golang.org/p/Jjkv9pdFITu
As an alternative to JimB's answer, you can also use the reflect (reflect.SliceHeader) and unsafe (unsafe.Pointer) packages to achieve this.
https://play.golang.org/p/RLrMgoWgI7t
s := []KeyValue{{0, 100}, {1, 101}, {2, 102}, {3, 103}}
var data []uint32
sh := (*reflect.SliceHeader)(unsafe.Pointer(&data))
sh.Data = uintptr(unsafe.Pointer(&s[0]))
sh.Len = len(s) * 2
sh.Cap = sh.Len
fmt.Println(data)
I a bit puzzled by what you want. If the slice is dynamic, you have to grow it dynamically as you add elements to it (at the call site).
Go does not have alloca(3), but I may propose another approach:
Come up with an upper limit of the elements you'd need.
Declare variable of the array type of that size (not a slice).
"Patch" as many elements you want (starting at index 0) in that array.
"Slice" the array and pass the slice to the callee.
Slicing an array does not copy memory — it just takes the address of the array's first element.
Something like this:
func caller(actualSize int) {
const MaxLen = 42
if actualSize > MaxLen {
panic("actualSize > MaxLen")
}
var kv [MaxLen]KeyValue
for i := 0; i < actualSize; i++ {
kv[i].Key, kv[i].Value = 100, 200
}
callee(kv[:actualSize])
}
func callee(kv []KeyValue) {
// ...
}
But note that when the Go compiler sees var kv [MaxLen]KeyValue, it does not have to allocate it on the stack. Basically, I beleive the language spec does not tell anything about the stack vs heap.

cgo: How to pass struct array from c to go

The C part:
struct Person {...}
struct Person * get_team(int * n)
The Go part:
n := C.int(0)
var team *C.struct_Person = C.get_team(&n)
defer C.free(unsafe.Pointer(team))
I can get the first element of the array in this way. But how to get the whole array with n elements?
and how to free them safely?
First, even though you’re using Go, when you add cgo there is no longer any "safe". It's up to you to determine when and how you free the memory, just as if you were programming in C.
The easiest way to use a C array in go is to convert it to a slice through an array:
team := C.get_team()
defer C.free(unsafe.Pointer(team))
teamSlice := (*[1 << 30]C.struct_Person)(unsafe.Pointer(team))[:teamSize:teamSize]
The max-sized array isn't actually allocated, but Go requires constant size arrays, and 1<<30 is going to be large enough. That array is immediately converted to a slice, with the length and capacity properly set.

Go, is this everything you need to know for assignments and initialization?

It is just mind boggling how many different ways Go has for variable initialization. May be either i don't understand this completely or Go is a big after thought. Lot of things don't feel natural and looks like they added features as they found them missing. Initialization is one of them.
Here is running app on Go playground showing different ways of initialization
Here is what i understand
There are values and pointers. Values are initiated using var = or :=.
:= only works inside the methods
To Create value and reference you use new or &. And they only work on composite types.
There are whole new ways of creating maps and slices
Create slice and maps using either make or var x []int. Noticing there is no = or :=
Are there any easy way to understand all this for newbies? Reading specs gives all this in bits and pieces everywhere.
First, to mention some incorrect statements:
To Create value and reference you use new or &. And they only work on composite types.
new() works on everything. & doesn't
Create slice and maps using either make or var x []int. Noticing there is no = or :=
You can actually use := with x := []int{} or even x := []int(nil). However, the first is only used when you want to make a slice of len 0 that is not nil and the second is never used because var x []int is nicer.
Lets start from the beginning. You initialize variables with var name T. Type inference was added because you can type less and avoid worrying about the name of a type a function returns. So you are able to do var name = f().
If you want to allocate a pointer, you would need to first create the variable being pointed to and then get its pointer:
var x int
var px = &x
Requiring two statements can be bothersome so they introduced a built-in function called new(). This allows you to do it in one statement: var px = new(int)
However, this method of making new pointers still has an issue. What if you are building a massive literal that has structs that expect pointers? This is a very common thing to do. So they added a way (only for composite literals) to handle this.
var x = []*T{
&T{x: 1},
&T{x: 2},
}
In this case, we want to assign different values to fields of the struct the pointers reference. Handling this with new() would be awful so it is allowed.
Whenever a variable is initialized, it is always initialized to its zero value. There are no constructors. This is a problem for some of the builtin types. Specifically slices, maps, and channels. These are complicated structures. They require parameters and initialization beyond memory being set to zero. Users of Go who need to do this simply write initialization functions. So, that is what they did, they wrote make(). neW([]x) returns a pointer to a slice of x while make([]x, 5) returns a slice of x backed by an array of length 5. New returns a pointer while make returns a value. This is an important distinction which is why it is separate.
What about :=? That is a big clusterfuck. They did that to save on typing, but it led to some odd behaviors and the tacked on different rules until it became somewhat usable.
At first, it was a short form of var x = y. However, it was used very very often with multiple returns. And most of those multiple returns were errors. So we very often had:
x, err := f()
But multiple errors show up in your standard function so people started naming things like:
x, err1 := f()
y, err2 := g()
This was ridiculous so they made the rule that := only re-declares if it is not already declared in that scope. But that defeats the point of := so they also tacked on the rule that at least one must be newly declared.
Interestingly enough, this solves 95% of its problems and made it usable. Although the biggest problem I run into with it is that all the variables on the lefthand side must be identifiers. In other words, the following would be invalid because x.y is not an identifier:
x.y, z := f()
This is a leftover of its relation to var which can't declare anything other than an identifier.
As for why := doesn't work outside a function's scope? That was done to make writing the compiler easier. Outside a function, every part starts with a keyword (var, func, package, import). := would mean that declaration would be the only one that doesn't start with a keyword.
So, that is my little rant on the subject. The bottom line is that different forms of declaration are useful in different areas.
Yeah, this was one of those things that I found confusing early as well. I've come up with my own rules of thumb which may or may not be best practices, but they've served me well. Tweaked after comment from Stephen about zero values
Use := as much as possible, let Go infer types
If you just need an empty slice or map (and don't need to set an initial capacity), use {} syntax
s := []string{}
m := map[string]string{}
The only reason to use var is to initialize something to a zero value (pointed out by Stephen in comments) (and yeah, outside of functions you'll need var or const as well):
var ptr *MyStruct // this initializes a nil pointer
var t time.Time // this initializes a zero-valued time.Time
I think your confusion is coming from mixing up the type system with declaration and initialization.
In Go, variables can be declared in two ways: using var, or using :=. var only declares the variable, while := also assigns an initial value to it. For example:
var i int
i = 1
is equivalent to
i := 1
The reason this is possible is that := simply assumes that the type of i is the same as the type of the expression it's being initialized to. So, since the expression 1 has type int, Go knows that i is being declared as an integer.
Note that you can also explicitly declare and initialize a variable with the var keyword as well:
var i int = 1
Those are the only two delcaration/initialization constructs in Go. You're right that using := in the global scope is not allowed. Other than that, the two are interchangable, as long as Go is able to guess what type you're using on the right-hand-side of the := (which is most of the time).
These constructs work with any types. The type system is completely independent from the declaration/initialization syntax. What I mean by that is that there are no special rules about which types you can use with the declaration/initialization syntax - if it's a type, you can use it.
With that in mind, let's walk through your example and explain everything.
Example 1
// Value Type assignments [string, bool, numbers]
// = & := Assignment
// Won't work on pointers?
var value = "Str1"
value2 := "Str2"
This will work with pointers. The key is that you have to be setting these equal to expressions whose types are pointers. Here are some expressions with pointer types:
Any call to new() (for example, new(int) has type *int)
Referencing an existing value (if i has type int, then &i has type *int)
So, to make your example work with pointers:
tmp := "Str1"
var value = &tmp // value has type *int
value2 := new(string)
*value2 = "Str2"
Example 2
// struct assignments
var ref1 = refType{"AGoodName"}
ref2 := refType{"AGoodName2"}
ref3 := &refType{"AGoodName2"}
ref4 := new(refType)
The syntax that you use here, refType{"AGoodName"}, is used to create and initialize a struct in a single expression. The type of this expression is refType.
Note that there is one funky thing here. Normally, you can't take the address of a literal value (for example, &3 is illegal in Go). However, you can take the address of a literal struct value, like you do above with ref3 := &refType{"AGoodName2"}. This seems pretty confusing (and certainly confused me at first). The reason it's allowed is that it's actually just a short-hand syntax for calling ref3 := new(refType) and then initializing it by doing *ref3 = refType{"AGoodName2"}.
Example 3
// arrays, totally new way of assignment now, = or := now won't work now
var array1 [5]int
This syntax is actually equivalent to var i int, except that instead of the type being int, the type is [5]int (an array of five ints). If we wanted to, we could now set each of the values in the array separately:
var array1 [5]int
array1[0] = 0
array1[1] = 1
array1[2] = 2
array1[3] = 3
array1[4] = 4
However, this is pretty tedious. Just like with integers and strings and so on, arrays can have literal values. For example, 1 is a literal value with the type int. The way you create a literal array value in Go is by naming the type explicitly, and then giving the values in brackets (similar to a struct literal) like this:
[5]int{0, 1, 2, 3, 4}
This is a literal value just like any other, so we can use it as an initializer just the same:
var array1 [5]int = [5]int{0, 1, 2, 3, 4}
var array2 = [5]int{0, 1, 2, 3, 4}
array3 := [5]int{0, 1, 2, 3, 4}
Example 4
// slices, some more ways
var slice1 []int
Slices are very similar to arrays in how they are initialized. The only difference is that they aren't fixed at a particular length, so you don't have to give a length parameter when you name the type. Thus, []int{1, 2, 3} is a slice of integers whose length is initially 3 (although could be changed later). So, just like above, we can do:
var slice1 []int = []int{1, 2, 3}
var slice2 = []int{1, 2, 3}
slice3 := []int{1, 2, 3}
Example 5
var slice2 = new([]int)
slice3 := new([]int)
Reasoning about types can get tricky when the types get complex. As I mentioned above, new(T) returns a pointer, which has type *T. This is pretty straightforward when the type is an int (ie, new(int) has type *int). However, it can get confusing when the type itself is also complex, like a slice type. In your example, slice2 and slice3 both have type *[]int. For example:
slice3 := new([]int)
*slice3 = []int{1, 2, 3}
fmt.Println((*slice3)[0]) // Prints 1
You may be confusing new with make. make is for types that need some sort of initialization. For slices, make creates a slice of the given size. For example, make([]int, 5) creates a slice of integers of length 5. make(T) has type T, while new(T) has type *T. This can certainly get confusing. Here are some examples to help sort it out:
a := make([]int, 5) // s is now a slice of 5 integers
b := new([]int) // b points to a slice, but it's not initialized yet
*b = make([]int, 3) // now b points to a slice of 5 integers
Example 6
// maps
var map1 map[int]string
var map2 = new(map[int]string)
Maps, just like slices, need some initialization to work properly. That's why neither map1 nor map2 in the above example are quite ready to be used. You need to use make first:
var map1 map[int]string // Not ready to be used
map1 = make(map[int]string) // Now it can be used
var map2 = new(map[int]string) // Has type *map[int]string; not ready to be used
*map2 = make(map[int]string) // Now *map2 can be used
Extra Notes
There wasn't a really good place to put this above, so I'll just stick it here.
One thing to note in Go is that if you declare a variable without initializing it, it isn't actually uninitialized. Instead, it has a "zero value." This is distinct from C, where uninitialized variables can contain junk data.
Each type has a zero value. The zero values of most types are pretty reasonable. For example, the basic types:
int has zero value 0
bool has zero value false
string has zero value ""
(other numeric values like int8, uint16, float32, etc, all have zero values of 0 or 0.0)
For composite types like structs and arrays, the zero values are recursive. That is, the zero value of an array is an array with all of its entries set to their respective zero values (ie, the zero value of [3]int is [3]int{0, 0, 0} since 0 is the zero value of int).
Another thing to watch out for is that when using the := syntax, certain expressions' types cannot be inferred. The main one to watch out for is nil. So, i := nil will produce a compiler error. The reason for this is that nil is used for all of the pointer types (and a few other types as well), so there's no way for the compiler to know if you mean a nil int pointer, or a nil bool pointer, etc.

Call struct method in range loop

example code (edited code snippet): http://play.golang.org/p/eZV4WL-4N_
Why is it that
for x, _ := range body.Personality {
body.Personality[x].Mutate()
}
successfully mutates the structs' contents, but
for _, pf := range body.Personality{
pf.Mutate()
}
does not?
Is it that range creates new instances of each item it iterates over? because the struct does in fact mutate but it doesn't persist.
The range keyword copies the results of the array, thus making it impossible to alter the
contents using the value of range. You have to use the index or a slice of pointers instead of values
if you want to alter the original array/slice.
This behaviour is covered by the spec as stated here. The crux is that an assignment line
x := a[i] copies the value a[i] to x as it is no pointer. Since range is defined to use
a[i], the values are copied.
Your second loop is roughly equivalent to:
for x := range body.Personality {
pf := body.Personality[x]
pf.Mutate()
}
Since body.Personality is an array of structs, the assignment to pf makes a copy of the struct, and that is what we call Mutate() on.
If you want to range over your array in the way you do in the example, one option would be to make it an array of pointers to structs (i.e. []*PFile). That way the assignment in the loop will just take a pointer to the struct allowing you to modify it.

Resources