I am coming from C# background, and I am little confused with the way of GoLang initialization and zeroing definitions. I think you can guess that this confusion arises from make() and new() functions in Go. What should I expect happening internally when these methods run? What happens when initialization and zeroing happens?
I know that there is an init() function in GoLang that is used to initialize packages. But I think it is different than this.
Anyways, what is the difference between them?
Update
I answered my own question, please check it to see my answer.
I think I have figured it and decided to share what I figured so far.
make() vs. new()
I think I now understand the difference between make() and new(). At first, it was little confusing, but here what I got:
new is simply like new in C# or Java, but since there is no constructor in Go, all the fields (like in Java and C# terminology) will be zeroed. Zeroing means more like defaulting the fields. So if the field type is int, then it will be 0, or if it is a struct, then it will be defaulted to nil, and "" for string types. It is actually similar to C# and Java when there is only parameterless constructor available and you are not setting the members to something else manually.
However, types like map, slice, and channels are different. They are different because they are actually wrapper types that wrap an array type to hold the values behind the scenes. So something like List<T> or ArrayList in C# and Java. But using new is not enough in this situation, because the underlying array should be initialized to an empty array to be usable. Because you cannot add or remove from a field of type array which is nil (or null). Therefore, they provided a make() method to help you to initialize slices and such.
So what happens when you use new() over slices, for instance? Simple: Since the underlying array will be nil, the slice will be pointing at a nil array.
So new() would look like the following C#/Java code:
public class Person{
public string Name;
public int Age;
public Address HomeAddress;
}
var person = new Person();
Console.WriteLine(person.Name); // ""
Console.WriteLine(person.Age); // 0
Console.WriteLine(person.HomeAddress); // null
make(), on the other hand, would look like this for slice,map, and channels:
public class PersonList{
// We are initializing the array so that we can use it.
// Its capacity can increase.
private Person[] _personList = new Person[100];
public void Add(Person p){}
public void Remove(Person p){}
public Person Get(int index){}
}
Initialization vs. Zeroing
Simply speaking, zeroing is a form of initialization. At first, I thought they were different but they are not. Initialization is a more general term, whereas if you are set the fields (properties, etc.) of a struct or a variable to its type default such as 0, nil, "", false, etc., then this is called zeroing. However, you can, for instance, use Composite Literals like hello := Hello{name="world"}, which is similar to var hello = new Hello() {Name = "World"} in C#, then you initialize your Hello object with a name field set to world.
In C#, at the time you say new List<string>(), [the underlying array field is initialized to a new array], and make) is performing a similar operation behind the scenes but as a language construct (built in the language itself):
(http://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs,cf7f4095e4de7646):
So new does zeroing and returns a pointer back. Whereas make() initializes to underlying array to an array with default values for each element and returns the value itself rather than a pointer.
new returns a pointer to the type, and it's the only way to return a pointer to a native type in one step (aka int, float, complex):
intPtr := new(int)
// or
var i int
intPtr := &i
make is for initializing channels, slices and maps.
All variables are zeroed on creation, regardless of how you create them.
The spec about make: https://golang.org/ref/spec#Making_slices_maps_and_channels
The spec about zero value: https://golang.org/ref/spec#The_zero_value
new(T) allocates zeroed storage for a new item of type T and returns its address, a value of type *T: it returns a pointer to a newly allocated zero value of type T, ready for use; it applies to value types like arrays and structs. It is
equivalent to &T{ }
make(T) returns an initialized value of type T; it applies only to the 3 built-in
reference types: slices, maps, and channels.
var p *[]int = new([]int) // *p == nil; with len and cap 0
//or
p := new([]int)
var v []int = make([]int, 10, 50)
//or
v := make([]int, 10, 50)
//This allocates an array of 50 ints and then creates a slice v with length 10 and capacity 50 pointing
//to the first 10 elements of the array
Related
What's the cleanest way to handle a case such as this:
func a() string {
/* doesn't matter */
}
b *string = &a()
This generates the error:
cannot take the address of a()
My understanding is that Go automatically promotes a local variable to the heap if its address is taken. Here it's clear that the address of the return value is to be taken. What's an idiomatic way to handle this?
The address operator returns a pointer to something having a "home", e.g. a variable. The value of the expression in your code is "homeless". if you really need a *string, you'll have to do it in 2 steps:
tmp := a(); b := &tmp
Note that while there are completely valid use cases for *string, many times it's a mistake to use them. In Go string is a value type, but a cheap one to pass around (a pointer and an int). String's value is immutable, changing a *string changes where the "home" points to, not the string value, so in most cases *string is not needed at all.
See the relevant section of the Go language spec. & can only be used on:
Something that is addressable: variable, pointer indirection, slice indexing operation, field selector of an addressable struct, array indexing operation of an addressable array; OR
A composite literal
What you have is neither of those, so it doesn't work.
I'm not even sure what it would mean even if you could do it. Taking the address of the result of a function call? Usually, you pass a pointer of something to someone because you want them to be able to assign to the thing pointed to, and see the changes in the original variable. But the result of a function call is temporary; nobody else "sees" it unless you assign it to something first.
If the purpose of creating the pointer is to create something with a dynamic lifetime, similar to new() or taking the address of a composite literal, then you can assign the result of the function call to a variable and take the address of that.
In the end you are proposing that Go should allow you to take the address of any expression, for example:
i,j := 1,2
var p *int = &(i+j)
println(*p)
The current Go compiler prints the error: cannot take the address of i + j
In my opinion, allowing the programmer to take the address of any expression:
Doesn't seem to be very useful (that is: it seems to have very small probability of occurrence in actual Go programs).
It would complicate the compiler and the language spec.
It seems counterproductive to complicate the compiler and the spec for little gain.
I recently was tied up in knots about something similar.
First talking about strings in your example is a distraction, use a struct instead, re-writing it to something like:
func a() MyStruct {
/* doesn't matter */
}
var b *MyStruct = &a()
This won't compile because you can't take the address of a(). So do this:
func a() MyStruct {
/* doesn't matter */
}
tmpA := a()
var b *MyStruct = &tmpA
This will compile, but you've returned a MyStruct on the stack, allocated sufficient space on the heap to store a MyStruct, then copied the contents from the stack to the heap. If you want to avoid this, then write it like this:
func a2() *MyStruct {
/* doesn't matter as long as MyStruct is created on the heap (e.g. use 'new') */
}
var a *MyStruct = a2()
Copying is normally inexpensive, but those structs might be big. Even worse when you want to modify the struct and have it 'stick' you can't be copying then modifying the copies.
Anyway, it gets all the more fun when you're using a return type of interface{}. The interface{} can be the struct or a pointer to a struct. The same copying issue comes up.
You can't get the reference of the result directly when assigning to a new variable, but you have idiomatic way to do this without the use of a temporary variable (it's useless) by simply pre-declaring your "b" pointer - this is the real step you missed:
func a() string {
return "doesn't matter"
}
b := new(string) // b is a pointer to a blank string (the "zeroed" value)
*b = a() // b is now a pointer to the result of `a()`
*b is used to dereference the pointer and directly access the memory area which hold your data (on the heap, of course).
Play with the code: https://play.golang.org/p/VDhycPwRjK9
Yeah, it can be annoying when APIs require the use of *string inputs even though you’ll often want to pass literal strings to them.
For this I make a very tiny function:
// Return pointer version of string
func p(s string) *string {
return &s
}
and then instead of trying to call foo("hi") and getting the dreaded cannot use "hi" (type string) as type *string in argument to foo, I just wrap the argument in a call to to p():
foo(p("hi"))
a() doesn't point to a variable as it is on the stack. You can't point to the stack (why would you ?).
You can do that if you want
va := a()
b := &va
But what your really want to achieve is somewhat unclear.
At the time of writing this, none of the answers really explain the rationale for why this is the case.
Consider the following:
func main() {
m := map[int]int{}
val := 1
m[0] = val
v := &m[0] // won't compile, but let's assume it does
delete(m, 0)
fmt.Println(v)
}
If this code snippet actually compiled, what would v point to!? It's a dangling pointer since the underlying object has been deleted.
Given this, it seems like a reasonable restriction to disallow addressing temporaries
guess you need help from More effective Cpp ;-)
Temp obj and rvalue
“True temporary objects in C++ are invisible - they don't appear in your source code. They arise whenever a non-heap object is created but not named. Such unnamed objects usually arise in one of two situations: when implicit type conversions are applied to make function calls succeed and when functions return objects.”
And from Primer Plus
lvalue is a data object that can be referenced by address through user (named object). Non-lvalues include literal constants (aside from the quoted strings, which are represented by their addresses), expressions with multiple terms, such as (a + b).
In Go lang, string literal will be converted into StrucType object, which will be a non-addressable temp struct object. In this case, string literal cannot be referenced by address in Go.
Well, the last but not the least, one exception in go, you can take the address of the composite literal. OMG, what a mess.
Let's say we have this kind of a struct (one of the simplest ever):
type some struct{
I uint32
}
And we want to have a variable of that type and to atomically increment in for loop (possibly in another goroutine but now the story is different). I do the following:
q := some{0}
for i := 0; i < 10; i++ {
atomic.AddUint32(&q.I,1) // increment [1]
fmt.Println(q.I)
}
We're getting what we'd expect, so far so good, but if we declare a function for that type as follows:
func (sm some) Add1(){
atomic.AddUint32(&sm.I,1)
}
and call this function in the above sample (line [1]) the value isn't incremented and we just get zeros. The question is obvious - why?
This has to be something basic but since I am new to go I don't realize it.
The Go Programming Language Specification
Calls
In a function call, the function value and arguments are evaluated in
the usual order. After they are evaluated, the parameters of the call
are passed by value to the function and the called function begins
execution. The return parameters of the function are passed by value
back to the calling function when the function returns.
The receiver sm some is passed by value to the method and the copy is discarded when you return from the method. Use a pointer receiver.
For example,
package main
import (
"fmt"
"sync/atomic"
)
type some struct {
I uint32
}
func (sm *some) Add1() {
atomic.AddUint32(&sm.I, 1)
}
func main() {
var s some
s.Add1()
fmt.Println(s)
}
Output:
{1}
Go Frequently Asked Questions (FAQ)
When are function parameters passed by value?
As in all languages in the C family, everything in Go is passed by
value. That is, a function always gets a copy of the thing being
passed, as if there were an assignment statement assigning the value
to the parameter. For instance, passing an int value to a function
makes a copy of the int, and passing a pointer value makes a copy of
the pointer, but not the data it points to.
Should I define methods on values or pointers?
func (s *MyStruct) pointerMethod() { } // method on pointer
func (s MyStruct) valueMethod() { } // method on value
For programmers unaccustomed to pointers, the distinction between
these two examples can be confusing, but the situation is actually
very simple. When defining a method on a type, the receiver (s in the
above examples) behaves exactly as if it were an argument to the
method. Whether to define the receiver as a value or as a pointer is
the same question, then, as whether a function argument should be a
value or a pointer. There are several considerations.
First, and most important, does the method need to modify the
receiver? If it does, the receiver must be a pointer. (Slices and maps
act as references, so their story is a little more subtle, but for
instance to change the length of a slice in a method the receiver must
still be a pointer.) In the examples above, if pointerMethod modifies
the fields of s, the caller will see those changes, but valueMethod is
called with a copy of the caller's argument (that's the definition of
passing a value), so changes it makes will be invisible to the caller.
By the way, pointer receivers are identical to the situation in Java,
although in Java the pointers are hidden under the covers; it's Go's
value receivers that are unusual.
Second is the consideration of efficiency. If the receiver is large, a
big struct for instance, it will be much cheaper to use a pointer
receiver.
Next is consistency. If some of the methods of the type must have
pointer receivers, the rest should too, so the method set is
consistent regardless of how the type is used. See the section on
method sets for details.
For types such as basic types, slices, and small structs, a value
receiver is very cheap so unless the semantics of the method requires
a pointer, a value receiver is efficient and clear.
Your function need to receive a pointer for the value to be incremented, that way you are not passing a copy of the struct and on next iteration the I can be incremented.
package main
import (
"sync/atomic"
"fmt"
)
type some struct{
I uint32
}
func main() {
q := &some{0}
for i := 0; i < 10; i++ {
q.Add1()
fmt.Println(q.I)
}
}
func (sm *some) Add1(){
atomic.AddUint32(&sm.I,1)
}
The slices are references to the underlying array. This makes sense and seems to work on builtin/primitive types but why is not working on structs? I assume that even if I update a struct field the reference/address is still the same.
package main
import "fmt"
type My struct {
Name string
}
func main() {
x := []int{1}
update2(x)
fmt.Println(x[0])
update(x)
fmt.Println(x[0])
my := My{Name: ""}
update3([]My{my})
// Why my[0].Name is not "many" ?
fmt.Println(my)
}
func update(x []int) {
x[0] = 999
return
}
func update2(x []int) {
x[0] = 1000
return
}
func update3(x []My) {
x[0].Name = "many"
return
}
To clarify: I'm aware that I could use pointers for both cases. I'm only intrigued why the struct is not updated (unlike the int).
What you do when calling update3 is you pass a new array, containing copies of the value, and you immediately discard the array. This is different from what you do with the primitive, as you keep the array.
There are two approaches here.
1) use an array of pointers instead of an array of values:
You could define update3 like this:
func update3(x []*My) {
x[0].Name = "many"
return
}
and call it using
update3([]*My{&my})
2) write in the array (in the same way you deal with the primitive)
arr := make([]My,1)
arr[0] = My{Name: ""}
update3(arr)
From the GO FAQ:
As in all languages in the C family, everything in Go is passed by
value. That is, a function always gets a copy of the thing being
passed, as if there were an assignment statement assigning the value
to the parameter. For instance, passing an int value to a function
makes a copy of the int, and passing a pointer value makes a copy of
the pointer, but not the data it points to. (See the next section for
a discussion of how this affects method receivers.)
Map and slice values behave like pointers: they are descriptors that
contain pointers to the underlying map or slice data. Copying a map or
slice value doesn't copy the data it points to.
Thus when you pass my you are passing a copy of your struct and the calling code won't see any changes made to that copy.
To have the function change the data in teh struct you have to pass a pointer to the struct.
Your third test is not the same as the first two. Look at this (Playground). In this case, you do not need to use pointers as you are not modifying the slice itself. You are modifying an element of the underlying array. If you wanted to modify the slice, by for instance, appending a new element, you would need to use a pointer to pass the slice by reference. Notice that I changed the prints to display the type as well as the value.
I've been trying to wrap my head around the concept of interfaces in Go. Reading this and this helped a lot.
The only thing that makes me uncomfortable is the syntax. Have a look at the example below:
package main
import "fmt"
type Interface interface {
String() string
}
type Implementation int
func (v Implementation) String() string {
return fmt.Sprintf("Hello %d", v)
}
func main() {
var i Interface
impl := Implementation(42)
i = impl
fmt.Println(i.String())
}
My issue is with i = impl. Based on the fact that an interface instance actually holds a pointer reference to the actual data, it would feel more natural for me to do i = &impl. Usually assignment of non-pointer when not using & will make a full memory copy of the data, but when assigning to interfaces this seem to side-step this and instead simply (behind the scenes) assign the pointer to the interface value. Am I right? That is, the data for the int(42) will not be copied in memory?
The data for int(42) will be copied. Try this code:
func main() {
var i Interface
impl := Implementation(42)
i = impl
fmt.Println(i.String())
impl = Implementation(91)
fmt.Println(i.String())
}
(Playground link)
You'll find that the second i.String() still shows 42. Perhaps one of the trickier aspects of Go is that method receivers can be pointers as well.
func (v *Implementation) String() string {
return fmt.Sprintf("Hello %d", *v)
}
// ...
i = &impl
Is what you want if you want the interface to hold a pointer to the original value of impl. "Under the hood" an interface is a struct that either holds a pointer to some data, or the data itself (and some type metadata that we can ignore for our purposes). The data itself is stored if its size is less than or equal to one machine word -- whether it be a pointer, struct, or other value.
Otherwise it will be a pointer to some data, but here's the tricky part: if the type implementing the interface is a struct the pointer will be to a copy of the struct, not the struct assigned to the interface variable itself. Or at least semantically the user can think of it as such, optimizations may allow the value to not be copied until the two diverge (e.g. until you call String or reassign impl).
In short: assigning to an interface can semantically be thought of as a copy of the data that implements the interface. If this is a pointer to a type, it copies the pointer, if it's a big struct, it copies the big struct. The particulars of interfaces using pointers under the hood are for reasons of garbage collection and making sure the stack expands by predictable amounts. As far as the developer is concerned, they should be thought of as semantic copies of the specific instance of the implementing type assigned.
It seems pointless to be used in primitive language constructs, as you can't specify any sort of values
func main() {
y := new([]float)
fmt.Printf("Len = %d", len(*y) ) // => Len = 0
}
For stucts it makes a bit more sense, but what's the difference between saying y := new(my_stuct) and the seemingly more concise y := &my_struct?
And since anything you create is based on those primitives, they will be initialized to the said zero values. So what's the point? When would you ever want to use new()?
Sorry for the very-beginner question, but the documentation isn't always that clear.
You can't use new for slices and maps, as in your code example, but instead you must use the make command: make([]float, 100)
Both new(MyStruct) and &MyStruct{} do to the same thing, because Go will allocate values on the heap if you get their address with &. Sometimes the code just expresses it intent better in one style or the other.
Go does not have built-in support for constructors, so usually you would wrap the call to new into a function, for example NewMyStruct() which does all the necessary initialization. It also makes it possible to initialize private fields or hide the struct behind an interface, to prevent users of the object from directly messing with its internals. Also evolving the structure of the struct is easier that way, when you don't need to change all of its users when adding/removing/renaming/reordering fields.
make does only work for maps, slices and channels and composite literals like type{} work only for structs, arrays, slices, and maps. For other types, you'll have to use new to get a pointer to a newly allocated instance (if you don't want to use a longer var v T; f(&v)).
I guess this is useful if you want to initialize a struct:
typedef foo struct {
bar *int
}
v := foo{bar: new(int)}