How exactly are interface variables implemented in Go? - go

In the below code snippet, I'd like to understand what exactly gets stored in iPerson when its contents are still uninitialized: just a value of 0-bytes? Or is it actually a pointer under the hood (and also initialized to 0-bytes of course)? In any case, what exactly happens at iPerson = person?
If iPerson = person makes a copy of person, what happens then when an object implementing IPerson but with a different size/memory footprint gets assigned to iPerson? I understand iPerson is a variable stored on the stack, so its size must be fixed. Does that mean that the heap is actually used under the hood, so iPerson is actually implemented as a pointer, but assignments still copy the object, as demonstrated by the above code?
Here's the code:
type Person struct{ name string }
type IPerson interface{}
func main() {
var person Person = Person{"John"}
var iPerson IPerson
fmt.Println(person) // => John
fmt.Println(iPerson) // => <nil> ...so looks like a pointer
iPerson = person // ...this seems to be making a copy
fmt.Println(iPerson) // => John
person.name = "Mike"
fmt.Println(person) // => Mike
fmt.Println(iPerson) // => John ...so looks like it wasn't a pointer,
// or at least something was definitely copied
}
(This question is the result of me having second thoughts on the precise factual correctness of my answer to why runtime error on io.WriterString?. So I decided to try to do some investigation to understand how is it exactly that interface variables and assignments to them work in Go.)
EDIT: after having received a few useful answers, I'm still puzzled with this:
iPerson = person
iPerson = &person
—both are legal. However, to me, this raises the question of why the compiler allows such weak typing to occur? One implication of the above is this:
iPerson = &person
var person2 = iPerson.(Person) # panic: interface conversion: interface is *main.Person, not main.Person
whereas changing the first line fixes it:
iPerson = person
var person2 = iPerson.(Person) # OK
...so it's not possible to determine statically whether iPerson holds a pointer or a value; and it seems that anything can assign either one to it at runtime with no errors raised. Why was such design decision made? What purpose does it serve? It definitely does not to fit within the "type safety" mindset.

You ask why both of
iPerson = person
iPerson = &person
are permitted. They are both permitted because both person and &person implement the IPerson interface. This is obvious, because IPerson is the empty interface--every value implements it.
It's true that you can't determine statically whether a value of IPerson holds a pointer or a value. So what? All you know about IPerson is that any object stored in a value of that type implements the list of methods in the interface. The assumption is that those methods are implemented correctly. Whether IPerson holds a value or a pointer is irrelevant to that.
For example, if the method is supposed to change something stored in the object, then that the method pretty much has to be a pointer method, in which case only a pointer value can be stored in the variable of interface type. But if none of the methods change something stored in the object, then they can all be value methods, and a non-pointer value can be stored in the variable.

So, looks like internally, the interface variable does hold a pointer to what was assigned to it. An excerpt from http://research.swtch.com/interfaces:
The second word in the interface value points at the actual data, in this case a copy of b. The assignment var s Stringer = b makes a copy of b rather than point at b for the same reason that var c uint64 = b makes a copy: if b later changes, s and c are supposed to have the original value, not the new one.
My question
[...] what happens then when an object implementing IPerson but with a different size/memory footprint gets assigned to iPerson?
...also gets answered in the article:
Values stored in interfaces might be arbitrarily large, but only one word is dedicated to holding the value in the interface structure, so the assignment allocates a chunk of memory on the heap and records the pointer in the one-word slot.
So yeah, a copy on the heap is made and a pointer to it assigned to the interface variable. But, apparently, to the programmer, the interface variable has the semantics of a value variable not a pointer variable.
(Thanks to Volker for providing the link; but also, the first part of his answer is factually plain wrong... So I don't know if I should downvote for the misleading information or upvote for the non-misleading and rather useful link (which also happens to contradict his own answer).)

When you execute the following line:
iPerson = person
You are storing a Person value in the interface variable. Since assignment to a struct performs a copy, yes your code is taking a copy. To retrieve the struct from inside the interface you'll need to take another copy:
p := iPerson.(Person)
so you'd rarely want to do this with mutable types. If you instead want to store a pointer to the struct in the interface variable, you need to do this explicitly:
iPerson = &person
As far as what goes on under the hood, you are right that interface variables allocate heap space to store values larger than a pointer, but this is usually not visible to the user.

Related

Mapping concrete types

What is the idiomatic way to define something like map[type]interface{}?
As far I can see the type (as keyword) is not something comparable so can not be used as a key in a map. Maybe I'm going in the wrong way, so U would accept any suggestion.
TL;DR;
Example motivation
Let's assume in the application model I have type called Person, being stored in a table named "person" of a rdbms.
If I would like to link the entity object Person to the the table name. Things that by definition doens't come together, so it is wise to avoid polluting the Person struct with "not-naturally-related" (this is where my java-OO-based mind appears) [pointer|value]-recieve methods, so a map could be in handy here, right? (maps are great for associating things from differents worlds or sets, right?)
var tableNameByType map[type]string = map[type]string{
Person: "person",
}
This statement causes the compiler to yell at me complaining about expected type, found 'type'. I have tried used instead of type, interface{} and struct, with no better results.
Use reflect.Type as the key:
var tableNameByType map[reflect.Type]string = map[reflect.Type]string{
reflect.TypeOf(Person{}): "person",
}
You can get the name for a type using:
name := tableNameByType[reflect.TypeOf(Person{})]
... or the name for value v using:
name := tableNameByType[reflect.ValueOf(v).Type()]
You can avoid instantiating the struct value by replacing reflect.TypeOf(Person{}) with reflect.TypeOf((*Person)(nil)).Elem() in the above code.

Golang struct initialization

There is a simple struct like this:
type Event struct {
Id int
Name string
}
What is the difference between these two initialization methods?
e1 := Event{Id: 1, Name: "event 1"}
e2 := &Event{Id: 2, Name: "event 2"}
Any why would I use either of these initialization methods?
The first method
e1 := Event{Id: 1, Name: "event 1"}
is initializing the variable e1 as a value with type Event.
The second
e2 := &Event{Id: 1, Name: "event1"}
is initializing e2 as a pointer to a value of type Event As you stated in the comments, the set of methods defined on a value of a given type are a subset of the set of methods defined on a pointer to a value of that type. This means that if you have a method
func (e Event) GetName() string {
return e.Name
}
then both e1 and e2 can call this method, but if you had another method, say:
func (e *Event) ChangeName(s string) {
e.Name = s
}
Then e1 is not able to use the ChangeName method, while e2 is.
This (e1 is not able to use the ChangeName method, while e2 is) is not the case (although it may have been at the time of writing for this help), thanks to #DannyChen for bringing this up and #GilbertNwaiwu for testing and posting in the comments below.
(To address the striked out section above: The set of methods defined on a struct type consist of the methods defined for the type and pointers to the type.
Instead, Go now automatically dereferences the argument to a method, so that if a method receives a pointer, Go calls the method on a pointer to that struct, and if the method receives a value, Go calls the method on the value pointed to by that struct. At this point my attempt to update this answer may be missing something important in semantics so if someone would like to correct this or clarify feel free to add a comment pointing to a more comprehensive answer. Here is a bit from the go playground illustrating this issue: https://play.golang.org/p/JcD0izXZGz.
To some extent, this change in how pointers and values work as arguments to methods defined on function affects some areas of the discourse below but I will leave the rest unedited unless someone encourages me to update it as it seems to be more or less correct within the context of general semantics of languages that pass by value vs. pointer.)
As to the difference between pointers and values, this example is illustrative, as pointers are ordinarily used in Go to allow you to mutate the values a variable is pointing to (but there are many more reasons one might use pointers as well! Although for typical use, this is normally a solid assumption). Thus, if you defined ChangeName instead as:
func (e Event) ChangeName(s string) {
e.Name = s
}
This function would not be very useful if called on the value receiver, as values (not pointers) won't keep changes that are made to them if they're passed into a function. This has to do with an area of language design around how variables are assigned and passed: What's the difference between passing by reference vs. passing by value?
You can see this on this example in the Go Playground: https://play.golang.org/p/j7yxvu3Fe6
The type of e1 is Event the type of e2 is *Event. The initialization is actually the same (using composite literal syntax, also not sure if that jargon is Go or C# or both?) but with e2 you using the 'address of operator' & so it returns a pointer to that object rather than the instance itself.

Use map[string]SpecificType with method of map[string]SomeInterface into

I get cannot use map[string]MyType literal (type map[string]MyType) as type map[string]IterableWithID in argument to MapToList with the code below, how do I pass in a concrete map type to method that expects a interface type?
https://play.golang.org/p/G7VzMwrRRw
Go's interface convention doesn't quite work the same way as in, say, Java (and the designers apparently didn't like the idea of getters and setters very much :-/ ). So you've got two core problems:
A map[string]Foo is not the same as a map[string]Bar, even if Bar implements Foo, so you have to break it out a bit (use make() beforehand, then assign in a single assignment).
Interface methods are called by value with no pointers, so you really need to do foo = foo.Method(bar) in your callers or get really pointer-happy to implement something like this.
What you can do to more-or-less simulate what you want:
type IterableWithID interface {
SetID(id string) IterableWithID // use as foo = foo.SetID(bar)
}
func (t MyType) SetID(id string) IterableWithID {
t.ID = id
return t
}
...and to deal with the typing problem
t := make(map[string]IterableWithID)
t["foo"] = MyType{}
MapToList(t) // This is a map[string]IterableWithID, so compiler's happy.
...and finally...
value = value.SetID(key) // We set back the copy of the value we mutated
The final value= deals with the fact that the method gets a fresh copy of the value object, so the original would be untouched by your method (the change would simply vanish).
Updated code on the Go Playground
...but it's not particularly idiomatic Go--they really want you to just reference struct members rather than use Java-style mutators in interfaces (though TBH I'm not so keen on that little detail--mutators are supes handy to do validation).
You can't do what you want to do because the two map types are different. It doesn't matter that the element type of one is a type that implements the interface which is the element type of the other. The map type that you pass into the function has to be map[string]IterableWithID. You could create a map of that type, assign values of type MyType to the map, and pass that to the function.
See https://play.golang.org/p/NfsTlunHkW
Also, you probably don't want to be returning a pointer to a slice in MapToList. Just return the slice itself. A slice contains a reference to the underlying array.

c++11: how to understand the function move

I can't understand the function move in c++11.
From here, I got things below:
Although note that -in the standard library- moving implies that the
moved-from object is left in a valid but unspecified state. Which
means that, after such an operation, the value of the moved-from
object should only be destroyed or assigned a new value; accessing it
otherwise yields an unspecified value.
In my opinion, after move(), the moved-from object has been "clear". However, I've done a test below:
std::string str = "abcd";
std::move(str);
std::cout<<str;
I got abcd on my screen.
So has the str been destroyed? If so, I could get abcd because I'm just lucky? Or I misunderstood the function move?
Besides, when I read C++ Primer, I got such a code:
class Base{/* ... */};
class D: public Base{
public:
D(D&& d): Base(std::move(d)){/* use d to initialize the members of D */}
};
I'm confused now. If the function move will clear the object, the parameter d will be clear, how could we "use d to initialize the members of D"?
std::move doesn't actually do anything. It's roughly analogous to a cast expression, in that the return value is the original object, but treated differently.
More precisely, std::move returns the object in a form which is amenable to its resources being 'stolen' for some other purpose. The original object remains valid, more or less (you're only supposed to do certain special things to it, though that's primarily a matter of convention and not necessarily applicable to non-standard-library objects), but the stolen-away resources no longer belong to it, and generally won't be referenced by it any more.
But! std::move doesn't, itself, do the stealing. It just sets things up for stealing to be allowed. Since you're not doing anything with the result, let alone something which could take advantage of the opportunity, nothing gets stolen.
std::move doesn’t move anything. std::move is merely a function template that perform casts. std::move unconditionally casts its argument to an rvalue,
std::move(str);
With this expression you are just doing type cast from lvalue to rvalue.
small modification in program to understand better.
std::string str = "abcd";
std::string str1 = std::move(str);
std::cout<<str<<std::endl;
std::cout<<str1<<std::endl;
str lvalue typecast to rvalue by std::move, std::string = std::move(str); =>this expression call the string move constructor where actual stealing of resources take placed. str resources(abcd) are steeled and printed empty string.
Here is sample implementation of move function. Please note that it is not complete implementation of standard library.
template<typename T> // C++14; still in
decltype(auto) move(T&& param) // namespace std
{
using ReturnType = remove_reference_t<T>&&;
return static_cast<ReturnType>(param);
}
Applying std::move to an object tells the compiler that the object is eligible to be moved from. It cast to the rvalue.
class Base{/* ... */};
class D: public Base{
public:
D(D&& d): Base(std::move(d)){/* use d to initialize the members of D */}
};
Base(std::move(d)) it will do up-casting only move the base class part only.
Here one more interesting thing to learn for you. If you do not invoke base class destructor with std::move like D(D&& d): Base(d) then d will be considered as lvalue and copy constructor of Base class involved instead of move constructor. Refer for more detail Move constructor on derived object

Properly distinguish between not set (nil) and blank/empty value

Whats the correct way in go to distinguish between when a value in a struct was never set, or is just empty, for example, given the following:
type Organisation struct {
Category string
Code string
Name string
}
I need to know (for example) if the category was never set, or was saved as blank by the user, should I be doing this:
type Organisation struct {
Category *string
Code *string
Name *string
}
I also need to ensure I correctly persist either null or an empty string to the database
I'm still learning GO so it is entirely possible my question needs more info.
The zero value for a string is an empty string, and you can't distinguish between the two.
If you are using the database/sql package, and need to distinguish between NULL and empty strings, consider using the sql.NullString type. It is a simple struct that keeps track of the NULL state:
type NullString struct {
String string
Valid bool // Valid is true if String is not NULL
}
You can scan into this type and use it as a query parameter, and the package will handle the NULL state for you.
Google's protocol buffers (https://code.google.com/p/goprotobuf/) use pointers to describe optional fields.
The generated objects provide GetFoo methods which take the pain away from testing for nil (a.GetFoo() returns an empty string if a.Foo is nil, otherwise it returns *a.Foo).
It introduces a nuisance when you want to write literal structs (in tests, for example), because &"something" is not valid syntax to generate a pointer to a string, so you need a helper function (see, for example, the source code of the protocol buffer library for proto.String).
// String is a helper routine that allocates a new string value
// to store v and returns a pointer to it.
func String(v string) *string {
return &v
}
Overall, using pointers to represent optional fields is not without drawbacks, but it's certainly a viable design choice.
The standard database/sql package provides a NullString struct (members are just String string and Valid bool). To take care of some of the repetitive work of persistence, you could look at an object-relational manager like gorp.
I looked into whether there was some way to distinguish two kinds of empty string just out of curiosity, and couldn't find one. With []bytes, []byte{} == []byte(nil) currently returns false, but I'm not sure if the spec guarantees that to always remain true. In any case, it seems like the most practical thing to do is to go with the flow and use NullString.

Resources