Deep copy vs shallow copy for structs - go

I had deep dived in some comparisons about deep or shallow copy while passing a struct with primitive and pointer fields. Like:
type Copy struct {
age int
ac *AnotherCopy
}
type AnotherCopy struct {
surname string
}
func main() {
s := Copy{
age: 20,
ac: &AnotherCopy{surname: "Relic"},
}
passIt(&s)
fmt.Printf("main s: %p\n", &s)
}
func passIt(s *Copy) {
f := *s
fmt.Printf("s: %p\n", &*s)
fmt.Printf("f: %p\n", &f)
f.age = 26
f.ac.surname = "Walker"
fmt.Printf("%v %s\n", f, f.ac.surname)
fmt.Printf("%v %s\n", *s, s.ac.surname)
}
The result is
s: 0xc000010230
f: 0xc000010250
{26 0xc000010240} Walker
{20 0xc000010240} Walker
main s: 0xc000010230
What I can see here is that, when we pass a struct with primitive and composite types, It copies deeply primitive types and copies shallowly pointer (reference) fields.
I have read some articles about that process and there is a conflict between thoughts.
The question is what should we call that process? Deep copy or shallow copy?
and if we call the process as only shallow copy, is this wrong?
Can you clarify me please?

In your question you mention "primitive and composite types" as being different, and I think that is the root of your confusion here. Go does not differentiate between primitive and composite types, but between pointer types and value types. "Primitive" types (int, int32, float64, etc.) are value types, but so are structs (and also arrays). Pointers to other types, maps, slices, channels, and interfaces are all pointer types.
The direct answer to your question is, as one comment mentions, that copies in Go (via variable assignment, etc.) are "shallow copies" in that Go does not dereference pointer types and create new underlying copies. However, since struct type things are not pointer types, it's completely possible for a composite value to be "deep copied".
If you want to see this in action, try modifying your example code so that the ac field on your Copy type is a plain struct rather than a pointer to a struct:
type Copy struct {
age int
ac AnotherCopy
}
You should then see that creating a copy of a Copy and setting the ac.surname field doesn't change the value of the ac.surname field on the original struct.

Related

Is type casting structs in Go a no-op?

Consider the following code in Go
type A struct {
f int
}
type B struct {
f int `somepkg:"somevalue"`
}
func f() {
var b *B = (*B)(&A{1}) // <-- THIS
fmt.Printf("%#v\n", b)
}
Will the marked line result in a memory copy (which I would like to avoid as A has many fields attached to it) or will it be just a reinterpretation, similar to casting an int to an uint?
EDIT: I was concerned, whether the whole struct would have to be copied, similarly to converting a byte slice to a string. A pointer copy is therefore a no-op for me
It is called a conversion. The expression (&A{}) creates a pointer to an instance of type A, and (*B) converts that pointer to a *B. What's copied there is the pointer, not the struct. You can validate this using the following code:
a:=A{}
var b *B = (*B)(&a)
b.f=2
fmt.Printf("%#v\n", a)
Prints 2.
The crucial points to understand is that
First, unlike C, C++ and some other languages of their ilk, Go does not have type casting, it has type conversions.
In most, but not all, cases, type conversion changes the type but not the internal representation of a value.
Second, as to whether a type conversion "is a no-op", depends on how you define the fact of being a no-op.
If you are concerned with a memory copy being made, there are two cases:
Some type conversions are defined to drastically change the value's representation or to copy memory; for example:
Type-converting a value of type string to []rune would interpret the value as a UTF-8-encoded byte stream, decode each encoded Unicode code point and produce a freshly-allocated slice of decoded Unicode runes.
Type-converting a value of type string to []byte, and vice-versa, will clone the backing array underlying the value.
Other type-conversions are no-op in this sense but in order for them to be useful you'd need to either assign a type-converted value to some variable or to pass it as an argument to a function call or send to a channel etc — in other words, you have to store the result or otherwise make use of it.
All of such operations do copy the value, even though it does not "look" like this; consider:
package main
import (
"fmt"
)
type A struct {
X int
}
type B struct {
X int
}
func (b B) Whatever() {
fmt.Println(b.X)
}
func main() {
a := A{X: 42}
B(a).Whatever()
b := B(a)
b.Whatever()
}
Here, the first type conversion in main does not look like a memory copy, but the resulting value will serve as a receiver in the call to B.Whatever and will be physically copied there.
The second type conversion stores the result in a variable (and then copies it again when a method is called).
Reasonong about such things is easy in Go as there everything, always, is passed by value (and pointers are values, too).
It may worth adding that variables in Go does not store the type of the value they hold, so a type conversion cannot mutate the type of a variable "in place". Values do not have type information stored in them, either. This basically means that type conversions is what compiler is concerned with: it knows the types of all the participating values and variables and performs type checking.

Is it better to use pointer for non-primitive types in struct fields in Go

I am going on a project which process some data, I am wondering that if it is better to use pointer in non-primitive typed fields of struct.
What I've found is that the reason of using pointer is that nil can be used as a zero-value, is this the only reason to use pointer?
For example, I am going to store time.Time in my struct and it cannot be nil, then is it better to use non-pointer field?
So is it okay to use
type A struct {
CreatedAt time.Time
}
rather than
type A struct {
CreatedAt *time.Time
}
when Now is not going to be nil?
Not sure I understand the question. In the case of "Now" I would make it a function of the struct i.e.:
type A struct{}
func (a A) Now() time.Time { return time.Now(); }
otherwise what does Now mean? Now is constantly changing.
There are great blogs on when to use pointers
The short would be it doesn't really depend on if the value can be nil, but more on memory and concurrency. Pointers will be passed as references, so less memory, and faster, but also means that changing in one go routine can be very dangerous because the value could be referenced in another go routine and cause race conditions and unexpected behaviors.
I'm not really a professional or know the ins and outs of Go, so take everything I say with some grain of salt.
But as I am understanding it, you should most likely use pointers.
This is because every time you use a non-pointer type, the whole struct will be part of your struct in memory. As a consequence, you can't share a single instance of a struct between multiple structs - every single one gets a copy of your original struct.
Heres a small example:
// This struct has 2x64 bits in size
type MyStruct struct {
A uint64
B uint64
}
// This struct has 32 + 2x64 bits in size
type MyOtherStruct struct {
C uint32
Parent MyStruct
}
// This struct has 32 + the length of an address bits size
type MyPointerStruct struct {
D uint32
Parent *MyStruct
}
But apart from memory concerns, there is also a performance hit if your inner struct is very big. Because every time you set the inner struct the whole memory has to be copied to your instance.
However you have to be careful if your are dealing with interfaces or structs. At runtime, an interface is represented as a type with two fields: A reference to the actual (runtime) type and one with a reference to the actual instance.
So I - with my unprofessional opinion - would recommend to not use pointers if you have interface types because otherwise the CPU has to deference twice (once to get the interface reference, and then again to get the instance of the interface).

What are the second pair of braces in this Golang struct?

var cache = struct {
sync.Mutex
mapping map[string]string
} {
mapping: make(map[string]string),
}
This looks like a struct with an embedded field sync.Mutex but I can't get my head around the second set of braces. It compiles and executes but what's up? Why does the label on the make instruction matter (it does) and the comma? Thanks...
The example you have is equivalent to:
type Cache struct {
sync.Mutex
mapping map[string]string
}
cache := Cache{
mapping: make(map[string]string),
}
Except in your example you do not declare a type of Cache and instead have an anonymous struct. In your example, as oppose to my Cache type, the type is the entire
struct {
sync.Mutex
mapping map[string]string
}
So think of the second pair of braces as the
cache := Cache{
mapping: make(map[string]string),
}
part.
make is a built in function that works similarly to C's calloc() which both initialize a data structure filled with 0'd values, in Go's case, certain data structures need to be initialized this way, other's (for the most part structs) are initialized with 0'd values automatically. The field there is needed so that the compiler now's cache.mapping is a empty map[string]string.
The comma there is part of Go's formatting, you can do Cache{mapping: make(map[string]string)} all on one line, but the moment the field's assignment is on a different line than the opening and closing braces, it requires a comma.
This is called a "struct literal" or an "anonymous struct" and is, in fact, how you always create structs in Go, it just may not be immediately obvious since you might be used to creating new types for struct types to make declaring them a bit less verbose.
An entire struct definition is actually a type in Go, just like int or []byte or string. Just as you can do:
type NewType int
var a NewType = 5 // a is a NewType (which is based on an int)
or:
a := 5 // a is an int
and both are distinct types that look like ints, you can also do the same thing with structs:
// a is type NewType (which is a struct{}).
type NewType struct{
A string
}
a := NewType{
A: "test string",
}
// a is type struct{A string}
a := struct{
A string
}{
A: "test string",
}
the type name (NewType) has just been replaced with the type of the struct itself, struct{A string}. Note that they are not the same type (an alias) for the purpose of comparison or assignment, but they do share the same semantics.

map[T]struct{} and map[T]bool in golang

What's the difference? Is map[T]bool optimized to map[T]struct{}? Which is the best practice in Go?
Perhaps the best reason to use map[T]struct{} is that you don't have to answer the question "what does it mean if the value is false"?
From "The Go Programming Language":
The struct type with no fields is called the empty struct, written
struct{}. It has size zero and carries no information but may be
useful nonetheless. Some Go programmers use it instead of bool as the
value type of a map that represents a set, to emphasize that only the
keys are significant, but the space saving is marginal and the syntax
more cumbersome, so we generally avoid it.
If you use bool testing for presence in the "set" is slightly nicer since you can just say:
if mySet["something"] {
/* .. */
}
Difference is in memory requirements. Under the bonnet empty struct is not a pointer but a special value to save memory.
An empty struct is a struct type like any other. All the properties you are used to with normal structs apply equally to the empty struct. You can declare an array of structs{}s, but they of course consume no storage.
var x [100]struct{}
fmt.Println(unsafe.Sizeof(x)) // prints 0
If empty structs hold no data, it is not possible to determine if two struct{} values are different.
Considering the above statements it means that we may use them as method receivers.
type S struct{}
func (s *S) addr() { fmt.Printf("%p\n", s) }
func main() {
var a, b S
a.addr() // 0x1beeb0
b.addr() // 0x1beeb0
}

Conversion of a slice of string into a slice of custom type

I'm quite new to Go, so this might be obvious. The compiler does not allow the following code:
(http://play.golang.org/p/3sTLguUG3l)
package main
import "fmt"
type Card string
type Hand []Card
func NewHand(cards []Card) Hand {
hand := Hand(cards)
return hand
}
func main() {
value := []string{"a", "b", "c"}
firstHand := NewHand(value)
fmt.Println(firstHand)
}
The error is:
/tmp/sandbox089372356/main.go:15: cannot use value (type []string) as type []Card in argument to NewHand
From the specs, it looks like []string is not the same underlying type as []Card, so the type conversion cannot occur.
Is it, indeed, the case, or did I miss something?
If it is the case, why is it so? Assuming, in a non-pet-example program, I have as input a slice of string, is there any way to "cast" it into a slice of Card, or do I have to create a new structure and copy the data into it? (Which I'd like to avoid since the functions I'll need to call will modify the slice content).
There is no technical reason why conversion between slices whose elements have identical underlying types (such as []string and []Card) is forbidden. It was a specification decision to help avoid accidental conversions between unrelated types that by chance have the same structure.
The safe solution is to copy the slice. However, it is possible to convert directly (without copying) using the unsafe package:
value := []string{"a", "b", "c"}
// convert &value (type *[]string) to *[]Card via unsafe.Pointer, then deref
cards := *(*[]Card)(unsafe.Pointer(&value))
firstHand := NewHand(cards)
https://play.golang.org/p/tto57DERjYa
Obligatory warning from the package documentation:
unsafe.Pointer allows a program to defeat the type system and read and write arbitrary memory. It should be used with extreme care.
There was a discussion on the mailing list about conversions and underlying types in 2011, and a proposal to allow conversion between recursively equivalent types in 2016 which was declined "until there is a more compelling reason".
The underlying type of Card might be the same as the underlying type of string (which is itself: string), but the underlying type of []Card is not the same as the underlying type of []string (and therefore the same applies to Hand).
You cannot convert a slice of T1 to a slice of T2, it's not a matter of what underlying types they have, if T1 is not identical to T2, you just can't. Why? Because slices of different element types may have different memory layout (different size in memory). For example the elements of type []byte occupy 1 byte each. The elements of []int32 occupy 4 bytes each. Obviously you can't just convert one to the other even if all values are in the range 0..255.
But back to the roots: if you need a slice of Cards, why do you create a slice of strings in the first place? You created the type Card because it is not a string (or at least not just a string). If so and you require []Card, then create []Card in the first place and all your problems go away:
value := []Card{"a", "b", "c"}
firstHand := NewHand(value)
fmt.Println(firstHand)
Note that you are still able to initialize the slice of Card with untyped constant string literals because it can be used to initialize any type whose underlying type is string. If you want to involve typed string constants or non-constant expressions of type string, you need explicit conversion, like in the example below:
s := "ddd"
value := []Card{"a", "b", "c", Card(s)}
If you have a []string, you need to manually build a []Card from it. There is no "easier" way. You can create a helper toCards() function so you can use it everywhere you need it.
func toCards(s []string) []Card {
c := make([]Card, len(s))
for i, v := range s {
c[i] = Card(v)
}
return c
}
Some links for background and reasoning:
Go Language Specification: Conversions
why []string can not be converted to []interface{} in golang
Cannot convert []string to []interface {}
What about memory layout means that []T cannot be converted to []interface in Go?
From the specs, it looks like []string is not the same underlying type as []Card, so the type conversion cannot occur.
Exactly right. You have to convert it by looping and copying over each element, converting the type from string to Card on the way.
If it is the case, why is it so? Assuming, in a non-pet-example program, I have as input a slice of string, is there any way to "cast" it into a slice of Card, or do I have to create a new structure and copy the data into it? (Which I'd like to avoid since the functions I'll need to call will modify the slice content).
Because conversions are always explicit and the designers felt that when a conversion implicitly involves a copy it should be made explicit as well.

Resources