Could someone help explain the Golang internals of why this code is mutating the original array a?
func main() {
a := []int{1,2,3,4}
b := a
b = append(b[0:1], b[2:]...)
fmt.Println(b)
fmt.Println(a)
}
Output:
[1 3 4]
[1 3 4 4]
I thought b := a would be passing by value. Thanks in advance.
That's how slices work. A slice is just a pointer(+size+capacity), the actual data is stored in the array.
When you copy a slice, the underlying array is not copied. Then you end up with two slices pointing to the same array. Mutating the values of one slice will become visible via the other slice.
See Go Slices: usage and internals for more details.
If you want to leave the original slice untouched, first make a deep copy. For example like this
b := append([]int{}, a...) // deep copy
(Live demo)
Slices are basically wrapper over arrays. Slices doesn't have their own data they just hold the reference to the arrays. In your given code you are assigning a to b now they both are indicating the same array. And so when you are mutating the slice b the slice a is also being mutated.
You can use copy method to copy elements from one array to another.
// copy returns the count of total copied elements
count := copy(b /*destination*/ , a /*source*/)
But make sure to allocate an array with the same length of source array.
Example is given below:
func main() {
a := []int{1,2,3,4}
b := make([]int, len(a))
_ = copy(b, a)
a[0] = 2
fmt.Println(b)
fmt.Println(a)
}
Related
I have the following function that generateс all subsets of a given array.
The idea is simple - I start with a results array that contains an empty set (slice) and for each element in the input array nums go over all previously generated sets, add the current element of nums to them and add the resulting new sets back to the results array. Nothing particularly interesting.
func subsets(nums []int) [][]int {
result := [][]int{{}}
for _, n := range nums {
newSets := [][]int{}
for _, set := range result {
newSets = append(newSets, append(set, n))
}
result = append(result, newSets...)
}
return result
}
The problem is that using append(newSets, append(set, n)) corrupts the result slice, of which set is a member. I modified the function a bit with some debug code (see below) and also found a workaround (the commented code) which doesn't cause the same behavior.
I very much suspect that this is caused by something that's passed by reference instead of being copied (I am appending the elements of newSets to result). The problem is that I can't find it. :( I never change the result within a loop that iterates over it. I also work with new instances of newSets for each loop. So I'm not sure what's causing it. Please advise. :)
func subsets(nums []int) [][]int {
result := [][]int{{}}
for _, n := range nums {
newSets := [][]int{}
var before, after []int
for _, set := range result {
lastResultIdx := len(result)-1
if lastResultIdx > 0 {
before = make([]int, len(result[lastResultIdx]))
copy(before, result[lastResultIdx])
}
//ns := []int{}
//for _,v := range set {
// ns = append(ns, v)
//}
//ns = append(ns, n)
//newSets = append(newSets, ns)
newSets = append(newSets, append(set, n))
if lastResultIdx > 0 {
after = result[lastResultIdx]
if before[len(before)-1]!=after[len(after)-1] {
fmt.Println(n, "before", before, "after", after)
}
}
}
result = append(result, newSets...)
}
return result
}
func main() {
subsets([]int{0, 1, 2, 3, 4})
}
The problem is here:
append(newSets, append(set, n))
The problem is not that it is a nested append. The problem is that you're assuming append(set,n) will return a new slice. That is not always the case. A slice is a view on an array, and when you add new elements to the slice, if the addition did not result in reallocation of the array, the returned slice is the same slice you passed in, with len field incremented. So when you're going through your results array, you're modifying the elements that are already there, and at the same time, adding them again as if they are different results.
To solve, when you get an element of the result, create a new slice, copy elements of the result to it, append the new element and then add the new slice to result.
The problem is simple enough: append takes a slice argument—[]T for some type T—plus of course the element(s) to append, and returns a []T result. But []T, if non-nil, consists of two parts: a slice header that points to some backing array and carries a current length and capacity, plus the backing array. When append does its job, it has a choice:
modify the backing array in place, and return a new slice header that re-uses the existing backing array, or
create a new backing array, copy the original values to the new backing array, and return a new slice header that uses the new backing array.
Whenever append copies the backing array, your code works. Whenever it re-uses the backing array, your code may or may not work, depending on whether some other slice header is using the same backing array.
Suppose your backing array has length 5 for instance, and one of the existing slice headers reads "length 1, capacity 5" with element 0 of the backing array holding zero. That is, the existing slice header h contains [0]. Now you call append(h, 1). The append operation re-uses the backing array and puts 1 in the second element and returns a new slice header h1 that contains [0, 1]. Now you take h again, append 2, and make a two-element slice h2 holding [0, 2]. But this re-uses the same backing array that h1 re-used so now h1 also holds [0, 2].
To solve the problem without modifying your algorithm much, you need either:
a variant of append that always copies, or
a variant of append one int to a slice of ints that always copies.
The latter is simpler:
func setPlusInt(set []int, n int) []int {
return append(append([]int(nil), set...), n)
}
which lets you replace one line of your existing code.
(I made one other trivial change here and added enough to provide a working example in the Go Playground.)
(An alternate solution is to set up each of your own slice headers to offer no extra capacity, so that append must always copy. I have not illustrated this method.)
There are two problems I don't understand.
The first one is that one slice variable is assigned to another variable, and it is found that the address of the new variable is inconsistent with that of the variable. My understanding is that slice shares memory, and their addresses are the same according to the principle.
the Second is then when the slice variable is insufficient in capacity, the memory address does not change after the append operation. It should change according to the principle, because the memory address will be newly allocated when the capacity is insufficient.
I would appreciate your comments.
var a = []int{1,2,3}
fmt.Printf("%p\n",&a)
b:=a
fmt.Printf("%p\n",&b) 1)、the first question
b=append(b,0)
fmt.Printf("%p\n",&b) 2)、the second question
fmt.Println(a)
fmt.Println(b)
run result is:
0xc04204c3a0
0xc04204c3e0
0xc04204c3e0
[1 2 3]
[1 2 3 0]
A slice value contains a pointer to the backing array, length and capacity. See Go Slices: usage and internals for the details.
Here's some commentary on the code in the question:
var a = []int{1, 2, 3}
// Print address of variable a
fmt.Printf("%p\n", &a)
b := a
// Print address of variable b. The variable a and b
// have different addresses.
fmt.Printf("%p\n", &b)
b = append(b, 0)
// Print address of variable b. Append did not change
// the address of the variable b.
fmt.Printf("%p\n", &b)
Print the address of the first slice element to get the results you expect.
var a = []int{1, 2, 3}
// Print address of a's first element
fmt.Printf("%p\n", &a[0])
b := a
// Print address of b's first element. This prints
// same value as previous because a and b share a backing
// array.
fmt.Printf("%p\n", &b[0])
b = append(b, 0)
// Print address of b's first element. This prints a
// different value from previous because append allocated
// a new backing array.
fmt.Printf("%p\n", &b[0])
// create a new slice struct which contains length, capacity and the underlying array.
// len(a)=3, cap(a)=3
var a = []int{1,2,3}
// `&a` means the pointer to slice struct
fmt.Printf("%p\n",&a)
// `b` is another newly created variable of slice struct, so `&b` differs from `&a`,
// but they share the same underlying array.
// len(b)=3, cap(b)=3
b := a
// the underlying array of `b` has been extended, and been newly allocated.
// but the pointer of `b` remains.
// len(b)=4, cap(b)=6
b = append(b, 0)
Hope that these comments can help you
This code is in builti.go:
// The append built-in function appends elements to the end of a slice. If
// it has sufficient capacity, the destination is resliced to accommodate the
// new elements. If it does not, a new underlying array will be allocated.
// Append returns the updated slice. It is therefore necessary to store the
// result of append, often in the variable holding the slice itself:
// slice = append(slice, elem1, elem2)
// slice = append(slice, anotherSlice...)
// As a special case, it is legal to append a string to a byte slice, like this:
// slice = append([]byte("hello "), "world"...)
func append(slice []Type, elems ...Type) []Type
The last line made me feel very confused. I do not know the meaning of ...Type .
These are other codes:
package main
import "fmt"
func main() {
s := []int{1,2,3,4,5}
s1 := s[:2]
s2 := s[2:]
s3 := append(s1, s2...)
fmt.Println(s1, s2, s3)
}
The result is
[1 2] [3 4 5] [1 2 3 4 5]
I guess the function of ... is to pick all elements from elems, but I haven't found an official explanation. What is it?
The code in builtin.go serves as documentation. The code is not compiled.
The ... specifies that the final parameter of the function is variadic. Variadic functions are documented in the Go Language specification. In short, variadic functions can be called with any number of arguments for the final parameter.
The Type part is a stand-in for any Go type.
The capacity parameter in making a slice in Go does not make much sense to me. For example,
aSlice := make([]int, 2, 2) //a new slice with length and cap both set to 2
aSlice = append(aSlice, 1, 2, 3, 4, 5) //append integers 1 through 5
fmt.Println("aSlice is: ", aSlice) //output [0, 0, 1, 2, 3, 4, 5]
If the slice allows inserting more elements than the capacity allows, why do we need to set it in the make() function?
The builtin append() function uses the specified slice to append elements to if it has a big enough capacity to accomodate the specified elements.
But if the passed slice is not big enough, it allocates a new, big enough slice, copies the elements from the passed slice to the new slice and append the elements to that new slice. And returns this new slice. Quoting from the append() documentation:
The append built-in function appends elements to the end of a slice. If it has sufficient capacity, the destination is resliced to accommodate the new elements. If it does not, a new underlying array will be allocated. Append returns the updated slice. It is therefore necessary to store the result of append, often in the variable holding the slice itself:
When making a slice with make if the length and capacity are the same, the capacity can be omitted, in which case it is defaulted to the specified length:
// These 2 declarations are equivalent:
s := make([]int, 2, 2)
s := make([]int, 2)
Also note that append() appends elements after the last element of the slice. And the above slices already have len(s) == 2 right after declaration so if you append even just 1 element to it, it will cause a reallocation as seen in this example:
s := make([]int, 2, 2)
fmt.Println(s, len(s), cap(s))
s = append(s, 1)
fmt.Println(s, len(s), cap(s))
Output:
[0 0] 2 2
[0 0 1] 3 4
So in your example what you should do is something like this:
s := make([]int, 0, 10) // Create a slice with length=0 and capacity=10
fmt.Println(s, len(s), cap(s))
s = append(s, 1)
fmt.Println(s, len(s), cap(s))
Output:
[] 0 10
[1] 1 10
I recommend the following blog articles if you want to understand slices in more details:
Go Slices: usage and internals
Arrays, slices (and strings): The mechanics of 'append'
It is mainly an optimization, and it is not unique to go, similar structures in other languages have this as well.
When you append more than the capacity, the runtime needs to allocate more memory for the new elements. This is costly and can also cause memory fragmentation.
By specifying the capacity, the runtime allocates what is needed in advance, and avoids reallocations. However if you do not know the estimated capacity in advance or it changes, you do not have to set it, and the runtime reallocates what is needed and grows the capacity itself.
I thought that in GO language, slices are passed by reference. But why the following code doesn't change the content of slice c? Am I missing something? Thank you.
package main
import (
"fmt"
)
func call(c []int) {
c = append(c, 1)
fmt.Println(c)
}
func main() {
c := make([]int, 1, 5)
fmt.Println(c)
call(c)
fmt.Println(c)
}
The result printed is:
[0]
[0 1]
[0]
while I was expecting
[0]
[0 1]
[0 1]
The length of the slice is kept in the slice header which is not passed by reference. You can think of a slice as a struct containing a pointer to the array, a length, and a capacity.
When you appended to the slice, you modified index 1 in the data array and then incremented the length in the slice header. When you returned, c in the main function had a length of 1 and so printed the same data.
The reason slices work this way is so you can have multiple slices pointing to the same data. For example:
x := []int{1,2,3}
y := x[:2] // [1 2]
z := x[1:] // [2 3]
All three of those slices point to overlapping data in the same underlying array.
Go is always pass by value. Certain types are reference types, like pointers, maps, channels; or partially reference types, like slices (which consists of a reference to the underlying array and also the values of the length and capacity). But regardless of type everything is passed by value. Thus assigning to a local variable never affects anything outside.