Improper use of slices causes unintended side effects - go

I have the following function that generateс all subsets of a given array.
The idea is simple - I start with a results array that contains an empty set (slice) and for each element in the input array nums go over all previously generated sets, add the current element of nums to them and add the resulting new sets back to the results array. Nothing particularly interesting.
func subsets(nums []int) [][]int {
result := [][]int{{}}
for _, n := range nums {
newSets := [][]int{}
for _, set := range result {
newSets = append(newSets, append(set, n))
}
result = append(result, newSets...)
}
return result
}
The problem is that using append(newSets, append(set, n)) corrupts the result slice, of which set is a member. I modified the function a bit with some debug code (see below) and also found a workaround (the commented code) which doesn't cause the same behavior.
I very much suspect that this is caused by something that's passed by reference instead of being copied (I am appending the elements of newSets to result). The problem is that I can't find it. :( I never change the result within a loop that iterates over it. I also work with new instances of newSets for each loop. So I'm not sure what's causing it. Please advise. :)
func subsets(nums []int) [][]int {
result := [][]int{{}}
for _, n := range nums {
newSets := [][]int{}
var before, after []int
for _, set := range result {
lastResultIdx := len(result)-1
if lastResultIdx > 0 {
before = make([]int, len(result[lastResultIdx]))
copy(before, result[lastResultIdx])
}
//ns := []int{}
//for _,v := range set {
// ns = append(ns, v)
//}
//ns = append(ns, n)
//newSets = append(newSets, ns)
newSets = append(newSets, append(set, n))
if lastResultIdx > 0 {
after = result[lastResultIdx]
if before[len(before)-1]!=after[len(after)-1] {
fmt.Println(n, "before", before, "after", after)
}
}
}
result = append(result, newSets...)
}
return result
}
func main() {
subsets([]int{0, 1, 2, 3, 4})
}

The problem is here:
append(newSets, append(set, n))
The problem is not that it is a nested append. The problem is that you're assuming append(set,n) will return a new slice. That is not always the case. A slice is a view on an array, and when you add new elements to the slice, if the addition did not result in reallocation of the array, the returned slice is the same slice you passed in, with len field incremented. So when you're going through your results array, you're modifying the elements that are already there, and at the same time, adding them again as if they are different results.
To solve, when you get an element of the result, create a new slice, copy elements of the result to it, append the new element and then add the new slice to result.

The problem is simple enough: append takes a slice argument—[]T for some type T—plus of course the element(s) to append, and returns a []T result. But []T, if non-nil, consists of two parts: a slice header that points to some backing array and carries a current length and capacity, plus the backing array. When append does its job, it has a choice:
modify the backing array in place, and return a new slice header that re-uses the existing backing array, or
create a new backing array, copy the original values to the new backing array, and return a new slice header that uses the new backing array.
Whenever append copies the backing array, your code works. Whenever it re-uses the backing array, your code may or may not work, depending on whether some other slice header is using the same backing array.
Suppose your backing array has length 5 for instance, and one of the existing slice headers reads "length 1, capacity 5" with element 0 of the backing array holding zero. That is, the existing slice header h contains [0]. Now you call append(h, 1). The append operation re-uses the backing array and puts 1 in the second element and returns a new slice header h1 that contains [0, 1]. Now you take h again, append 2, and make a two-element slice h2 holding [0, 2]. But this re-uses the same backing array that h1 re-used so now h1 also holds [0, 2].
To solve the problem without modifying your algorithm much, you need either:
a variant of append that always copies, or
a variant of append one int to a slice of ints that always copies.
The latter is simpler:
func setPlusInt(set []int, n int) []int {
return append(append([]int(nil), set...), n)
}
which lets you replace one line of your existing code.
(I made one other trivial change here and added enough to provide a working example in the Go Playground.)
(An alternate solution is to set up each of your own slice headers to offer no extra capacity, so that append must always copy. I have not illustrated this method.)

Related

Slice copy mutating original slice

Could someone help explain the Golang internals of why this code is mutating the original array a?
func main() {
a := []int{1,2,3,4}
b := a
b = append(b[0:1], b[2:]...)
fmt.Println(b)
fmt.Println(a)
}
Output:
[1 3 4]
[1 3 4 4]
I thought b := a would be passing by value. Thanks in advance.
That's how slices work. A slice is just a pointer(+size+capacity), the actual data is stored in the array.
When you copy a slice, the underlying array is not copied. Then you end up with two slices pointing to the same array. Mutating the values of one slice will become visible via the other slice.
See Go Slices: usage and internals for more details.
If you want to leave the original slice untouched, first make a deep copy. For example like this
b := append([]int{}, a...) // deep copy
(Live demo)
Slices are basically wrapper over arrays. Slices doesn't have their own data they just hold the reference to the arrays. In your given code you are assigning a to b now they both are indicating the same array. And so when you are mutating the slice b the slice a is also being mutated.
You can use copy method to copy elements from one array to another.
// copy returns the count of total copied elements
count := copy(b /*destination*/ , a /*source*/)
But make sure to allocate an array with the same length of source array.
Example is given below:
func main() {
a := []int{1,2,3,4}
b := make([]int, len(a))
_ = copy(b, a)
a[0] = 2
fmt.Println(b)
fmt.Println(a)
}

Go slices mutation best practices

It's not very predictable to know whether the underlying original array is getting mutated or whether its the copy of the original array that is getting mutated when slices are passed around
a = [3]int {0, 1, 2}
s = a[:]
s[0] = 10
a[0] == s[0] // true
s = append(s, 3)
s[0] = 20
a[0] == s[0] // false
Let' say today I had a processing of this kind
a = [3]int {0, 1, 2}
s = some_func(a[:]) // returns slice
process(s) // a is getting mutated because so far some_func hasn't caused the underlying array to be copied
and now tomorrow
a = [3]int {0, 1, 2}
s = some_func(a[:]) // returns slice, does append operations
process(s) // a is not getting mutated because some_func caused the underlying array to be copied
What are the best practices for slices then?
If a function really does modify a slice's underlying array in place, and promises that it always modifies the underlying array in place, that function should in general take the slice argument by value and not return an updated slice:1
// Mutate() modifies (the backing array of) s in place to achieve $result.
// See below for why it returns an int.
func Mutate(s []T) int {
// code
}
If a function may modify the underlying array in place but may return a slice that uses a new array, the function should return a new slice value, or take a pointer to a slice:
// Replace() operates on a slice of T, but may return a totally new
// slice of T.
func Replace(s []T) []T {
// code
}
When this function returns, you should assume that the underlying array, if you have hold of it, may or may not be in use:
func callsReplace() {
var arr [10]T
s := Replace(arr[:])
// From here on, do not use variable arr directly as
// we don't know if it is s's backing array, or not.
// more code
}
But Mutate() promises to modify the array in place. Note that Mutate will often need to return the number of array elements actually updated:
func callsMutate() {
var arr [10]T
n := Mutate(arr[:])
// now work with arr[0] through arr[n]
// more code
}
1Of course, it could take a pointer to the array object, and modify the array in place, but that's less flexible since the array size is then baked in to the type.

Add []bytes append slice []bytes

I began to learn the language of GO and I do not quite understand something, maybe I'm just confused and tired.
Here is my code, there is an array of result (from encoded strings, size 2139614 elements). I need to decode them and use them further. But when I run an iteration, the resultrips is twice as large and the first half is completely empty. Therefore, I make a slice and add to it the desired range.
Why it happens?
It might be easier to decode the result immediately and re-record it, but I don’t know how to do it, well)))
maybe there is a completely different way and as a beginner I don’t know it yet
result := []string{}
for i, _ := range input {
result = append(result, i)
}
sort.Strings(result)
rips := make([][]byte, 2139614)
for _, i := range result {
c := Decode(i)
c = c[1:37]
rips = append(rips, c)
}
//len(result) == 2139614
for i := 2139610; i < 2139700; i++ {
fmt.Println(i, rips[i])
}
resultrips := rips[2139614:]
for _,i := range resultrips {
fmt.Println(i)
}
fmt.Println("All write: ", len(resultrips))
and this question: I do it right if I need an array of byte arrays (I do it so as not to do too much work and will check the values in bytes, because there is no any coding) ???
rips := make([][]byte, 2139614) //array []byte
in the end, I need an array of the type of the set in C ++ to check if there is an element in my set
in C ++ it was code:
if (resultrips.count > 0) { ... }
When you write:
make([][]byte, 2139614)
This creates a slice with length and capacity equal to 2139614. When you append to a slice, it always appends after the last element, thereby increasing the length. If you want to pre-allocate a large slice so that you can append into it, you want to specify a length of 0:
make([][]byte, 0, 2139614)
This pre-allocates 2139614 elements, but with a length of 0, subsequent append calls will start at the beginning of the slice; after the first append it will have a length of 1, and it will not need to have increased its capacity.
Length vs capacity is covered in the Tour of Go: https://tour.golang.org/moretypes/13
A quick note based on the text of your question - remember that slices and arrays are not the same thing. Arrays have a compile-time fixed length and their capacity is synonymous with their length. Slices are backed by arrays but have runtime dynamic independent length and capacity.

slice iteration order in go

Ok, i think this may be an old question, but i didn't find anything over the stackoverflow. In go , the iteration order over a map is not guranteed to be reproducible. So, the way suggest is to hold the keys in a slice and sort that slice. Then iterate over that slice to retrieve the values from the map, so that we get them in order(since slice composed of keys is sorted, so will be in reproducible order). So this goes to imply that the slice need be sorted else iteration over the slice will also not give reproducible order. But when i tried the below code in playground, i always found the order maintained in iteration, then in the map iteration case, why the slice of keys need to be sorted?
func main() {
var mySlice = make([]string, 0)
mySlice = append(mySlice, "abcd")
mySlice = append(mySlice, "efgh")
mySlice = append(mySlice, "ijkl")
mySlice = append(mySlice, "mnop")
mySlice = append(mySlice, "qrst")
mySlice = append(mySlice, "uvwxyz")
for _, val := range mySlice {
fmt.Println(val)
}
fmt.Println(strings.Join(mySlice, "|"))
}
Output:
abcd
efgh
ijkl
mnop
qrst
uvwxyz
abcd|efgh|ijkl|mnop|qrst|uvwxyz
A slice or array will always have a fixed order, i.e. how it is laid out in memory.
The documentation you were reading was probably just telling you to sort the slice so that the map output is in sorted order.
You are correct that the iteration order of a map is undefined and hence can be different each time it is performed. If you use a slice to iterate a map then it will always come back in a reliable order, i.e. the order of the keys in the slice.
I suggest you have a read over the information about slices.
EDIT
If it helps, consider the following code to illustrate that the sorting of a slice has nothing to do with its order being fixed:
words := map[int]string{
0: "hello",
1: "there",
2: "goodbye",
}
keys:=[]int{2,0,1}
for _, k := range keys {
// Will output in order: Goodbye, hello, there
fmt.Println("Key:", k, "Value:", words[k])
}
The only reason your slice is sorted is because you're appending items in already sorted order. If you appended items in an unsorted order like this
var mySlice = make([]string, 0)
mySlice = append(mySlice, "mnop")
mySlice = append(mySlice, "efgh")
mySlice = append(mySlice, "uvwxyz")
mySlice = append(mySlice, "ijkl")
mySlice = append(mySlice, "abcd")
mySlice = append(mySlice, "qrst")
(or populated a slice by pulling keys from a map, which would be unsorted), then the order on iteration would be unsorted (consistent, yes, but consistently unsorted). So, if your objective is to use the slice to pull items from a map in sorted order, then you need to first sort the slice, unless you can guarantee the slice items were inserted in an already sorted order.

Check whether a string slice contains a certain value in Go

What is the best way to check whether a certain value is in a string slice? I would use a Set in other languages, but Go doesn't have one.
My best try is this so far:
package main
import "fmt"
func main() {
list := []string{"a", "b", "x"}
fmt.Println(isValueInList("b", list))
fmt.Println(isValueInList("z", list))
}
func isValueInList(value string, list []string) bool {
for _, v := range list {
if v == value {
return true
}
}
return false
}
http://play.golang.org/p/gkwMz5j09n
This solution should be ok for small slices, but what to do for slices with many elements?
If you have a slice of strings in an arbitrary order, finding if a value exists in the slice requires O(n) time. This applies to all languages.
If you intend to do a search over and over again, you can use other data structures to make lookups faster. However, building these structures require at least O(n) time. So you will only get benefits if you do lookups using the data structure more than once.
For example, you could load your strings into a map. Then lookups would take O(1) time. Insertions also take O(1) time making the initial build take O(n) time:
set := make(map[string]bool)
for _, v := range list {
set[v] = true
}
fmt.Println(set["b"])
You can also sort your string slice and then do a binary search. Binary searches occur in O(log(n)) time. Building can take O(n*log(n)) time.
sort.Strings(list)
i := sort.SearchStrings(list, "b")
fmt.Println(i < len(list) && list[i] == "b")
Although in theory given an infinite number of values, a map is faster, in practice it is very likely searching a sorted list will be faster. You need to benchmark it yourself.
To replace sets you should use a map[string]struct{}. This is efficient and considered idiomatic, the "values" take absolutely no space.
Initialize the set:
set := make(map[string]struct{})
Put an item :
set["item"]=struct{}{}
Check whether an item is present:
_, isPresent := set["item"]
Remove an item:
delete(set, "item")
You can use a map, and have the value e.g. a bool
m := map[string] bool {"a":true, "b":true, "x":true}
if m["a"] { // will be false if "a" is not in the map
//it was in the map
}
There's also the sort package, so you could sort and binary search your slices

Resources