Golang: appending slices with or w/o allocation - go

Go's append() function only allocates new slice data, when the capacity of the given slice is not sufficient (see also: https://stackoverflow.com/a/28143457/802833). This can lead to unexpected behavior (at least for me as a golang newbie):
package main
import (
"fmt"
)
func main() {
a1 := make([][]int, 3)
a2 := make([][]int, 3)
b := [][]int{{1, 1, 1}, {2, 2, 2}, {3, 3, 3}}
common1 := make([]int, 0)
common2 := make([]int, 0, 12) // provide sufficient capacity
common1 = append(common1, []int{10, 20}...)
common2 = append(common2, []int{10, 20}...)
idx := 0
for _, k := range b {
a1[idx] = append(common1, k...) // new slice is allocated
a2[idx] = append(common2, k...) // no allocation
idx++
}
fmt.Println(a1)
fmt.Println(a2) // surprise!!!
}
output:
[[10 20 1 1 1] [10 20 2 2 2] [10 20 3 3 3]]
[[10 20 3 3 3] [10 20 3 3 3] [10 20 3 3 3]]
https://play.golang.org/p/8PEqFxAsMt
So, what ist the (idomatic) way in Go to force allocation of new slice data or more precisely to make sure that the slice argument to append() remains unchanged?

You might maintain a wrong idea of how slices work in Go.
When you append elements to a slice, the call to append() returns a new slice. If reallocation did not happen, both slice values — the one you called append() on and the one it returned back — share the same backing array but they will have different lengths; observe:
package main
import "fmt"
func main() {
a := make([]int, 0, 10)
b := append(a, 1, 2, 3)
c := append(a, 4, 3, 2)
fmt.Printf("a=%#v\nb=%#v\nc=%#v\n", a, b, c)
}
outputs:
a=[]int{}
b=[]int{4, 3, 2}
c=[]int{4, 3, 2}
So, len(a) == 0, len(b) == 3, len(c) == 3, and the second call to append() owerwrote what the first one did because all the slices share the same underlying array.
As to reallocation of the backing array, the spec is clear:
If the capacity of s is not large enough to fit the additional values, append allocates a new, sufficiently large underlying array that fits both the existing slice elements and the additional values. Otherwise, append re-uses the underlying array.
From this, it follows that:
append() never copies the underlying storage if the capacity of the slice being appeneded to is sufficient.
If there's not enough capacity, the array will be reallocated.
That is, given a slice s to which you want to append N elements, the reallocation won't be done iff cap(s) - len(s) ≥ N.
Hence I suspect your problem is not about unexpected reallocation results but rather about the concept of slices as implemented in Go. The code idea to absorb is that append() returns the resulting slice value, which you're supposed to be using after the call unless you fully understand the repercussions.
I recommend starting with this to fully understand them.

Thanx for your feedback.
So the solution to gain control of the memory allocation is to do it explicitely (which remembers me that Go is a more a system language than other (scripting) langs):
package main
import (
"fmt"
)
func main() {
a1 := make([][]int, 3)
a2 := make([][]int, 3)
b := [][]int{{1, 1, 1}, {2, 2, 2}, {3, 3, 3}}
common1 := make([]int, 0)
common2 := make([]int, 0, 12) // provide sufficient capacity
common1 = append(common1, []int{10, 20}...)
common2 = append(common2, []int{10, 20}...)
idx := 0
for _, k := range b {
a1[idx] = append(common1, k...) // new slice is allocated
a2[idx] = make([]int, len(common2), len(common2)+len(k))
copy(a2[idx], common2) // copy & append could probably be
a2[idx] = append(a2[idx], k...) // combined into a single copy step
idx++
}
fmt.Println(a1)
fmt.Println(a2)
}
output:
[[10 20 1 1 1] [10 20 2 2 2] [10 20 3 3 3]]
[[10 20 1 1 1] [10 20 2 2 2] [10 20 3 3 3]]
https://play.golang.org/p/Id_wSZwb84

Related

The most concise way to concatenate 3 or more slices

I am looking for a way to concisely and efficiently concatenate 3 or more slices in Go.
Let's say I want to concatenate the following slices (all the code can be found here - https://play.golang.org/p/6682YiFF8qG):
a := []int{1, 2, 3}
b := []int{4, 5, 6}
c := []int{7, 8, 9}
My first attempt is by using the append method:
d1 := append(a, b...)
d1 = append(d1, c...) // [1 2 3 4 5 6 7 8 9]
However, this method is verbose and requires 2 append calls for concatenating three slices. So, for n slices, I will need n-1 calls to append, which is not only verbose, but also inefficient as it requires multiple allocations.
My next attempt is to create a variadic function to handle the concatenation with only one new slice allocation:
func concat(slicesOfSlices ...[]int) []int {
var totalLengthOfSlices int
for _, slice := range slicesOfSlices {
totalLengthOfSlices += len(slice)
}
arr := make([]int, 0, totalLengthOfSlices)
for _, slice := range slicesOfSlices {
arr = append(arr, slice...)
}
return arr
}
Then I can use it as follows:
d2 := concat(a, b, c) // [1 2 3 4 5 6 7 8 9]
To illustrate, I want to emulate the following convenient functionality of the spread operator in JavaScript, which I often use in the following way:
const a = [1, 2, 3];
const b = [4, 5, 6];
const c = [7, 8, 9];
const d = [...a, ...b, ...c]; // [1, 2, 3, 4, 5, 6, 7, 8, 9]
In other words, I am looking for a way to do something like d3 := append(a, b, c) or d3 := append(a, b..., c...) but with the standard Go library or using less code than I did.
Note on possible duplicates
I don't think this is a duplicate of the question "How to concatenate two slices" as my question is about concatenating 3 or more slices in the most concise and idiomatic way.
You could use your first method of using append like this:
a := []int{1, 2, 3, 4}
b := []int{9, 8, 7, 6}
c := []int{5, 4, 3, 2}
a = append(a, append(b, c...)...)
That being said, I think that your variadic concat function is cleaner and isn't very much code for a utility function.
(Go Playground Link)
Good luck!

Best way to remove selected elements from slice

I have a slice A and another slice B. Slice A contains n elements and slice B is a subset of slice A where each element is a pointer to Slice A.
What would be the cheapest method to remove all elements from A which is referred in B.
After bit of googling only method I can think of is to reslice slice A for each element in B. Is that is only method or is there a simpler one?
I have a slice A and another slice B. Slice A contains n elements and
slice B is a subset of slice A where each element is a pointer to
Slice A.
What would be the cheapest method to remove all elements from A which
is referred in B.
A and B may have duplicates and may not be sorted.
For example, growth rate O(n),
package main
import "fmt"
func remove(a []int, b []*int) []int {
d := make(map[*int]bool, len(b))
for _, e := range b {
d[e] = true
}
var c []int
if len(a) >= len(d) {
c = make([]int, 0, len(a)-len(d))
}
for i := range a {
if !d[&a[i]] {
c = append(c, a[i])
}
}
return c
}
func main() {
a := []int{0, 1, 2, 3, 4, 5, 6, 7}
fmt.Println(a)
b := []*int{&a[1], &a[3], &a[3], &a[7], &a[4]}
a = remove(a, b)
fmt.Println(a)
}
Playground: https://play.golang.org/p/-RpkH51FSt2
Output:
[0 1 2 3 4 5 6 7]
[0 2 5 6]

Explanation of heap indexing example

This code is taken from the Go heap example (with my own added prints). Here's the playground.
https://play.golang.org/p/E69SfBIZF5X
Most everything is straightforward and makes sense, but the one thing I can't wrap around is why the 'minimum' print on index 0 of the heap in main() returns the value 1 (the correct minimum) but printing 4 in the heap's pop function returns 1 (see output).
If the root (minimum) of a heap is always at n=0, why is it n=4 in the pop function itself? It then seems to work fine, in descending order.
Can someone explain what's going on here? I don't feel comfortable implementing something like the Pop before I understand what's going on.
// This example demonstrates an integer heap built using the heap interface.
package main
import (
"container/heap"
"fmt"
)
// An IntHeap is a min-heap of ints.
type IntHeap []int
func (h IntHeap) Len() int { return len(h) }
func (h IntHeap) Less(i, j int) bool { return h[i] < h[j] }
func (h IntHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] }
func (h *IntHeap) Push(x interface{}) {
// Push and Pop use pointer receivers because they modify the slice's length,
// not just its contents.
*h = append(*h, x.(int))
}
func (h *IntHeap) Pop() interface{} {
old := *h
n := len(old)
x := old[n-1]
*h = old[0 : n-1]
fmt.Printf("n: %v\n", n)
fmt.Printf("x: %v\n", x)
return x
}
// This example inserts several ints into an IntHeap, checks the minimum,
// and removes them in order of priority.
func main() {
h := &IntHeap{2, 1, 5}
heap.Init(h)
heap.Push(h, 3)
fmt.Printf("minimum: %d\n", (*h)[0])
for h.Len() > 0 {
fmt.Printf("roll: %d\n", (*h)[0])
fmt.Printf("%d\n", heap.Pop(h))
}
}
-
Output
x = value
n = index
minimum: 1
roll: 1
n: 4
x: 1
1
roll: 2
n: 3
x: 2
2
roll: 3
n: 2
x: 3
3
roll: 5
n: 1
x: 5
5
The textbook heap algorithms include a way to fix up a heap if you know the entire heap structure is correct (a[n] < a[2*n+1] && a[n] < a[2*n+2], for all n in bounds), except that the root is wrong, in O(lg n) time. When you heap.Pop() an item, it almost certainly (*IntHeap).Swaps the first and last elements, does some more swapping to maintain the heap invariants, and then (*IntHeap).Pops the last element. That's what you're seeing here.
You can also use this to implement a heap sort. Say you have an array int[4] you're trying to sort. Take a slice s int[] = (a, len=4, cap=4), then:
If len(s) == 1, stop.
Swap s[0] and s[len(s)-1].
Shrink the slice by one item: s = (array(s), len=len(s)-1, cap=cap(s)).
If the heap is out of order, fix it.
Go to 1.
Say your example starts with [1, 2, 5, 3]. Then:
[1, 2, 5, 3]
[3, 2, 5, 1] Swap first and last
[3, 2, 5], 1 Shrink slice by one
[2, 3, 5], 1 Correct heap invariant
[5, 3, 2], 1 Swap first and last
[5, 3], 2, 1 Shrink slice by one
[3, 5], 2, 1 Correct heap invariant
[5, 3], 2, 1 Swap first and last
[5], 3, 2, 1 Shrink slice by one
5, 3, 2, 1 Sorted (descending order)

Decreasing slice capacity

My question is about slice length and capacity. I'm learning about Go here: https://tour.golang.org/moretypes/11.
(My question was marked as a possible duplicate of this; however, this is not the case. My question is specifically about the cutting off the first few elements of a slice and the implications of that.)
Why does the line s = s[2:] decrease the capacity when s = s[:4] and s = s[:0] do not? The only difference I see is that there is a number before the colon in s = s[2:] while there is a number after the colon in the other two lines.
Is there any way to recover the first two elements that we cut off with s = s[2:]?
package main
import "fmt"
func main() {
s := []int{2, 3, 5, 7, 11, 13}
printSlice(s)
// Slice the slice to give it zero length.
s = s[:0]
printSlice(s)
// Extend its length.
s = s[:4]
printSlice(s)
// Drop its first two values.
s = s[2:]
printSlice(s)
}
func printSlice(s []int) {
fmt.Printf("len=%d cap=%d %v\n", len(s), cap(s), s)
}
After clicking the Run button, we get the following.
len=6 cap=6 [2 3 5 7 11 13]
len=0 cap=6 []
len=4 cap=6 [2 3 5 7]
len=2 cap=4 [5 7]
You can read more about slices here. But I think this passage answers your question:
Slicing does not copy the slice's data. It creates a new slice value that points to the original array. This makes slice operations as efficient as manipulating array indices. Therefore, modifying the elements (not the slice itself) of a re-slice modifies the elements of the original slice.
So you cannot recover the slice data if you are assigning it to the same variable.
The capacity decrease is because by dropping the first 2 elements you are changing the pointer to the new slice (slices are referenced by the pointer to the first element).
How slices are represented in the memory:
make([]byte, 5)
s = s[2:4]
You can use a full slice expression:
package main
func main() {
s := []int{2, 3, 5, 7, 11, 13}
{ // example 1
t := s[:0]
println(cap(t) == 6)
}
{ // example 2
t := s[:0:0]
println(cap(t) == 0)
}
}
https://golang.org/ref/spec#Slice_expressions
The slices.Clip of "golang.org/x/exp/slices" could reduce the capacity of slice through Full slice expressions.
Clip removes unused capacity from the slice, returning s[:len(s):len(s)].
func main() {
s := []int{2, 3, 5, 7, 11, 13}
printSlice(s)
s = s[:4]
printSlice(s)
s = slices.Clip(s)
printSlice(s)
}
func printSlice(s []int) {
fmt.Printf("len=%d cap=%d %v\n", len(s), cap(s), s)
}
len=6 cap=6 [2 3 5 7 11 13]
len=4 cap=6 [2 3 5 7]
len=4 cap=4 [2 3 5 7]
Playground

Concatenate two slices in Go

I'm trying to combine the slice [1, 2] and the slice [3, 4]. How can I do this in Go?
I tried:
append([]int{1,2}, []int{3,4})
but got:
cannot use []int literal (type []int) as type int in append
However, the documentation seems to indicate this is possible, what am I missing?
slice = append(slice, anotherSlice...)
Add dots after the second slice:
// vvv
append([]int{1,2}, []int{3,4}...)
This is just like any other variadic function.
func foo(is ...int) {
for i := 0; i < len(is); i++ {
fmt.Println(is[i])
}
}
func main() {
foo([]int{9,8,7,6,5}...)
}
Appending to and copying slices
The variadic function append appends zero or more values x to s
of type S, which must be a slice type, and returns the resulting
slice, also of type S. The values x are passed to a parameter of
type ...T where T is the element type of S and the respective
parameter passing rules apply. As a special case, append also accepts
a first argument assignable to type []byte with a second argument of
string type followed by .... This form appends the bytes of the
string.
append(s S, x ...T) S // T is the element type of S
s0 := []int{0, 0}
s1 := append(s0, 2) // append a single element s1 == []int{0, 0, 2}
s2 := append(s1, 3, 5, 7) // append multiple elements s2 == []int{0, 0, 2, 3, 5, 7}
s3 := append(s2, s0...) // append a slice s3 == []int{0, 0, 2, 3, 5, 7, 0, 0}
Passing arguments to ... parameters
If f is variadic with final parameter type ...T, then within the
function the argument is equivalent to a parameter of type []T. At
each call of f, the argument passed to the final parameter is a new
slice of type []T whose successive elements are the actual arguments,
which all must be assignable to the type T. The length of the slice is
therefore the number of arguments bound to the final parameter and may
differ for each call site.
The answer to your question is example s3 := append(s2, s0...) in the Go Programming Language Specification. For example,
s := append([]int{1, 2}, []int{3, 4}...)
Nothing against the other answers, but I found the brief explanation in the docs more easily understandable than the examples in them:
func append
func append(slice []Type, elems ...Type) []Type The append built-in
function appends elements to the end of a slice. If it has sufficient
capacity, the destination is resliced to accommodate the new elements.
If it does not, a new underlying array will be allocated. Append
returns the updated slice. It is therefore necessary to store the
result of append, often in the variable holding the slice itself:
slice = append(slice, elem1, elem2)
slice = append(slice, anotherSlice...)
As a special case, it is legal to append a string to a byte slice,
like this:
slice = append([]byte("hello "), "world"...)
I would like to emphasize #icza answer and simplify it a bit since it is a crucial concept. I assume that reader is familiar with slices.
c := append(a, b...)
This is a valid answer to the question.
BUT if you need to use slices 'a' and 'c' later in code in different context, this is not the safe way to concatenate slices.
To explain, lets read the expression not in terms of slices, but in terms of underlying arrays:
"Take (underlying) array of 'a' and append elements from array 'b' to
it. If array 'a' has enough capacity to include all elements from 'b'
- underlying array of 'c' will not be a new array, it will actually be array 'a'. Basically, slice 'a' will show len(a) elements of
underlying array 'a', and slice 'c' will show len(c) of array 'a'."
append() does not necessarily create a new array! This can lead to unexpected results. See Go Playground example.
Always use make() function if you want to make sure that new array is allocated for the slice. For example here are few ugly but efficient enough options for the task.
la := len(a)
c := make([]int, la, la + len(b))
_ = copy(c, a)
c = append(c, b...)
la := len(a)
c := make([]int, la + len(b))
_ = copy(c, a)
_ = copy(c[la:], b)
I think it's important to point out and to know that if the destination slice (the slice you append to) has sufficient capacity, the append will happen "in-place", by reslicing the destination (reslicing to increase its length in order to be able to accommodate the appendable elements).
This means that if the destination was created by slicing a bigger array or slice which has additional elements beyond the length of the resulting slice, they may get overwritten.
To demonstrate, see this example:
a := [10]int{1, 2}
fmt.Printf("a: %v\n", a)
x, y := a[:2], []int{3, 4}
fmt.Printf("x: %v, y: %v\n", x, y)
fmt.Printf("cap(x): %v\n", cap(x))
x = append(x, y...)
fmt.Printf("x: %v\n", x)
fmt.Printf("a: %v\n", a)
Output (try it on the Go Playground):
a: [1 2 0 0 0 0 0 0 0 0]
x: [1 2], y: [3 4]
cap(x): 10
x: [1 2 3 4]
a: [1 2 3 4 0 0 0 0 0 0]
We created a "backing" array a with length 10. Then we create the x destination slice by slicing this a array, y slice is created using the composite literal []int{3, 4}. Now when we append y to x, the result is the expected [1 2 3 4], but what may be surprising is that the backing array a also changed, because capacity of x is 10 which is sufficient to append y to it, so x is resliced which will also use the same a backing array, and append() will copy elements of y into there.
If you want to avoid this, you may use a full slice expression which has the form
a[low : high : max]
which constructs a slice and also controls the resulting slice's capacity by setting it to max - low.
See the modified example (the only difference is that we create x like this: x = a[:2:2]:
a := [10]int{1, 2}
fmt.Printf("a: %v\n", a)
x, y := a[:2:2], []int{3, 4}
fmt.Printf("x: %v, y: %v\n", x, y)
fmt.Printf("cap(x): %v\n", cap(x))
x = append(x, y...)
fmt.Printf("x: %v\n", x)
fmt.Printf("a: %v\n", a)
Output (try it on the Go Playground)
a: [1 2 0 0 0 0 0 0 0 0]
x: [1 2], y: [3 4]
cap(x): 2
x: [1 2 3 4]
a: [1 2 0 0 0 0 0 0 0 0]
As you can see, we get the same x result but the backing array a did not change, because capacity of x was "only" 2 (thanks to the full slice expression a[:2:2]). So to do the append, a new backing array is allocated that can store the elements of both x and y, which is distinct from a.
append( ) function and spread operator
Two slices can be concatenated using append method in the standard golang library. Which is similar to the variadic function operation. So we need to use ...
package main
import (
"fmt"
)
func main() {
x := []int{1, 2, 3}
y := []int{4, 5, 6}
z := append([]int{}, append(x, y...)...)
fmt.Println(z)
}
output of the above code is: [1 2 3 4 5 6]
To concatenate two slices,
func main() {
s1 := []int{1, 2, 3}
s2 := []int{99, 100}
s1 = append(s1, s2...)
fmt.Println(s1) // [1 2 3 99 100]
}
To append a single value to a slice
func main() {
s1 := []int{1,2,3}
s1 := append(s1, 4)
fmt.Println(s1) // [1 2 3 4]
}
To append multiple values to a slice
func main() {
s1 := []int{1,2,3}
s1 = append(s1, 4, 5)
fmt.Println(s1) // [1 2 3 4]
}
Seems like a perfect use for generics (if using 1.18 or later).
func concat[T any](first []T, second []T) []T {
n := len(first);
return append(first[:n:n], second...);
}
append([]int{1,2}, []int{3,4}...) will work. Passing arguments to ... parameters.
If f is variadic with a final parameter p of type ...T, then within f the type of p is equivalent to type []T.
If f is invoked with no actual arguments for p, the value passed to p is nil.
Otherwise, the value passed is a new slice of type []T with a new underlying array whose successive elements are the actual arguments, which all must be assignable to T. The length and capacity of the slice is therefore the number of arguments bound to p and may differ for each call site.
Given the function and calls
func Greeting(prefix string, who ...string)
Greeting("nobody")
Greeting("hello:", "Joe", "Anna", "Eileen")

Resources