Go: builtin make - does the capacity make a difference - go

Consider wanting to dynamically fill an array/slice with exactly 5 elements. No more, and no less.
(1) Slice with initial length 0
sl := []string{}
for i := 0; i < 5; i++ {
sl = append(sl, "abc")
}
(2) Slice with initial length set, no capacity
sl := make([]string, 5)
for i := 0; i < 5; i++ {
s1[i] = "abc"
}
(3) Slice with initial length set, capacity given
sl := make([]string, 5, 5)
for i := 0; i < 5; i++ {
sl[i] = "abc"
}
My feeling tells me that #1 is not the best solution, but I am wondering why I would choose #2 over #3 or vice versa? (performance-wise)

First of all, whenever you have a question about performance, benchmark and profile.
Secondly, I don't see any difference here. Considering that this code
s := make([]int, 5)
fmt.Println(cap(s))
prints 5, your #2 and #3 are basically the same.

Related

How to swap two slices in a byte array?

I try to swap the slice 0:10 and the slice 10:20 using the following code. But
data1 := make([]byte, 100)
tmp := data1[0:10]
data1[0:10] = data1[10:20]
data1[10:20] = tmp
But I got error messages like this.
../xxx.go:60:14: cannot assign to data1[0:10]
../xxx.go:61:15: cannot assign to data1[10:20]
Could anybody show me how to swap two slices in a byte array? Thanks.
You are trying to swap the contents of the underlying array. The only way of doing it is to swap individual elements:
for i := 0; i < 10; i++ {
data[i], data[i+10] = data[i+10], data[i]
}
Or:
j := 10
for i := 0; i < 10; i++ {
data[i], data[j] = data[j], data[i]
j++
}
#BurakSerdar 's answer is the most efficient for the small chunks of data to move and the swap nature of the operation.
If you're curious how to copy sections of a slice, simply use the internal copy function:
copy(data[0:10], data[10:20]) // overwrites first 10-bytes with next 10 bytes
To perform a swap with copy is a little awkward, but if you're curious:
// tmp := data[0:10] // will *NOT* work
// // as `tmp` will just reference data's underlying byte-array
tmp := make([]byte, 10) // need fresh memory
copy(tmp, data[0:10])
copy(data[0:10], data[10:20])
copy(data[10:20], tmp)
https://play.golang.org/p/ud31Gxfa19b

Efficient allocation of slices (cap vs length)

Assuming I am creating a slice, which I know in advance that I want to populate via a for loop with 1e5 elements via successive calls to append:
// Append 1e5 strings to the slice
for i := 0; i<= 1e5; i++ {
value := fmt.Sprintf("Entry: %d", i)
myslice = append(myslice, value)
}
which is the more efficient way of initialising the slice and why:
a. declaring a nil slice of strings?
var myslice []string
b. setting its length in advance to 1e5?
myslice = make([]string, 1e5)
c. setting both its length and capacity to 1e5?
myslice = make([]string, 1e5, 1e5)
Your b and c solutions are identical: creating a slice with make() where you don't specify the capacity, the "missing" capacity defaults to the given length.
Also note that if you create the slice with a length in advance, you can't use append() to populate the slice, because it adds new elements to the slice, and it doesn't "reuse" the allocated elements. So in that case you have to assign values to the elements using an index expression, e.g. myslice[i] = value.
If you start with a slice with 0 capacity, a new backing array have to be allocated and "old" content have to be copied over whenever you append an element that does not fit into the capacity, so that solution must be slower inherently.
I would define and consider the following different solutions (I use an []int slice to avoid fmt.Sprintf() to intervene / interfere with our benchmarks):
var s []int
func BenchmarkA(b *testing.B) {
for i := 0; i < b.N; i++ {
s = nil
for j := 0; j < 1e5; j++ {
s = append(s, j)
}
}
}
func BenchmarkB(b *testing.B) {
for i := 0; i < b.N; i++ {
s = make([]int, 0, 1e5)
for j := 0; j < 1e5; j++ {
s = append(s, j)
}
}
}
func BenchmarkBLocal(b *testing.B) {
for i := 0; i < b.N; i++ {
s := make([]int, 0, 1e5)
for j := 0; j < 1e5; j++ {
s = append(s, j)
}
}
}
func BenchmarkD(b *testing.B) {
for i := 0; i < b.N; i++ {
s = make([]int, 1e5)
for j := range s {
s[j] = j
}
}
}
Note: I use package level variables in benchmarks (except BLocal), because some optimization may (and actually do) happen when using a local slice variable).
And the benchmark results:
BenchmarkA-4 1000 1081599 ns/op 4654332 B/op 30 allocs/op
BenchmarkB-4 3000 371096 ns/op 802816 B/op 1 allocs/op
BenchmarkBLocal-4 10000 172427 ns/op 802816 B/op 1 allocs/op
BenchmarkD-4 10000 167305 ns/op 802816 B/op 1 allocs/op
A: As you can see, starting with a nil slice is the slowest, uses the most memory and allocations.
B: Pre-allocating the slice with capacity (but still 0 length) and using append: it requires only a single allocation and is much faster, almost thrice as fast.
BLocal: Do note that when using a local slice instead of a package variable, (compiler) optimizations happen and it gets a lot faster: twice as fast, almost as fast as D.
D: Not using append() but assigning elements to a preallocated slice wins in every aspect, even when using a non-local variable.
For this use case, since you already know the number of string elements that you want to assign to the slice,
I would prefer approach b or c.
Since you will prevent resizing of the slice using these two approaches.
If you choose to use approach a, the slice will double its size everytime a new element is added after len equals capacity.
https://play.golang.org/p/kSuX7cE176j

Remove slice element within a for

An idiomatic method to remove an element i from a slice a, preserving the order, seems to be:
a = append(a[:i], a[i+1:]...)
I was wondering which would be the best way to do it inside a loop. As I understand, it is not possible to use it inside a range for:
for i := range a { // BAD
if conditionMeets(a[i]) {
a = append(a[:i], a[i+1:]...)
}
}
However it is possible to use len(a). [EDIT: this doesn't work, see answers below]
for i := 0; i < len(a); i++ {
if conditionMeets(a[i]) {
a = append(a[:i], a[i+1:]...)
}
}
Is there a better or more idiomatic way than using len or append?
Your proposed solution is incorrect. The problem is that when you remove an element from a slice, all subsequent elements are shifted. But the loop doesn't know that you changed the underlying slice and loop variable (the index) gets incremented as usual, even though in this case it shouldn't because then you skip an element.
And if the slice contains 2 elements which are right next to each other both of which need to be removed, the second one will not be checked and will not be removed.
So if you remove an element, the loop variable has to be decremented manually! Let's see an example: remove words that start with "a":
func conditionMeets(s string) bool {
return strings.HasPrefix(s, "a")
}
Solution (try it with all other examples below on the Go Playground):
a := []string{"abc", "bbc", "aaa", "aoi", "ccc"}
for i := 0; i < len(a); i++ {
if conditionMeets(a[i]) {
a = append(a[:i], a[i+1:]...)
i--
}
}
fmt.Println(a)
Output:
[bbc ccc]
Or better: use a downward loop and so you don't need to manually decrement the variable, because in this case the shifted elements are in the "already processed" part of the slice.
a := []string{"abc", "bbc", "aaa", "aoi", "ccc"}
for i := len(a) - 1; i >= 0; i-- {
if conditionMeets(a[i]) {
a = append(a[:i], a[i+1:]...)
}
}
fmt.Println(a)
Output is the same.
Alternate for many removals
If you have to remove "many" elements, this might be slow as you have to do a lot of copy (append() does the copy). Imagine this: you have a slice with 1000 elements; just removing the first element requires copying 999 elements to the front. Also many new slice descriptors will be created: every element removal creates 2 new slice descriptors (a[:i], a[i+1:]) plus a has to be updated (the result of append()). In this case it might be more efficient to copy the non-removable elements to a new slice.
An efficient solution:
a := []string{"abc", "bbc", "aaa", "aoi", "ccc"}
b := make([]string, len(a))
copied := 0
for _, s := range(a) {
if !conditionMeets(s) {
b[copied] = s
copied++
}
}
b = b[:copied]
fmt.Println(b)
This solution allocates a slice with the same length as the source, so no new allocations (and copying) will be performed. This solution can also use the range loop. And if you want the result in a, assign the result to a: a = b[:copied].
Output is the same.
In-place alternate for many removals (and for general purposes)
We can also do the removal "in place" with a cycle, by maintaining 2 indices and assigning (copying forward) non-removable elements in the same slice.
One thing to keep in mind is that we should zero places of removed elements in order to remove references of unreachable values so the GC can do its work. This applies to other solutions as well, but only mentioned here.
Example implementation:
a := []string{"abc", "bbc", "aaa", "aoi", "ccc"}
copied := 0
for i := 0; i < len(a); i++ {
if !conditionMeets(a[i]) {
a[copied] = a[i]
copied++
}
}
for i := copied; i < len(a); i++ {
a[i] = "" // Zero places of removed elements (allow gc to do its job)
}
a = a[:copied]
fmt.Println(a)
Output is the same. Try all the examples on the Go Playground.

Why this code raise slice bound out of range?

I have no idea why this code always slice bound out of range:
parts := make([]string, 0, len(encodedCode)/4)
for i := 0; i < len(encodedCode); i += 4 {
parts = append(parts, encodedCode[i:4])
}
encodedCode is string with length always multiply with 4. That mean encodedCode[i:4] never out of bound.
Slices are [idx_start:idx_end+1], not [idx_start:length]
Try this.
parts := make([]string, 0, len(encodedCode)/4)
for i := 0; i < len(encodedCode); i += 4 {
parts = append(parts, encodedCode[i:i+4])
}
Good examples # http://blog.golang.org/go-slices-usage-and-internals

How to remove items from a slice while ranging over it?

What is the best way to remove items from a slice while ranging over it?
For example:
type MultiDataPoint []*DataPoint
func (m MultiDataPoint) Json() ([]byte, error) {
for i, d := range m {
err := d.clean()
if ( err != nil ) {
//Remove the DP from m
}
}
return json.Marshal(m)
}
As you have mentioned elsewhere, you can allocate new memory block and copy only valid elements to it. However, if you want to avoid the allocation, you can rewrite your slice in-place:
i := 0 // output index
for _, x := range s {
if isValid(x) {
// copy and increment index
s[i] = x
i++
}
}
// Prevent memory leak by erasing truncated values
// (not needed if values don't contain pointers, directly or indirectly)
for j := i; j < len(s); j++ {
s[j] = nil
}
s = s[:i]
Full example: http://play.golang.org/p/FNDFswPeDJ
Note this will leave old values after index i in the underlying array, so this will leak memory until the slice itself is garbage collected, if values are or contain pointers. You can solve this by setting all values to nil or the zero value from i until the end of the slice before truncating it.
I know its answered long time ago but i use something like this in other languages, but i don't know if it is the golang way.
Just iterate from back to front so you don't have to worry about indexes that are deleted. I am using the same example as Adam.
m = []int{3, 7, 2, 9, 4, 5}
for i := len(m)-1; i >= 0; i-- {
if m[i] < 5 {
m = append(m[:i], m[i+1:]...)
}
}
There might be better ways, but here's an example that deletes the even values from a slice:
m := []int{1,2,3,4,5,6}
deleted := 0
for i := range m {
j := i - deleted
if (m[j] & 1) == 0 {
m = m[:j+copy(m[j:], m[j+1:])]
deleted++
}
}
Note that I don't get the element using the i, d := range m syntax, since d would end up getting set to the wrong elements once you start deleting from the slice.
Here is a more idiomatic Go way to remove elements from slices.
temp := s[:0]
for _, x := range s {
if isValid(x) {
temp = append(temp, x)
}
}
s = temp
Playground link: https://play.golang.org/p/OH5Ymsat7s9
Note: The example and playground links are based upon #tomasz's answer https://stackoverflow.com/a/20551116/12003457
One other option is to use a normal for loop using the length of the slice and subtract 1 from the index each time a value is removed. See the following example:
m := []int{3, 7, 2, 9, 4, 5}
for i := 0; i < len(m); i++ {
if m[i] < 5 {
m = append(m[:i], m[i+1:]...)
i-- // -1 as the slice just got shorter
}
}
I don't know if len() uses enough resources to make any difference but you could also run it just once and subtract from the length value too:
m := []int{3, 7, 2, 9, 4, 5}
for i, s := 0, len(m); i < s; i++ {
if m[i] < 5 {
m = append(m[:i], m[i+1:]...)
s--
i--
}
}
Something like:
m = append(m[:i], m[i+1:]...)
You don't even need to count backwards but you do need to check that you're at the end of the array where the suggested append() will fail. Here's an example of removing duplicate positive integers from a sorted list:
// Remove repeating numbers
numbers := []int{1, 2, 3, 3, 4, 5, 5}
log.Println(numbers)
for i, numbersCount, prevNum := 0, len(numbers), -1; i < numbersCount; numbersCount = len(numbers) {
if numbers[i] == prevNum {
if i == numbersCount-1 {
numbers = numbers[:i]
} else {
numbers = append(numbers[:i], numbers[i+1:]...)
}
continue
}
prevNum = numbers[i]
i++
}
log.Println(numbers)
Playground: https://play.golang.org/p/v93MgtCQsaN
I just implement a method which removes all nil elements in slice.
And I used it to solve a leetcode problems, it works perfectly.
/**
* Definition for singly-linked list.
* type ListNode struct {
* Val int
* Next *ListNode
* }
*/
func removeNil(lists *[]*ListNode) {
for i := 0; i < len(*lists); i++ {
if (*lists)[i] == nil {
*lists = append((*lists)[:i], (*lists)[i+1:]...)
i--
}
}
}
You can avoid memory leaks, as suggested in #tomasz's answer, controlling the capacity of the underlying array with a full slice expression. Look at the following function that remove duplicates from a slice of integers:
package main
import "fmt"
func removeDuplicates(a []int) []int {
for i, j := 0, 1; i < len(a) && j < len(a); i, j = i+1, j+1 {
if a[i] == a[j] {
copy(a[j:], a[j+1:])
// resize the capacity of the underlying array using the "full slice expression"
// a[low : high : max]
a = a[: len(a)-1 : len(a)-1]
i--
j--
}
}
return a
}
func main() {
a := []int{2, 3, 3, 3, 6, 9, 9}
fmt.Println(a)
a = removeDuplicates(a)
fmt.Println(a)
}
// [2 3 3 3 6 9 9]
// [2 3 6 9]
For reasons #tomasz has explained, there are issues with removing in place. That's why it is practice in golang not to do that, but to reconstruct the slice. So several answers go beyond the answer of #tomasz.
If elements should be unique, it's practice to use the keys of a map for this. I like to contribute an example of deletion by use of a map.
What's nice, the boolean values are available for a second purpose. In this example I calculate Set a minus Set b. As Golang doesn't have a real set, I make sure the output is unique. I use the boolean values as well for the algorithm.
The map gets close to O(n). I don't know the implementation. append() should be O(n). So the runtime is similar fast as deletion in place. Real deletion in place would cause a shifting of the upper end to clean up. If not done in batch, the runtime should be worse.
In this special case, I also use the map as a register, to avoid a nested loop over Set a and Set b to keep the runtime close to O(n).
type Set []int
func differenceOfSets(a, b Set) (difference Set) {
m := map[int]bool{}
for _, element := range a {
m[element] = true
}
for _, element := range b {
if _, registered := m[element]; registered {
m[element] = false
}
}
for element, present := range m {
if present {
difference = append(difference, element)
}
}
return difference
}
Try Sort and Binary search.
Example:
package main
import (
"fmt"
"sort"
)
func main() {
// Our slice.
s := []int{3, 7, 2, 9, 4, 5}
// 1. Iterate over it.
for i, v := range s {
func(i, v int) {}(i, v)
}
// 2. Sort it. (by whatever condition of yours)
sort.Slice(s, func(i, j int) bool {
return s[i] < s[j]
})
// 3. Cut it only once.
i := sort.Search(len(s), func(i int) bool { return s[i] >= 5 })
s = s[i:]
// That's it!
fmt.Println(s) // [5 7 9]
}
https://play.golang.org/p/LnF6o0yMJGT

Resources