When appending value to slice, value is different from original value - go

consider this piece of code:
package main
import (
"fmt"
)
func main() {
fmt.Println(Part(11))
}
func Part(n int) string {
enumResult := [][]int{}
enum(n, n, []int{}, &enumResult)
fmt.Println(enumResult)
fmt.Println(40, enumResult[40])
return ""
}
var abc int = 0
func enum(n int, top int, pre []int, result *[][]int) {
var i int
if n > top {
i = top
} else {
i = n
}
for ; i > 0; i-- {
tempResult := append(pre, i)
if n-i == 0 {
/* if tempResult[0] == 3 && tempResult[1] == 3 && tempResult[2] == 3 && tempResult[3] == 2 {
tempResult = append(tempResult, 12345)
}*/
fmt.Println(abc, tempResult)
abc++
*result = append(*result, tempResult)
} else {
enum(n-i, i, tempResult, result)
}
}
}
When I run this code
I append value '[3,3,3,2]' to 'enumResult'
but If I check the value of 'enumResult' then '[3,3,3,1]' is appear
it`s index is 40 =>enumResult[40]
(other value is correct)
I don`t know why this is happening
Can you explain to me why?

The problem is indeed due to append.
There are two thing about append. First is, that append doe not necessarily copy memory. As the spec specifies:
If the capacity of s is not large enough to fit the additional values,
append allocates a new, sufficiently large underlying array that fits
both the existing slice elements and the additional values. Otherwise,
append re-uses the underlying array.
This may cause unexpected behavior if you are not clear. A playground example: https://play.golang.org/p/7A3JR-5IX8o
The second part is, that when append does copy memory, it grows the capacity of the slice. However, it does not grow it just by 1. A playground example: https://play.golang.org/p/STr9jMqORUz
How much append grows a slice is undocumented and considered an implentation details. But till Go 1.10, it follows this rule:
Go slices grow by doubling until size 1024, after which they grow by
25% each time.
Note that when enabling race-detector, this may change. The code for growing slice is located in $GOROOT/src/runtime/slice.go in growslice function.
Now back to the question. It should be clear now that your code did append from a same slice with sufficient capacity due to growth of the slice from append before. To solve it, make a new slice and copy the memory.
tempResult := make([]int,len(pre)+1)
copy(tempResult,pre)
tempResult[len(pre)] = i

Related

Golang maps/hashmaps, optimizing for iteration speed

When migrating a production NodeJS application to Golang I've noticed that iteration of GO's native Map is actually slower than Node.
I've come up with an alternative solution that sacrifices removal/insertion speed with iteration speed instead, by exposing an array that can be iterated over and storing key=>index pairs inside a separate map.
While this solution works, and has a significant performance increase, I was wondering if there is a better solution to this that I could look into.
The setup I have is that its very rare something is removed from the hashmaps, only additions and replacements are common for which this implementation 'works', albeit feels like a workaround more than an actual solution.
The maps are always indexed by an integer, holding arbitrary data.
FastMap: 500000 Iterations - 0.153000ms
Native Map: 500000 Iterations - 4.988000ms
/*
Unordered hash map optimized for iteration speed.
Stores values in an array and holds key=>index mappings inside a separate hashmap
*/
type FastMapEntry[K comparable, T any] struct {
Key K
Value T
}
type FastMap[K comparable, T any] struct {
m map[K]int // Stores key => array index mappings
entries []FastMapEntry[K, T] // Array holding entries and their keys
len int // Total map size
}
func MakeFastMap[K comparable, T any]() *FastMap[K, T] {
return &FastMap[K, T]{
m: make(map[K]int),
entries: make([]FastMapEntry[K, T], 0),
}
}
func (m *FastMap[K, T]) Set(key K, value T) {
index, exists := m.m[key]
if exists {
// Replace if key already exists
m.entries[index] = FastMapEntry[K, T]{
Key: key,
Value: value,
}
} else {
// Store the key=>index pair in the map and add value to entries. Increase total len by one
m.m[key] = m.len
m.entries = append(m.entries, FastMapEntry[K, T]{
Key: key,
Value: value,
})
m.len++
}
}
func (m *FastMap[K, T]) Has(key K) bool {
_, exists := m.m[key]
return exists
}
func (m *FastMap[K, T]) Get(key K) (value T, found bool) {
index, exists := m.m[key]
if exists {
found = true
value = m.entries[index].Value
}
return
}
func (m *FastMap[K, T]) Remove(key K) bool {
index, exists := m.m[key]
if exists {
// Remove value from entries
m.entries = append(m.entries[:index], m.entries[index+1:]...)
// Remove key=>index mapping
delete(m.m, key)
m.len--
for i := index; i < m.len; i++ {
// Move all index mappings up, starting from current index
m.m[m.entries[i].Key] = i
}
}
return exists
}
func (m *FastMap[K, T]) Entries() []FastMapEntry[K, T] {
return m.entries
}
func (m *FastMap[K, T]) Len() int {
return m.len
}
The test code that was ran is:
// s.Variations is a native map holding ~500k records
start := time.Now()
iterations := 0
for _, variation := range s.Variations {
if variation.Id > 0 {
}
iterations++
}
log.Printf("Native Map: %d Iterations - %fms\n", iterations, float64(time.Since(start).Microseconds())/1000)
// Copy data into FastMap
fm := helpers.MakeFastMap[state.VariationId, models.ItemVariation]()
for key, variation := range s.Variations {
fm.Set(key, variation)
}
start = time.Now()
iterations = 0
for _, variation := range fm.Entries() {
if variation.Value.Id > 0 {
}
iterations++
}
log.Printf("FastMap: %d Iterations - %fms\n", iterations, float64(time.Since(start).Microseconds())/1000)
I think this kind of comparison and benchmarking is a little off-topic. Go implementation of map is quite different from your implementation, basically because it needs to cover a wider area of entries, the structs used in compile time are actually kind of heavy (not so much though, they basically store some information about the types you use in your map and so on), and the implementation approach is different! Go implementation of map is basically a hashmap (yours is not obviously, or it is, but the actual hashing implementation is delegated to the m map you hold internally).
One of the other factors makes you get this result is, if you take a look at this:
for _, variation := range fm.Entries() {
if variation.Value.Id > 0 {
}
iterations++
}
Basically, you're iterating over a slice, which is much easier and faster to iterate rather than a map, you have a view to an array, which holds elements of the same types next to each other, makes sense, right?
What you should do to make a better comparison would be something like this:
for _, y := range fastMap.m {
_ = fastMap.Entries()[y].Value + 1 // some simple calculation
}
If you're really looking for performance, a well written hash function and a fixed size array would be your best choice.

how to manipulate very long string to avoid out of memory with golang

I trying for personal skills improvement to solve the hacker rank challenge:
There is a string, s, of lowercase English letters that is repeated infinitely many times. Given an integer, n, find and print the number of letter a's in the first n letters of the infinite string.
1<=s<=100 && 1<=n<=10^12
Very naively I though this code will be fine:
fs := strings.Repeat(s, int(n)) // full string
ss := fs[:n] // sub string
fmt.Println(strings.Count(ss, "a"))
Obviously I explode the memory and got an: "out of memory".
I never faced this kind of issue, and I'm clueless on how to handle it.
How can I manipulate very long string to avoid out of memory ?
I hope this helps, you don't have to actually count by running through the string.
That is the naive approach. You need to use some basic arithmetic to get the answer without running out of memory, I hope the comments help.
var answer int64
// 1st figure out how many a's are present in s.
aCount := int64(strings.Count(s, "a"))
// How many times will s repeat in its entirety if it had to be of length n
repeats := n / int64(len(s))
remainder := n % int64(len(s))
// If n/len(s) is not perfectly divisible, it means there has to be a remainder, check if that's the case.
// If s is of length 5 and the value of n = 22, then the first 2 characters of s would repeat an extra time.
if remainder > 0{
aCountInRemainder := strings.Count(s[:remainder], "a")
answer = int64((aCount * repeats) + int64(aCountInRemainder))
} else{
answer = int64((aCount * repeats))
}
return answer
There might be other methods but this is what came to my mind.
As you found out, if you actually generate the string you will end up having that huge memory block in RAM.
One common way to represent a "big sequence of incoming bytes" is to implement it as an io.Reader (which you can view as a stream of bytes), and have your code run a r.Read(buff) loop.
Given the specifics of the exercise you mention (a fixed string repeated n times), the number of occurrence of a specific letter can also be computed straight from the number of occurences of that letter in s, plus something more (I'll let you figure out what multiplications and counting should be done).
How to implement a Reader that repeats the string without allocating 10^12 times the string ?
Note that, when implementing the .Read() method, the caller has already allocated his buffer. You don't need to repeat your string in memory, you just need to fill the buffer with the correct values -- for example by copying byte by byte your data into the buffer.
Here is one way to do it :
type RepeatReader struct {
str string
count int
}
func (r *RepeatReader) Read(p []byte) (int, error) {
if r.count == 0 {
return 0, io.EOF
}
// at each iteration, pos will hold the number of bytes copied so far
var pos = 0
for r.count > 0 && pos < len(p) {
// to copy slices over, you can use the built-in 'copy' method
// at each iteration, you need to write bytes *after* the ones you have already copied,
// hence the "p[pos:]"
n := copy(p[pos:], r.str)
// update the amount of copied bytes
pos += n
// bad computation for this first example :
// I decrement one complete count, even if str was only partially copied
r.count--
}
return pos, nil
}
https://go.dev/play/p/QyFQ-3NzUDV
To have a complete, correct implementation, you also need to keep track of the offset you need to start from next time .Read() is called :
type RepeatReader struct {
str string
count int
offset int
}
func (r *RepeatReader) Read(p []byte) (int, error) {
if r.count == 0 {
return 0, io.EOF
}
var pos = 0
for r.count > 0 && pos < len(p) {
// when copying over to p, you should start at r.offset :
n := copy(p[pos:], r.str[r.offset:])
pos += n
// update r.offset :
r.offset += n
// if one full copy of str has been issued, decrement 'count' and reset 'offset' to 0
if r.offset == len(r.str) {
r.count--
r.offset = 0
}
}
return pos, nil
}
https://go.dev/play/p/YapRuioQcOz
You can now count the as while iterating through this Reader.

how to understand the following code about golang slice?

Recently I found some code that I can't understand, below is my code:
func subsetsWithDup(nums []int) [][]int {
if len(nums) == 0 {
return [][]int{[]int{}}
}
sort.Ints(nums)
result := [][]int{}
backtracking(nums, &result, []int{}, 0)
return result
}
func backtracking(nums []int, result *[][]int, tempList []int, start int) {
*result = append(*result, tempList)
for i := start; i < len(nums); i++ {
if i > start && nums[i] == nums[i-1] {
continue
}
tempList = append(tempList, nums[i])
backtracking(nums, result, tempList, i+1)
tempList = tempList[:len(tempList)-1:len(tempList)-1]
}
}
and another approach:
func subsetsWithDup(nums []int) [][]int {
sort.Ints(nums)
return subsets(nums, []int{}, [][]int{})
}
func subsets(nums []int, result []int, results [][]int) [][]int {
newR := make([]int, len(result))
copy(newR, result)
results = append(results, newR)
if len(nums) == 0 {return results}
for i := 0; i < len(nums); i++ {
if i > 0 && nums[i] == nums[i - 1] {continue}
result = append(result, nums[i])
results = subsets(nums[i + 1:], result, results)
result = result[:len(result) - 1]
}
return results
}
In the first approach, I use the following code :
tempList = tempList[:len(tempList)-1:len(tempList)-1]
it works, but if I change it to:
tempList = tempList[:len(tempList)-1]
it dose not work.In the second approach which use copy function also works. I want to know what happens behind the code, any help is appreciated, thanks.
In Go, slice is a pointer type to maintain information about underlying array, so change of the underlying array would cause changes of the slice value, which sometimes might be surprising.
The second part of the puzzle is that append modifies the underlying array if the cap of the slice is sufficient. Document:
The append built-in function appends elements to the end of a slice.
If it has sufficient capacity, the destination is resliced to
accommodate the new elements. If it does not, a new underlying array
will be allocated. Append returns the updated slice. It is therefore
necessary to store the result of append, often in the variable holding
the slice itself.
So in you failed attempt, tempList = append(tempList, nums[i]) will possibly change value of previously stored slices in result.
On the other hand, the second approach creates a new slice with new underlying array and copy to it explictly, so the error is avoided. The first approach is more subtle, as it use a full slice expressions: tempList[:len(tempList)-1:len(tempList)-1]. The code limits the new slice's cap so append would have to allocate a new underlying array each time instead of using the orignal one.
More about full slice expressions(spec):
For an array, pointer to array, or slice a (but not a string), the primary expression
a[low : high : max]
constructs a slice of the same type, and with the same length and elements as the simple slice expression a[low : high]. Additionally, it controls the resulting slice's capacity by setting it to max - low. Only the first index may be omitted; it defaults to 0. After slicing the array a
a := [5]int{1, 2, 3, 4, 5}
t := a[1:3:5]
the slice t has type []int, length 2, capacity 4, and elements
t[0] == 2
t[1] == 3

What is the mechanism of using append to prepend in Go?

Suppose I have a slice slice of type int. While declaring, I set the third argument to size, which I believe reserves memory for at least size ints by setting the cap parameter of the slice.
slice:=make([]int,0,size)
Now, suppose I have an integer variable value. To add it to the slice at the end, I use
slice=append(slice,value)
If the number of elements currently in the slice is less than size, then there will be no need to copy the entire underlying array to a new location in order to add the new element.
Further, if I want to prepend value to slice, as suggested here and here, I use
slice=append([]int{value},slice...)
My question is, what happens in this case? If the number of elements is still less than size, how are the elements stored in the memory? Assuming a contiguous allocation when the make() function was invoked, are all existing elements right shifted to free the first space for value? Or is memory reallocated and all elements copied?
The reason for asking is that I would like my program to be as fast as possible, and would like to know if this is a possible cause for slowing it down. If it is so, is there any alternative way of prepending that would be more time efficient?
With reslicing and copying
The builtin append() always appends elements to a slice. You cannot use it (alone) to prepend elements.
Having said that, if you have a slice capacity bigger than length (has "free" space after its elements) to which you want to prepend an element, you may reslice the original slice, copy all elements to an index one higher to make room for the new element, then set the element to the 0th index. This will require no new allocation. This is how it could look like:
func prepend(dest []int, value int) []int {
if cap(dest) > len(dest) {
dest = dest[:len(dest)+1]
copy(dest[1:], dest)
dest[0] = value
return dest
}
// No room, new slice need to be allocated:
// Use some extra space for future:
res := make([]int, len(dest)+1, len(dest)+5)
res[0] = value
copy(res[1:], dest)
return res
}
Testing it:
s := make([]int, 0, 5)
s = append(s, 1, 2, 3, 4)
fmt.Println(s)
s = prepend(s, 9)
fmt.Println(s)
s = prepend(s, 8)
fmt.Println(s)
Output (try it on the Go Playground):
[1 2 3 4]
[9 1 2 3 4]
[8 9 1 2 3 4]
Note: if no room for the new element, since performance does matter now, we didn't just do:
res := append([]int{value}, dest...)
Because it does more allocations and copying than needed: allocates a slice for the literal ([]int{value}), then append() allocates a new when appending dest to it.
Instead our solution allocates just one new array (by make(), even reserving some space for future growth), then just set value as the first element and copy dest (the previous elements).
With linked list
If you need to prepend many times, a normal slice may not be the right choice. A faster alternative would be to use a linked list, to which prepending an element requires no allocations of slices / arrays and copying, you just create a new node element, and you designate it to be the root by pointing it to the old root (first element).
The standard library provides a general implementation in the container/list package.
With manually managing a larger backing array
Sticking to normal slices and arrays, there is another solution.
If you're willing to manage a larger backing array (or slice) yourself, you can do so by leaving free space before the slice you use. When prepending, you can create a new slice value from the backing larger array or slice which starts at an index that leaves room for 1 element to be prepended.
Without completeness, just for demonstration:
var backing = make([]int, 15) // 15 elements
var start int
func prepend(dest []int, value int) []int {
if start == 0 {
// No more room for new value, must allocate bigger backing array:
newbacking := make([]int, len(backing)+5)
start = 5
copy(newbacking[5:], backing)
backing = newbacking
}
start--
dest = backing[start : start+len(dest)+1]
dest[0] = value
return dest
}
Testing / using it:
start = 5
s := backing[start:start] // empty slice, starting at idx=5
s = append(s, 1, 2, 3, 4)
fmt.Println(s)
s = prepend(s, 9)
fmt.Println(s)
s = prepend(s, 8)
fmt.Println(s)
// Prepend more to test reallocation:
for i := 10; i < 15; i++ {
s = prepend(s, i)
}
fmt.Println(s)
Output (try it on the Go Playground):
[1 2 3 4]
[9 1 2 3 4]
[8 9 1 2 3 4]
[14 13 12 11 10 8 9 1 2 3 4]
Analysis: this solution makes no allocations and no copying when there is room in the backing slice to prepend the value! All that happens is it creates a new slice from the backing slice that covers the destination +1 space for the value to be prepended, sets it and returns this slice value. You can't really get better than this.
If there is no room, then it allocates a larger backing slice, copies over the content of the old, and then does the "normal" prepending.
With tricky slice usage
Idea: imagine that you always store elements in a slice in backward order.
Storing your elements in backward order in a slice means a prepand becomes append!
So to "prepand" an element, you can simply use append(s, value). And that's all.
Yes, this has its limited uses (e.g. append to a slice stored in reverse order has the same issues and complexity as a "normal" slice and prepand operation), and you lose many conveniences (ability to list elements using for range just to name one), but performance wise nothing beats prepanding a value just by using append().
Note: iterating over the elements that stores elements in backward order has to use a downward loop, e.g.:
for i := len(s) - 1; i >= 0; i-- {
// do something with s[i]
}
Final note: all these solutions can easily be extended to prepend a slice instead of just a value. Generally the additional space when reslicing is not +1 but +len(values), and not simply setting dst[0] = value but instead a call to copy(dst, values).
The "prepend" call will need to allocate an array and copy all elements because a slice in Go is defined as a starting point, a size and an allocation (with the allocation being counted from the starting point).
There is no way a slice can know that the element before the first one can be used to extend the slice.
What will happen with
slice = append([]int{value}, slice...)
is
a new array of a single element value is allocated (probably on stack)
a slice is created to map this element (start=0, size=1, alloc=1)
the append call is done
append sees that there is not enough room to extend the single-element slice so allocates a new array and copies all the elements
a new slice object is created to refer to this array
If appending/removing at both ends of a large container is the common use case for your application then you need a deque container. It is unfortunately unavailable in Go and impossible to write efficiently for generic contained types while maintaining usability (because Go still lacks generics).
You can however implement a deque for your specific case and this is easy (for example if you have a large container with a known upper bound may be a circular buffer is all you need and that is just a couple of lines of code away).
I'm very new to Go, so may be the following is very bad Go code... but it's an attempt to implement a deque using a growing circular buffer (depending on the use case this may be or may be not a good solution)
type Deque struct {
buffer []interface{}
f, b, n int
}
func (d *Deque) resize() {
new_buffer := make([]interface{}, 2*(1+d.n))
j := d.f
for i := 0; i < d.n; i++ {
new_buffer[i] = d.buffer[j]
d.buffer[j] = nil
j++
if j == len(d.buffer) {
j = 0
}
}
d.f = 0
d.b = d.n
d.buffer = new_buffer
}
func (d *Deque) push_back(x interface{}) {
if d.n == len(d.buffer) {
d.resize()
}
d.buffer[d.b] = x
d.b++
if d.b == len(d.buffer) {
d.b = 0
}
d.n++
}
func (d *Deque) push_front(x interface{}) {
if d.n == len(d.buffer) {
d.resize()
}
if d.f == 0 {
d.f = len(d.buffer)
}
d.f--
d.buffer[d.f] = x
d.n++
}
func (d *Deque) pop_back() interface{} {
if d.n == 0 {
panic("Cannot pop from an empty deque")
}
if d.b == 0 {
d.b = len(d.buffer)
}
d.b--
x := d.buffer[d.b]
d.buffer[d.b] = nil
d.n--
return x
}
func (d *Deque) pop_front() interface{} {
if d.n == 0 {
panic("Cannot pop from an empty deque")
}
x := d.buffer[d.f]
d.buffer[d.f] = nil
d.f++
if d.f == len(d.buffer) {
d.f = 0
}
d.n--
return x
}

Standard library Priority Queue push method

The code snippet below is the library implementation of the push methods for a priority queue. I am wondering why the line with the code a = a[0 : n+1] does not throw an out of bounds errors.
func (pq *PriorityQueue) Push(x interface{}) {
// Push and Pop use pointer receivers because they modify the slice's length,
// not just its contents.
// To simplify indexing expressions in these methods, we save a copy of the
// slice object. We could instead write (*pq)[i].
a := *pq
n := len(a)
a = a[0 : n+1]
item := x.(*Item)
item.index = n
a[n] = item
*pq = a
}
a slice is not an array; it is a view onto an existing array. The slice in question is backed by an array larger than itself. When you define a slice of an existing slice, you're actually slicing the underlying array, but the indexes referenced are relative to the source slice.
That's a mouthful. Let's prove this in the following way: we'll create a slice of zero length, but we'll force the underlying array to be larger. When creating a slice with make, the third parameter will set the size of the underlying array. The expression make([]int, 0, 2) will allocate an array of size 2, but it evaluates to a size-zero slice.
package main
import ("fmt")
func main() {
// create a zero-width slice over an initial array of size 2
a := make([]int, 0, 2)
fmt.Println(a)
// expand the slice. Since we're not beyond the size of the initial
// array, this isn't out of bounds.
a = a[0:len(a)+1]
a[0] = 1
fmt.Println(a)
fmt.Println(a[0:len(a)+1])
}
see here. You can use the cap keyword to reference the size of the array that backs a given slice.
The specific code that you asked about loops over cap(pq) in the calling context (container/heap/example_test.go line 90). If you modify the code at the call site and attempt to push another item into the queue, it will panic like you expect. I ... probably wouldn't suggest writing code like this. Although the code in the standard library executes, I would be very sour if I found that in my codebase. It's generally safer to use the append keyword.
Because it works in a specific example program. Here are the important parts from the original/full example source)
const nItem = 10
and
pq := make(PriorityQueue, 0, nItem)
and
for i := 0; i < cap(pq); i++ {
item := &Item{
value: values[i],
priority: priorities[i],
}
heap.Push(&pq, item)
}
Is it an example from container/heap? If yes, then it doesn't throws an exception because capacity is big enough (see how the Push method is used). If you change the example to Push more items then the capacity, then it'll throw.
It does in general; it doesn't in the container/heap example. Here's the general fix I already gave you some time ago.
func (pq *PriorityQueue) Push(x interface{}) {
a := *pq
n := len(a)
item := x.(*Item)
item.index = n
a = append(a, item)
*pq = a
}
Golang solution to Project Euler problem #81

Resources