I was solving this Project Euler question. First I tried brute force and it took 0.5 seconds and then I tried the dynamic programming to utilize memoization expecting a huge improvement but I surprised that the result was 0.36 seconds.
After a little bit of googling I found out you can not use a pointer in a function (find_collatz_len) to an outside map data (memo). So each time the function below runs it copies over the entire dictionary. That sounds like a huge waste of processor power.
My question is what is a workaround so that I can use a pointer to a map outside the function to avoid the copying.
Here is my ugly code:
package main
//project euler 014 - longest collatz sequence
import (
"fmt"
"time"
)
func find_collatz_len(n int, memo map[int]int) int {
counter := 1
initital_value := n
for n != 1 {
counter++
if n < initital_value {
counter = counter + memo[n]
break
}
if n%2 == 0 {
n = int(float64(n)/2)
} else {
n = n*3+1
}
}
memo[initital_value] = counter
return counter
}
func main() {
start := time.Now()
max_length := 0
number := 0
current_length := 0
memo := make(map[int]int)
for i:=1; i<1_000_000; i++ {
current_length = find_collatz_len(i, memo)
if current_length > max_length {
max_length = current_length
number = i
}
}
fmt.Println(max_length, number)
fmt.Println("Time:", time.Since(start).Seconds())
}
Maps are already pointers under the hood. Passing a map value will pass a single pointer. For details, see why slice values can sometimes go stale but never map values?
When creating a map without a capacity hint, a map is allocated with internal structure enough to store a relatively small number of entries (around 7). As the map grows, the implementation sometimes needs to allocate more memory and restructure (rehash) the map to accommodate more elements. This can be avoided if you initialize the map with the expected final capacity as suggested by #mkopriva:
memo := make(map[int]int, 1_000_000).
As a result, enough room will be allocated to store all entries (1_000_000 in your example), so a rehash will not happen during the lifetime of your app. This will reduce the runtime from 0.3 sec to 0.2 sec.
You can also replace int(float64(n)/2) with n/2, as in the integer range you use, they give the same result. This will give you further 5% boost (0.19 sec on my machine).
Related
When migrating a production NodeJS application to Golang I've noticed that iteration of GO's native Map is actually slower than Node.
I've come up with an alternative solution that sacrifices removal/insertion speed with iteration speed instead, by exposing an array that can be iterated over and storing key=>index pairs inside a separate map.
While this solution works, and has a significant performance increase, I was wondering if there is a better solution to this that I could look into.
The setup I have is that its very rare something is removed from the hashmaps, only additions and replacements are common for which this implementation 'works', albeit feels like a workaround more than an actual solution.
The maps are always indexed by an integer, holding arbitrary data.
FastMap: 500000 Iterations - 0.153000ms
Native Map: 500000 Iterations - 4.988000ms
/*
Unordered hash map optimized for iteration speed.
Stores values in an array and holds key=>index mappings inside a separate hashmap
*/
type FastMapEntry[K comparable, T any] struct {
Key K
Value T
}
type FastMap[K comparable, T any] struct {
m map[K]int // Stores key => array index mappings
entries []FastMapEntry[K, T] // Array holding entries and their keys
len int // Total map size
}
func MakeFastMap[K comparable, T any]() *FastMap[K, T] {
return &FastMap[K, T]{
m: make(map[K]int),
entries: make([]FastMapEntry[K, T], 0),
}
}
func (m *FastMap[K, T]) Set(key K, value T) {
index, exists := m.m[key]
if exists {
// Replace if key already exists
m.entries[index] = FastMapEntry[K, T]{
Key: key,
Value: value,
}
} else {
// Store the key=>index pair in the map and add value to entries. Increase total len by one
m.m[key] = m.len
m.entries = append(m.entries, FastMapEntry[K, T]{
Key: key,
Value: value,
})
m.len++
}
}
func (m *FastMap[K, T]) Has(key K) bool {
_, exists := m.m[key]
return exists
}
func (m *FastMap[K, T]) Get(key K) (value T, found bool) {
index, exists := m.m[key]
if exists {
found = true
value = m.entries[index].Value
}
return
}
func (m *FastMap[K, T]) Remove(key K) bool {
index, exists := m.m[key]
if exists {
// Remove value from entries
m.entries = append(m.entries[:index], m.entries[index+1:]...)
// Remove key=>index mapping
delete(m.m, key)
m.len--
for i := index; i < m.len; i++ {
// Move all index mappings up, starting from current index
m.m[m.entries[i].Key] = i
}
}
return exists
}
func (m *FastMap[K, T]) Entries() []FastMapEntry[K, T] {
return m.entries
}
func (m *FastMap[K, T]) Len() int {
return m.len
}
The test code that was ran is:
// s.Variations is a native map holding ~500k records
start := time.Now()
iterations := 0
for _, variation := range s.Variations {
if variation.Id > 0 {
}
iterations++
}
log.Printf("Native Map: %d Iterations - %fms\n", iterations, float64(time.Since(start).Microseconds())/1000)
// Copy data into FastMap
fm := helpers.MakeFastMap[state.VariationId, models.ItemVariation]()
for key, variation := range s.Variations {
fm.Set(key, variation)
}
start = time.Now()
iterations = 0
for _, variation := range fm.Entries() {
if variation.Value.Id > 0 {
}
iterations++
}
log.Printf("FastMap: %d Iterations - %fms\n", iterations, float64(time.Since(start).Microseconds())/1000)
I think this kind of comparison and benchmarking is a little off-topic. Go implementation of map is quite different from your implementation, basically because it needs to cover a wider area of entries, the structs used in compile time are actually kind of heavy (not so much though, they basically store some information about the types you use in your map and so on), and the implementation approach is different! Go implementation of map is basically a hashmap (yours is not obviously, or it is, but the actual hashing implementation is delegated to the m map you hold internally).
One of the other factors makes you get this result is, if you take a look at this:
for _, variation := range fm.Entries() {
if variation.Value.Id > 0 {
}
iterations++
}
Basically, you're iterating over a slice, which is much easier and faster to iterate rather than a map, you have a view to an array, which holds elements of the same types next to each other, makes sense, right?
What you should do to make a better comparison would be something like this:
for _, y := range fastMap.m {
_ = fastMap.Entries()[y].Value + 1 // some simple calculation
}
If you're really looking for performance, a well written hash function and a fixed size array would be your best choice.
I trying for personal skills improvement to solve the hacker rank challenge:
There is a string, s, of lowercase English letters that is repeated infinitely many times. Given an integer, n, find and print the number of letter a's in the first n letters of the infinite string.
1<=s<=100 && 1<=n<=10^12
Very naively I though this code will be fine:
fs := strings.Repeat(s, int(n)) // full string
ss := fs[:n] // sub string
fmt.Println(strings.Count(ss, "a"))
Obviously I explode the memory and got an: "out of memory".
I never faced this kind of issue, and I'm clueless on how to handle it.
How can I manipulate very long string to avoid out of memory ?
I hope this helps, you don't have to actually count by running through the string.
That is the naive approach. You need to use some basic arithmetic to get the answer without running out of memory, I hope the comments help.
var answer int64
// 1st figure out how many a's are present in s.
aCount := int64(strings.Count(s, "a"))
// How many times will s repeat in its entirety if it had to be of length n
repeats := n / int64(len(s))
remainder := n % int64(len(s))
// If n/len(s) is not perfectly divisible, it means there has to be a remainder, check if that's the case.
// If s is of length 5 and the value of n = 22, then the first 2 characters of s would repeat an extra time.
if remainder > 0{
aCountInRemainder := strings.Count(s[:remainder], "a")
answer = int64((aCount * repeats) + int64(aCountInRemainder))
} else{
answer = int64((aCount * repeats))
}
return answer
There might be other methods but this is what came to my mind.
As you found out, if you actually generate the string you will end up having that huge memory block in RAM.
One common way to represent a "big sequence of incoming bytes" is to implement it as an io.Reader (which you can view as a stream of bytes), and have your code run a r.Read(buff) loop.
Given the specifics of the exercise you mention (a fixed string repeated n times), the number of occurrence of a specific letter can also be computed straight from the number of occurences of that letter in s, plus something more (I'll let you figure out what multiplications and counting should be done).
How to implement a Reader that repeats the string without allocating 10^12 times the string ?
Note that, when implementing the .Read() method, the caller has already allocated his buffer. You don't need to repeat your string in memory, you just need to fill the buffer with the correct values -- for example by copying byte by byte your data into the buffer.
Here is one way to do it :
type RepeatReader struct {
str string
count int
}
func (r *RepeatReader) Read(p []byte) (int, error) {
if r.count == 0 {
return 0, io.EOF
}
// at each iteration, pos will hold the number of bytes copied so far
var pos = 0
for r.count > 0 && pos < len(p) {
// to copy slices over, you can use the built-in 'copy' method
// at each iteration, you need to write bytes *after* the ones you have already copied,
// hence the "p[pos:]"
n := copy(p[pos:], r.str)
// update the amount of copied bytes
pos += n
// bad computation for this first example :
// I decrement one complete count, even if str was only partially copied
r.count--
}
return pos, nil
}
https://go.dev/play/p/QyFQ-3NzUDV
To have a complete, correct implementation, you also need to keep track of the offset you need to start from next time .Read() is called :
type RepeatReader struct {
str string
count int
offset int
}
func (r *RepeatReader) Read(p []byte) (int, error) {
if r.count == 0 {
return 0, io.EOF
}
var pos = 0
for r.count > 0 && pos < len(p) {
// when copying over to p, you should start at r.offset :
n := copy(p[pos:], r.str[r.offset:])
pos += n
// update r.offset :
r.offset += n
// if one full copy of str has been issued, decrement 'count' and reset 'offset' to 0
if r.offset == len(r.str) {
r.count--
r.offset = 0
}
}
return pos, nil
}
https://go.dev/play/p/YapRuioQcOz
You can now count the as while iterating through this Reader.
consider this piece of code:
package main
import (
"fmt"
)
func main() {
fmt.Println(Part(11))
}
func Part(n int) string {
enumResult := [][]int{}
enum(n, n, []int{}, &enumResult)
fmt.Println(enumResult)
fmt.Println(40, enumResult[40])
return ""
}
var abc int = 0
func enum(n int, top int, pre []int, result *[][]int) {
var i int
if n > top {
i = top
} else {
i = n
}
for ; i > 0; i-- {
tempResult := append(pre, i)
if n-i == 0 {
/* if tempResult[0] == 3 && tempResult[1] == 3 && tempResult[2] == 3 && tempResult[3] == 2 {
tempResult = append(tempResult, 12345)
}*/
fmt.Println(abc, tempResult)
abc++
*result = append(*result, tempResult)
} else {
enum(n-i, i, tempResult, result)
}
}
}
When I run this code
I append value '[3,3,3,2]' to 'enumResult'
but If I check the value of 'enumResult' then '[3,3,3,1]' is appear
it`s index is 40 =>enumResult[40]
(other value is correct)
I don`t know why this is happening
Can you explain to me why?
The problem is indeed due to append.
There are two thing about append. First is, that append doe not necessarily copy memory. As the spec specifies:
If the capacity of s is not large enough to fit the additional values,
append allocates a new, sufficiently large underlying array that fits
both the existing slice elements and the additional values. Otherwise,
append re-uses the underlying array.
This may cause unexpected behavior if you are not clear. A playground example: https://play.golang.org/p/7A3JR-5IX8o
The second part is, that when append does copy memory, it grows the capacity of the slice. However, it does not grow it just by 1. A playground example: https://play.golang.org/p/STr9jMqORUz
How much append grows a slice is undocumented and considered an implentation details. But till Go 1.10, it follows this rule:
Go slices grow by doubling until size 1024, after which they grow by
25% each time.
Note that when enabling race-detector, this may change. The code for growing slice is located in $GOROOT/src/runtime/slice.go in growslice function.
Now back to the question. It should be clear now that your code did append from a same slice with sufficient capacity due to growth of the slice from append before. To solve it, make a new slice and copy the memory.
tempResult := make([]int,len(pre)+1)
copy(tempResult,pre)
tempResult[len(pre)] = i
Suppose I have a slice slice of type int. While declaring, I set the third argument to size, which I believe reserves memory for at least size ints by setting the cap parameter of the slice.
slice:=make([]int,0,size)
Now, suppose I have an integer variable value. To add it to the slice at the end, I use
slice=append(slice,value)
If the number of elements currently in the slice is less than size, then there will be no need to copy the entire underlying array to a new location in order to add the new element.
Further, if I want to prepend value to slice, as suggested here and here, I use
slice=append([]int{value},slice...)
My question is, what happens in this case? If the number of elements is still less than size, how are the elements stored in the memory? Assuming a contiguous allocation when the make() function was invoked, are all existing elements right shifted to free the first space for value? Or is memory reallocated and all elements copied?
The reason for asking is that I would like my program to be as fast as possible, and would like to know if this is a possible cause for slowing it down. If it is so, is there any alternative way of prepending that would be more time efficient?
With reslicing and copying
The builtin append() always appends elements to a slice. You cannot use it (alone) to prepend elements.
Having said that, if you have a slice capacity bigger than length (has "free" space after its elements) to which you want to prepend an element, you may reslice the original slice, copy all elements to an index one higher to make room for the new element, then set the element to the 0th index. This will require no new allocation. This is how it could look like:
func prepend(dest []int, value int) []int {
if cap(dest) > len(dest) {
dest = dest[:len(dest)+1]
copy(dest[1:], dest)
dest[0] = value
return dest
}
// No room, new slice need to be allocated:
// Use some extra space for future:
res := make([]int, len(dest)+1, len(dest)+5)
res[0] = value
copy(res[1:], dest)
return res
}
Testing it:
s := make([]int, 0, 5)
s = append(s, 1, 2, 3, 4)
fmt.Println(s)
s = prepend(s, 9)
fmt.Println(s)
s = prepend(s, 8)
fmt.Println(s)
Output (try it on the Go Playground):
[1 2 3 4]
[9 1 2 3 4]
[8 9 1 2 3 4]
Note: if no room for the new element, since performance does matter now, we didn't just do:
res := append([]int{value}, dest...)
Because it does more allocations and copying than needed: allocates a slice for the literal ([]int{value}), then append() allocates a new when appending dest to it.
Instead our solution allocates just one new array (by make(), even reserving some space for future growth), then just set value as the first element and copy dest (the previous elements).
With linked list
If you need to prepend many times, a normal slice may not be the right choice. A faster alternative would be to use a linked list, to which prepending an element requires no allocations of slices / arrays and copying, you just create a new node element, and you designate it to be the root by pointing it to the old root (first element).
The standard library provides a general implementation in the container/list package.
With manually managing a larger backing array
Sticking to normal slices and arrays, there is another solution.
If you're willing to manage a larger backing array (or slice) yourself, you can do so by leaving free space before the slice you use. When prepending, you can create a new slice value from the backing larger array or slice which starts at an index that leaves room for 1 element to be prepended.
Without completeness, just for demonstration:
var backing = make([]int, 15) // 15 elements
var start int
func prepend(dest []int, value int) []int {
if start == 0 {
// No more room for new value, must allocate bigger backing array:
newbacking := make([]int, len(backing)+5)
start = 5
copy(newbacking[5:], backing)
backing = newbacking
}
start--
dest = backing[start : start+len(dest)+1]
dest[0] = value
return dest
}
Testing / using it:
start = 5
s := backing[start:start] // empty slice, starting at idx=5
s = append(s, 1, 2, 3, 4)
fmt.Println(s)
s = prepend(s, 9)
fmt.Println(s)
s = prepend(s, 8)
fmt.Println(s)
// Prepend more to test reallocation:
for i := 10; i < 15; i++ {
s = prepend(s, i)
}
fmt.Println(s)
Output (try it on the Go Playground):
[1 2 3 4]
[9 1 2 3 4]
[8 9 1 2 3 4]
[14 13 12 11 10 8 9 1 2 3 4]
Analysis: this solution makes no allocations and no copying when there is room in the backing slice to prepend the value! All that happens is it creates a new slice from the backing slice that covers the destination +1 space for the value to be prepended, sets it and returns this slice value. You can't really get better than this.
If there is no room, then it allocates a larger backing slice, copies over the content of the old, and then does the "normal" prepending.
With tricky slice usage
Idea: imagine that you always store elements in a slice in backward order.
Storing your elements in backward order in a slice means a prepand becomes append!
So to "prepand" an element, you can simply use append(s, value). And that's all.
Yes, this has its limited uses (e.g. append to a slice stored in reverse order has the same issues and complexity as a "normal" slice and prepand operation), and you lose many conveniences (ability to list elements using for range just to name one), but performance wise nothing beats prepanding a value just by using append().
Note: iterating over the elements that stores elements in backward order has to use a downward loop, e.g.:
for i := len(s) - 1; i >= 0; i-- {
// do something with s[i]
}
Final note: all these solutions can easily be extended to prepend a slice instead of just a value. Generally the additional space when reslicing is not +1 but +len(values), and not simply setting dst[0] = value but instead a call to copy(dst, values).
The "prepend" call will need to allocate an array and copy all elements because a slice in Go is defined as a starting point, a size and an allocation (with the allocation being counted from the starting point).
There is no way a slice can know that the element before the first one can be used to extend the slice.
What will happen with
slice = append([]int{value}, slice...)
is
a new array of a single element value is allocated (probably on stack)
a slice is created to map this element (start=0, size=1, alloc=1)
the append call is done
append sees that there is not enough room to extend the single-element slice so allocates a new array and copies all the elements
a new slice object is created to refer to this array
If appending/removing at both ends of a large container is the common use case for your application then you need a deque container. It is unfortunately unavailable in Go and impossible to write efficiently for generic contained types while maintaining usability (because Go still lacks generics).
You can however implement a deque for your specific case and this is easy (for example if you have a large container with a known upper bound may be a circular buffer is all you need and that is just a couple of lines of code away).
I'm very new to Go, so may be the following is very bad Go code... but it's an attempt to implement a deque using a growing circular buffer (depending on the use case this may be or may be not a good solution)
type Deque struct {
buffer []interface{}
f, b, n int
}
func (d *Deque) resize() {
new_buffer := make([]interface{}, 2*(1+d.n))
j := d.f
for i := 0; i < d.n; i++ {
new_buffer[i] = d.buffer[j]
d.buffer[j] = nil
j++
if j == len(d.buffer) {
j = 0
}
}
d.f = 0
d.b = d.n
d.buffer = new_buffer
}
func (d *Deque) push_back(x interface{}) {
if d.n == len(d.buffer) {
d.resize()
}
d.buffer[d.b] = x
d.b++
if d.b == len(d.buffer) {
d.b = 0
}
d.n++
}
func (d *Deque) push_front(x interface{}) {
if d.n == len(d.buffer) {
d.resize()
}
if d.f == 0 {
d.f = len(d.buffer)
}
d.f--
d.buffer[d.f] = x
d.n++
}
func (d *Deque) pop_back() interface{} {
if d.n == 0 {
panic("Cannot pop from an empty deque")
}
if d.b == 0 {
d.b = len(d.buffer)
}
d.b--
x := d.buffer[d.b]
d.buffer[d.b] = nil
d.n--
return x
}
func (d *Deque) pop_front() interface{} {
if d.n == 0 {
panic("Cannot pop from an empty deque")
}
x := d.buffer[d.f]
d.buffer[d.f] = nil
d.f++
if d.f == len(d.buffer) {
d.f = 0
}
d.n--
return x
}
What is the max value of *big.Int and max precision of *big.Rat?
Here are the structure definitions :
// A Word represents a single digit of a multi-precision unsigned integer.
type Word uintptr
type nat []Word
type Int struct {
neg bool // sign
abs nat // absolute value of the integer
}
type Rat struct {
// To make zero values for Rat work w/o initialization,
// a zero value of b (len(b) == 0) acts like b == 1.
// a.neg determines the sign of the Rat, b.neg is ignored.
a, b Int
}
There is no explicit limit. The limit will be your memory or, theoretically, the max array size (2^31 or 2^63, depending on your platform).
If you have practical concerns, you might be interested by the tests made in http://golang.org/src/pkg/math/big/nat_test.go, for example the one where 10^100000 is benchmarked.
And you can easily run this kind of program :
package main
import (
"fmt"
"math/big"
)
func main() {
verybig := big.NewInt(1)
ten := big.NewInt(10)
for i:=0; i<100000; i++ {
verybig.Mul(verybig, ten)
}
fmt.Println(verybig)
}
(if you want it to run fast enough for Go Playground, use a smaller exponent than 100000)
The problem won't be the max size but the used memory and the time such computations take.