How to free memory manually in golang - go

Below is a code to calculte C(36,8) and save result to file
func combine_dfs(n int, k int) (ans [][]int) {
temp := []int{}
var dfs func(int)
dfs = func(cur int) {
if len(temp)+(n-cur+1) < k {
return
}
if len(temp) == k {
comb := make([]int, k)
copy(comb, temp)
ans = append(ans, comb)
return
}
temp = append(temp, cur)
dfs(cur + 1)
temp = temp[:len(temp)-1]
dfs(cur + 1)
}
dfs(1)
return
}
func DoCombin() {
fmt.Printf("%v\n", "calculator...")
cst := []byte{}
for i := 'a'; i <= 'z'; i++ {
cst = append(cst, byte(i))
}
for i := '0'; i <= '9'; i++ {
cst = append(cst, byte(i))
}
n := 36
k := 8
arr := combine_dfs(n, k)
fmt.Printf("%v\n", "writefile...")
file, _ := os.OpenFile("result.txt", os.O_CREATE|os.O_TRUNC|os.O_RDWR|os.O_APPEND, 0666)
defer file.Close()
for _, m := range arr {
b:= bytes.Buffer{}
b.Reset()
for _, i := range m {
b.WriteByte(cst[i-1])
}
b.WriteByte('\n')
file.Write(b.Bytes())
}
}
but i write file so slow..
so i want use goroutine to write file (use pool to limit the number of goroutine):
func DoCombin2() {
fmt.Printf("%v\n", "calculator...")
cst := []byte{}
for i := 'a'; i <= 'z'; i++ {
cst = append(cst, byte(i))
}
for i := '0'; i <= '9'; i++ {
cst = append(cst, byte(i))
}
n := 36
k := 8
arr := combine_dfs(n, k)
fmt.Printf("%v\n", "writefile...")
file, _ := os.OpenFile("result.txt", os.O_CREATE|os.O_TRUNC|os.O_RDWR|os.O_APPEND, 0666)
defer file.Close()
pool := make(chan int, 100)
for _, m := range arr {
go func(m []int) {
pool <- 1
b := bytes.Buffer{}
b.Reset()
for _, i := range m {
b.WriteByte(cst[i-1])
}
b.WriteByte('\n')
file.Write(b.Bytes())
<-pool
}(m)
}
}
but the memory exploded
I try using sync.Pool to avoid it, but it fail:
var bufPool = sync.Pool{
New: func() interface{} {
return new(bytes.Buffer)
},
}
func DoCombin() {
fmt.Printf("%v\n", "calculator...")
cst := []byte{}
for i := 'a'; i <= 'z'; i++ {
cst = append(cst, byte(i))
}
for i := '0'; i <= '9'; i++ {
cst = append(cst, byte(i))
}
n := 36
k := 8
arr := combine_dfs(n, k)
fmt.Printf("%v\n", "writefile...")
file, _ := os.OpenFile("result.txt", os.O_CREATE|os.O_TRUNC|os.O_RDWR|os.O_APPEND, 0666)
defer file.Close()
pool := make(chan int, 100)
for _, m := range arr {
go func(m []int) {
pool <- 1
b, _ := bufPool.Get().(*bytes.Buffer)
b.Reset()
for _, i := range m {
b.WriteByte(cst[i-1])
}
b.WriteByte('\n')
bufPool.Put(b)
file.Write(b.Bytes())
<-pool
}(m)
}
}
Is there any way to avoid memory explosion?
1.Why can't I avoid it after using sync.Pool?
2.Is there any way to limit memory usage in windows(in linux i know) ?
3.Is there other way to avoid memory explosion?
4.Is the memory explosion because of bytes.Buffer? How to free bytes.Buffer manually?

Update 02/20/2023
This proposal arenas is on hold indefinitely due to serious API concerns. The GOEXPERIMENT=arena code may be changed incompatibly or removed at any time, and we do not recommend its use in production.
Per this Proposal: arena: new package providing memory arenas
We propose the addition of a new arena package to the Go standard library. The arena package will allow the allocation of any number of arenas. Objects of arbitrary type can be allocated from the memory of the arena, and an arena automatically grows in size as needed. When all objects in an arena are no longer in use, the arena can be explicitly freed to reclaim its memory efficiently without general garbage collection. We require that the implementation provide safety checks, such that, if an arena free operation is unsafe, the program will be terminated before any incorrect behavior happens.
This feature has been merged to the master branch under arena, and maybe could be released in go 1.20. With the arena package, you could allocate memory by yourself and manually free it if it is no longer in use.
Sample codes
a := arena.NewArena()
defer a.Free()
tt := arena.New[T1](a)
tt.n = 1
ts := arena.MakeSlice[T1](a, 99, 100)
if len(ts) != 99 {
t.Errorf("Slice() len = %d, want 99", len(ts))
}
if cap(ts) != 100 {
t.Errorf("Slice() cap = %d, want 100", cap(ts))
}
ts[1].n = 42

in go 1.19
The garbage collector has added support for a soft memory limit,
The garbage collector has added support for a soft memory limit, discussed in detail in the new garbage collection guide. The limit can be particularly helpful for optimizing Go programs to run as efficiently as possible in containers with dedicated amounts of memory.
the new garbage collection guide

Related

Why is the O(n^2) solution faster? Day 1 Advent of Code 2020

I have two solutions for the first Problem from Advent of Code. The first solution (p1) has the time complexity of O(n). The second (p2) of O(n^2). But why is the second faster?
https://adventofcode.com/2020/day/1
BenchmarkP1 ​ 12684 92239 ns/op
BenchmarkP2 3161 90705 ns/op
//O(n)
func p1(value int) (int, int){
m := make(map[int]int)
f, err := os.Open("nums.txt")
printError(err)
defer f.Close()
scanner := bufio.NewScanner(f)
for scanner.Scan() {
intVar, err := strconv.Atoi(scanner.Text())
printError(err)
m[intVar]=intVar
}
for _, key := range m {
l, ok := m[value-key]
if ok {
return l, key
}
}
return 0, 0
}
//O(n^2)
func p2(value int) (int, int){
var data []int
f, err := os.Open("nums.txt")
printError(err)
defer f.Close()
scanner := bufio.NewScanner(f)
for scanner.Scan() {
intVar, err := strconv.Atoi(scanner.Text())
printError(err)
data= append(data, intVar)
}
for ki, i := range data {
for kj, j := range data {
if ki != kj && i+j == value {
return i , j
}
}
}
return 0, 0
}
Just try more test data:
func main() {
// generate txt
generateTxt(10000)
// test
time := countTime(p1)
fmt.Println("time cost of O(n^2): ", time)
time = countTime(p2)
fmt.Println("time cost of O(n): ", time)
}
func countTime(f func(int) (int, int)) int64 {
tick := time.Now().UnixNano()
fmt.Println(tick)
f(2020)
tock := time.Now().UnixNano()
return tock - tick
}
result(ns):
data
O(n)
O(n^2)
500
510700
529900
5000
787900
4589600
explanation
Big-O means how the time cost increase while data scale increase, so small size data cannot reflect this well.
And also notice: p1 need to make a map with some time cost. if you try p2 after p1, the io of p2 might be benifited from cache too.

Not able to scrape data when trying to use go routines

I am trying to scrape related words for a given word for which I am using BFS starting with the given word and searching through each related word on dictionary.com
I have tried this code without concurrency and it works just fine, but takes a lot of time hence, tried using go routines but my code gets stuck after the first iteration. The first level of BFS works just fine but then in the second level it hangs!
package main
import (
"fmt"
"github.com/gocolly/colly"
"sync"
)
var wg sync.WaitGroup
func buildURL(word string) string {
return "https://www.dictionary.com/browse/" + string(word)
}
func get(url string) []string {
c := colly.NewCollector()
c.IgnoreRobotsTxt = true
var ret []string
c.OnHTML("a.css-cilpq1.e15p0a5t2", func(e *colly.HTMLElement) {
ret = append(ret, string(e.Text))
})
c.Visit(url)
c.Wait()
return ret
}
func threading(c chan []string, word string) {
defer wg.Done()
var words []string
for _, w := range get(buildURL(word)) {
words = append(words, w)
}
c <- words
}
func main() {
fmt.Println("START")
word := "jump"
maxDepth := 2
//bfs
var q map[string]int
nq := map[string]int {
word: 0,
}
vis := make(map[string]bool)
queue := make(chan []string, 5000)
for i := 1; i <= maxDepth; i++ {
fmt.Println(i)
q, nq = nq, make(map[string]int)
for word := range q {
if _, ok := vis[word]; !ok {
wg.Add(1)
vis[word] = true
go threading(queue, word)
for v := range queue {
fmt.Println(v)
for _, w := range v {
nq[w] = i
}
}
}
}
}
wg.Wait()
close(queue)
fmt.Println("END")
}
OUTPUT:
START
1
[plunge dive rise upsurge bounce hurdle fall vault drop advance upturn inflation increment spurt boost plummet skip bound surge take]
hangs just here forever, counter = 2 is not printed!
Can check here https://www.dictionary.com/browse/jump for the related words.
According to Tour of Go
Sends to a buffered channel block only when the buffer is full.
Receives block when the buffer is empty.
So, in this case, you are creating a buffered channel using 5000 as length.
for i := 1; i <= maxDepth; i++ {
fmt.Println(i)
q, nq = nq, make(map[string]int)
for word := range q { // for each word
if _, ok := vis[word]; !ok { // if not visited visit
wg.Add(1) // add a worker
vis[word] = true
go threading(queue, word) // fetch in concurrent manner
for v := range queue { // <<< blocks here when queue is empty
fmt.Println(v)
for _, w := range v {
nq[w] = i
}
}
}
}
}
As you can see I've commented in the code, after 1st iteration the for loop gonna block until channel is empty. In this case after fetching jump It sends the array corresponding similar words, but after that as the for loop is blocking as zerkems explains you will not get to next iteration(i = 2). You can ultimately close the channel to end the blocking in for loop. But since you use the same channel to write over multiple goroutines it will panic if you closed it from multiple goroutines.
To overcome this we can come up with a nice workaround.
We exactly know how much unvisited items we are fetching for.
We now know where is the block
First, we need to count the unvisited words and then we can iterate that much of the time
vis := make(map[string]bool)
queue := make(chan []string, 5000)
for i := 1; i <= maxDepth; i++ {
fmt.Println(i)
q, nq = nq, make(map[string]int)
unvisited := 0
for word := range q {
if _, ok := vis[word]; !ok {
vis[word] = true
unvisited++
wg.Add(1)
go threading(queue, word)
}
}
wg.Wait() // wait until jobs are done
for j := 0; j < unvisited; j++ { // << does not block as we know how much
v := <-queue // we exactly try to get unvisited amount
fmt.Println(v)
for _, w := range v {
nq[w] = i
}
}
}
In this situation, we are simply counting what is the minimum iterations we need to go to get results. Also, you can see that I've moved down the for loop outer and use original one to just add words to workers. It will ask to fetch all words and will wait in the following loop to complete there tasks in a non-blocking way.
Latter loop waits until all workers are done. After that next iteration, works and next level of BFS can be reached.
Summary
Distribute workload
Wait for results
Don't do both at the same time
Hope this helps.

Why don't goroutines write in parallel using WriteAt?

I'm experimenting a bit with reading and writing from a file.
To write to a file concurrently I created the following function:
func write(f *os.File, b []byte, off int64, c chan int) {
var _, err = f.WriteAt(b, off)
check(err)
c <- 0
}
I then create a file and 100000 goroutines to perform the write operations.
They each write an array of 16384 bytes to the hard disk:
func main() {
path := "E:/test"
f, err := os.OpenFile(path, os.O_RDWR|os.O_CREATE, 0666)
check(err)
size := int64(16384)
ones := make([]byte, size)
n := int64(100000)
c := make(chan int, n)
for i := int64(0); i < size; i++ {
ones[i] = 1
}
// Start timing
start := time.Now()
for i := int64(0); i < n; i++ {
go write(f, ones, size*i, c)
}
for i := int64(0); i < n; i++ {
<-c
}
// Check elapsed time
fmt.Println(time.Now().Sub(start))
err = f.Sync()
check(err)
err = f.Close()
check(err)
}
In this case about 1.6 GB is written where each goroutine writes to a non-overlapping byte range. The documentation for the io package states that Clients of WriteAt can execute parallel WriteAt calls on the same destination if the ranges do not overlap.
So what I expect to see, is that when I use go write(f, ones, 0, c), it would take much longer since all write operations would be on the same byterange.
However after testing this my results are quite unexpected:
Using go write(f, ones, size*i, c) took an average of about 3s
But using go write(f, ones, 0, c) only took an average of about 480ms
Do I use the WriteAt function in the wrong way? How could i achieve concurrent writing to non-overlapping byteranges?

"fan in" - one "fan out" behavior

Say, we have three methods to implement "fan in" behavior
func MakeChannel(tries int) chan int {
ch := make(chan int)
go func() {
for i := 0; i < tries; i++ {
ch <- i
}
close(ch)
}()
return ch
}
func MergeByReflection(channels ...chan int) chan int {
length := len(channels)
out := make(chan int)
cases := make([]reflect.SelectCase, length)
for i, ch := range channels {
cases[i] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ch)}
}
go func() {
for length > 0 {
i, line, opened := reflect.Select(cases)
if !opened {
cases[i].Chan = reflect.ValueOf(nil)
length -= 1
} else {
out <- int(line.Int())
}
}
close(out)
}()
return out
}
func MergeByCode(channels ...chan int) chan int {
length := len(channels)
out := make(chan int)
go func() {
var i int
var ok bool
for length > 0 {
select {
case i, ok = <-channels[0]:
out <- i
if !ok {
channels[0] = nil
length -= 1
}
case i, ok = <-channels[1]:
out <- i
if !ok {
channels[1] = nil
length -= 1
}
case i, ok = <-channels[2]:
out <- i
if !ok {
channels[2] = nil
length -= 1
}
case i, ok = <-channels[3]:
out <- i
if !ok {
channels[3] = nil
length -= 1
}
case i, ok = <-channels[4]:
out <- i
if !ok {
channels[4] = nil
length -= 1
}
}
}
close(out)
}()
return out
}
func MergeByGoRoutines(channels ...chan int) chan int {
var group sync.WaitGroup
out := make(chan int)
for _, ch := range channels {
go func(ch chan int) {
for i := range ch {
out <- i
}
group.Done()
}(ch)
}
group.Add(len(channels))
go func() {
group.Wait()
close(out)
}()
return out
}
type MergeFn func(...chan int) chan int
func main() {
length := 5
tries := 1000000
channels := make([]chan int, length)
fns := []MergeFn{MergeByReflection, MergeByCode, MergeByGoRoutines}
for _, fn := range fns {
sum := 0
t := time.Now()
for i := 0; i < length; i++ {
channels[i] = MakeChannel(tries)
}
for i := range fn(channels...) {
sum += i
}
fmt.Println(time.Since(t))
fmt.Println(sum)
}
}
Results are (at 1 CPU, I have used runtime.GOMAXPROCS(1)):
19.869s (MergeByReflection)
2499997500000
8.483s (MergeByCode)
2499997500000
4.977s (MergeByGoRoutines)
2499997500000
Results are (at 2 CPU, I have used runtime.GOMAXPROCS(2)):
44.94s
2499997500000
10.853s
2499997500000
3.728s
2499997500000
I understand the reason why MergeByReflection is slowest, but what is about the difference between MergeByCode and MergeByGoRoutines?
And when we increase the CPU number why "select" clause (used MergeByReflection directly and in MergeByCode indirectly) becomes slower?
Here is a preliminary remark. The channels in your examples are all unbuffered, meaning they will likely block at put or get time.
In this example, there is almost no processing except channel management. The performance is therefore dominated by synchronization primitives. Actually, there is very little of this code that can be parallelized.
In the MergeByReflection and MergeByCode functions, select is used to listen to multiple input channels, but nothing is done to take in account the output channel (which may therefore block, while some event could be available on one of the input channels).
In the MergeByGoRoutines function, this situation cannot happen: when the output channel blocks, it does not prevent an other input channel to be read by another goroutine. There are therefore better opportunities for the runtime to parallelize the goroutines, and less contention on the input channels.
The MergeByReflection code is the slowest because it has the overhead of reflection, and almost nothing can be parallelized.
The MergeByGoRoutines function is the fastest because it reduces the contention (less synchronization is needed), and because output contention has a lesser impact on the input performance. It can therefore benefit of a small improvement when running with multiple cores (contrary to the two other methods).
There is so much synchronization activity with MergeByReflection and MergeByCode, that running on multiple cores negatively impacts the performance. You could have different performance by using buffered channels though.

Deadlock in go function channel

Why is there a deadlock even tho I just pass one and get one output from the channel?
package main
import "fmt"
import "math/cmplx"
func max(a []complex128, base int, ans chan float64, index chan int) {
fmt.Printf("called for %d,%d\n",len(a),base)
maxi_i := 0
maxi := cmplx.Abs(a[maxi_i]);
for i:=1 ; i< len(a) ; i++ {
if cmplx.Abs(a[i]) > maxi {
maxi_i = i
maxi = cmplx.Abs(a[i])
}
}
fmt.Printf("called for %d,%d and found %f %d\n",len(a),base,maxi,base+maxi_i)
ans <- maxi
index <- base+maxi_i
}
func main() {
ans := make([]complex128,128)
numberOfSlices := 4
incr := len(ans)/numberOfSlices
tmp_val := make([]chan float64,numberOfSlices)
tmp_index := make([]chan int,numberOfSlices)
for i,j := 0 , 0; i < len(ans); j++{
fmt.Printf("From %d to %d - %d\n",i,i+incr,len(ans))
go max(ans[i:i+incr],i,tmp_val[j],tmp_index[j])
i = i+ incr
}
//After Here is it stops deadlock
maximumFreq := <- tmp_index[0]
maximumMax := <- tmp_val[0]
for i := 1; i < numberOfSlices; i++ {
tmpI := <- tmp_index[i]
tmpV := <- tmp_val[i]
if(tmpV > maximumMax ) {
maximumMax = tmpV
maximumFreq = tmpI
}
}
fmt.Printf("Max freq = %d",maximumFreq)
}
For those reading this question and perhaps wondering why his code failed here's an explanation.
When he constructed his slice of channels like so:
tmp_val := make([]chan float64,numberOfSlices)
He made slice of channels where every index was to the channels zero value. A channels zero value is nil since channels are reference types and a nil channel blocks on send forever and since there is never anything in a nil channel it will also block on recieve forever. Thus you get a deadlock.
When footy changes his code to construct each channel individually using
tmp_val[i] = make(chan float64)
in a loop he constructs non-nil channels and everything is good.
I was wrong in making of the chan. Should have done
numberOfSlices := 4
incr := len(ans)/numberOfSlices
var tmp_val [4]chan float64
var tmp_index [4]chan int
for i := range tmp_val {
tmp_val[i] = make(chan float64)
tmp_index[i] = make(chan int)
}
for i,j := 0 , 0; i < len(ans); j++{
fmt.Printf("From %d to %d [j:%d] - %d\n",i,i+incr,j,len(ans))
go maximumFunc(ans[i:i+incr],i,tmp_val[j],tmp_index[j])
i = i+ incr
}

Resources