Access with mutex has more speed on more cpu - go

I tried some synchronization techniques to share state between goroutines and find out that incorrect variant (without sync) works slowly than same program with mutex.
Given the code:
package main
import (
"sync"
"time"
"fmt"
)
func main() {
var wg sync.WaitGroup
var mu sync.Mutex
hash := make(map[string]string)
hash["test"] = "string"
num := 40000000
wg.Add(num)
start := time.Now()
for i := 0; i < num; i++ {
go func() {
mu.Lock()
_, _ = hash["test"]
mu.Unlock()
wg.Done()
}()
}
wg.Wait()
fmt.Println(time.Since(start))
}
Perform on my laptop with 8 HT cores for 9-10 seconds.
But if just remove sync it works for 11-12 seconds:
package main
import (
"sync"
"time"
"fmt"
)
func main() {
var wg sync.WaitGroup
hash := make(map[string]string)
hash["test"] = "string"
num := 40000000
wg.Add(num)
start := time.Now()
for i := 0; i < num; i++ {
go func() {
_, _ = hash["test"]
wg.Done()
}()
}
wg.Wait()
fmt.Println(time.Since(start))
}
A synced version is faster and unitizes CPU much higher. Question is why?
My thought is about how goroutines scheduled and overhead for context switching because of the more GOMAXPROCS the greater the gap between these two versions. But I can't explain the real reason for that that happens under the hood of the scheduler.

Related

Concurrent Bubble sort in golang

Can someone explain to me how the goroutine works in the following code, I wrote it btw.
When I do BubbleSortVanilla, it takes roughly 15s for a list of size 100000
When I do BubbleSortOdd followed by BubbleSortEven using the odd even phase, it takes roughly 7s. But when I just do ConcurrentBubbleSort it only takes roughly 1.4s.
Can't really understand why the single ConcurrentBubbleSort is better?
Is it cause of the overhead in creating the two threads and its also processing the
same or well half the length of the list?
I tried profiling the code but am not really sure how to see how many threads are being created or the memory usage of each thread etc
package main
import (
"fmt"
"math/rand"
"sync"
"time"
)
func BubbleSortVanilla(intList []int) {
for i := 0; i < len(intList)-1; i += 1 {
if intList[i] > intList[i+1] {
intList[i], intList[i+1] = intList[i+1], intList[i]
}
}
}
func BubbleSortOdd(intList []int, wg *sync.WaitGroup, c chan []int) {
for i := 1; i < len(intList)-2; i += 2 {
if intList[i] > intList[i+1] {
intList[i], intList[i+1] = intList[i+1], intList[i]
}
}
wg.Done()
}
func BubbleSortEven(intList []int, wg *sync.WaitGroup, c chan []int) {
for i := 0; i < len(intList)-1; i += 2 {
if intList[i] > intList[i+1] {
intList[i], intList[i+1] = intList[i+1], intList[i]
}
}
wg.Done()
}
func ConcurrentBubbleSort(intList []int, wg *sync.WaitGroup, c chan []int) {
for i := 0; i < len(intList)-1; i += 1 {
if intList[i] > intList[i+1] {
intList[i], intList[i+1] = intList[i+1], intList[i]
}
}
wg.Done()
}
func main() {
// defer profile.Start(profile.MemProfile).Stop()
rand.Seed(time.Now().Unix())
intList := rand.Perm(100000)
fmt.Println("Read a sequence of", len(intList), "elements")
c := make(chan []int, len(intList))
var wg sync.WaitGroup
start := time.Now()
for j := 0; j < len(intList)-1; j++ {
// BubbleSortVanilla(intList) // takes roughly 15s
// wg.Add(2)
// go BubbleSortOdd(intList, &wg, c) // takes roughly 7s
// go BubbleSortEven(intList, &wg, c)
wg.Add(1)
go ConcurrentBubbleSort(intList, &wg, c) // takes roughly 1.4s
}
wg.Wait()
elapsed := time.Since(start)
// Print the sorted integers
fmt.Println("Sorted List: ", len(intList), "in", elapsed)
}
Your code is not working at all. ConcurrentBubbleSort and BubbleSortOdd + BubbleSortEven will cause the data race. Try to run your code with go run -race main.go. Because of data race, data of array will be incorrect after sort, and they are not sorted neither.
Why it is slow? I guess it is because of data race, and there are too many go routines which are causing the data race.
The Thread Analyzer detects data-races that occur during the execution
of a multi-threaded process. A data race occurs when:
two or more threads in a single process access the same memory
location concurrently, and
at least one of the accesses is for writing, and
the threads are not using any exclusive locks to control their
accesses to that memory.

Unable to use goroutines concurrently to find max until context is cancelled

I have successfully made a synchronous solution without goroutines to findMax of compute calls.
package main
import (
"context"
"fmt"
"math/rand"
"time"
)
func findMax(ctx context.Context, concurrency int) uint64 {
var (
max uint64 = 0
num uint64 = 0
)
for i := 0; i < concurrency; i++ {
num = compute()
if num > max {
max = num
}
}
return max
}
func compute() uint64 {
// NOTE: This is a MOCK implementation of the blocking operation.
time.Sleep(time.Duration(rand.Int63n(100)) * time.Millisecond)
return rand.Uint64()
}
func main() {
maxDuration := 2 * time.Second
concurrency := 10
ctx, cancel := context.WithTimeout(context.Background(), maxDuration)
defer cancel()
max := findMax(ctx, concurrency)
fmt.Println(max)
}
https://play.golang.org/p/lYXRNTDtNCI
When I attempt to use goroutines to use findMax to repeatedly call compute function using as many goroutines until context ctx is canceled by the caller main function. I am getting 0 every time and not the expected max of the grouting compute function calls. I have tried different ways to do it and get deadlock most of the time.
package main
import (
"context"
"fmt"
"math/rand"
"time"
)
func findMax(ctx context.Context, concurrency int) uint64 {
var (
max uint64 = 0
num uint64 = 0
)
for i := 0; i < concurrency; i++ {
select {
case <- ctx.Done():
return max
default:
go func() {
num = compute()
if num > max {
max = num
}
}()
}
}
return max
}
func compute() uint64 {
// NOTE: This is a MOCK implementation of the blocking operation.
time.Sleep(time.Duration(rand.Int63n(100)) * time.Millisecond)
return rand.Uint64()
}
func main() {
maxDuration := 2 * time.Second
concurrency := 10
ctx, cancel := context.WithTimeout(context.Background(), maxDuration)
defer cancel()
max := findMax(ctx, concurrency)
fmt.Println(max)
}
https://play.golang.org/p/3fFFq2xlXAE
Your program has multiple problems:
You are spawning multiple goroutines that are operating on shared variables i.e., max and num leading to data race as they are not protected (eg. by Mutex).
Here num is modified by every worker goroutine but it should have been local to the worker otherwise the computed data could be lost (eg. one worker goroutine computed a result and stored it in num, but right after that a second worker computes and replaces the value of num).
num = compute // Should be "num := compute"
You are not waiting for every goroutine to finish it's computation and it may result in incorrect results as every workers computation wasn't taken into account even though context wasn't cancelled. Use sync.WaitGroup or channels to fix this.
Here's a sample program that addresses most of the issues in your code:
package main
import (
"context"
"fmt"
"math/rand"
"sync"
"time"
)
type result struct {
sync.RWMutex
max uint64
}
func findMax(ctx context.Context, workers int) uint64 {
var (
res = result{}
wg = sync.WaitGroup{}
)
for i := 0; i < workers; i++ {
select {
case <-ctx.Done():
// RLock to read res.max
res.RLock()
ret := res.max
res.RUnlock()
return ret
default:
wg.Add(1)
go func() {
defer wg.Done()
num := compute()
// Lock so that read from res.max and write
// to res.max is safe. Else, data race could
// occur.
res.Lock()
if num > res.max {
res.max = num
}
res.Unlock()
}()
}
}
// Wait for all the goroutine to finish work i.e., all
// workers are done computing and updating the max.
wg.Wait()
return res.max
}
func compute() uint64 {
rnd := rand.Int63n(100)
time.Sleep(time.Duration(rnd) * time.Millisecond)
return rand.Uint64()
}
func main() {
maxDuration := 2 * time.Second
concurrency := 10
ctx, cancel := context.WithTimeout(context.Background(), maxDuration)
defer cancel()
fmt.Println(findMax(ctx, concurrency))
}
As #Brits pointed out in the comments that when context is cancelled make sure that you stop those worker goroutines to stop processing (if possible) because it is not needed anymore.

sync.Map seems not safe concurrent read/write

Test the sync.Map in golang standard package. It seems not safe concurrent read and write.
What's wrong?
Test code:
package main
import (
"log"
"sync"
)
func main() {
var m sync.Map
m.Store("count", 0)
var wg sync.WaitGroup
for numOfThread := 0; numOfThread < 10; numOfThread++ {
wg.Add(1)
go func() {
defer wg.Done()
for i := 0; i < 1000; i++ {
value, ok := m.Load("count")
if !ok {
log.Println("load count error")
} else {
v, _ := value.(int)
m.Store("count", v+1)
}
}
}()
}
log.Println("threads starts")
wg.Wait()
value, ok := m.Load("count")
if ok {
v, _ := value.(int)
log.Printf("final count: %d", v)
}
log.Println("all done")
}
https://play.golang.org/p/E-pw4iZUceB
The result should be 10000, but get random number but not 10000:
2009/11/10 23:00:00 threads starts
2009/11/10 23:00:00 final count: 6696
2009/11/10 23:00:00 all done
You have a race condition:
value, ok := m.Load("count")
...
v, _ := value.(int)
m.Store("count", v+1)
The read-modify-store above does not protect other goroutines do the same thing, thus some of the increments performed by other goroutines will be missed.
The sync.Map protects concurrent access to its members. That means, a write to the map will not cause other goroutines to read an inconsistent map. If you read-modify-write, nothing will protect other goroutines from updating the value at the same time. You need a mutex to protect access to the map when you read-modify-update.
package main
import (
"log"
"sync"
)
func main() {
var m sync.Map
m.Store("count", 0)
var wg sync.WaitGroup
var mu *sync.Mutex
for numOfThread := 0; numOfThread < 10; numOfThread++ {
wg.Add(1)
go func() {
defer wg.Done()
for i := 0; i < 1000; i++ {
mu.Lock()
value, ok := m.Load("count")
if !ok {
log.Println("load count error")
} else {
v, _ := value.(int)
m.Store("count", v+1)
}
mu.Unlock()
}
}()
}
log.Println("threads starts")
wg.Wait()
value, ok := m.Load("count")
if ok {
v, _ := value.(int)
log.Printf("final count: %d", v)
}
log.Println("all done")
}
both read and write operation are thread safe, but you are attempting an upsert or read+write operation. It is not thread safe. Modified code to make it thread safe.

How to set a max number of goroutines to complete the work?

var wg sync.WaitGroup
wg.Add(len(work))
sem := make(chan struct{}, 10)
wgDone := make(chan bool)
for i < len(work)-1 {
go func() {
defer wg.Done()
sem <- struct{}{}
defer func() {
<-sem
}()
worker(work[i])
}()
i = i + 1
}
go func() {
wg.Wait()
close(wgDone)
}()
I only want 10 new goroutines at a time performing the work. This is my current solution, it blocks goroutines from continuing so there is only 10 at a time. How can I change this so it doesn't create an abundance of goroutines that are blocked waiting to work and instead only creates 10 that complete all the work?
Based on the use case one of these methods is useful:
Using max number of new goroutines and a channel as a queue (The Go playground):
package main
import (
"fmt"
"sync"
)
func main() {
const max = 10
queue := make(chan int, max)
wg := &sync.WaitGroup{}
for i := 0; i < max; i++ {
wg.Add(1)
go worker(wg, queue)
}
for i := 0; i < 100; i++ {
queue <- i
}
close(queue)
wg.Wait()
fmt.Println("Done")
}
func worker(wg *sync.WaitGroup, queue chan int) {
defer wg.Done()
for job := range queue {
fmt.Print(job, " ") // a job
}
}
Using a buffered channel as a semaphore to limits the new goroutines number to the max number (The Go playground):
package main
import (
"fmt"
"sync"
)
func main() {
const max = 10
semaphore := make(chan struct{}, max)
wg := &sync.WaitGroup{}
for i := 0; i < 1000; i++ {
semaphore <- struct{}{} // acquire
wg.Add(1)
go limited(i, wg, semaphore)
}
wg.Wait()
fmt.Println("Done")
}
func limited(i int, wg *sync.WaitGroup, semaphore chan struct{}) {
defer wg.Done()
fmt.Println("i =", i) // a job
<-semaphore // release
}
Using a buffered channel as a semaphore to limits the number of jobs to the max number - here number of goroutines are more than max number (The Go playground):
package main
import (
"fmt"
"sync"
)
func main() {
const max = 10
semaphore := make(chan struct{}, max)
wg := &sync.WaitGroup{}
for i := 0; i < 1000; i++ {
wg.Add(1)
go limited(i, wg, semaphore)
}
wg.Wait()
fmt.Println("Done")
}
func limited(i int, wg *sync.WaitGroup, semaphore chan struct{}) {
defer wg.Done()
semaphore <- struct{}{} // acquire
fmt.Println("i =", i) // a job
<-semaphore // release
}
So if you want only say 10 workers you should spawn 10 workers listening to a job Queue this can be a channel you can push the inputs to this channel and workers will pick it
Now it will only block jobs when the queue is full so you can decide the queue size based on your use case
package main
import (
"fmt"
"sync"
)
var jobQ chan int
var wg sync.WaitGroup
func main() {
jobQ = make(chan int, 100)
go func(){
wg.Add(1)
defer wg.Done()
//Spawn 10 workers
for i:=0;i<10;i++ {
fmt.Println("Spawn :", i)
wg.Add(1)
go worker(jobQ)
}
}()
for i := 0; i< 1000;i++ {
jobQ<- i
}
close(jobQ)
wg.Wait()
}
func worker(jobs chan int) {
defer wg.Done()
for job:=range jobs {
fmt.Println(job)
}
}
Now you can customize this and find other worker pool implementations; worker pools are used a lot and you will find different implementaions
Playground : https://play.golang.org/p/lzIMRUCvqR9

Why does this golang program create a memory leak?

I am trying to understand concurrency and goroutines, and had a couple questions about the following experimental code:
Why does it create a memory leak? I thought that a return at the end of the goroutine would allow memory associated with it to get cleaned up.
Why do my loops almost never reach 999? In fact, when I output to a file and study the output, I notice that it rarely prints integers in double digits; the first time it prints "99" is line 2461, and for "999" line 6120. This behavior is unexpected to me, which clearly means I don't really understand what is going on with goroutine scheduling.
Disclaimer:
Be careful running the code below, it can crash your system if you don't stop it after a few seconds!
CODE
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for {
// spawn four worker goroutines
spawnWorkers(4, wg)
// wait for the workers to finish
wg.Wait()
}
}
func spawnWorkers(max int, wg sync.WaitGroup) {
for n := 0; n < max; n++ {
wg.Add(1)
go func() {
defer wg.Done()
f(n)
return
}()
}
}
func f(n int) {
for i := 0; i < 1000; i++ {
fmt.Println(n, ":", i)
}
}
Thanks to Tim Cooper, JimB, and Greg for their helpful comments. The corrected version of the code is posted below for reference.
The two fixes were to pass in the WaitGroup by reference, which fixed the memory leak, and to pass n correctly into the anonymous goroutine, and
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for {
// spawn four worker goroutines
spawnWorkers(4,&wg)
// wait for the workers to finish
wg.Wait()
}
}
func spawnWorkers(max int, wg *sync.WaitGroup) {
for n := 0; n < max; n++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
f(n)
return
}(n)
}
}
func f(n int) {
for i := 0; i < 1000; i++ {
fmt.Println(n, ":", i)
}
}

Resources