How to prevent deadlocks without using sync.WaitGroup? - go

concurrent.go:
package main
import (
"fmt"
"sync"
)
// JOBS represents the number of jobs workers do
const JOBS = 2
// WORKERS represents the number of workers
const WORKERS = 5
func work(in <-chan int, out chan<- int, wg *sync.WaitGroup) {
for n := range in {
out <- n * n
}
wg.Done()
}
var wg sync.WaitGroup
func main() {
in := make(chan int, JOBS)
out := make(chan int, JOBS)
for w := 1; w <= WORKERS; w++ {
wg.Add(1)
go work(in, out, &wg)
}
for j := 1; j <= JOBS; j++ {
in <- j
}
close(in)
wg.Wait()
close(out)
for r := range out {
fmt.Println("result:", r)
}
// This is a solution but I want to do it with `range out`
// and also without WaitGroups
// for r := 1; r <= JOBS; r++ {
// fmt.Println("result:", <-out)
// }
}
Example is here on goplay.

Goroutines run concurrently and independently. Spec: Go statements:
A "go" statement starts the execution of a function call as an independent concurrent thread of control, or goroutine, within the same address space.
If you want to use for range to receive values from the out channel, that means the out channel can only be closed once all goroutines are done sending on it.
Since goroutines run concurrently and independently, without synchronization you can't have this.
Using WaitGroup is one mean, one way to do it (to ensure we wait all goroutines to do their job before closing out).
Your commented code is another way of that: the commented code receives exactly as many values from the channel as many the goroutines ought to send on it, which is only possible if all goroutines do send their values. The synchronization are the send statements and receive operations.
Notes:
Usually receiving results from the channel is done asynchronously, in a dedicated goroutine, or using even multiple goroutines. Doing so you are not required to use channels with buffers capable of buffering all the results. You will still need synchronization to wait for all workers to finish their job, you can't avoid this due to the concurrent and independent nature of gorutine scheduling and execution.

Related

How to signal if a value has been read from a channel in Go

I am reading values that are put into a channel ch via an infinite for. I would like some way to signal if a value has been read and operated upon (via the sq result) and add it to some sort of counter variable upon success. That way I have a way to check if my channel has been exhausted so that I can properly exit my infinite for loop.
Currently it is incrementing regardless if a value was read, thus causing it to exit early when the counter == num. I only want it to count when the value has been squared.
EDIT: Another approach I have tested is to receive the ok val out of the channel upon reading and setting val and then check if !ok { break }. However I receive a deadlock panic since the for did has not properly break. Example here: https://go.dev/play/p/RYNtTix2nm2
package main
import "fmt"
func main() {
num := 5
// Buffered channel with 5 values.
ch := make(chan int, num)
defer close(ch)
for i := 0; i < num; i++ {
go func(val int) {
fmt.Printf("Added value: %d to the channel\n", val)
ch <- val
}(i)
}
// Read from our channel infinitely and increment each time a value has been read and operated upon
counter := 0
for {
// Check our counter and if its == num then break the infinite loop
if counter == num {
break
}
val := <-ch
counter++
go func(i int) {
// I'd like to verify a value was read from ch & it was processed before I increment the counter
sq := i * i
fmt.Println(sq)
}(val)
}
}
let me try to help you in figuring out the issue.
Reading issue
The latest version of the code you put in the question is working except when you're about to read values from the ch channel. I mean with the following code snippet:
go func(i int) {
// I'd like to verify a value was read from ch & it was processed before I increment the counter
sq := i * I
fmt.Println(sq)
}(val)
In fact, it's not needed to spawn a new goroutine for each read. You can consume the messages as soon as they arrived in the ch channel. This is possible due to writing done inside goroutines. Thanks to them, the code can go ahead and reach the reading phase without being blocked.
Buffered vs unbuffered
In this scenario, you used a buffered channel with 5 slots for data. However, if you're relying on the buffered channel you should signal when you finish sending data to it. This is done with a close(ch) invocation after all of the Go routines finish their job. If you use an unbuffered channel it's fine to invoke defer close(ch) next to the channel initialization. In fact, this is done for cleanup and resource optimization tasks. Back to your example, you can change the implementation to use unbuffered channels.
Final Code
Just to recap, the two small changes that you've to do are:
Use an unbuffered channel instead of a buffered one.
Do Not use a Go routine when reading the messages from the channel.
Please be sure to understand exactly what's going on. Another tip can be to issue the statement: fmt.Println("NumGoroutine:", runtime.NumGoroutine()) to print the exact number of Go routines running in that specific moment.
The final code:
package main
import (
"fmt"
"runtime"
)
func main() {
num := 5
// Buffered channel with 5 values.
ch := make(chan int)
defer close(ch)
for i := 0; i < num; i++ {
go func(val int) {
fmt.Printf("Added value: %d to the channel\n", val)
ch <- val
}(i)
}
fmt.Println("NumGoroutine:", runtime.NumGoroutine())
// Read from our channel infinitely and increment each time a value has been read and operated upon
counter := 0
for {
// Check our counter and if its == num then break the infinite loop
if counter == num {
break
}
val := <-ch
counter++
func(i int) {
// I'd like to verify a value was read from ch & it was processed before I increment the counter
sq := i * i
fmt.Println(sq)
}(val)
}
}
Let me know if this helps you, thanks!
package main
import "fmt"
func main() {
c := make(chan int)
done := make(chan bool)
go func() {
for i := 0; i < 10; i++ {
c <- i
}
close(c)
}()
go func() {
for i := range c {
fmt.Println(i)
done <- true
}
close(done)
}()
for i := 0; i < 10; i++ {
<-done
}
}
In this example, the done channel is used to signal that a value has been read from the c channel. After each value is read from c, a signal is sent on the done channel. The main function blocks on the done channel, waiting for a signal before continuing. This ensures that all values from c have been processed before the program terminates.

Gracefully closing channel and not sending on closed channel

I am new to Golang concurrency and have been working to understand this piece of code mentioned below.
I witness few things which I am unable to explain why it happens:
when using i smaller than equal to 100000 for i <= 100000 { in main function, it sometimes prints different values for nResults and countWrites (in last two statements)
fmt.Printf("number of result writes %d\n", nResults) fmt.Printf("Number of job writes %d\n", jobWrites)
when using i more than 1000000 it gives panic: send on closed channel
How can I make sure that the values send to jobs is not on closed channel and later after all values are received in results we can close the channel without deadlock?
package main
import (
"fmt"
"sync"
)
func worker(wg *sync.WaitGroup, id int, jobs <-chan int, results chan<- int, countWrites *int64) {
defer wg.Done()
for j := range jobs {
*countWrites += 1
go func(j int) {
if j%2 == 0 {
results <- j * 2
} else {
results <- j
}
}(j)
}
}
func main() {
wg := &sync.WaitGroup{}
jobs := make(chan int)
results := make(chan int)
var i int = 1
var jobWrites int64 = 0
for i <= 10000000 {
go func(j int) {
if j%2 == 0 {
i += 99
j += 99
}
jobWrites += 1
jobs <- j
}(i)
i += 1
}
var nResults int64 = 0
for w := 1; w < 1000; w++ {
wg.Add(1)
go worker(wg, w, jobs, results, &nResults)
}
close(jobs)
wg.Wait()
var sum int32 = 0
var count int64 = 0
for r := range results {
count += 1
sum += int32(r)
if count == nResults {
close(results)
}
}
fmt.Println(sum)
fmt.Printf("number of result writes %d\n", nResults)
fmt.Printf("Number of job writes %d\n", jobWrites)
}
Quite a few problems in your code.
Sending on closed channel
One general principle of using Go channels is
don't close a channel from the receiver side and don't close a channel if the channel has multiple concurrent senders
(https://go101.org/article/channel-closing.html)
The solution for you is simple: don't have multiple concurrent senders, and then you can close the channel from the sender side.
Instead of starting millions of separate goroutine for each job you add to the channel, run one goroutine that executes the whole loop to add all jobs to the channel. And close the channel after the loop. The workers will consume the channel as fast as they can.
Data races by modifying shared variables in multiple goroutines
You're modifying two shared variables without taking special steps:
nResults, which you pass to the countWrites *int64 in the worker.
i in the loop that writes to the jobs channel: you're adding 99 to it from multiple goroutines, making it unpredictable how many values you actually write to the jobs channel
To solve 1, there are many options, including using sync.Mutex. However since you're just adding to it, the easiest solution is to use atomic.AddInt64(countWrites, 1) instead of *countWrites += 1
To solve 2, don't use one goroutine per write to the channel, but one goroutine for the entire loop (see above)

Channels and Parallelism confusion

I'm learning myself Golang, and I'm a bit confused about parallelism and how it is implemented in Golang.
Given the following example:
package main
import (
"fmt"
"sync"
"math/rand"
"time"
)
const (
workers = 1
rand_count = 5000000
)
func start_rand(ch chan int) {
defer close(ch)
var wg sync.WaitGroup
wg.Add(workers)
rand_routine := func(counter int) {
defer wg.Done()
for i:=0;i<counter;i++ {
seed := time.Now().UnixNano()
rand.Seed(seed)
ch<-rand.Intn(5000)
}
}
for i:=0; i<workers; i++ {
go rand_routine(rand_count/workers)
}
wg.Wait()
}
func main() {
start_time := time.Now()
mychan := make(chan int, workers)
go start_rand(mychan)
var wg sync.WaitGroup
wg.Add(workers)
work_handler := func() {
defer wg.Done()
for {
v, isOpen := <-mychan
if !isOpen { break }
fmt.Println(v)
}
}
for i:=0;i<workers;i++ {
go work_handler()
}
wg.Wait()
elapsed_time := time.Since(start_time)
fmt.Println("Done",elapsed_time)
}
This piece of code takes about one minute to run on my Macbook. I assumed that increasing the "workers" constants, would launch additional go routines, and since my laptop has multiple cores, would shorten the execution time.
This is not the case however. Increasing the workers does not reduce the execution time.
I was thinking that setting workers to 1, would create 1 goroutine to generate the random numbers, and setting it to 4, would create 4 goroutines. Given the multicore nature of my laptop, I was expecting that 4 workers would run on different cores, and therefore, increae the performance.
However, I see increased load on all my cores, even when workers is set to 1. What am I missing here?
Your code has some issues which makes it inherently slow:
You are seeding inside the loop. This needs only to be done once
You are using the same source for random numbers. This source is thread safe, but takes away any performance gains for concurrent workers. You could create a source for each worker with rand.New
You are printing a lot. Printing is thread safe, too. So that takes away any speed gains for concurrent workers.
As Zak already pointed out: The concurrent work inside the go routines is very cheap and the communication is expensive.
You could rewrite your program like that. Then you will see some speed gains when you change the number of workers:
package main
import (
"fmt"
"math/rand"
"time"
)
const (
workers = 1
randCount = 5000000
)
var results = [randCount]int{}
func randRoutine(start, counter int, c chan bool) {
r := rand.New(rand.NewSource(time.Now().UnixNano()))
for i := 0; i < counter; i++ {
results[start+i] = r.Intn(5000)
}
c <- true
}
func main() {
startTime := time.Now()
c := make(chan bool)
start := 0
for w := 0; w < workers; w++ {
go randRoutine(start, randCount/workers, c)
start += randCount / workers
}
for i := 0; i < workers; i++ {
<-c
}
elapsedTime := time.Since(startTime)
for _, i := range results {
fmt.Println(i)
}
fmt.Println("Time calulating", elapsedTime)
elapsedTime = time.Since(startTime)
fmt.Println("Toal time", elapsedTime)
}
This program does a lot of work in a go routine and communicates minimal. Also a different random source is used for each go routine.
Your code does not have just a single routine, even though you set the workers to 1.
There is 1 goroutine from the call go start_rand(...)
That goroutine creates N (worker) routines with go rand_routine(...) and waits for them to finish.
Then you also start N (worker) go routines with go work_handler()
Then you also have 1 goroutine that was started by main() func call.
so: 1 + 2N + 1 routines running for any given N where N == workers.
Plus, on top of that, the work that you are doing in the goroutines is pretty cheap (fast to execute). You are just generating random numbers.
If you look at the blocking and scheduler latency profiles of the program:
You can see from both of the images above that most of the time is spent in the concurrency constructs. This suggests there is a lot of contention in your program. While goroutines are cheap, there is still some blocking and synchronisation that needs to be done when sending a value over a channel. This can take a large proportion of the time of the program when the work being done by the producer is very fast / cheap.
To answer your original question, you see load on many cores because you have more than a single goroutine running.

Recursive concurrency with golang

I'd like to distribute some load across some goroutines. If the number of tasks is known beforehand then it is easy to organize. For example, I could do fan out with a wait group.
nTasks := 100
nGoroutines := 10
// it is important that this channel is not buffered
ch := make(chan *Task)
done := make(chan bool)
var w sync.WaitGroup
// Feed the channel until done
go func () {
for i:= 0; i < nTasks; i++ {
task := getTaskI(i)
ch <- task
}
// as ch is not buffered once everything is read we know we have delivered all of them
for i:=0; i < nGoroutines; i++ {
done <- false
}
}()
for i:= 0; i < nGoroutines; i ++ {
w.Add(1)
go func () {
defer w.Done()
select {
case task := <-ch:
doSomethingWithTask(task)
case <- done:
return
}
}()
}
w.Wait()
// All tasks done, all goroutines closed
However, in my case each task returns more tasks to be done. Say for example a crawler where we receive all the links from the crawled web. My initial hunch was to have a main loop where I track the number of tasks done and tasks pending. When I'm done I send a finish signal to all goroutines:
nGoroutines := 10
ch := make(chan *Task, nGoroutines)
feedBackChannel := make(chan * Task, nGoroutines)
done := make(chan bool)
for i:= 0; i < nGoroutines; i ++ {
go func () {
select {
case task := <-ch:
task.NextTasks = doSomethingWithTask(task)
feedBackChannel <- task
case <- done:
return
}
}()
}
// seed first task
ch <- firstTask
nTasksRemaining := 1
for nTasksRemaining > 0 {
task := <- feedBackChannel
nTasksRemaining -= 1
for _, t := range(task.NextTasks) {
ch <- t
nTasksRemaining++
}
}
for i:=0; i < nGoroutines; i++ {
done <- false
}
However, this produces a deadlock. For example if NextTasks is bigger than the number of goroutines then the main loop will stall when the first tasks finish. But the first tasks can't finish because the feedBack is blocked since the mainLoop is waiting to write.
One "easy" way out of this is to post to the channel asynchronously:
Instead of doing feedBackChannel <- task do go func () {feedBackChannel <- task}(). Now, this feels like an awful hack. Specially since there might be hundred of thousands of tasks.
What would be a nice way to avoid this deadlock? I've searched for concurrency patterns, but mostly are simpler things like fanning out or pipelines where the later stage does not affect the earlier steps.
If I understand your problem correctly, your solution is pretty complex. Here are some points. Hope it helps.
As people mentioned in comments, launching a goroutine is cheap (both memory and switch between them is much cheaper that OS level theread) and you could have hundred thousand of them. Let's assume for some reasons you want to have worker goroutines.
Instead of done channel you could just close ch channel and instead of select you just range over your channel getting tasks.
I don't see the point of separating ch and feedBackChannel just push every task you have into ch and increase its capacity.
As mentioned you may get a deadlock when you trying to enqueue new task. My solution is pretty naive. Just increase its capacity until you are sure that it won't overflow (you could also log warnings if cap(ch) - len(ch) < threshold). If you create a channel (of pointers) with 1 million capacity it will take about 8 * 1e6 ~= 8MB of ram.

How do I close a channel multiple goroutines are sending on?

I am attempting to do some computation in parallel. The program is designed so that each worker goroutine sends "pieces" of a solved puzzle back to the controller goroutine that waits to receive and assembles everything sent from the worker routines.
What is the idomatic Go for closing the single channel? I cannot call close on the channel in each goroutine because then I could possibly send on a closed channel. Likewise, there is no way to predetermine which goroutine will finish first. Is a sync.WaitGroup necessary here?
Here is an example using the sync.WaitGroup to do what you are looking for,
This example accepts a lenghty list of integers, then sums them all up by handing N parallel workers an equal-sized chunk of the input data. It can be run on go playground:
package main
import (
"fmt"
"sync"
)
const WorkerCount = 10
func main() {
// Some input data to operate on.
// Each worker gets an equal share to work on.
data := make([]int, WorkerCount*10)
for i := range data {
data[i] = i
}
// Sum all the entries.
result := sum(data)
fmt.Printf("Sum: %d\n", result)
}
// sum adds up the numbers in the given list, by having the operation delegated
// to workers operating in parallel on sub-slices of the input data.
func sum(data []int) int {
var sum int
result := make(chan int)
defer close(result)
// Accumulate results from workers.
go func() {
for {
select {
case value := <-result:
sum += value
}
}
}()
// The WaitGroup will track completion of all our workers.
wg := new(sync.WaitGroup)
wg.Add(WorkerCount)
// Divide the work up over the number of workers.
chunkSize := len(data) / WorkerCount
// Spawn workers.
for i := 0; i < WorkerCount; i++ {
go func(i int) {
offset := i * chunkSize
worker(result, data[offset:offset+chunkSize])
wg.Done()
}(i)
}
// Wait for all workers to finish, before returning the result.
wg.Wait()
return sum
}
// worker sums up the numbers in the given list.
func worker(result chan int, data []int) {
var sum int
for _, v := range data {
sum += v
}
result <- sum
}
Yes, This is a perfect use case for sync.WaitGroup.
Your other option is to use 1 channel per goroutine and one multiplexer goroutine that feeds from each channel into a single channel. But that would get unwieldy fast so I'd just go with a sync.WaitGroup.

Resources