Gracefully closing channel and not sending on closed channel - go

I am new to Golang concurrency and have been working to understand this piece of code mentioned below.
I witness few things which I am unable to explain why it happens:
when using i smaller than equal to 100000 for i <= 100000 { in main function, it sometimes prints different values for nResults and countWrites (in last two statements)
fmt.Printf("number of result writes %d\n", nResults) fmt.Printf("Number of job writes %d\n", jobWrites)
when using i more than 1000000 it gives panic: send on closed channel
How can I make sure that the values send to jobs is not on closed channel and later after all values are received in results we can close the channel without deadlock?
package main
import (
"fmt"
"sync"
)
func worker(wg *sync.WaitGroup, id int, jobs <-chan int, results chan<- int, countWrites *int64) {
defer wg.Done()
for j := range jobs {
*countWrites += 1
go func(j int) {
if j%2 == 0 {
results <- j * 2
} else {
results <- j
}
}(j)
}
}
func main() {
wg := &sync.WaitGroup{}
jobs := make(chan int)
results := make(chan int)
var i int = 1
var jobWrites int64 = 0
for i <= 10000000 {
go func(j int) {
if j%2 == 0 {
i += 99
j += 99
}
jobWrites += 1
jobs <- j
}(i)
i += 1
}
var nResults int64 = 0
for w := 1; w < 1000; w++ {
wg.Add(1)
go worker(wg, w, jobs, results, &nResults)
}
close(jobs)
wg.Wait()
var sum int32 = 0
var count int64 = 0
for r := range results {
count += 1
sum += int32(r)
if count == nResults {
close(results)
}
}
fmt.Println(sum)
fmt.Printf("number of result writes %d\n", nResults)
fmt.Printf("Number of job writes %d\n", jobWrites)
}

Quite a few problems in your code.
Sending on closed channel
One general principle of using Go channels is
don't close a channel from the receiver side and don't close a channel if the channel has multiple concurrent senders
(https://go101.org/article/channel-closing.html)
The solution for you is simple: don't have multiple concurrent senders, and then you can close the channel from the sender side.
Instead of starting millions of separate goroutine for each job you add to the channel, run one goroutine that executes the whole loop to add all jobs to the channel. And close the channel after the loop. The workers will consume the channel as fast as they can.
Data races by modifying shared variables in multiple goroutines
You're modifying two shared variables without taking special steps:
nResults, which you pass to the countWrites *int64 in the worker.
i in the loop that writes to the jobs channel: you're adding 99 to it from multiple goroutines, making it unpredictable how many values you actually write to the jobs channel
To solve 1, there are many options, including using sync.Mutex. However since you're just adding to it, the easiest solution is to use atomic.AddInt64(countWrites, 1) instead of *countWrites += 1
To solve 2, don't use one goroutine per write to the channel, but one goroutine for the entire loop (see above)

Related

How to signal if a value has been read from a channel in Go

I am reading values that are put into a channel ch via an infinite for. I would like some way to signal if a value has been read and operated upon (via the sq result) and add it to some sort of counter variable upon success. That way I have a way to check if my channel has been exhausted so that I can properly exit my infinite for loop.
Currently it is incrementing regardless if a value was read, thus causing it to exit early when the counter == num. I only want it to count when the value has been squared.
EDIT: Another approach I have tested is to receive the ok val out of the channel upon reading and setting val and then check if !ok { break }. However I receive a deadlock panic since the for did has not properly break. Example here: https://go.dev/play/p/RYNtTix2nm2
package main
import "fmt"
func main() {
num := 5
// Buffered channel with 5 values.
ch := make(chan int, num)
defer close(ch)
for i := 0; i < num; i++ {
go func(val int) {
fmt.Printf("Added value: %d to the channel\n", val)
ch <- val
}(i)
}
// Read from our channel infinitely and increment each time a value has been read and operated upon
counter := 0
for {
// Check our counter and if its == num then break the infinite loop
if counter == num {
break
}
val := <-ch
counter++
go func(i int) {
// I'd like to verify a value was read from ch & it was processed before I increment the counter
sq := i * i
fmt.Println(sq)
}(val)
}
}
let me try to help you in figuring out the issue.
Reading issue
The latest version of the code you put in the question is working except when you're about to read values from the ch channel. I mean with the following code snippet:
go func(i int) {
// I'd like to verify a value was read from ch & it was processed before I increment the counter
sq := i * I
fmt.Println(sq)
}(val)
In fact, it's not needed to spawn a new goroutine for each read. You can consume the messages as soon as they arrived in the ch channel. This is possible due to writing done inside goroutines. Thanks to them, the code can go ahead and reach the reading phase without being blocked.
Buffered vs unbuffered
In this scenario, you used a buffered channel with 5 slots for data. However, if you're relying on the buffered channel you should signal when you finish sending data to it. This is done with a close(ch) invocation after all of the Go routines finish their job. If you use an unbuffered channel it's fine to invoke defer close(ch) next to the channel initialization. In fact, this is done for cleanup and resource optimization tasks. Back to your example, you can change the implementation to use unbuffered channels.
Final Code
Just to recap, the two small changes that you've to do are:
Use an unbuffered channel instead of a buffered one.
Do Not use a Go routine when reading the messages from the channel.
Please be sure to understand exactly what's going on. Another tip can be to issue the statement: fmt.Println("NumGoroutine:", runtime.NumGoroutine()) to print the exact number of Go routines running in that specific moment.
The final code:
package main
import (
"fmt"
"runtime"
)
func main() {
num := 5
// Buffered channel with 5 values.
ch := make(chan int)
defer close(ch)
for i := 0; i < num; i++ {
go func(val int) {
fmt.Printf("Added value: %d to the channel\n", val)
ch <- val
}(i)
}
fmt.Println("NumGoroutine:", runtime.NumGoroutine())
// Read from our channel infinitely and increment each time a value has been read and operated upon
counter := 0
for {
// Check our counter and if its == num then break the infinite loop
if counter == num {
break
}
val := <-ch
counter++
func(i int) {
// I'd like to verify a value was read from ch & it was processed before I increment the counter
sq := i * i
fmt.Println(sq)
}(val)
}
}
Let me know if this helps you, thanks!
package main
import "fmt"
func main() {
c := make(chan int)
done := make(chan bool)
go func() {
for i := 0; i < 10; i++ {
c <- i
}
close(c)
}()
go func() {
for i := range c {
fmt.Println(i)
done <- true
}
close(done)
}()
for i := 0; i < 10; i++ {
<-done
}
}
In this example, the done channel is used to signal that a value has been read from the c channel. After each value is read from c, a signal is sent on the done channel. The main function blocks on the done channel, waiting for a signal before continuing. This ensures that all values from c have been processed before the program terminates.

Deadlock with Multiple goroutines with multiple channels

I am working on a sample program to print sum of odd and sum of even number between 1 to 100 using goroutine with multiple channels.
you can find my code
here
output
sum of even number = 2550
sum of odd number = 2500
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan receive]:
main.print(0x434100, 0x11db7c)
/tmp/sandbox052575152/main.go:18 +0xc0
main.main()
/tmp/sandbox052575152/main.go:14 +0x120
The code works but with deadlock.
I am not sure what is wrong in my code
We can iterate through values sent over a channel. To break such iteration channel needs to be closed explicitly. Otherwise range would block forever in the same way as for nil channel. In your code you did't close the sum(for print function sumValues channel) channel. That's why following function will be blocked for forever.
func print(sumValues <-chan string ){
for val := range sumValues {
fmt.Println(val)
}
}
So you have to close the sum channel in the doSum function after all the go routine in the doSum function are complete (otherwise sum channel might be closed before go routines are complete). You can use sync.WaitGroup to do that. See the updated doSum function below:
func doSum(sum chan<- string, oddChan <-chan int, evenChan <-chan int) {
var waitGroup sync.WaitGroup
waitGroup.Add(2) // Must wait for 2 calls to 'done' before moving on
go func(sum chan<- string) {
s1 := 0
for val := range oddChan {
s1 += val
}
sum <- fmt.Sprint("sum of odd number = ", s1)
waitGroup.Done()
}(sum)
go func(sum chan<- string) {
s1 := 0
for val := range evenChan {
s1 += val
}
sum <- fmt.Sprint("sum of even number = ", s1)
waitGroup.Done()
}(sum)
// Waiting for all goroutines to exit
waitGroup.Wait()
// all goroutines are complete now close the sum channel
close(sum)
}

How to prevent deadlocks without using sync.WaitGroup?

concurrent.go:
package main
import (
"fmt"
"sync"
)
// JOBS represents the number of jobs workers do
const JOBS = 2
// WORKERS represents the number of workers
const WORKERS = 5
func work(in <-chan int, out chan<- int, wg *sync.WaitGroup) {
for n := range in {
out <- n * n
}
wg.Done()
}
var wg sync.WaitGroup
func main() {
in := make(chan int, JOBS)
out := make(chan int, JOBS)
for w := 1; w <= WORKERS; w++ {
wg.Add(1)
go work(in, out, &wg)
}
for j := 1; j <= JOBS; j++ {
in <- j
}
close(in)
wg.Wait()
close(out)
for r := range out {
fmt.Println("result:", r)
}
// This is a solution but I want to do it with `range out`
// and also without WaitGroups
// for r := 1; r <= JOBS; r++ {
// fmt.Println("result:", <-out)
// }
}
Example is here on goplay.
Goroutines run concurrently and independently. Spec: Go statements:
A "go" statement starts the execution of a function call as an independent concurrent thread of control, or goroutine, within the same address space.
If you want to use for range to receive values from the out channel, that means the out channel can only be closed once all goroutines are done sending on it.
Since goroutines run concurrently and independently, without synchronization you can't have this.
Using WaitGroup is one mean, one way to do it (to ensure we wait all goroutines to do their job before closing out).
Your commented code is another way of that: the commented code receives exactly as many values from the channel as many the goroutines ought to send on it, which is only possible if all goroutines do send their values. The synchronization are the send statements and receive operations.
Notes:
Usually receiving results from the channel is done asynchronously, in a dedicated goroutine, or using even multiple goroutines. Doing so you are not required to use channels with buffers capable of buffering all the results. You will still need synchronization to wait for all workers to finish their job, you can't avoid this due to the concurrent and independent nature of gorutine scheduling and execution.

Multiple goroutines listening on one channel

I have multiple goroutines trying to receive on the same channel simultaneously. It seems like the last goroutine that starts receiving on the channel gets the value. Is this somewhere in the language spec or is it undefined behaviour?
c := make(chan string)
for i := 0; i < 5; i++ {
go func(i int) {
<-c
c <- fmt.Sprintf("goroutine %d", i)
}(i)
}
c <- "hi"
fmt.Println(<-c)
Output:
goroutine 4
Example On Playground
EDIT:
I just realized that it's more complicated than I thought. The message gets passed around all the goroutines.
c := make(chan string)
for i := 0; i < 5; i++ {
go func(i int) {
msg := <-c
c <- fmt.Sprintf("%s, hi from %d", msg, i)
}(i)
}
c <- "original"
fmt.Println(<-c)
Output:
original, hi from 0, hi from 1, hi from 2, hi from 3, hi from 4
NOTE: the above output is outdated in more recent versions of Go (see comments)
Example On Playground
Yes, it's complicated, But there are a couple of rules of thumb that should make things feel much more straightforward.
prefer using formal arguments for the channels you pass to go-routines instead of accessing channels in global scope. You can get more compiler checking this way, and better modularity too.
avoid both reading and writing on the same channel in a particular go-routine (including the 'main' one). Otherwise, deadlock is a much greater risk.
Here's an alternative version of your program, applying these two guidelines. This case demonstrates many writers & one reader on a channel:
c := make(chan string)
for i := 1; i <= 5; i++ {
go func(i int, co chan<- string) {
for j := 1; j <= 5; j++ {
co <- fmt.Sprintf("hi from %d.%d", i, j)
}
}(i, c)
}
for i := 1; i <= 25; i++ {
fmt.Println(<-c)
}
http://play.golang.org/p/quQn7xePLw
It creates the five go-routines writing to a single channel, each one writing five times. The main go-routine reads all twenty five messages - you may notice that the order they appear in is often not sequential (i.e. the concurrency is evident).
This example demonstrates a feature of Go channels: it is possible to have multiple writers sharing one channel; Go will interleave the messages automatically.
The same applies for one writer and multiple readers on one channel, as seen in the second example here:
c := make(chan int)
var w sync.WaitGroup
w.Add(5)
for i := 1; i <= 5; i++ {
go func(i int, ci <-chan int) {
j := 1
for v := range ci {
time.Sleep(time.Millisecond)
fmt.Printf("%d.%d got %d\n", i, j, v)
j += 1
}
w.Done()
}(i, c)
}
for i := 1; i <= 25; i++ {
c <- i
}
close(c)
w.Wait()
This second example includes a wait imposed on the main goroutine, which would otherwise exit promptly and cause the other five goroutines to be terminated early (thanks to olov for this correction).
In both examples, no buffering was needed. It is generally a good principle to view buffering as a performance enhancer only. If your program does not deadlock without buffers, it won't deadlock with buffers either (but the converse is not always true). So, as another rule of thumb, start without buffering then add it later as needed.
Late reply, but I hope this helps others in the future like Long Polling, "Global" Button, Broadcast to everyone?
Effective Go explains the issue:
Receivers always block until there is data to receive.
That means that you cannot have more than 1 goroutine listening to 1 channel and expect ALL goroutines to receive the same value.
Run this Code Example.
package main
import "fmt"
func main() {
c := make(chan int)
for i := 1; i <= 5; i++ {
go func(i int) {
for v := range c {
fmt.Printf("count %d from goroutine #%d\n", v, i)
}
}(i)
}
for i := 1; i <= 25; i++ {
c<-i
}
close(c)
}
You will not see "count 1" more than once even though there are 5 goroutines listening to the channel. This is because when the first goroutine blocks the channel all other goroutines must wait in line. When the channel is unblocked, the count has already been received and removed from the channel so the next goroutine in line gets the next count value.
I've studied existing solutions and created simple broadcast library https://github.com/grafov/bcast.
group := bcast.NewGroup() // you created the broadcast group
go bcast.Broadcasting(0) // the group accepts messages and broadcast it to all members
member := group.Join() // then you join member(s) from other goroutine(s)
member.Send("test message") // or send messages of any type to the group
member1 := group.Join() // then you join member(s) from other goroutine(s)
val := member1.Recv() // and for example listen for messages
It is complicated.
Also, see what happens with GOMAXPROCS = NumCPU+1. For example,
package main
import (
"fmt"
"runtime"
)
func main() {
runtime.GOMAXPROCS(runtime.NumCPU() + 1)
fmt.Print(runtime.GOMAXPROCS(0))
c := make(chan string)
for i := 0; i < 5; i++ {
go func(i int) {
msg := <-c
c <- fmt.Sprintf("%s, hi from %d", msg, i)
}(i)
}
c <- ", original"
fmt.Println(<-c)
}
Output:
5, original, hi from 4
And, see what happens with buffered channels. For example,
package main
import "fmt"
func main() {
c := make(chan string, 5+1)
for i := 0; i < 5; i++ {
go func(i int) {
msg := <-c
c <- fmt.Sprintf("%s, hi from %d", msg, i)
}(i)
}
c <- "original"
fmt.Println(<-c)
}
Output:
original
You should be able to explain these cases too.
For multiple goroutine listen on one channel, yes, it's possible. the key point is the message itself, you can define some message like that:
package main
import (
"fmt"
"sync"
)
type obj struct {
msg string
receiver int
}
func main() {
ch := make(chan *obj) // both block or non-block are ok
var wg sync.WaitGroup
receiver := 25 // specify receiver count
sender := func() {
o := &obj {
msg: "hello everyone!",
receiver: receiver,
}
ch <- o
}
recv := func(idx int) {
defer wg.Done()
o := <-ch
fmt.Printf("%d received at %d\n", idx, o.receiver)
o.receiver--
if o.receiver > 0 {
ch <- o // forward to others
} else {
fmt.Printf("last receiver: %d\n", idx)
}
}
go sender()
for i:=0; i<reciever; i++ {
wg.Add(1)
go recv(i)
}
wg.Wait()
}
The output is random:
5 received at 25
24 received at 24
6 received at 23
7 received at 22
8 received at 21
9 received at 20
10 received at 19
11 received at 18
12 received at 17
13 received at 16
14 received at 15
15 received at 14
16 received at 13
17 received at 12
18 received at 11
19 received at 10
20 received at 9
21 received at 8
22 received at 7
23 received at 6
2 received at 5
0 received at 4
1 received at 3
3 received at 2
4 received at 1
last receiver 4
Quite an old question, but nobody mentioned this, I think.
First, the outputs of both examples can be different if you run the codes many times. This is not related to the Go version.
The output of the 1st example can be goroutine 4, goroutine 0, goroutine 1,... actually all the goroutine can be a one who sends the string to the main goroutine.
Main goroutine is one of the goroutines, so it's also waiting for data from the channel.
Which goroutine should receive the data? Nobody knows. It's not in the language spec.
Also, the output of the 2nd example also can be anything:
(I added the square brackets just for clarity)
// [original, hi from 4]
// [[[[[original, hi from 4], hi from 0], hi from 2], hi from 1], hi from 3]
// [[[[[original, hi from 4], hi from 1], hi from 0], hi from 2], hi from 3]
// [[[[[original, hi from 0], hi from 2], hi from 1], hi from 3], hi from 4]
// [[original, hi from 4], hi from 1]
// [[original, hi from 0], hi from 4]
// [[[original, hi from 4], hi from 1], hi from 0]
// [[[[[original, hi from 4], hi from 1], hi from 0], hi from 3], hi from 2]
// [[[[original, hi from 0], hi from 2], hi from 1], hi from 3]
//
// ......anything can be the output.
This is not magic, nor a mysterious phenomenon.
If there are multiple threads being executed, no one knows exactly which thread will acquire the resource. The language doesn't determine it. Rather, OS takes care of it. This is why multithread programming is quite complicated.
Goroutine is not OS thread, but it behaves somewhat similarly.
Use sync.Cond is a good choice.
ref: https://pkg.go.dev/sync#Cond

How do I close a channel multiple goroutines are sending on?

I am attempting to do some computation in parallel. The program is designed so that each worker goroutine sends "pieces" of a solved puzzle back to the controller goroutine that waits to receive and assembles everything sent from the worker routines.
What is the idomatic Go for closing the single channel? I cannot call close on the channel in each goroutine because then I could possibly send on a closed channel. Likewise, there is no way to predetermine which goroutine will finish first. Is a sync.WaitGroup necessary here?
Here is an example using the sync.WaitGroup to do what you are looking for,
This example accepts a lenghty list of integers, then sums them all up by handing N parallel workers an equal-sized chunk of the input data. It can be run on go playground:
package main
import (
"fmt"
"sync"
)
const WorkerCount = 10
func main() {
// Some input data to operate on.
// Each worker gets an equal share to work on.
data := make([]int, WorkerCount*10)
for i := range data {
data[i] = i
}
// Sum all the entries.
result := sum(data)
fmt.Printf("Sum: %d\n", result)
}
// sum adds up the numbers in the given list, by having the operation delegated
// to workers operating in parallel on sub-slices of the input data.
func sum(data []int) int {
var sum int
result := make(chan int)
defer close(result)
// Accumulate results from workers.
go func() {
for {
select {
case value := <-result:
sum += value
}
}
}()
// The WaitGroup will track completion of all our workers.
wg := new(sync.WaitGroup)
wg.Add(WorkerCount)
// Divide the work up over the number of workers.
chunkSize := len(data) / WorkerCount
// Spawn workers.
for i := 0; i < WorkerCount; i++ {
go func(i int) {
offset := i * chunkSize
worker(result, data[offset:offset+chunkSize])
wg.Done()
}(i)
}
// Wait for all workers to finish, before returning the result.
wg.Wait()
return sum
}
// worker sums up the numbers in the given list.
func worker(result chan int, data []int) {
var sum int
for _, v := range data {
sum += v
}
result <- sum
}
Yes, This is a perfect use case for sync.WaitGroup.
Your other option is to use 1 channel per goroutine and one multiplexer goroutine that feeds from each channel into a single channel. But that would get unwieldy fast so I'd just go with a sync.WaitGroup.

Resources