Better go-idiomatic way of writing this code? - go

Playing around with go, I threw together this code:
package main
import "fmt"
const N = 10
func main() {
ch := make(chan int, N)
done := make(chan bool)
for i := 0; i < N; i++ {
go (func(n int, ch chan int, done chan bool) {
for i := 0; i < N; i++ {
ch <- n*N + i
}
done <- true
})(i, ch, done)
}
numDone := 0
for numDone < N {
select {
case i := <-ch:
fmt.Println(i)
case <-done:
numDone++
}
}
for {
select {
case i := <-ch:
fmt.Println(i)
default:
return
}
}
}
Basically I have N channels doing some work and reporting it on the same channel -- I want to know when all the channels are done. So I have this other done channel that each worker goroutine sends a message on (message doesn't matter), and this causes main to count that thread as done. When the count gets to N, we're actually done.
Is this "good" go? Is there a more go-idiomatic way of doing this?
edit: To clarify a bit, I'm doubtful because the done channel seems to be doing a job that channel closing seems to be for, but of course I can't actually close the channel in any goroutine because all the routines share the same channel. So I'm using done to simulate a channel that does some kind of "buffered closing".
edit2: Original code wasn't really working since sometimes the done signal from a routine was read before the int it just put on ch. Needs a "cleanup" loop.

Here is an idiomatic use of sync.WaitGroup for you to study
(playground link)
package main
import (
"fmt"
"sync"
)
const N = 10
func main() {
ch := make(chan int, N)
var wg sync.WaitGroup
for i := 0; i < N; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
for i := 0; i < N; i++ {
ch <- n*N + i
}
}(i)
}
go func() {
wg.Wait()
close(ch)
}()
for i := range ch {
fmt.Println(i)
}
}
Note the use of closures in the two go routine definitions and note the second go statement to wait for all the routines to finish, then close the channel, so range can be used.

looks like you want a sync.WaitGroup (http://golang.org/pkg/sync/#WaitGroup)

Just use a WaitGroup! They are the built-in primitive that essentially let you wait for stuff in different goroutines to finish up.
http://golang.org/pkg/sync/#WaitGroup
As for your doubts, The way to thing about is that being done by closing a channel (done permanently) and being done with work (temporarily) are different.

In the first approximation the code seems more or less okay to me.
Wrt the details, the 'ch' should be buffered. Also the 'done' channel goroutine "accounting" might be possibly replaced with sync.WaitGroup.

If you're iterating over values generated from goroutines, you can iterate directly over the
communication channel:
for value := range ch {
println(value)
}
The only thing necessary for this is, that the channel ch is closed later on, or else the
loop would wait for new values forever.
This would effectively replace your for numDone < N when used in combination with sync.WaitGroup.

I was dealing with the same issue in some code of mine and found this to be a more than adequate solution.
The answer provides Go's idiom for handling multiple goroutines all sending across a single channel.

Related

How to signal if a value has been read from a channel in Go

I am reading values that are put into a channel ch via an infinite for. I would like some way to signal if a value has been read and operated upon (via the sq result) and add it to some sort of counter variable upon success. That way I have a way to check if my channel has been exhausted so that I can properly exit my infinite for loop.
Currently it is incrementing regardless if a value was read, thus causing it to exit early when the counter == num. I only want it to count when the value has been squared.
EDIT: Another approach I have tested is to receive the ok val out of the channel upon reading and setting val and then check if !ok { break }. However I receive a deadlock panic since the for did has not properly break. Example here: https://go.dev/play/p/RYNtTix2nm2
package main
import "fmt"
func main() {
num := 5
// Buffered channel with 5 values.
ch := make(chan int, num)
defer close(ch)
for i := 0; i < num; i++ {
go func(val int) {
fmt.Printf("Added value: %d to the channel\n", val)
ch <- val
}(i)
}
// Read from our channel infinitely and increment each time a value has been read and operated upon
counter := 0
for {
// Check our counter and if its == num then break the infinite loop
if counter == num {
break
}
val := <-ch
counter++
go func(i int) {
// I'd like to verify a value was read from ch & it was processed before I increment the counter
sq := i * i
fmt.Println(sq)
}(val)
}
}
let me try to help you in figuring out the issue.
Reading issue
The latest version of the code you put in the question is working except when you're about to read values from the ch channel. I mean with the following code snippet:
go func(i int) {
// I'd like to verify a value was read from ch & it was processed before I increment the counter
sq := i * I
fmt.Println(sq)
}(val)
In fact, it's not needed to spawn a new goroutine for each read. You can consume the messages as soon as they arrived in the ch channel. This is possible due to writing done inside goroutines. Thanks to them, the code can go ahead and reach the reading phase without being blocked.
Buffered vs unbuffered
In this scenario, you used a buffered channel with 5 slots for data. However, if you're relying on the buffered channel you should signal when you finish sending data to it. This is done with a close(ch) invocation after all of the Go routines finish their job. If you use an unbuffered channel it's fine to invoke defer close(ch) next to the channel initialization. In fact, this is done for cleanup and resource optimization tasks. Back to your example, you can change the implementation to use unbuffered channels.
Final Code
Just to recap, the two small changes that you've to do are:
Use an unbuffered channel instead of a buffered one.
Do Not use a Go routine when reading the messages from the channel.
Please be sure to understand exactly what's going on. Another tip can be to issue the statement: fmt.Println("NumGoroutine:", runtime.NumGoroutine()) to print the exact number of Go routines running in that specific moment.
The final code:
package main
import (
"fmt"
"runtime"
)
func main() {
num := 5
// Buffered channel with 5 values.
ch := make(chan int)
defer close(ch)
for i := 0; i < num; i++ {
go func(val int) {
fmt.Printf("Added value: %d to the channel\n", val)
ch <- val
}(i)
}
fmt.Println("NumGoroutine:", runtime.NumGoroutine())
// Read from our channel infinitely and increment each time a value has been read and operated upon
counter := 0
for {
// Check our counter and if its == num then break the infinite loop
if counter == num {
break
}
val := <-ch
counter++
func(i int) {
// I'd like to verify a value was read from ch & it was processed before I increment the counter
sq := i * i
fmt.Println(sq)
}(val)
}
}
Let me know if this helps you, thanks!
package main
import "fmt"
func main() {
c := make(chan int)
done := make(chan bool)
go func() {
for i := 0; i < 10; i++ {
c <- i
}
close(c)
}()
go func() {
for i := range c {
fmt.Println(i)
done <- true
}
close(done)
}()
for i := 0; i < 10; i++ {
<-done
}
}
In this example, the done channel is used to signal that a value has been read from the c channel. After each value is read from c, a signal is sent on the done channel. The main function blocks on the done channel, waiting for a signal before continuing. This ensures that all values from c have been processed before the program terminates.

Go routine with channel deadlock

I just started to learn Go so please bear with me, I've tried to play around with Go routines and channels but are getting a deadlock somehow.
Here's the example
package main
import (
"fmt"
"sync"
)
func main() {
total := 2
var wg sync.WaitGroup
wg.Add(total)
ch := make(chan int)
for idx := 0; idx < total; idx++ {
fmt.Printf("Processing idx %d\n", idx)
go func(idx int) {
defer wg.Done()
ch <- idx
}(idx)
}
for val := range ch {
fmt.Println(val)
}
fmt.Println("Wait")
wg.Wait()
}
which throws the error
Processing idx 0
Processing idx 1
1
0
fatal error: all goroutines are asleep - deadlock!
range ch reads from the channel until it is closed.
How many times do you call close(ch)? When will the for val := range ch loop terminate?
When should you close the channel? You have a lot of options here, but one way to do it is to add another goroutine:
go func() {
wg.Wait()
close(ch)
}()
e.g., after spinning off all routines that will write-to-channel-then-call-wg.Done(), so that the channel is closed once all the writers are done writing. (You can run this goroutine as soon as you've increased the wg count to account for all writers.)

Writing data into same chanel from different go routines is working fine without wait group

When writing data into same channel using multiple go routines with waitgroup after waiting wg.Wait() getting exception saying all go routines are asleep or deedlock.
package main
import (
"fmt"
"runtime"
"sync"
)
var wg sync.WaitGroup
func CreateMultipleRoutines() {
ch := make(chan int)
for i := 0; i < 10; i++ { // creates 10 go routines and adds to waitgroup
wg.Add(1)
go func() {
for j := 0; j < 10; j++ {
ch <- j
}
wg.Done() // indication of go routine is done to main routine
}()
}
fmt.Println(runtime.NumGoroutine())
wg.Wait() //wait for all go routines to complete
close(ch) // closing channel after completion of wait fo go routines
for v := range ch { // range can be used since channel is closed
fmt.Println(v)
}
fmt.Println("About to exit program ...")
}
When tried to implement this without waitgroup I am able to read data from channel by looping exact number of times data pushed to channel but i cant range since there will be panic when we close channel. here is the example code
package main
import (
"fmt"
"runtime"
)
func main() {
ch := make(chan int)
for i := 0; i < 10; i++ { // creates 10 go routines and adds to waitgroup
go func(i int) {
for j := 0; j < 10; j++ {
ch <- j * i
}
}(i)
}
fmt.Println(runtime.NumGoroutine())
for v := 0; v < 100; v++ {
fmt.Println(<-ch)
}
fmt.Println("About to exit program ...")
}
I want to understand why waitgroup in wait state is still waiting even though all go routines are signalled Done() which inturn makes number of go routines to zero
I think your original code has some problems.
You are closing the channel before reading from it.
You are not getting the advantage of using 10 goroutines because of your channel is 1 "sized". So one goroutine is producing one result per once.
My solution would be to spawn a new goroutine to monitor if the 10 goroutines finished its jobs. There you will use your WaitGroup.
Then the code would be like:
package main
import (
"fmt"
"runtime"
"sync"
)
var wg sync.WaitGroup
func main() {
ch := make(chan int, 10)
for i := 0; i < 10; i++ { // creates 10 go routines and adds to waitgroup
wg.Add(1)
go func() {
for j := 0; j < 10; j++ {
ch <- j
}
wg.Done() // indication of go routine is done to main routine
}()
}
go func(){
wg.Wait()
close(ch)
}()
fmt.Println(runtime.NumGoroutine())
for v := range ch { // range can be used since channel is closed
fmt.Println(v)
}
fmt.Println("About to exit program ...")
}
By default a chan holds no items, so all go routines are blocked on sending, until something reads from it. They never actually reach the wg.Done() statement.
A solution would be to close the channel in it's own go routine. Wrap your wg.Wait() and close(ch) lines like this:
go func() {
wg.Wait() //wait for all go routines to complete
close(ch) // closing channel after completion of wait fo go routines
}()
Then you can range over the channel, which will only close after all of the sending go routines have finished (and implicitly all values have been received).

How to properly use channels to control concurrency?

I'm new to concurrency in Go and I'm trying to figure out how to use channels to control concurrency. What I would like to do have a loop where I can call out to a function using a new go routine and continue looping while that function processes and I would like to limit the number of routines that run to 3. My first attempt to do this was the code below:
func write(val int, ch chan bool) {
fmt.Println("Processing:", val)
time.Sleep(2 * time.Second)
ch <- val % 3 == 0
}
func main() {
ch := make(chan bool, 3) // limit to 3 routines?
for i := 0; i< 10; i++ {
go write(i, ch)
resp := <- ch
fmt.Println("Divisible by 3:", resp)
}
time.Sleep(20 * time.Second)
}
I was under the impression that this would basically make calls to write 3 at a time and then hold off on processing the next 3 until the first 3 had finished. Based on what is logging it appears to only be processing one at a time. The code can be found and executed here.
What would I need to change in this example to get the functionality that I described above?
The problem here is very simple:
for i := 0; i< 10; i++ {
go write(i, ch)
resp := <- ch
fmt.Println("Divisible by 3:", resp)
}
You spin up a goroutine, then wait for it to respond, before you continue around the loop and spin up the next goroutine. They can't run in parallel because you never run two of them at the same time.
To fix this, you need to spin up all 10 goroutines, then wait on all 10 responses (playground):
for i := 0; i< 10; i++ {
go write(i, ch)
}
for i := 0; i<10; i++ {
resp := <- ch
fmt.Println("Divisible by 3:", resp)
}
Now you do have 7 goroutines blocking on the channel—but it's so brief that you can't see it happening, so the output won't be very interesting. If you try adding a Processed message at the end of the goroutine, and sleeping between each channel read, you'll see that 3 of them finish immediately (well, after waiting 2 seconds), and then the others unblock and finish one by one (playground).
There is one more way to run go routines in parallel with wait for all of them to return the value on channel is using Wait groups. It also helps to synchronize go routines. If you are working with go routines to wait for all of them to finish before executing another function better approach is to use wait group.
package main
import (
"fmt"
"time"
"sync"
)
func write(val int, wg *sync.WaitGroup, ch chan bool) {
defer wg.Done()
fmt.Println("Processing:", val)
time.Sleep(2 * time.Second)
ch <- val % 3 == 0
}
func main() {
wg := &sync.WaitGroup{}
ch := make(chan bool, 3)
for i := 0; i< 10; i++ {
wg.Add(1)
go write(i, wg, ch)
}
for i := 0; i< 10; i++ {
fmt.Println("Divisible by 3: ", <-ch)
}
close(ch)
wg.Wait()
time.Sleep(20 * time.Second)
}
Playground example

Recursive concurrency with golang

I'd like to distribute some load across some goroutines. If the number of tasks is known beforehand then it is easy to organize. For example, I could do fan out with a wait group.
nTasks := 100
nGoroutines := 10
// it is important that this channel is not buffered
ch := make(chan *Task)
done := make(chan bool)
var w sync.WaitGroup
// Feed the channel until done
go func () {
for i:= 0; i < nTasks; i++ {
task := getTaskI(i)
ch <- task
}
// as ch is not buffered once everything is read we know we have delivered all of them
for i:=0; i < nGoroutines; i++ {
done <- false
}
}()
for i:= 0; i < nGoroutines; i ++ {
w.Add(1)
go func () {
defer w.Done()
select {
case task := <-ch:
doSomethingWithTask(task)
case <- done:
return
}
}()
}
w.Wait()
// All tasks done, all goroutines closed
However, in my case each task returns more tasks to be done. Say for example a crawler where we receive all the links from the crawled web. My initial hunch was to have a main loop where I track the number of tasks done and tasks pending. When I'm done I send a finish signal to all goroutines:
nGoroutines := 10
ch := make(chan *Task, nGoroutines)
feedBackChannel := make(chan * Task, nGoroutines)
done := make(chan bool)
for i:= 0; i < nGoroutines; i ++ {
go func () {
select {
case task := <-ch:
task.NextTasks = doSomethingWithTask(task)
feedBackChannel <- task
case <- done:
return
}
}()
}
// seed first task
ch <- firstTask
nTasksRemaining := 1
for nTasksRemaining > 0 {
task := <- feedBackChannel
nTasksRemaining -= 1
for _, t := range(task.NextTasks) {
ch <- t
nTasksRemaining++
}
}
for i:=0; i < nGoroutines; i++ {
done <- false
}
However, this produces a deadlock. For example if NextTasks is bigger than the number of goroutines then the main loop will stall when the first tasks finish. But the first tasks can't finish because the feedBack is blocked since the mainLoop is waiting to write.
One "easy" way out of this is to post to the channel asynchronously:
Instead of doing feedBackChannel <- task do go func () {feedBackChannel <- task}(). Now, this feels like an awful hack. Specially since there might be hundred of thousands of tasks.
What would be a nice way to avoid this deadlock? I've searched for concurrency patterns, but mostly are simpler things like fanning out or pipelines where the later stage does not affect the earlier steps.
If I understand your problem correctly, your solution is pretty complex. Here are some points. Hope it helps.
As people mentioned in comments, launching a goroutine is cheap (both memory and switch between them is much cheaper that OS level theread) and you could have hundred thousand of them. Let's assume for some reasons you want to have worker goroutines.
Instead of done channel you could just close ch channel and instead of select you just range over your channel getting tasks.
I don't see the point of separating ch and feedBackChannel just push every task you have into ch and increase its capacity.
As mentioned you may get a deadlock when you trying to enqueue new task. My solution is pretty naive. Just increase its capacity until you are sure that it won't overflow (you could also log warnings if cap(ch) - len(ch) < threshold). If you create a channel (of pointers) with 1 million capacity it will take about 8 * 1e6 ~= 8MB of ram.

Resources