I need a way to signal from one main goroutine, an unknown number of other goroutines, multiple times. I also need for those other goroutines to select on multiple items, so busy waiting is (probably) not an option. I have come up with the following solution:
package main
import (
"context"
"fmt"
"sync"
"time"
)
type signal struct {
data []int
channels []chan struct{}
}
func newSignal() *signal {
s := &signal{
data: make([]int, 0),
channels: make([]chan struct{}, 1),
}
s.channels[0] = make(chan struct{})
return s
}
func (s *signal) Broadcast(d int) {
s.data = append(s.data, d)
s.channels = append(s.channels, make(chan struct{}))
close(s.channels[len(s.data)-1])
}
func test(s *signal, wg *sync.WaitGroup, id int, ctx context.Context) {
for i := 0; ; i += 1 {
select {
case <-s.channels[i]:
if id >= s.data[i] {
fmt.Println("Goroutine completed:", id)
wg.Done()
return
}
case <-ctx.Done():
fmt.Println("Goroutine completed:", id)
wg.Done()
return
}
}
}
func main() {
s := newSignal()
ctx, cancel := context.WithCancel(context.Background())
wg := sync.WaitGroup{}
wg.Add(3)
go test(s, &wg, 3, ctx)
go test(s, &wg, 2, ctx)
go test(s, &wg, 1, ctx)
s.Broadcast(3)
time.Sleep(1 * time.Second)
// multiple broadcasts is mandatory
s.Broadcast(2)
time.Sleep(1 * time.Second)
// last goroutine
cancel()
wg.Wait()
}
Playground: https://play.golang.org/p/dGmlkTuj7Ty
Is there a more elegant way to do this? One that uses builtin libraries only. If not, is this a safe/ok to use solution? I believe it is safe at least, as it works for a large number of goroutines (I have done some testing with it).
To be concise, here is exactly what I want:
The main goroutine (call it M) must be able to signal with some data (call it d) some unknown number of other goroutines (call them n for 0...n), multiple times, with each goroutine taking an action based on d each time
M must be able to signal all of the other n goroutines with certain (numerical) data, multiple times
Every goroutine in n will either terminate on its own (based on a context) or after doing some operation with d and deciding its fate. It will be performing this check as many times as signaled until it dies.
I am not allowed to keep track of the n goroutines in any way (eg. having a map of channels to goroutines and iterating)
In my solution, the slices of channels do not represent goroutines: they actually represent signals that are being broadcast out. This means that if I broadcast twice, and then a goroutine spins up, it will check both signals before sleeping in the select block.
It seems to me that you might want something like a fan-out pattern. Here's one source describing fan-in and fan-out, among other concurrency patterns. Here's a blog post on golang.org about this too. I think it's essentially a version of observer pattern using channels.
Basically, you want something, say Broadcaster, that keeps a list of channels. When you call Broadcaster.send(data), it loops over the list of channels sending data on each channel. Broadcaster must also have a way for goroutines to subscribe to Broadcaster. Goroutines must have a way to accept a channel from Broadcasteror give a channel to Broadcaster. That channel is the communication link.
If the work to be performed in the "observer" goroutines will take long, consider using buffered channels so that Broadcaster is not blocking during send and waiting on goroutines. If you don't care if a goroutine misses a data, you can use a non-blocking send (see below).
When a goroutine "dies", it can unsubscribe from Broadcaster which will remove the appropriate channel from its list. Or the channel can just remain full, and Broadcaster will have to use a non-blocking send to skip over full channels to dead goroutines.
I can't say what I described is comprehensive or 100% correct. It's just a quick description of the first general things I'd try based on your problem statement.
Related
I am implementing a (sort of a) combinatorial backtracking algorithm in go utilising goroutines. My problem can be represented as a tree with a certain degree/spread where I want to visit each leaf and calculate a result depending on the path taken. On a given level, I want to spawn goroutines to process the subproblems concurrently, i.e. if I have a tree with degree 3 and I want to start the concurrency after level 2, I'd spawn 3*3=9 goroutines that proceed with processing the subproblems concurrently.
func main() {
cRes := make(chan string, 100)
res := []string{}
numLevels := 5
spread := 3
startConcurrencyAtLevel := 2
nTree("", numLevels, spread, startConcurrencyAtLevel, cRes)
for {
select {
case r := <-cRes:
res = append(res, r)
case <-time.After(10 * time.Second):
fmt.Println("Caculation timed out")
fmt.Println(len(res), math.Pow(float64(spread), float64(numLevels)))
return
}
}
}
func nTree(path string, maxLevels int, spread int, startConcurrencyAtLevel int, cRes chan string) {
if len(path) == maxLevels {
// some longer running task here associated with the found path, also using a lookup table
// real problem actually returns not the path but the result if it satisfies some condition
cRes <- path
return
}
for i := 1; i <= spread; i++ {
nextPath := path + fmt.Sprint(i)
if len(path) == startConcurrencyAtLevel {
go nTree(nextPath, maxLevels, spread, startConcurrencyAtLevel, cRes)
} else {
nTree(nextPath, maxLevels, spread, startConcurrencyAtLevel, cRes)
}
}
}
The above code works, however I rely on the for select statement timing out. I am looking for a way to continue with main() as soon as all goroutines have finished, i.e. all subproblems have been processed.
I already came up with two possible (unpreferred/unelegant) solutions:
Using a mutex protected result map + a waitgroup instead of a channel-based approach should do the trick, but I'm curious if there is a neat solution with channels.
Using a quit channel (of type int). Every time a goroutine is spawned, the quit channel gets a +1 int, everytime a comptutation finished in a leaf, it gets a -1 int and the caller sums up the values. See the following snippet, this however is not a good solution as it (rather blatantly) runs into timing issues I don't want to deal with. It quits prematurely if for instance the first goroutine finishes before another one has been spawned.
for {
select {
case q := <-cRunningRoutines:
runningRoutines += q
if runningRoutines == 0 {
fmt.Println("Calculation complete")
return res
}
// ...same cases as above
}
Playground: https://go.dev/play/p/9jzeCvl8Clj
Following questions:
Is doing recursive calls from a function started as a goroutine to itself a valid approach?
What would be an idiomatic way of reading the results from cRes until all spawned goroutines finish? I read somewhere that channels should be closed when computation is done, but I just cant wrap my head around how to integrate it in this case.
Happy about any ideas, thanks!
reading the description and the snippet I am not able to understand exactly what you are trying to achieve, but I have some hints and patterns for channels that I use daily and think are helpful.
the context package is very helpful to manage goroutines' state in a safe way. In your example, time.After is used to end the main program, but in non-main functions it could be leaking goroutines: if instead you use context.Context and pass it into the gorotuines (it's usually passed first arg of a function) you will be able to control cancellation of downstream calls. This explains it briefly.
it is common practice to create channels (and return them) in functions that produce messages and send them in the channel. The same function should be responsible of closing the channel, e,g, with defer close(channel) when it's done writing.
This is handy because when the channel is buffered, closing it is possible even when it still has data in it: the Go runtime will actually wait until all messages are polledbefore closing. For unbuffered channels, the function won't be able to send a message over the channel until a reader of the channel is ready to poll it, thus won;t be able to exit.
This is an example (without recursion).
We can close the channel both when it is buffered or unbuffered in this example, because the send will block until the for := range on the channel in the main goroutine reads from it.
This is a variant for the same principle, with the channel passed as argument.
we can use sync.WaitGroup in tandem with channels, to signal completion for individual goroutines, and let know to an "orchestrating" goroutine that the channel can be closed, because all message producers are done sending data into the channel. The same considerations as point 1 apply on the close operation.
This is an example showing the use of waitGroup and external closer of channel.
channels can have a direction! notice that in the example, I added/removed arrows next to the channel (e.g. <-chan string, or chan<- string) when passing them in/outside functions. This tells the compiler that a channel is read or write only respectively in the scope of that function.
This is helping in 2 ways:
the compiler will produce more efficient code, as the channels with direction will have a single lock instead of 2.
the signature of the function describes if it will only use the channel for writing to it (and possibly close()) or reading: remember that reading from a channel with a range automatically stops the iteration when the channel is closed.
you can build channels of channels: make(chan chan string) is a valid (and helpful) construct to build processing pipelines.
A common usage of it is a fan-in goroutine that is collecting multiple outputs of a series of channel-producing goroutines.
This is an example of how to use them.
In essence, to answer your initial questions:
Is doing recursive calls from a function started as a goroutine to itself a valid approach?
If you really need recursion, it's probably better to handle it separately from the concurrent code: create a dedicated function that recursively sends data into a channel, and orchestrate the closing of the channel in the caller.
What would be an idiomatic way of reading the results from cRes until all spawned goroutines finish? I read somewhere that channels should be closed when computation is done, but I just cant wrap my head around how to integrate it in this case.
A good reference is Go Concurrency Patterns: Pipelines and cancellation: this is a rather old post (before the context package existedin the std lib) and I think Parallel digestion is what you're looking for to address the original question.
As mentioned by torek, I spun off an anonymous function closing the channel after the waitgroup finished waiting. Also needed some logic around calling the wg.Done() of the spawned goroutines only after the the recursion of the goroutine spawning level returns.
Generally I think this is a useful idiom (correct me if I'm wrong :))
Playground: https://go.dev/play/p/bQjHENsZL25
func main() {
cRes := make(chan string, 100)
numLevels := 3
spread := 3
startConcurrencyAtLevel := 2
var wg sync.WaitGroup
nTree("", numLevels, spread, startConcurrencyAtLevel, cRes, &wg)
go func() {
// time.Sleep(1 * time.Second) // edit: code should work without this initial sleep
wg.Wait()
close(cRes)
}()
for r := range cRes {
fmt.Println(r)
}
fmt.Println("Done!")
}
func nTree(path string, maxLevels int, spread int, startConcurrencyAtLevel int, cRes chan string, wg *sync.WaitGroup) {
if len(path) == maxLevels {
// some longer running task here associated with the found path
cRes <- path
return
}
for i := 1; i <= spread; i++ {
nextPath := path + fmt.Sprint(i)
if len(path) == startConcurrencyAtLevel {
go nTree(nextPath, maxLevels, spread, startConcurrencyAtLevel, cRes, wg)
} else {
nTree(nextPath, maxLevels, spread, startConcurrencyAtLevel, cRes, wg)
}
}
}
package main
import (
"fmt"
"math/rand"
"time"
)
func boring(msg string) <-chan string { // Returns receive-only channel of strings.
c := make(chan string)
go func() { // We launch the goroutine from inside the function.
for i := 0; ; i++ {
c <- fmt.Sprintf("%s %d", msg, i)
time.Sleep(time.Duration(rand.Intn(1e3)) * time.Millisecond)
}
}()
return c // Return the channel to the caller.
}
func fanIn(input1, input2 <-chan string) <-chan string {
c := make(chan string)
go func() {
for {
c <- <-input1
}
}()
go func() {
for {
c <- <-input2
}
}()
return c
}
func main() {
c := fanIn(boring("Joe"), boring("Ann"))
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
fmt.Println("You're both boring; I'm leaving.")
}
This is an example from Rob Pike's talk on Go Concurrency Patterns. I understand the idea behind the fan-in pattern and I understand that the order of messages printed in main is non-deterministic: we just print 10 messages that turn out to be ready.
What I do not completely understand, however, is the order of calls and what blocks what.
Only unbuffered channels are used so, as per the documentation, an unbuffered channel blocks the sender.
The boring function launches a goroutine that sends strings to the unbuffered channel c, which is returned. If I understand correctly, this inner goroutine is launched but doesn't block boring. It can immediately return the channel in main to the fanIn function. But fanIn does almost the same thing: it receives the values from the input channel and sends them to its own channel that is returned.
How does the blocking happen? What blocks what in this case? A schematic explanation would be perfect because, honestly, even though I have an intuitive understanding, I would like to understand the exact logic behind it.
My intuitive understanding is that each send inside boring blocks until the value is received in fanIn, but then the value is immediately sent to another channel so it gets blocked until the value is received in main. Roughly speaking, the three functions are tightly bound to each other due to the use of channels
How does the blocking happen? What blocks what in this case?
Each send on an unbuffered channel blocks if there is no corresponding receive operation on the other side (or if the channel is nil, which becomes a case of having no receiver).
Consider that in main the calls to boring and fanIn happen sequentially. In particular this line:
c := fanIn(boring("Joe"), boring("Ann"))
has order of evaluation:
boring("Joe")
boring("Ann")
fanIn
The send operations in boring("Joe") and boring("Ann") have a corresponding receive operation in fanIn, so they would block until fanIn runs. Hence boring spawns its own goroutine to ensure it returns the channel before fanIn can start receiving on it.
The send operations in fanIn have then a corresponding receive operation in main, so they would block until fmt.Println(<-c) runs. Hence fanIn spawns its own goroutine(s) to ensure it returns the out channel before main can start receiving on it.
Finally main's execution gets to fmt.Println(<-c) and sets everything in motion. Receiving on c unblocks c <- <-input[1|2], and receiving on <-input[1|2] unblocks c <- fmt.Sprintf("%s %d", msg, i).
If you remove the receive operation in main, main can still proceed execution and the program exits right away, so no deadlock occurs.
In a project the program receives data via websocket. This data needs to be processed by n algorithms. The amount of algorithms can change dynamically.
My attempt is to create some pub/sub pattern where subscriptions can be started and canceled on the fly. Turns out that this is a bit more challenging than expected.
Here's what I came up with (which is based on https://eli.thegreenplace.net/2020/pubsub-using-channels-in-go/):
package pubsub
import (
"context"
"sync"
"time"
)
type Pubsub struct {
sync.RWMutex
subs []*Subsciption
closed bool
}
func New() *Pubsub {
ps := &Pubsub{}
ps.subs = []*Subsciption{}
return ps
}
func (ps *Pubsub) Publish(msg interface{}) {
ps.RLock()
defer ps.RUnlock()
if ps.closed {
return
}
for _, sub := range ps.subs {
// ISSUE1: These goroutines apparently do not exit properly...
go func(ch chan interface{}) {
ch <- msg
}(sub.Data)
}
}
func (ps *Pubsub) Subscribe() (context.Context, *Subsciption, error) {
ps.Lock()
defer ps.Unlock()
// prep channel
ctx, cancel := context.WithCancel(context.Background())
sub := &Subsciption{
Data: make(chan interface{}, 1),
cancel: cancel,
ps: ps,
}
// prep subsciption
ps.subs = append(ps.subs, sub)
return ctx, sub, nil
}
func (ps *Pubsub) unsubscribe(s *Subsciption) bool {
ps.Lock()
defer ps.Unlock()
found := false
index := 0
for i, sub := range ps.subs {
if sub == s {
index = i
found = true
}
}
if found {
s.cancel()
ps.subs[index] = ps.subs[len(ps.subs)-1]
ps.subs = ps.subs[:len(ps.subs)-1]
// ISSUE2: close the channel async with a delay to ensure
// nothing will be written to the channel anymore
// via a pending goroutine from Publish()
go func(ch chan interface{}) {
time.Sleep(500 * time.Millisecond)
close(ch)
}(s.Data)
}
return found
}
func (ps *Pubsub) Close() {
ps.Lock()
defer ps.Unlock()
if !ps.closed {
ps.closed = true
for _, sub := range ps.subs {
sub.cancel()
// ISSUE2: close the channel async with a delay to ensure
// nothing will be written to the channel anymore
// via a pending goroutine from Publish()
go func(ch chan interface{}) {
time.Sleep(500 * time.Millisecond)
close(ch)
}(sub.Data)
}
}
}
type Subsciption struct {
Data chan interface{}
cancel func()
ps *Pubsub
}
func (s *Subsciption) Unsubscribe() {
s.ps.unsubscribe(s)
}
As mentioned in the comments there are (at least) two issues with this:
ISSUE1:
After a while of running in implementation of this I get a few errors of this kind:
goroutine 120624 [runnable]:
bm/internal/pubsub.(*Pubsub).Publish.func1(0x8586c0, 0xc00b44e880, 0xc008617740)
/home/X/Projects/bm/internal/pubsub/pubsub.go:30
created by bookmaker/internal/pubsub.(*Pubsub).Publish
/home/X/Projects/bm/internal/pubsub/pubsub.go:30 +0xbb
Without really understanding this it appears to me that the goroutines created in Publish() do accumulate/leak. Is this correct and what am I doing wrong here?
ISSUE2:
When I end a subscription via Unsubscribe() it occurs that Publish() tried to write to a closed channel and panics. To mitigate this I have created a goroutine to close the channel with a delay. This feel really off-best-practice but I was not able to find a proper solution to this. What would be a deterministic way to do this?
Heres a little playground for you to test with: https://play.golang.org/p/K-L8vLjt7_9
Before diving into your solution and its issues, let me recommend again another Broker approach presented in this answer: How to broadcast message using channel
Now on to your solution.
Whenever you launch a goroutine, always think of how it will end and make sure it does if the goroutine is not ought to run for the lifetime of your app.
// ISSUE1: These goroutines apparently do not exit properly...
go func(ch chan interface{}) {
ch <- msg
}(sub.Data)
This goroutine tries to send a value on ch. This may be a blocking operation: it will block if ch's buffer is full and there is no ready receiver on ch. This is out of the control of the launched goroutine, and also out of the control of the pubsub package. This may be fine in some cases, but this already places a burden on the users of the package. Try to avoid these. Try to create APIs that are easy to use and hard to misuse.
Also, launching a goroutine just to send a value on a channel is a waste of resources (goroutines are cheap and light, but you shouldn't spam them whenever you can).
You do it because you don't want to get blocked. To avoid blocking, you may use a buffered channel with a "reasonable" high buffer. Yes, this doesn't solve the blocking issue, in only helps with "slow" clients receiving from the channel.
To "truly" avoid blocking without launching a goroutine, you may use non-blocking send:
select {
case ch <- msg:
default:
// ch's buffer is full, we cannot deliver now
}
If send on ch can proceed, it will happen. If not, the default branch is chosen immediately. You have to decide what to do then. Is it acceptable to "lose" a message? Is it acceptable to wait for some time until "giving up"? Or is it acceptable to launch a goroutine to do this (but then you'll be back at what we're trying to fix here)? Or is it acceptable to get blocked until the client can receive from the channel...
Choosing a reasonable high buffer, if you encounter a situation when it still gets full, it may be acceptable to block until the client can advance and receive from the message. If it can't, then your whole app might be in an unacceptable state, and it might be acceptable to "hang" or "crash".
// ISSUE2: close the channel async with a delay to ensure
// nothing will be written to the channel anymore
// via a pending goroutine from Publish()
go func(ch chan interface{}) {
time.Sleep(500 * time.Millisecond)
close(ch)
}(s.Data)
Closing a channel is a signal to the receiver(s) that no more values will be sent on the channel. So always it should be the sender's job (and responsibility) to close the channel. Launching a goroutine to close the channel, you "hand" that job and responsibility to another "entity" (a goroutine) that will not be synchronized to the sender. This may easily end up in a panic (sending on a closed channel is a runtime panic, for other axioms see How does a non initialized channel behave?). Don't do that.
Yes, this was necessary because you launched goroutines to send. If you don't do that, then you may close "in-place", without launching a goroutine, because then the sender and closer will be the same entity: the Pubsub itself, whose sending and closing operations are protected by a mutex. So solving the first issue solves the second naturally.
In general if there are multiple senders for a channel, then closing the channel must be coordinated. There must be a single entity (often not any of the senders) that waits for all senders to finish, practically using a sync.WaitGroup, and then that single entity can close the channel, safely. See Closing channel of unknown length.
Suppose I have a simple loop which does sequential tests like this.
for f := 1; f <= 1000; f++ {
if doTest(f) {
break
}
}
I loop through range of numbers and do a test for each number. If test fails for one number, I break and exit the main thread. Simple enough.
Now, how do correctly feed the test numbers in say four or several go routines. Basically, I want to test the numbers from 1 to 1000 in batches of 4 (or whatever number of go routines is).
Do I create 4 routines reading from one channel and feed the numbers sequentially into this channel? Or do I make 4 routines with an individual channel?
And another question. How do I stop all 4 routines if one of them fails the test? I've been reading some texts on channels but I cannot put the pieces together.
You can create a producer/consumer system: https://play.golang.org/p/rks0gB3aDb
func main() {
ch := make(chan int)
clients := 4
// make it buffered, so all clients can fail without hanging
notifyCh := make(chan struct{}, clients)
go produce(100, ch, notifyCh)
var wg sync.WaitGroup
wg.Add(clients)
for i := 0; i < clients; i++ {
go func() {
consumer(ch, notifyCh)
wg.Done()
}()
}
wg.Wait()
}
func consumer(in chan int, notifyCh chan struct{}) {
fmt.Printf("Start consumer\n")
for i := range in {
<-time.After(100 * time.Millisecond)
if i == 42 {
fmt.Printf("%d fails\n", i)
notifyCh <- struct{}{}
return
} else {
fmt.Printf("%d\n", i)
}
}
fmt.Printf("Consumer stopped working\n")
}
func produce(N int, out chan int, notifyCh chan struct{}) {
for i := 0; i < N; i++ {
select {
case out <- i:
case <-notifyCh:
close(out)
return
}
}
close(out)
}
The producer pushes numbers from 0 to 99 to the channel, the consumer consumes until the channel is closed. In main we create 4 clients and add them to a waitgroup to reliably check if every goroutine returned.
Every consumer can signal on the notifyCh, the producer stops working and no further numbers are generated, therefor all consumers return after their current number.
There's also an option to create 4 go routines, wait for all of them to return, start the next 4 go routines. But this adds quite an overhead on waiting.
Since you mentioned prime numbers, here's a really cool prime seive: https://golang.org/doc/play/sieve.go
Whether you will create one channel common or a channel per routines depend on what you want.
If you want only put some numbers (or more general - requests) inside and you don't care which goroutine serve that, than of course is better to share a channel. In case when you want for example first 250 request to be served by goroutine1, than of course you cannot share a channel.
For channel is a good practice use it as input or output. And the simples thing how sender can sent, that he is finished is close the channel. Good article about that is https://blog.golang.org/pipelines
What is not mentiond in the question - is you need also another channel (or channels) or or any other communication primitive to get results. And here is the channel most interesting than to feeding.
What information should be sent - it should be sent, a bool after every doTest, or just know when everthing was done (it this case neither bool is not necessary just close a channel)?
If you prefer program at first fail. Than I would prefer use buffered shared channel to feed the numbers. Don't forget to close it, when all numbers will be feed.
And another unbuffered chan to let main thread know, that tests are done. It can be channel, there you only put the number, where test failed, or if you also want a positive result - channel of struct containing number and result, or any other informantion returned from doTest.
Very good article about channel is also http://dave.cheney.net/2014/03/19/channel-axioms
Each of your four goroutines can report a failure (by sending error and closing channel). But gotcha is what goroutines should do, when all numbers passed and feeding channel is closed. And about that is also nice article http://nathanleclaire.com/blog/2014/02/15/how-to-wait-for-all-goroutines-to-finish-executing-before-continuing/
I’m writing the Walk function in the go tutorial that basically traverses a tree in-order. What I have works:
package main
import (
"fmt"
"code.google.com/p/go-tour/tree"
)
// Walk walks the tree t sending all values
// from the tree to the channel ch.
func Walk__helper(t *tree.Tree, ch chan int) {
if (t == nil) {
return
}
Walk__helper(t.Left, ch)
ch <- t.Value
Walk__helper(t.Right, ch)
}
func Walk(t *tree.Tree, ch chan int) {
Walk__helper(t, ch)
close(ch)
}
func main() {
ch := make(chan int)
go Walk(tree.New(1), ch)
for v := range ch {
fmt.Println(v)
}
}
Why must I use go Walk(tree.New(1), ch) instead of just Walk(tree.New(1), ch)?
I was under the impression that the go keyword basically spawns a new thread. In that case, we’d run into issues since the for loop might run before the subroutine completes.
Strangely, when I take out the go keyword, I get a deadlock. This is rather counterintuitive to me. What exactly is the go keyword doing here?
The key point here is range when coupled with a channel.
When you range over a channel (in this case, ch), it will wait for items to be sent on the channel before iterating through the loop. This is a safe, "blocking" action, that will not deadlock while it waits for the channel to receive an item.
The deadlock occurs when not using a goroutine because your channel isn't buffered. If you don't use a goroutine, then the method call is synchronous, Walk puts something on the channel.. and it blocks until that is popped off. It never gets popped off.. because the method call was synchronous.
I was under the impression that the go keyword basically spawns a new thread
..that is incorrect. There are many more important implementation details required to understand what goes on there. You should separate your thought process of a goroutine from a thread.. and just think of a goroutine as a concurrently executing piece of code, without a "thread".