Multiple receivers on a single channel. Who gets the data? - go

Unbuffered channels block receivers until data is available on the channel. It's not clear to me how this blocking behaves with multiple receivers on the same channel (say when using goroutines). I am sure they would all block as long as there is no data sent on that channel.
But what happens once I send a single value to that channel? Which receiver/goroutine will get the data and therefore unblock? All of them? The first in line? Random?

A single random (non-deterministic) one will receive it.
See the language spec:
Execution of a "select" statement proceeds in several steps:
For all the cases in the statement, the channel operands of receive operations and the channel and right-hand-side expressions of send
statements are evaluated exactly once, in source order, upon entering
the "select" statement. The result is a set of channels to receive
from or send to, and the corresponding values to send. Any side
effects in that evaluation will occur irrespective of which (if any)
communication operation is selected to proceed. Expressions on the
left-hand side of a RecvStmt with a short variable declaration or
assignment are not yet evaluated.
If one or more of the communications can proceed, a single one that can proceed is chosen via a uniform pseudo-random selection.
Otherwise, if there is a default case, that case is chosen. If there
is no default case, the "select" statement blocks until at least one
of the communications can proceed.
Unless the selected case is the default case, the respective communication operation is executed.
If the selected case is a RecvStmt with a short variable declaration or an assignment, the left-hand side expressions are
evaluated and the received value (or values) are assigned.
The statement list of the selected case is executed.

By default the goroutine communication is synchronous and unbuffered: sends do not complete until there is a receiver to accept the value. There must be a receiver ready to receive data from the channel and then the sender can hand it over directly to the receiver.
So channel send/receive operations block until the other side is ready:
1. A send operation on a channel blocks until a receiver is available for the same channel: if there’s no recipient for the value on ch, no other value can be put in the channel. And the other way around: no new value can be sent in ch when the channel is not empty! So the send operation will wait until ch becomes available again.
2. A receive operation for a channel blocks until a sender is available for the same channel: if there is no value in the channel, the receiver blocks.
This is illustrated in the below example:
package main
import "fmt"
func main() {
ch1 := make(chan int)
go pump(ch1) // pump hangs
fmt.Println(<-ch1) // prints only 0
}
func pump(ch chan int) {
for i:= 0; ; i++ {
ch <- i
}
}
Because there is no receiver the goroutine hangs and print only the first number.
To workaround this we need to define a new goroutine which reads from the channel in an infinite loop.
func receive(ch chan int) {
for {
fmt.Println(<- ch)
}
}
Then in main():
func main() {
ch := make(chan int)
go pump(ch)
receive(ch)
}

If the program is allowing multiple goroutines to receive on a single channel then the sender is broadcasting. Each receiver should be equally able to process the data. So it does not matter what mechanism the go runtime uses to decide which of the many goroutine receivers will run Cf. https://github.com/golang/go/issues/247. But only ONE will run for each sent item if the channel is unbuffered.

There have been some discussion about this
But what is established in the Go Memory Model is that it will be at most one of them.
Each send on a particular channel is matched to a corresponding receive from that channel, usually in a different goroutine.
That isn't as clear cut as I would like, but later down they give this example of a semaphore implementation
var limit = make(chan int, 3)
func main() {
for _, w := range work {
go func(w func()) {
limit <- 1
w()
// if it were possible for more than one channel to receive
// from a single send, it would be possible for this to release
// more than one "lock", making it an invalid semaphore
// implementation
<-limit
}(w)
}
select{}
}

Related

When to use a finilizer to close a channel?

This is the second of two questions (this is the first one) to help make sense of the Go generics proposal examples.
In particular I am having trouble-so far-understanding two bits of code from the examples section of the proposal entitled "Channels":
The second issue I have is in the following definition of the the Ranger function.
Namely, I don't understand the need to call runtime.SetFinalizer(r,r.finalize) where in fact what the finalize) method of the *Receiver[T] type is supposed to do is simply to signal that the receiver is done receiving values (close(r.done)).
The way I see it, by providing a finalizer for a *Receiver[T] the code is delegating the obligation to close the receiver to the runtime.
The way I understand this piece of code, is that the *Receiver[T] signals to the *Sender[T] that it won't be receiving any more values when the GC decides that the former is unreachable ie no more references are available to it.
If my interpretation is correct, why wait that long for the receiver to signal it's done? Is't it possible, to explicitly handle the close operation in the code somehow?
Thanks.
Code:
// Ranger provides a convenient way to exit a goroutine sending values
// when the receiver stops reading them.
//
// Ranger returns a Sender and a Receiver. The Receiver provides a
// Next method to retrieve values. The Sender provides a Send method
// to send values and a Close method to stop sending values. The Next
// method indicates when the Sender has been closed, and the Send
// method indicates when the Receiver has been freed.
func Ranger[T any]() (*Sender[T], *Receiver[T]) {
c := make(chan T)
d := make(chan bool)
s := &Sender[T]{values: c, done: d}
r := &Receiver[T]{values: c, done: d}
// The finalizer on the receiver will tell the sender
// if the receiver stops listening.
runtime.SetFinalizer(r, r.finalize)
return s, r
}
// A Sender is used to send values to a Receiver.
type Sender[T any] struct {
values chan<- T
done <-chan bool
}
// Send sends a value to the receiver. It reports whether any more
// values may be sent; if it returns false the value was not sent.
func (s *Sender[T]) Send(v T) bool {
select {
case s.values <- v:
return true
case <-s.done:
// The receiver has stopped listening.
return false
}
}
// Close tells the receiver that no more values will arrive.
// After Close is called, the Sender may no longer be used.
func (s *Sender[T]) Close() {
close(s.values)
}
// A Receiver receives values from a Sender.
type Receiver[T any] struct {
values <-chan T
done chan<- bool
}
// Next returns the next value from the channel. The bool result
// reports whether the value is valid. If the value is not valid, the
// Sender has been closed and no more values will be received.
func (r *Receiver[T]) Next() (T, bool) {
v, ok := <-r.values
return v, ok
}
// finalize is a finalizer for the receiver.
// It tells the sender that the receiver has stopped listening.
func (r *Receiver[T]) finalize() {
close(r.done)
}
TLDR: Your understanding is correct, the done channel may simply be closed by the receiver "manually" to signal the lost of interest (to stop the communication and relieve the sender from its duty).
Channels are used for goroutines to communicate in a concurrency safe manner. The idiomatic use is that the sender party keeps sending values, and once there are no more values to send, it is signaled by the sender closing the channel.
The receiver party keeps receiving from the channel until it is closed, which signals there won't be (can't be) any more values coming on the channel. This is usually / easiest done using a for range over the channel.
So usually the receiver has to keep receiving until the channel is closed, else the sender party would get blocked forever. Often this is OK / sufficient.
The demonstrated Ranger() construct is for the non-general case when there's need / possibility for the receiver to stop the communication.
A single channel does not provide a mean for the receiver party to signal the sender that the receiver has lost interest, and no more values are needed. This requires an additional channel which the receiver has to close (and the sender has to monitor of course). As long as there's a single receiver, this is also OK. But if there are multiple receivers, closing the done channel gets a little more complicated: it's not OK for all the receivers to close the done channel: closing an already closed channel panics. So the receivers also have to be coordinated, so only a single receiver, or rather the coordinator party itself closes the done channel, once only; and this has to happen after all receivers "abandoned" the channel.
Ranger() helps with this, and in a simple way by delegating closing the done channel using a finalizer. This is acceptable because usually it wouldn't even be the receiver(s) task to stop the communication, but in the rare case if this still arises, it will be dealt with (in an easy way, without the need of an additional, coordinator goroutine).

Is there a resource leak here?

func First(query string, replicas ...Search) Result {
c := make(chan Result)
searchReplica := func(i int) {
c <- replicas[i](query)
}
for i := range replicas {
go searchReplica(i)
}
return <-c
}
This function is from the slides of Rob Pike on go concurrency patterns in 2012. I think there is a resource leak in this function. As the function return after the first send & receive pair happens on channel c, the other go routines try to send on channel c. So there is a resource leak here. Anyone knows golang well can confirm this? And how can I detect this leak using what kind of golang tooling?
Yes, you are right (for reference, here's the link to the slide). In the above code only one launched goroutine will terminate, the rest will hang on attempting to send on channel c.
Detailing:
c is an unbuffered channel
there is only a single receive operation, in the return statement
A new goroutine is launched for each element of replicas
each launched goroutine sends a value on channel c
since there is only 1 receive from it, one goroutine will be able to send a value on it, the rest will block forever
Note that depending on the number of elements of replicas (which is len(replicas)):
if it's 0: First() would block forever (no one sends anything on c)
if it's 1: would work as expected
if it's > 1: then it leaks resources
The following modified version will not leak goroutines, by using a non-blocking send (with the help of select with default branch):
searchReplica := func(i int) {
select {
case c <- replicas[i](query):
default:
}
}
The first goroutine ready with the result will send it on channel c which will be received by the goroutine running First(), in the return statement. All other goroutines when they have the result will attempt to send on the channel, and "seeing" that it's not ready (send would block because nobody is ready to receive from it), the default branch will be chosen, and thus the goroutine will end normally.
Another way to fix it would be to use a buffered channel:
c := make(chan Result, len(replicas))
And this way the send operations would not block. And of course only one (the first sent) value will be received from the channel and returned.
Note that the solution with any of the above fixes would still block if len(replicas) is 0. To avoid that, First() should check this explicitly, e.g.:
func First(query string, replicas ...Search) Result {
if len(replicas) == 0 {
return Result{}
}
// ...rest of the code...
}
Some tools / resources to detect leaks:
https://github.com/fortytw2/leaktest
https://github.com/zimmski/go-leak
https://medium.com/golangspec/goroutine-leak-400063aef468
https://blog.minio.io/debugging-go-routine-leaks-a1220142d32c

Confused about channel argument for Goroutines

I am learning about channels and concurrency in Go and I am confused on how the code below works.
package main
import (
"fmt"
"time"
"sync/atomic"
)
var workerID int64
var publisherID int64
func main() {
input := make(chan string)
go workerProcess(input)
go workerProcess(input)
go workerProcess(input)
go publisher(input)
go publisher(input)
go publisher(input)
go publisher(input)
time.Sleep(1 * time.Millisecond)
}
// publisher pushes data into a channel
func publisher(out chan string) {
atomic.AddInt64(&publisherID, 1)
thisID := atomic.LoadInt64(&publisherID)
dataID := 0
for {
dataID++
fmt.Printf("publisher %d is pushing data\n", thisID)
data := fmt.Sprintf("Data from publisher %d. Data %d", thisID, dataID)
out <- data
}
}
func workerProcess(in <-chan string) {
atomic.AddInt64(&workerID, 1)
thisID := atomic.LoadInt64(&workerID)
for {
fmt.Printf("%d: waiting for input...\n", thisID)
input := <-in
fmt.Printf("%d: input is: %s\n", thisID, input)
}
}
This is what I understand please correct me if I'm wrong:
workerProcess Goroutine:
Input channel taken as the argument which is assigned as the in channel.
In the for loop, the first printf is executed.
The value off the in channel is assigned to the variable input.
The last printf is executed to show the value of thisID and input.(This doesn't really execute till after other Goroutines run).
publisher Goroutine:
Input channel taken as the argument which is assigned as the out channel.
In the for loop, dataID is incremented then first printf is executed.
The string is assigned to data.
The value of data is passed to out.
Questions:
Where is the value from out getting passed to in the publisher Goroutine? It seems to be only in the scope of the for loop, wouldn't this cause deadlock. Since this is an unbuffered channel.
I don't get how the workerProcess gets the data from publisher if
all the Goroutines in main have the arguments as the channel input.
I'm used to having code written like this if the output of one function
is used in another:
foo = fcn1(input)
fcn2(foo)
I suspect it has something with how Goroutines run in the background but I'm not to sure I would appreciate an explanation.
Why isn't the last printf statement in workerProcess not executed? My guess is the channel is empty, so it's waiting for a value.
Part of the output:
1: waiting for input...
publisher 1 is pushing data
publisher 1 is pushing data
1: input is: Data from publisher 1. Data 1
1: waiting for input...
1: input is: Data from publisher 1. Data 2
1: waiting for input...
publisher 1 is pushing data
publisher 1 is pushing data
publisher 2 is pushing data
2: waiting for input..
You have one channel, it is made with make in main.
This one single channel is named in in workerProcess and out in publisher. out is a function argument to publisher which answers question 1.
Question 2: That's the whole purpose of a channel. What you stuff into a channel on its input side comes out at its output side. If a function has reference to (one end of) such a channel it may communicate with someone else having reference to the same channel (its other end). What producer send to the channel is received by workerProcess. This send and receive is done through the special operator <- on Go. A fact you got slightly wrong in your explanation.
out <- data takes data and sends it through the channel named out until <- in reads it from the channel (remember in and out name the same channel from main). That's how workerProcess and publisher communicate.
Question 3 is a duplicate. Your whole program terminates once main is done (in your case after 1 millisecond. Nothing happens after termination of the program. Give the program more time to execute. (The non-printing is unrelated to the channel).

How to block all goroutines except the one running

I have two (but later I'll be three) go routines that are handling incoming messages from a remote server (from a ampq channel). But because they are handling on the same data/state, I want to block all other go routines, except the one running.
I come up with a solution to use chan bool where each go routine blocks and then release it, the code is like:
package main
func a(deliveries <-chan amqp, handleDone chan bool) {
for d := range deliveries {
<-handleDone // Data comes always, wait for other channels
handleDone <- false // Block other channels
// Do stuff with data...
handleDone <- true // I'm done, other channels are free to do anything
}
}
func b(deliveries <-chan amqp, handleDone chan bool) {
for d := range deliveries {
<-handleDone
handleDone <- false
// Do stuff with data...
handleDone <- true
}
}
func main() {
handleDone := make(chan bool, 1)
go a(arg1, handleDone)
go b(arg2, handleDone)
// go c(arg3, handleDone) , later
handleDone <- true // kickstart
}
But for the first time each of the function will get handleDone <- true, which they will be executed. And later if I add another third function, things will get more complicated. How can block all other go routines except the running? Any other better solutions?
You want to look at the sync package.
http://golang.org/pkg/sync/
You would do this with a mutex.
If you have an incoming stream of messages and you have three goroutines listening on that stream and processing and you want to ensure that only one goroutine is running at a time, the solution is quite simple: kill off two of the goroutines.
You're spinning up concurrency and adding complexity and then trying to prevent them from running concurrently. The end result is the same as a single stream reader, but with lots of things that can go wrong.
I'm puzzled why you want this - why can't each message on deliveries be handled independently? and why are there two different functions handling those message? If each is responsible for a particular type of message, it seems like you want one deliveries receiver that dispatches to appropriate logic for the type.
But to answer your question, I don't think it's true that each function will get a true from handleDone on start. One (let's say it's a) is receiving the true sent from main; the other (b then) is getting the false sent from the first. Because you're discarding the value received, you can't tell this. And then both are running, and you're using a buffered channel (you probably want make(chan bool) instead for an unbuffered one), so confusion ensues, particularly when you add that third goroutine.
The handleDone <- false doesn't actually accomplish anything. Just treat any value on handleDone as the baton in a relay race. Once a goroutine receives this value, it can do its thing; when it's done, it should send it to the channel to hand it to the next goroutine.

More idiomatic way of adding channel result to queue on completion

So, right now, I just pass a pointer to a Queue object (implementation doesn't really matter) and call queue.add(result) at the end of goroutines that should add things to the queue.
I need that same sort of functionality—and of course doing a loop checking completion with the comma ok syntax is unacceptable in terms of performance versus the simple queue add function call.
Is there a way to do this better, or not?
There are actually two parts to your question: how does one queue data in Go, and how does one use a channel without blocking.
For the first part, it sounds like what you need to do is instead of using the channel to add things to the queue, use the channel as a queue. For example:
var (
ch = make(chan int) // You can add an int parameter to this make call to create a buffered channel
// Do not buffer these channels!
gFinished = make(chan bool)
processFinished = make(chan bool)
)
func f() {
go g()
for {
// send values over ch here...
}
<-gFinished
close(ch)
}
func g() {
// create more expensive objects...
gFinished <- true
}
func processObjects() {
for val := range ch {
// Process each val here
}
processFinished <- true
}
func main() {
go processObjects()
f()
<-processFinished
}
As for how you can make this more asynchronous, you can (as cthom06 pointed out) pass a second integer to the make call in the second line which will make send operations asynchronous until the channel's buffer is full.
EDIT: However (as cthom06 also pointed out), because you have two goroutines writing to the channel, one of them has to be responsible for closing the channel. Also, my previous revision would exit before processObjects could complete. The way I chose to synchronize the goroutines is by creating a couple more channels that pass around dummy values to ensure that the cleanup gets finished properly. Those channels are specifically unbuffered so that the sends happen in lock-step.

Resources