Receive on multiple channels [duplicate] - go

to start an endless loop of executing two goroutines, I can use the code below:
after receiving the msg it will start a new goroutine and go on for ever.
c1 := make(chan string)
c2 := make(chan string)
go DoStuff(c1, 5)
go DoStuff(c2, 2)
for ; true; {
select {
case msg1 := <-c1:
fmt.Println("received ", msg1)
go DoStuff(c1, 1)
case msg2 := <-c2:
fmt.Println("received ", msg2)
go DoStuff(c2, 9)
}
}
I would now like to have the same behavior for N goroutines, but how will the select statement look in that case?
This is the code bit I have started with, but I am confused how to code the select statement
numChans := 2
//I keep the channels in this slice, and want to "loop" over them in the select statemnt
var chans = [] chan string{}
for i:=0;i<numChans;i++{
tmp := make(chan string);
chans = append(chans, tmp);
go DoStuff(tmp, i + 1)
//How shall the select statment be coded for this case?
for ; true; {
select {
case msg1 := <-c1:
fmt.Println("received ", msg1)
go DoStuff(c1, 1)
case msg2 := <-c2:
fmt.Println("received ", msg2)
go DoStuff(c2, 9)
}
}

You can do this using the Select function from the reflect package:
func Select(cases []SelectCase) (chosen int, recv Value, recvOK bool)
Select executes a select operation described by the list of cases. Like
the Go select statement, it blocks until at least one of the cases can
proceed, makes a uniform pseudo-random choice, and then executes that
case. It returns the index of the chosen case and, if that case was a
receive operation, the value received and a boolean indicating whether
the value corresponds to a send on the channel (as opposed to a zero
value received because the channel is closed).
You pass in an array of SelectCase structs that identify the channel to select on, the direction of the operation, and a value to send in the case of a send operation.
So you could do something like this:
cases := make([]reflect.SelectCase, len(chans))
for i, ch := range chans {
cases[i] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ch)}
}
chosen, value, ok := reflect.Select(cases)
// ok will be true if the channel has not been closed.
ch := chans[chosen]
msg := value.String()
You can experiment with a more fleshed out example here: http://play.golang.org/p/8zwvSk4kjx

You can accomplish this by wrapping each channel in a goroutine which "forwards" messages to a shared "aggregate" channel. For example:
agg := make(chan string)
for _, ch := range chans {
go func(c chan string) {
for msg := range c {
agg <- msg
}
}(ch)
}
select {
case msg <- agg:
fmt.Println("received ", msg)
}
If you need to know which channel the message originated from, you could wrap it in a struct with any extra information before forwarding it to the aggregate channel.
In my (limited) testing, this method greatly out performs using the reflect package:
$ go test dynamic_select_test.go -test.bench=.
...
BenchmarkReflectSelect 1 5265109013 ns/op
BenchmarkGoSelect 20 81911344 ns/op
ok command-line-arguments 9.463s
Benchmark code here

To expand on some comments on previous answers and to provide a clearer comparison here is an example of both approaches presented so far given the same input, a slice of channels to read from and a function to call for each value which also need to know which channel the value came from.
There are three main differences between the approaches:
Complexity. Although it may partially be a reader preference I find the channel approach more idiomatic, straight-forward, and readable.
Performance. On my Xeon amd64 system the goroutines+channels out performs the reflect solution by about two orders of magnitude (in general reflection in Go is often slower and should only be used when absolutely required). Of course, if there is any significant delay in either the function processing the results or in the writing of values to the input channels this performance difference can easily become insignificant.
Blocking/buffering semantics. The importantance of this depends on the use case. Most often it either won't matter or the slight extra buffering in the goroutine merging solution may be helpful for throughput. However, if it is desirable to have the semantics that only a single writer is unblocked and it's value fully handled before any other writer is unblocked, then that can only be achieved with the reflect solution.
Note, both approaches can be simplified if either the "id" of the sending channel isn't required or if the source channels will never be closed.
Goroutine merging channel:
// Process1 calls `fn` for each value received from any of the `chans`
// channels. The arguments to `fn` are the index of the channel the
// value came from and the string value. Process1 returns once all the
// channels are closed.
func Process1(chans []<-chan string, fn func(int, string)) {
// Setup
type item struct {
int // index of which channel this came from
string // the actual string item
}
merged := make(chan item)
var wg sync.WaitGroup
wg.Add(len(chans))
for i, c := range chans {
go func(i int, c <-chan string) {
// Reads and buffers a single item from `c` before
// we even know if we can write to `merged`.
//
// Go doesn't provide a way to do something like:
// merged <- (<-c)
// atomically, where we delay the read from `c`
// until we can write to `merged`. The read from
// `c` will always happen first (blocking as
// required) and then we block on `merged` (with
// either the above or the below syntax making
// no difference).
for s := range c {
merged <- item{i, s}
}
// If/when this input channel is closed we just stop
// writing to the merged channel and via the WaitGroup
// let it be known there is one fewer channel active.
wg.Done()
}(i, c)
}
// One extra goroutine to watch for all the merging goroutines to
// be finished and then close the merged channel.
go func() {
wg.Wait()
close(merged)
}()
// "select-like" loop
for i := range merged {
// Process each value
fn(i.int, i.string)
}
}
Reflection select:
// Process2 is identical to Process1 except that it uses the reflect
// package to select and read from the input channels which guarantees
// there is only one value "in-flight" (i.e. when `fn` is called only
// a single send on a single channel will have succeeded, the rest will
// be blocked). It is approximately two orders of magnitude slower than
// Process1 (which is still insignificant if their is a significant
// delay between incoming values or if `fn` runs for a significant
// time).
func Process2(chans []<-chan string, fn func(int, string)) {
// Setup
cases := make([]reflect.SelectCase, len(chans))
// `ids` maps the index within cases to the original `chans` index.
ids := make([]int, len(chans))
for i, c := range chans {
cases[i] = reflect.SelectCase{
Dir: reflect.SelectRecv,
Chan: reflect.ValueOf(c),
}
ids[i] = i
}
// Select loop
for len(cases) > 0 {
// A difference here from the merging goroutines is
// that `v` is the only value "in-flight" that any of
// the workers have sent. All other workers are blocked
// trying to send the single value they have calculated
// where-as the goroutine version reads/buffers a single
// extra value from each worker.
i, v, ok := reflect.Select(cases)
if !ok {
// Channel cases[i] has been closed, remove it
// from our slice of cases and update our ids
// mapping as well.
cases = append(cases[:i], cases[i+1:]...)
ids = append(ids[:i], ids[i+1:]...)
continue
}
// Process each value
fn(ids[i], v.String())
}
}
[Full code on the Go playground.]

We actually made some research about this subject and found the best solution. We used reflect.Select for a while and it is a great solution for the problem. It is much lighter than a goroutine per channel and simple to operate. But unfortunately, it doesn't really support a massive amount of channels which is our case so we found something interesting and wrote a blog post about it: https://cyolo.io/blog/how-we-enabled-dynamic-channel-selection-at-scale-in-go/
I'll summarize what is written there:
We statically created batches of select..case statements for every result of the power of two of exponent up to 32 along with a function that routes to the different cases and aggregates the results through an aggregate channel.
An example of such a batch:
func select4(ctx context.Context, chanz []chan interface{}, res chan *r, r *r, i int) {
select {
case r.v, r.ok = <-chanz[0]:
r.i = i + 0
res <- r
case r.v, r.ok = <-chanz[1]:
r.i = i + 1
res <- r
case r.v, r.ok = <-chanz[2]:
r.i = i + 2
res <- r
case r.v, r.ok = <-chanz[3]:
r.i = i + 3
res <- r
case <-ctx.Done():
break
}
}
And the logic of aggregating the first result from any number of channels using these kinds of select..case batches:
for i < len(channels) {
l = len(channels) - i
switch {
case l > 31 && maxBatchSize >= 32:
go select32(ctx, channels[i:i+32], agg, rPool.Get().(*r), i)
i += 32
case l > 15 && maxBatchSize >= 16:
go select16(ctx, channels[i:i+16], agg, rPool.Get().(*r), i)
i += 16
case l > 7 && maxBatchSize >= 8:
go select8(ctx, channels[i:i+8], agg, rPool.Get().(*r), i)
i += 8
case l > 3 && maxBatchSize >= 4:
go select4(ctx, channels[i:i+4], agg, rPool.Get().(*r), i)
i += 4
case l > 1 && maxBatchSize >= 2:
go select2(ctx, channels[i:i+2], agg, rPool.Get().(*r), i)
i += 2
case l > 0:
go select1(ctx, channels[i], agg, rPool.Get().(*r), i)
i += 1
}
}

Possibly simpler option:
Instead of having an array of channels, why not pass just one channel as a parameter to the functions being run on separate goroutines, and then listen to the channel in a consumer goroutine?
This allows you to select on just one channel in your listener, making for a simple select, and avoiding creation of new goroutines to aggregate messages from multiple channels?

Based on the answer of James Henstridge,
I made this generic (go >=1.18) Select function that takes a context and a slice of channels and returns the selected one:
func Select[T any](ctx context.Context, chs []chan T) (int, T, error) {
var zeroT T
cases := make([]reflect.SelectCase, len(chs)+1)
for i, ch := range chs {
cases[i] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ch)}
}
cases[len(chs)] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ctx.Done())}
// ok will be true if the channel has not been closed.
chosen, value, ok := reflect.Select(cases)
if !ok {
if ctx.Err() != nil {
return -1, zeroT, ctx.Err()
}
return chosen, zeroT, errors.New("channel closed")
}
if ret, ok := value.Interface().(T); ok {
return chosen, ret, nil
}
return chosen, zeroT, errors.New("failed to cast value")
}
Here is an example on how to use it:
func TestSelect(t *testing.T) {
c1 := make(chan int)
c2 := make(chan int)
c3 := make(chan int)
chs := []chan int{c1, c2, c3}
go func() {
time.Sleep(time.Second)
//close(c2)
c2 <- 42
}()
ctx, _ := context.WithTimeout(context.Background(), 5*time.Second)
chosen, val, err := Select(ctx, chs)
assert.Equal(t, 1, chosen)
assert.Equal(t, 42, val)
assert.NoError(t, err)
}

Why this approach wouldn't work assuming that somebody is sending events?
func main() {
numChans := 2
var chans = []chan string{}
for i := 0; i < numChans; i++ {
tmp := make(chan string)
chans = append(chans, tmp)
}
for true {
for i, c := range chans {
select {
case x = <-c:
fmt.Printf("received %d \n", i)
go DoShit(x, i)
default: continue
}
}
}
}

Related

Deadlock when running two go routine

Im studying Golang now on my freetime and I am trying sample exams online to test what i learned,
I came about this coding exam task but I cant seem to make it work/run without a crash,
im getting fatal error: all goroutines are asleep - deadlock! error, can anybody help what I am doing wrong here?
func executeParallel(ch chan<- int, done chan<- bool, functions ...func() int) {
ch <- functions[1]()
done <- true
}
func exampleFunction(counter int) int {
sum := 0
for i := 0; i < counter; i++ {
sum += 1
}
return sum
}
func main() {
expensiveFunction := func() int {
return exampleFunction(200000000)
}
cheapFunction := func() int {return exampleFunction(10000000)}
ch := make(chan int)
done := make(chan bool)
go executeParallel(ch, done, expensiveFunction, cheapFunction)
var isDone = <-done
for result := range ch {
fmt.Printf("Result: %d\n", result)
if isDone {
break;
}
}
}
Your executeParallel function will panic if less than 2 functions are provided - and will only run the 2nd function:
ch <- functions[1]() // runtime panic if less then 2 functions
I think it should look more like this: running all input functions in parallel and grabbing the first result.
for _, fn := range functions {
fn := fn // so each iteration/goroutine gets the proper value
go func() {
select {
case ch <- fn():
// first (fastest worker) wins
default:
// other workers results are discarded (if reader has not read results yet)
// this ensure we don't leak goroutines - since reader only reads one result from channel
}
}()
}
As such there's no need for a done channel - as we just need to read the one and only (quickest) result:
ch := make(chan int, 1) // big enough to capture one result - even if reader is not reading yet
executeParallel(ch, expensiveFunction, cheapFunction)
fmt.Printf("Result: %d\n", <-ch)
https://play.golang.org/p/skXc3gZZmRn
package main
import "fmt"
func executeParallel(ch chan<- int, done chan<- struct{}, functions ...func() int) {
// Only execute the second function [1], if available.
if len(functions) > 1 {
ch <- functions[1]()
}
// Close the done channel to signal the for-select to break and the main returns.
close(done)
}
// example returns the number of iterations for [0..counter-1]
func example(counter int) int {
sum := 0
for i := 0; i < counter; i++ {
sum += 1
}
return sum
// NOTE(SS): This function could just return "counter-1"
// to avoid the unnecessary calculation done above.
}
func main() {
var (
cheap = func() int { return example(10000000) }
expensive = func() int { return example(200000000) }
ch = make(chan int)
done = make(chan struct{})
)
// executeParallel takes ch, done channel followed by variable
// number of functions where on the second i.e., indexed 1
// function is executed on a separated goroutine which is then
// sent to ch channel which is then received by the for-select
// reciever below i.e., <-ch is the receiver.
go executeParallel(ch, done, expensive, cheap)
for {
select {
// Wait for something to be sent to done or the done channel
// to be closed.
case <-done:
return
// Keep receiving from ch (if something is sent to it)
case result := <-ch:
fmt.Println("Result:", result)
}
}
}
I have commented on the code so that it's understandable. As you didn't the actual question the logic could be still wrong.

Wait for all channels to receive and inform about completion using select [duplicate]

to start an endless loop of executing two goroutines, I can use the code below:
after receiving the msg it will start a new goroutine and go on for ever.
c1 := make(chan string)
c2 := make(chan string)
go DoStuff(c1, 5)
go DoStuff(c2, 2)
for ; true; {
select {
case msg1 := <-c1:
fmt.Println("received ", msg1)
go DoStuff(c1, 1)
case msg2 := <-c2:
fmt.Println("received ", msg2)
go DoStuff(c2, 9)
}
}
I would now like to have the same behavior for N goroutines, but how will the select statement look in that case?
This is the code bit I have started with, but I am confused how to code the select statement
numChans := 2
//I keep the channels in this slice, and want to "loop" over them in the select statemnt
var chans = [] chan string{}
for i:=0;i<numChans;i++{
tmp := make(chan string);
chans = append(chans, tmp);
go DoStuff(tmp, i + 1)
//How shall the select statment be coded for this case?
for ; true; {
select {
case msg1 := <-c1:
fmt.Println("received ", msg1)
go DoStuff(c1, 1)
case msg2 := <-c2:
fmt.Println("received ", msg2)
go DoStuff(c2, 9)
}
}
You can do this using the Select function from the reflect package:
func Select(cases []SelectCase) (chosen int, recv Value, recvOK bool)
Select executes a select operation described by the list of cases. Like
the Go select statement, it blocks until at least one of the cases can
proceed, makes a uniform pseudo-random choice, and then executes that
case. It returns the index of the chosen case and, if that case was a
receive operation, the value received and a boolean indicating whether
the value corresponds to a send on the channel (as opposed to a zero
value received because the channel is closed).
You pass in an array of SelectCase structs that identify the channel to select on, the direction of the operation, and a value to send in the case of a send operation.
So you could do something like this:
cases := make([]reflect.SelectCase, len(chans))
for i, ch := range chans {
cases[i] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ch)}
}
chosen, value, ok := reflect.Select(cases)
// ok will be true if the channel has not been closed.
ch := chans[chosen]
msg := value.String()
You can experiment with a more fleshed out example here: http://play.golang.org/p/8zwvSk4kjx
You can accomplish this by wrapping each channel in a goroutine which "forwards" messages to a shared "aggregate" channel. For example:
agg := make(chan string)
for _, ch := range chans {
go func(c chan string) {
for msg := range c {
agg <- msg
}
}(ch)
}
select {
case msg <- agg:
fmt.Println("received ", msg)
}
If you need to know which channel the message originated from, you could wrap it in a struct with any extra information before forwarding it to the aggregate channel.
In my (limited) testing, this method greatly out performs using the reflect package:
$ go test dynamic_select_test.go -test.bench=.
...
BenchmarkReflectSelect 1 5265109013 ns/op
BenchmarkGoSelect 20 81911344 ns/op
ok command-line-arguments 9.463s
Benchmark code here
To expand on some comments on previous answers and to provide a clearer comparison here is an example of both approaches presented so far given the same input, a slice of channels to read from and a function to call for each value which also need to know which channel the value came from.
There are three main differences between the approaches:
Complexity. Although it may partially be a reader preference I find the channel approach more idiomatic, straight-forward, and readable.
Performance. On my Xeon amd64 system the goroutines+channels out performs the reflect solution by about two orders of magnitude (in general reflection in Go is often slower and should only be used when absolutely required). Of course, if there is any significant delay in either the function processing the results or in the writing of values to the input channels this performance difference can easily become insignificant.
Blocking/buffering semantics. The importantance of this depends on the use case. Most often it either won't matter or the slight extra buffering in the goroutine merging solution may be helpful for throughput. However, if it is desirable to have the semantics that only a single writer is unblocked and it's value fully handled before any other writer is unblocked, then that can only be achieved with the reflect solution.
Note, both approaches can be simplified if either the "id" of the sending channel isn't required or if the source channels will never be closed.
Goroutine merging channel:
// Process1 calls `fn` for each value received from any of the `chans`
// channels. The arguments to `fn` are the index of the channel the
// value came from and the string value. Process1 returns once all the
// channels are closed.
func Process1(chans []<-chan string, fn func(int, string)) {
// Setup
type item struct {
int // index of which channel this came from
string // the actual string item
}
merged := make(chan item)
var wg sync.WaitGroup
wg.Add(len(chans))
for i, c := range chans {
go func(i int, c <-chan string) {
// Reads and buffers a single item from `c` before
// we even know if we can write to `merged`.
//
// Go doesn't provide a way to do something like:
// merged <- (<-c)
// atomically, where we delay the read from `c`
// until we can write to `merged`. The read from
// `c` will always happen first (blocking as
// required) and then we block on `merged` (with
// either the above or the below syntax making
// no difference).
for s := range c {
merged <- item{i, s}
}
// If/when this input channel is closed we just stop
// writing to the merged channel and via the WaitGroup
// let it be known there is one fewer channel active.
wg.Done()
}(i, c)
}
// One extra goroutine to watch for all the merging goroutines to
// be finished and then close the merged channel.
go func() {
wg.Wait()
close(merged)
}()
// "select-like" loop
for i := range merged {
// Process each value
fn(i.int, i.string)
}
}
Reflection select:
// Process2 is identical to Process1 except that it uses the reflect
// package to select and read from the input channels which guarantees
// there is only one value "in-flight" (i.e. when `fn` is called only
// a single send on a single channel will have succeeded, the rest will
// be blocked). It is approximately two orders of magnitude slower than
// Process1 (which is still insignificant if their is a significant
// delay between incoming values or if `fn` runs for a significant
// time).
func Process2(chans []<-chan string, fn func(int, string)) {
// Setup
cases := make([]reflect.SelectCase, len(chans))
// `ids` maps the index within cases to the original `chans` index.
ids := make([]int, len(chans))
for i, c := range chans {
cases[i] = reflect.SelectCase{
Dir: reflect.SelectRecv,
Chan: reflect.ValueOf(c),
}
ids[i] = i
}
// Select loop
for len(cases) > 0 {
// A difference here from the merging goroutines is
// that `v` is the only value "in-flight" that any of
// the workers have sent. All other workers are blocked
// trying to send the single value they have calculated
// where-as the goroutine version reads/buffers a single
// extra value from each worker.
i, v, ok := reflect.Select(cases)
if !ok {
// Channel cases[i] has been closed, remove it
// from our slice of cases and update our ids
// mapping as well.
cases = append(cases[:i], cases[i+1:]...)
ids = append(ids[:i], ids[i+1:]...)
continue
}
// Process each value
fn(ids[i], v.String())
}
}
[Full code on the Go playground.]
We actually made some research about this subject and found the best solution. We used reflect.Select for a while and it is a great solution for the problem. It is much lighter than a goroutine per channel and simple to operate. But unfortunately, it doesn't really support a massive amount of channels which is our case so we found something interesting and wrote a blog post about it: https://cyolo.io/blog/how-we-enabled-dynamic-channel-selection-at-scale-in-go/
I'll summarize what is written there:
We statically created batches of select..case statements for every result of the power of two of exponent up to 32 along with a function that routes to the different cases and aggregates the results through an aggregate channel.
An example of such a batch:
func select4(ctx context.Context, chanz []chan interface{}, res chan *r, r *r, i int) {
select {
case r.v, r.ok = <-chanz[0]:
r.i = i + 0
res <- r
case r.v, r.ok = <-chanz[1]:
r.i = i + 1
res <- r
case r.v, r.ok = <-chanz[2]:
r.i = i + 2
res <- r
case r.v, r.ok = <-chanz[3]:
r.i = i + 3
res <- r
case <-ctx.Done():
break
}
}
And the logic of aggregating the first result from any number of channels using these kinds of select..case batches:
for i < len(channels) {
l = len(channels) - i
switch {
case l > 31 && maxBatchSize >= 32:
go select32(ctx, channels[i:i+32], agg, rPool.Get().(*r), i)
i += 32
case l > 15 && maxBatchSize >= 16:
go select16(ctx, channels[i:i+16], agg, rPool.Get().(*r), i)
i += 16
case l > 7 && maxBatchSize >= 8:
go select8(ctx, channels[i:i+8], agg, rPool.Get().(*r), i)
i += 8
case l > 3 && maxBatchSize >= 4:
go select4(ctx, channels[i:i+4], agg, rPool.Get().(*r), i)
i += 4
case l > 1 && maxBatchSize >= 2:
go select2(ctx, channels[i:i+2], agg, rPool.Get().(*r), i)
i += 2
case l > 0:
go select1(ctx, channels[i], agg, rPool.Get().(*r), i)
i += 1
}
}
Possibly simpler option:
Instead of having an array of channels, why not pass just one channel as a parameter to the functions being run on separate goroutines, and then listen to the channel in a consumer goroutine?
This allows you to select on just one channel in your listener, making for a simple select, and avoiding creation of new goroutines to aggregate messages from multiple channels?
Based on the answer of James Henstridge,
I made this generic (go >=1.18) Select function that takes a context and a slice of channels and returns the selected one:
func Select[T any](ctx context.Context, chs []chan T) (int, T, error) {
var zeroT T
cases := make([]reflect.SelectCase, len(chs)+1)
for i, ch := range chs {
cases[i] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ch)}
}
cases[len(chs)] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ctx.Done())}
// ok will be true if the channel has not been closed.
chosen, value, ok := reflect.Select(cases)
if !ok {
if ctx.Err() != nil {
return -1, zeroT, ctx.Err()
}
return chosen, zeroT, errors.New("channel closed")
}
if ret, ok := value.Interface().(T); ok {
return chosen, ret, nil
}
return chosen, zeroT, errors.New("failed to cast value")
}
Here is an example on how to use it:
func TestSelect(t *testing.T) {
c1 := make(chan int)
c2 := make(chan int)
c3 := make(chan int)
chs := []chan int{c1, c2, c3}
go func() {
time.Sleep(time.Second)
//close(c2)
c2 <- 42
}()
ctx, _ := context.WithTimeout(context.Background(), 5*time.Second)
chosen, val, err := Select(ctx, chs)
assert.Equal(t, 1, chosen)
assert.Equal(t, 42, val)
assert.NoError(t, err)
}
Why this approach wouldn't work assuming that somebody is sending events?
func main() {
numChans := 2
var chans = []chan string{}
for i := 0; i < numChans; i++ {
tmp := make(chan string)
chans = append(chans, tmp)
}
for true {
for i, c := range chans {
select {
case x = <-c:
fmt.Printf("received %d \n", i)
go DoShit(x, i)
default: continue
}
}
}
}

Goroutines channels and "stopping short"

I'm reading/working through Go Concurrency Patterns: Pipelines and cancellation, but i'm having trouble understanding the Stopping short section. We have the following functions:
func sq(in <-chan int) <-chan int {
out := make(chan int)
go func() {
for n := range in {
out <- n * n
}
close(out)
}()
return out
}
func gen(nums ...int) <-chan int {
out := make(chan int)
go func() {
for _, n := range nums {
out <- n
}
close(out)
}()
return out
}
func merge(cs ...<-chan int) <-chan int {
var wg sync.WaitGroup
out := make(chan int, 1) // enough space for the unread inputs
// Start an output goroutine for each input channel in cs. output
// copies values from c to out until c is closed, then calls wg.Done.
output := func(c <-chan int) {
for n := range c {
out <- n
}
wg.Done()
}
wg.Add(len(cs))
for _, c := range cs {
go output(c)
}
// Start a goroutine to close out once all the output goroutines are
// done. This must start after the wg.Add call.
go func() {
wg.Wait()
close(out)
}()
return out
}
func main() {
in := gen(2, 3)
// Distribute the sq work across two goroutines that both read from in.
c1 := sq(in)
c2 := sq(in)
// Consume the first value from output.
out := merge(c1, c2)
fmt.Println(<-out) // 4 or 9
return
// Apparently if we had not set the merge out buffer size to 1
// then we would have a hanging go routine.
}
Now, if you notice line 2 in merge, it says we make the out chan with buffer size 1, because this is enough space for the unread inputs. However, I'm almost positive that we should allocate a chan with buffer size 2. In accordance with this code sample:
c := make(chan int, 2) // buffer size 2
c <- 1 // succeeds immediately
c <- 2 // succeeds immediately
c <- 3 // blocks until another goroutine does <-c and receives 1
Because this section implies that a chan of buffer size 3 would not block. Can anyone please clarify/assist my understanding?
The program sends two values to the channel out and reads one value from the channel out. One of the values is not received.
If the channel is unbuffered (capacity 0), then one of the sending goroutines will block until the program exits. This is a leak.
If the channel is created with a capacity of 1, then both goroutines can send to the channel and exit. The first value sent to the channel is received by main. The second value remains in the channel.
If the main function does not receive a value from the channel out, then a channel of capacity 2 is required to prevent the goroutines from blocking indefinitely.

"fan in" - one "fan out" behavior

Say, we have three methods to implement "fan in" behavior
func MakeChannel(tries int) chan int {
ch := make(chan int)
go func() {
for i := 0; i < tries; i++ {
ch <- i
}
close(ch)
}()
return ch
}
func MergeByReflection(channels ...chan int) chan int {
length := len(channels)
out := make(chan int)
cases := make([]reflect.SelectCase, length)
for i, ch := range channels {
cases[i] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ch)}
}
go func() {
for length > 0 {
i, line, opened := reflect.Select(cases)
if !opened {
cases[i].Chan = reflect.ValueOf(nil)
length -= 1
} else {
out <- int(line.Int())
}
}
close(out)
}()
return out
}
func MergeByCode(channels ...chan int) chan int {
length := len(channels)
out := make(chan int)
go func() {
var i int
var ok bool
for length > 0 {
select {
case i, ok = <-channels[0]:
out <- i
if !ok {
channels[0] = nil
length -= 1
}
case i, ok = <-channels[1]:
out <- i
if !ok {
channels[1] = nil
length -= 1
}
case i, ok = <-channels[2]:
out <- i
if !ok {
channels[2] = nil
length -= 1
}
case i, ok = <-channels[3]:
out <- i
if !ok {
channels[3] = nil
length -= 1
}
case i, ok = <-channels[4]:
out <- i
if !ok {
channels[4] = nil
length -= 1
}
}
}
close(out)
}()
return out
}
func MergeByGoRoutines(channels ...chan int) chan int {
var group sync.WaitGroup
out := make(chan int)
for _, ch := range channels {
go func(ch chan int) {
for i := range ch {
out <- i
}
group.Done()
}(ch)
}
group.Add(len(channels))
go func() {
group.Wait()
close(out)
}()
return out
}
type MergeFn func(...chan int) chan int
func main() {
length := 5
tries := 1000000
channels := make([]chan int, length)
fns := []MergeFn{MergeByReflection, MergeByCode, MergeByGoRoutines}
for _, fn := range fns {
sum := 0
t := time.Now()
for i := 0; i < length; i++ {
channels[i] = MakeChannel(tries)
}
for i := range fn(channels...) {
sum += i
}
fmt.Println(time.Since(t))
fmt.Println(sum)
}
}
Results are (at 1 CPU, I have used runtime.GOMAXPROCS(1)):
19.869s (MergeByReflection)
2499997500000
8.483s (MergeByCode)
2499997500000
4.977s (MergeByGoRoutines)
2499997500000
Results are (at 2 CPU, I have used runtime.GOMAXPROCS(2)):
44.94s
2499997500000
10.853s
2499997500000
3.728s
2499997500000
I understand the reason why MergeByReflection is slowest, but what is about the difference between MergeByCode and MergeByGoRoutines?
And when we increase the CPU number why "select" clause (used MergeByReflection directly and in MergeByCode indirectly) becomes slower?
Here is a preliminary remark. The channels in your examples are all unbuffered, meaning they will likely block at put or get time.
In this example, there is almost no processing except channel management. The performance is therefore dominated by synchronization primitives. Actually, there is very little of this code that can be parallelized.
In the MergeByReflection and MergeByCode functions, select is used to listen to multiple input channels, but nothing is done to take in account the output channel (which may therefore block, while some event could be available on one of the input channels).
In the MergeByGoRoutines function, this situation cannot happen: when the output channel blocks, it does not prevent an other input channel to be read by another goroutine. There are therefore better opportunities for the runtime to parallelize the goroutines, and less contention on the input channels.
The MergeByReflection code is the slowest because it has the overhead of reflection, and almost nothing can be parallelized.
The MergeByGoRoutines function is the fastest because it reduces the contention (less synchronization is needed), and because output contention has a lesser impact on the input performance. It can therefore benefit of a small improvement when running with multiple cores (contrary to the two other methods).
There is so much synchronization activity with MergeByReflection and MergeByCode, that running on multiple cores negatively impacts the performance. You could have different performance by using buffered channels though.

how to listen to N channels? (dynamic select statement)

to start an endless loop of executing two goroutines, I can use the code below:
after receiving the msg it will start a new goroutine and go on for ever.
c1 := make(chan string)
c2 := make(chan string)
go DoStuff(c1, 5)
go DoStuff(c2, 2)
for ; true; {
select {
case msg1 := <-c1:
fmt.Println("received ", msg1)
go DoStuff(c1, 1)
case msg2 := <-c2:
fmt.Println("received ", msg2)
go DoStuff(c2, 9)
}
}
I would now like to have the same behavior for N goroutines, but how will the select statement look in that case?
This is the code bit I have started with, but I am confused how to code the select statement
numChans := 2
//I keep the channels in this slice, and want to "loop" over them in the select statemnt
var chans = [] chan string{}
for i:=0;i<numChans;i++{
tmp := make(chan string);
chans = append(chans, tmp);
go DoStuff(tmp, i + 1)
//How shall the select statment be coded for this case?
for ; true; {
select {
case msg1 := <-c1:
fmt.Println("received ", msg1)
go DoStuff(c1, 1)
case msg2 := <-c2:
fmt.Println("received ", msg2)
go DoStuff(c2, 9)
}
}
You can do this using the Select function from the reflect package:
func Select(cases []SelectCase) (chosen int, recv Value, recvOK bool)
Select executes a select operation described by the list of cases. Like
the Go select statement, it blocks until at least one of the cases can
proceed, makes a uniform pseudo-random choice, and then executes that
case. It returns the index of the chosen case and, if that case was a
receive operation, the value received and a boolean indicating whether
the value corresponds to a send on the channel (as opposed to a zero
value received because the channel is closed).
You pass in an array of SelectCase structs that identify the channel to select on, the direction of the operation, and a value to send in the case of a send operation.
So you could do something like this:
cases := make([]reflect.SelectCase, len(chans))
for i, ch := range chans {
cases[i] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ch)}
}
chosen, value, ok := reflect.Select(cases)
// ok will be true if the channel has not been closed.
ch := chans[chosen]
msg := value.String()
You can experiment with a more fleshed out example here: http://play.golang.org/p/8zwvSk4kjx
You can accomplish this by wrapping each channel in a goroutine which "forwards" messages to a shared "aggregate" channel. For example:
agg := make(chan string)
for _, ch := range chans {
go func(c chan string) {
for msg := range c {
agg <- msg
}
}(ch)
}
select {
case msg <- agg:
fmt.Println("received ", msg)
}
If you need to know which channel the message originated from, you could wrap it in a struct with any extra information before forwarding it to the aggregate channel.
In my (limited) testing, this method greatly out performs using the reflect package:
$ go test dynamic_select_test.go -test.bench=.
...
BenchmarkReflectSelect 1 5265109013 ns/op
BenchmarkGoSelect 20 81911344 ns/op
ok command-line-arguments 9.463s
Benchmark code here
To expand on some comments on previous answers and to provide a clearer comparison here is an example of both approaches presented so far given the same input, a slice of channels to read from and a function to call for each value which also need to know which channel the value came from.
There are three main differences between the approaches:
Complexity. Although it may partially be a reader preference I find the channel approach more idiomatic, straight-forward, and readable.
Performance. On my Xeon amd64 system the goroutines+channels out performs the reflect solution by about two orders of magnitude (in general reflection in Go is often slower and should only be used when absolutely required). Of course, if there is any significant delay in either the function processing the results or in the writing of values to the input channels this performance difference can easily become insignificant.
Blocking/buffering semantics. The importantance of this depends on the use case. Most often it either won't matter or the slight extra buffering in the goroutine merging solution may be helpful for throughput. However, if it is desirable to have the semantics that only a single writer is unblocked and it's value fully handled before any other writer is unblocked, then that can only be achieved with the reflect solution.
Note, both approaches can be simplified if either the "id" of the sending channel isn't required or if the source channels will never be closed.
Goroutine merging channel:
// Process1 calls `fn` for each value received from any of the `chans`
// channels. The arguments to `fn` are the index of the channel the
// value came from and the string value. Process1 returns once all the
// channels are closed.
func Process1(chans []<-chan string, fn func(int, string)) {
// Setup
type item struct {
int // index of which channel this came from
string // the actual string item
}
merged := make(chan item)
var wg sync.WaitGroup
wg.Add(len(chans))
for i, c := range chans {
go func(i int, c <-chan string) {
// Reads and buffers a single item from `c` before
// we even know if we can write to `merged`.
//
// Go doesn't provide a way to do something like:
// merged <- (<-c)
// atomically, where we delay the read from `c`
// until we can write to `merged`. The read from
// `c` will always happen first (blocking as
// required) and then we block on `merged` (with
// either the above or the below syntax making
// no difference).
for s := range c {
merged <- item{i, s}
}
// If/when this input channel is closed we just stop
// writing to the merged channel and via the WaitGroup
// let it be known there is one fewer channel active.
wg.Done()
}(i, c)
}
// One extra goroutine to watch for all the merging goroutines to
// be finished and then close the merged channel.
go func() {
wg.Wait()
close(merged)
}()
// "select-like" loop
for i := range merged {
// Process each value
fn(i.int, i.string)
}
}
Reflection select:
// Process2 is identical to Process1 except that it uses the reflect
// package to select and read from the input channels which guarantees
// there is only one value "in-flight" (i.e. when `fn` is called only
// a single send on a single channel will have succeeded, the rest will
// be blocked). It is approximately two orders of magnitude slower than
// Process1 (which is still insignificant if their is a significant
// delay between incoming values or if `fn` runs for a significant
// time).
func Process2(chans []<-chan string, fn func(int, string)) {
// Setup
cases := make([]reflect.SelectCase, len(chans))
// `ids` maps the index within cases to the original `chans` index.
ids := make([]int, len(chans))
for i, c := range chans {
cases[i] = reflect.SelectCase{
Dir: reflect.SelectRecv,
Chan: reflect.ValueOf(c),
}
ids[i] = i
}
// Select loop
for len(cases) > 0 {
// A difference here from the merging goroutines is
// that `v` is the only value "in-flight" that any of
// the workers have sent. All other workers are blocked
// trying to send the single value they have calculated
// where-as the goroutine version reads/buffers a single
// extra value from each worker.
i, v, ok := reflect.Select(cases)
if !ok {
// Channel cases[i] has been closed, remove it
// from our slice of cases and update our ids
// mapping as well.
cases = append(cases[:i], cases[i+1:]...)
ids = append(ids[:i], ids[i+1:]...)
continue
}
// Process each value
fn(ids[i], v.String())
}
}
[Full code on the Go playground.]
We actually made some research about this subject and found the best solution. We used reflect.Select for a while and it is a great solution for the problem. It is much lighter than a goroutine per channel and simple to operate. But unfortunately, it doesn't really support a massive amount of channels which is our case so we found something interesting and wrote a blog post about it: https://cyolo.io/blog/how-we-enabled-dynamic-channel-selection-at-scale-in-go/
I'll summarize what is written there:
We statically created batches of select..case statements for every result of the power of two of exponent up to 32 along with a function that routes to the different cases and aggregates the results through an aggregate channel.
An example of such a batch:
func select4(ctx context.Context, chanz []chan interface{}, res chan *r, r *r, i int) {
select {
case r.v, r.ok = <-chanz[0]:
r.i = i + 0
res <- r
case r.v, r.ok = <-chanz[1]:
r.i = i + 1
res <- r
case r.v, r.ok = <-chanz[2]:
r.i = i + 2
res <- r
case r.v, r.ok = <-chanz[3]:
r.i = i + 3
res <- r
case <-ctx.Done():
break
}
}
And the logic of aggregating the first result from any number of channels using these kinds of select..case batches:
for i < len(channels) {
l = len(channels) - i
switch {
case l > 31 && maxBatchSize >= 32:
go select32(ctx, channels[i:i+32], agg, rPool.Get().(*r), i)
i += 32
case l > 15 && maxBatchSize >= 16:
go select16(ctx, channels[i:i+16], agg, rPool.Get().(*r), i)
i += 16
case l > 7 && maxBatchSize >= 8:
go select8(ctx, channels[i:i+8], agg, rPool.Get().(*r), i)
i += 8
case l > 3 && maxBatchSize >= 4:
go select4(ctx, channels[i:i+4], agg, rPool.Get().(*r), i)
i += 4
case l > 1 && maxBatchSize >= 2:
go select2(ctx, channels[i:i+2], agg, rPool.Get().(*r), i)
i += 2
case l > 0:
go select1(ctx, channels[i], agg, rPool.Get().(*r), i)
i += 1
}
}
Possibly simpler option:
Instead of having an array of channels, why not pass just one channel as a parameter to the functions being run on separate goroutines, and then listen to the channel in a consumer goroutine?
This allows you to select on just one channel in your listener, making for a simple select, and avoiding creation of new goroutines to aggregate messages from multiple channels?
Based on the answer of James Henstridge,
I made this generic (go >=1.18) Select function that takes a context and a slice of channels and returns the selected one:
func Select[T any](ctx context.Context, chs []chan T) (int, T, error) {
var zeroT T
cases := make([]reflect.SelectCase, len(chs)+1)
for i, ch := range chs {
cases[i] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ch)}
}
cases[len(chs)] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(ctx.Done())}
// ok will be true if the channel has not been closed.
chosen, value, ok := reflect.Select(cases)
if !ok {
if ctx.Err() != nil {
return -1, zeroT, ctx.Err()
}
return chosen, zeroT, errors.New("channel closed")
}
if ret, ok := value.Interface().(T); ok {
return chosen, ret, nil
}
return chosen, zeroT, errors.New("failed to cast value")
}
Here is an example on how to use it:
func TestSelect(t *testing.T) {
c1 := make(chan int)
c2 := make(chan int)
c3 := make(chan int)
chs := []chan int{c1, c2, c3}
go func() {
time.Sleep(time.Second)
//close(c2)
c2 <- 42
}()
ctx, _ := context.WithTimeout(context.Background(), 5*time.Second)
chosen, val, err := Select(ctx, chs)
assert.Equal(t, 1, chosen)
assert.Equal(t, 42, val)
assert.NoError(t, err)
}
Why this approach wouldn't work assuming that somebody is sending events?
func main() {
numChans := 2
var chans = []chan string{}
for i := 0; i < numChans; i++ {
tmp := make(chan string)
chans = append(chans, tmp)
}
for true {
for i, c := range chans {
select {
case x = <-c:
fmt.Printf("received %d \n", i)
go DoShit(x, i)
default: continue
}
}
}
}

Resources