I am new to go and I am trying to create a simple chat server where clients can broadcast messages to all connected clients.
In my server, I have a goroutine (infinite for loop) that accepts connection and all the connections are received by a channel.
go func() {
for {
conn, _ := listener.Accept()
ch <- conn
}
}()
Then, I start a handler (goroutine) for every connected client. Inside the handler, I try to broadcast to all connections by iterating through the channel.
for c := range ch {
conn.Write(msg)
}
However, I cannot broadcast because (I think from reading the docs) the channel needs to be closed before iterating. I am not sure when I should close the channel because I want to continuously accept new connections and closing the channel won't let me do that. If anyone can help me, or provide a better way to broadcast messages to all connected clients, it would be appreciated.
What you are doing is a fan out pattern, that is to say, multiple endpoints are listening to a single input source. The result of this pattern is, only one of these listeners will be able to get the message whenever there's a message in the input source. The only exception is a close of channel. This close will be recognized by all of the listeners, and thus a "broadcast".
But what you want to do is broadcasting a message read from connection, so we could do something like this:
When the number of listeners is known
Let each worker listen to dedicated broadcast channel, and dispatch the message from the main channel to each dedicated broadcast channel.
type worker struct {
source chan interface{}
quit chan struct{}
}
func (w *worker) Start() {
w.source = make(chan interface{}, 10) // some buffer size to avoid blocking
go func() {
for {
select {
case msg := <-w.source
// do something with msg
case <-quit: // will explain this in the last section
return
}
}
}()
}
And then we could have a bunch of workers:
workers := []*worker{&worker{}, &worker{}}
for _, worker := range workers { worker.Start() }
Then start our listener:
go func() {
for {
conn, _ := listener.Accept()
ch <- conn
}
}()
And a dispatcher:
go func() {
for {
msg := <- ch
for _, worker := workers {
worker.source <- msg
}
}
}()
When the number of listeners is not known
In this case, the solution given above still works. The only difference is, whenever you need a new worker, you need to create a new worker, start it up, and then push it into workers slice. But this method requires a thread-safe slice, which need a lock around it. One of the implementation may look like as follows:
type threadSafeSlice struct {
sync.Mutex
workers []*worker
}
func (slice *threadSafeSlice) Push(w *worker) {
slice.Lock()
defer slice.Unlock()
workers = append(workers, w)
}
func (slice *threadSafeSlice) Iter(routine func(*worker)) {
slice.Lock()
defer slice.Unlock()
for _, worker := range workers {
routine(worker)
}
}
Whenever you want to start a worker:
w := &worker{}
w.Start()
threadSafeSlice.Push(w)
And your dispatcher will be changed to:
go func() {
for {
msg := <- ch
threadSafeSlice.Iter(func(w *worker) { w.source <- msg })
}
}()
Last words: never leave a dangling goroutine
One of the good practices is: never leave a dangling goroutine. So when you finished listening, you need to close all of the goroutines you fired. This will be done via quit channel in worker:
First we need to create a global quit signalling channel:
globalQuit := make(chan struct{})
And whenever we create a worker, we assign the globalQuit channel to it as its quit signal:
worker.quit = globalQuit
Then when we want to shutdown all workers, we simply do:
close(globalQuit)
Since close will be recognized by all listening goroutines (this is the point you understood), all goroutines will be returned. Remember to close your dispatcher routine as well, but I will leave it to you :)
A more elegant solution is a "broker", where clients may subscribe and unsubscribe to messages.
To also handle subscribing and unsubscribing elegantly, we may utilize channels for this, so the main loop of the broker which receives and distributes the messages can incorporate all these using a single select statement, and synchronization is given from the solution's nature.
Another trick is to store the subscribers in a map, mapping from the channel we use to distribute messages to them. So use the channel as the key in the map, and then adding and removing the clients is "dead" simple. This is made possible because channel values are comparable, and their comparison is very efficient as channel values are simple pointers to channel descriptors.
Without further ado, here's a simple broker implementation:
type Broker[T any] struct {
stopCh chan struct{}
publishCh chan T
subCh chan chan T
unsubCh chan chan T
}
func NewBroker[T any]() *Broker[T] {
return &Broker[T]{
stopCh: make(chan struct{}),
publishCh: make(chan T, 1),
subCh: make(chan chan T, 1),
unsubCh: make(chan chan T, 1),
}
}
func (b *Broker[T]) Start() {
subs := map[chan T]struct{}{}
for {
select {
case <-b.stopCh:
return
case msgCh := <-b.subCh:
subs[msgCh] = struct{}{}
case msgCh := <-b.unsubCh:
delete(subs, msgCh)
case msg := <-b.publishCh:
for msgCh := range subs {
// msgCh is buffered, use non-blocking send to protect the broker:
select {
case msgCh <- msg:
default:
}
}
}
}
}
func (b *Broker[T]) Stop() {
close(b.stopCh)
}
func (b *Broker[T]) Subscribe() chan T {
msgCh := make(chan T, 5)
b.subCh <- msgCh
return msgCh
}
func (b *Broker[T]) Unsubscribe(msgCh chan T) {
b.unsubCh <- msgCh
}
func (b *Broker[T]) Publish(msg T) {
b.publishCh <- msg
}
Example using it:
func main() {
// Create and start a broker:
b := NewBroker[string]()
go b.Start()
// Create and subscribe 3 clients:
clientFunc := func(id int) {
msgCh := b.Subscribe()
for {
fmt.Printf("Client %d got message: %v\n", id, <-msgCh)
}
}
for i := 0; i < 3; i++ {
go clientFunc(i)
}
// Start publishing messages:
go func() {
for msgId := 0; ; msgId++ {
b.Publish(fmt.Sprintf("msg#%d", msgId))
time.Sleep(300 * time.Millisecond)
}
}()
time.Sleep(time.Second)
}
Output of the above will be (try it on the Go Playground):
Client 2 got message: msg#0
Client 0 got message: msg#0
Client 1 got message: msg#0
Client 2 got message: msg#1
Client 0 got message: msg#1
Client 1 got message: msg#1
Client 1 got message: msg#2
Client 2 got message: msg#2
Client 0 got message: msg#2
Client 2 got message: msg#3
Client 0 got message: msg#3
Client 1 got message: msg#3
Improvements
You may consider the following improvements. These may or may not be useful depending on how / to what you use the broker.
Broker.Unsubscribe() may close the message channel, signalling that no more messages will be sent on it:
func (b *Broker[T]) Unsubscribe(msgCh chan T) {
b.unsubCh <- msgCh
close(msgCh)
}
This would allow clients to range over the message channel, like this:
msgCh := b.Subscribe()
for msg := range msgCh {
fmt.Printf("Client %d got message: %v\n", id, msg)
}
Then if someone unsubscribes this msgCh like this:
b.Unsubscribe(msgCh)
The above range loop will terminate after processing all messages that were sent before the call to Unsubscribe().
If you want your clients to rely on the message channel being closed, and the broker's lifetime is narrower than your app's lifetime, then you could also close all subscribed clients when the broker is stopped, in the Start() method like this:
case <-b.stopCh:
for msgCh := range subs {
close(msgCh)
}
return
Broadcast to a slice of channel and use sync.Mutex to manage channel add and remove may be the easiest way in your case.
Here is what you can do to broadcast in golang:
You can broadcast a share status change with sync.Cond. This way do not have any alloc once setup, but you can not add timeout functional or work with another channel.
You can broadcast a share status change with a close old channel and create new channel and sync.Mutex. This way have one alloc per status change, but you can add timeout functional and work with another channel.
You can broadcast to a slice of function callback and use sync.Mutex to manage them. The caller can do channel stuff. This way have more than one alloc per caller, and work with another channel.
You can broadcast to a slice of channel and use sync.Mutex to manage them. This way have more than one alloc per caller, and work with another channel.
You can broadcast to a slice of sync.WaitGroup and use sync.Mutex to manage them.
This is a late answer but I think it may appease some curious readers.
Go channels are widely welcomed to be used when it comes to concurrency.
Go community is rigid to follow this saying:
Do not communicate by sharing memory; instead, share memory by communicating.
I am completely neutral toward this and I think other options rather than well-defined channels should be considered when it comes to broadcasting.
Here is my take: Cond from sync packages are widely overlooked. Implementing braodcaster as suggested by Bronze man in very same context worths noting.
I was delighted witch icza suggestion to use channels and broadcast messages over them. I follow the same methods and use sync's conditional variable:
// Broadcaster is the struct which encompasses broadcasting
type Broadcaster struct {
cond *sync.Cond
subscribers map[interface{}]func(interface{})
message interface{}
running bool
}
this is the main struct that our whole broadcasting concept relies on.
Below, I define some behaviours for this struct. In a nutshell, subscribers should be able to be added, removed and whole the process should be revokable.
// SetupBroadcaster gives the broadcaster object to be used further in messaging
func SetupBroadcaster() *Broadcaster {
return &Broadcaster{
cond: sync.NewCond(&sync.RWMutex{}),
subscribers: map[interface{}]func(interface{}){},
}
}
// Subscribe let others enroll in broadcast event!
func (b *Broadcaster) Subscribe(id interface{}, f func(input interface{})) {
b.subscribers[id] = f
}
// Unsubscribe stop receiving broadcasting
func (b *Broadcaster) Unsubscribe(id interface{}) {
b.cond.L.Lock()
delete(b.subscribers, id)
b.cond.L.Unlock()
}
// Publish publishes the message
func (b *Broadcaster) Publish(message interface{}) {
go func() {
b.cond.L.Lock()
b.message = message
b.cond.Broadcast()
b.cond.L.Unlock()
}()
}
// Start the main broadcasting event
func (b *Broadcaster) Start() {
b.running = true
for b.running {
b.cond.L.Lock()
b.cond.Wait()
go func() {
for _, f := range b.subscribers {
f(b.message) // publishes the message
}
}()
b.cond.L.Unlock()
}
}
// Stop broadcasting event
func (b *Broadcaster) Stop() {
b.running = false
}
Next, I can use it quite easily:
messageToaster := func(message interface{}) {
fmt.Printf("[New Message]: %v\n", message)
}
unwillingReceiver := func(message interface{}) {
fmt.Println("Do not disturb!")
}
broadcaster := SetupBroadcaster()
broadcaster.Subscribe(1, messageToaster)
broadcaster.Subscribe(2, messageToaster)
broadcaster.Subscribe(3, unwillingReceiver)
go broadcaster.Start()
broadcaster.Publish("Hello!")
time.Sleep(time.Second)
broadcaster.Unsubscribe(3)
broadcaster.Publish("Goodbye!")
It should print something like this in any order:
[New Message]: Hello!
Do not disturb!
[New Message]: Hello!
[New Message]: Goodbye!
[New Message]: Goodbye!
See this on go playground
another one simple example:
https://play.golang.org
type Broadcaster struct {
mu sync.Mutex
clients map[int64]chan struct{}
}
func NewBroadcaster() *Broadcaster {
return &Broadcaster{
clients: make(map[int64]chan struct{}),
}
}
func (b *Broadcaster) Subscribe(id int64) (<-chan struct{}, error) {
defer b.mu.Unlock()
b.mu.Lock()
s := make(chan struct{}, 1)
if _, ok := b.clients[id]; ok {
return nil, fmt.Errorf("signal %d already exist", id)
}
b.clients[id] = s
return b.clients[id], nil
}
func (b *Broadcaster) Unsubscribe(id int64) {
defer b.mu.Unlock()
b.mu.Lock()
if _, ok := b.clients[id]; ok {
close(b.clients[id])
}
delete(b.clients, id)
}
func (b *Broadcaster) broadcast() {
defer b.mu.Unlock()
b.mu.Lock()
for k := range b.clients {
if len(b.clients[k]) == 0 {
b.clients[k] <- struct{}{}
}
}
}
type testClient struct {
name string
signal <-chan struct{}
signalID int64
brd *Broadcaster
}
func (c *testClient) doWork() {
i := 0
for range c.signal {
fmt.Println(c.name, "do work", i)
if i > 2 {
c.brd.Unsubscribe(c.signalID)
fmt.Println(c.name, "unsubscribed")
}
i++
}
fmt.Println(c.name, "done")
}
func main() {
var err error
brd := NewBroadcaster()
clients := make([]*testClient, 0)
for i := 0; i < 3; i++ {
c := &testClient{
name: fmt.Sprint("client:", i),
signalID: time.Now().UnixNano()+int64(i), // +int64(i) for play.golang.org
brd: brd,
}
c.signal, err = brd.Subscribe(c.signalID)
if err != nil {
log.Fatal(err)
}
clients = append(clients, c)
}
for i := 0; i < len(clients); i++ {
go clients[i].doWork()
}
for i := 0; i < 6; i++ {
brd.broadcast()
time.Sleep(time.Second)
}
}
output:
client:0 do work 0
client:2 do work 0
client:1 do work 0
client:2 do work 1
client:0 do work 1
client:1 do work 1
client:2 do work 2
client:0 do work 2
client:1 do work 2
client:2 do work 3
client:2 unsubscribed
client:2 done
client:0 do work 3
client:0 unsubscribed
client:0 done
client:1 do work 3
client:1 unsubscribed
client:1 done
Because Go channels follow the Communicating Sequential Processes (CSP) pattern, channels are a point-to-point communication entity. There is always one writer and one reader involved in each exchange.
However, each channel end can be shared amongst multiple goroutines. This is safe to do - there is no dangerous race condition.
So there can be multiple writers sharing the writing end. And/or there can be multiple readers sharing the reading end. I wrote more on this in a different answer, which includes examples.
If you really need a broadcast, you cannot do this directly, but it is not hard to implement an intermediate goroutine that copies a value out to each of a group of output channels.
The canonical (and idiomatic go) way to do this is via a slice of channels, as recommended above by Nevets and icza.
You should specifically not use a slice of callbacks. In some languages, you do typically register observers by passing a callback, but in those cases, you have to wrap their invocation in a fair amount of defensive code to protect the sender, and ideally you should have the generator of the message (the "Subject" in classic Observer pattern discussion) segregated from the observers by an intermediate message transport layer. This is where you typically use a pub-sub mesh (JMS brokers, gnats, MQ, whatever) when you're crossing process boundaries, but you should adhere to the same pattern if both subject and observers are internal to the same process (and most languages have available implementations of such mechanisms, so you shouldn't need to roll your own).
The reasons not to use callbacks include:
Unless you build in your own message transport layer, your subject is no longer both naive (it doesn't know the nature or cardinality of the observers) and disinterested (it doesn't care what they do with the message, only that it is made available to any interested parties);
If you want true broadcasting, then you need to act as if the order of receipt does not matter - ideally, everyone can see the message at the same time, even though in practice sending is iterative, even when using channels. But sending to recipient n+1 should absolutely not depend on confirmation of receipt by recipient n. That isn't broadcasting, it's serialized assignment. I say assignment because, if you are asking for a callback, then in executing the callback, you are enforcing (even if only minimally) some behavior to be taken by the recipient. You've basically turned your sender into an orchestrator, which is a very different sort of pattern with a different set of use cases.
Absent a defensive boundary (wrapping each callback invocation in a separate goroutine with a timeout context, e.g.), you are vulnerable to being blocked by a recipient - this is antithetical to broadcasting. Receipt (and optionally, taking any action at all based on) a broadcast message must be entirely asynchronous with respect to the original sending.
Is it doable to provide pseudo-broadcasting by using callbacks in go? Sure, but you have to invest in so much additional complexity to keep things clean - and why would you do that when go provides an easy and rather robust way to do it? The examples of channel-driven broadcasting above are good ones and how you should do it pretty much every time.
The specific exception when you absolutely should use callbacks is when you are not disinterested - you really do care that, on the basis of the sent message, the recipients take some action (and usually something specified by contract). For example, "I am about to unmount this filesystem, so flush and close your filehandles, let me know once you're done." (I know that's a pretty old-fashioned example, but it's the first one that comes to mind.)
Related
Hi I'm having a problem with a control channel (of sorts).
The essence of my program:
I do not know how many go routines I will be running at runtime
I will need to restart these go routines at set times, however, they could also potentially error out (and then restarted), so their timing will not be predictable.
These go routines will be putting messages onto a single channel.
So What I've done is created a simple random message generator to put messages onto a channel.
When the timer is up (random duration for testing) I put a message onto a control channel which is a struct payload, so I know there was a close signal and which go routine it was; in reality I'd then do some other stuff I'd need to do before starting the go routines again.
My problem is:
I receive the control message within my reflect.Select loop
I do not (or unable to) receive it in my randmsgs() loop
Therefore I can not stop my randmsgs() go routine.
I believe I'm right in understanding that multiple go routines can read from a single channel, therefore I think I'm misunderstanding how reflect.SelectCases fit into all of this.
My code:
package main
import (
"fmt"
"math/rand"
"reflect"
"time"
)
type testing struct {
control bool
market string
}
func main() {
rand.Seed(time.Now().UnixNano())
// explicitly define chanids for tests.
var chanids []string = []string{"GR I", "GR II", "GR III", "GR IV"}
stream := make(chan string)
control := make([]chan testing, len(chanids))
reflectCases := make([]reflect.SelectCase, len(chanids)+1)
// MAKE REFLECT SELECTS FOR 4 CONTROL CHANS AND 1 DATA CHANNEL
for i := range chanids {
control[i] = make(chan testing)
reflectCases[i] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(control[i])}
}
reflectCases[4] = reflect.SelectCase{Dir: reflect.SelectRecv, Chan: reflect.ValueOf(stream)}
// START GO ROUTINES
for i, val := range chanids {
runningFunc(control[i], val, stream, 1+rand.Intn(30-1))
}
// READ DATA
for {
o, recieved, ok := reflect.Select(reflectCases)
if !ok {
fmt.Println("You really buggered this one up...")
}
ty, err := recieved.Interface().(testing)
if err == true {
fmt.Printf("Read from: %v, and recieved close signal from: %s\n", o, ty.market)
// close control & stream here.
} else {
ty := recieved.Interface().(string)
fmt.Printf("Read from: %v, and recieved value from: %s\n", o, ty)
}
}
}
// THE GO ROUTINES - TIMER AND RANDMSGS
func runningFunc(q chan testing, chanid string, stream chan string, dur int) {
go timer(q, dur, chanid)
go randmsgs(q, chanid, stream)
}
func timer(q chan testing, t int, message string) {
for t > 0 {
time.Sleep(time.Second)
t--
}
q <- testing{true, message}
}
func randmsgs(q chan testing, chanid string, stream chan string) {
for {
select {
case <-q:
fmt.Println("Just sitting by the mailbox. :(")
return
default:
secondsToWait := 1 + rand.Intn(5-1)
time.Sleep(time.Second * time.Duration(secondsToWait))
stream <- fmt.Sprintf("%s: %d", chanid, secondsToWait)
}
}
}
I apologise for the wall of text, but I'm all out of ideas :(!
K/Regards,
C.
Your channels q in the second half are the same as control[0...3] in the first.
Your reflect.Select that you are running also reads from all of these channels, with no delay.
The problem I think comes down to that your reflect.Select is simply running too fast and "stealing" all the channel output right away. This is why randmsgs is never able to read the messages.
You'll notice that if you remove the default case from randmsgs, the function is able to (potentially) read some of the messages from q.
select {
case <-q:
fmt.Println("Just sitting by the mailbox. :(")
return
}
This is because now that it is running without delay, it is always waiting for a message on q and thus has the chance to beat the reflect.Select in the race.
If you read from the same channel in multiple goroutines, then the data passed will simply go to whatever goroutine reads it first.
This program appears to just be an experiment / learning experience, but I'll offer some criticism that may help.
Again, generally you don't have multiple goroutines reading from the same channel if both goroutines are doing different tasks. You're creating a mostly non-deterministic race as to which goroutine fetches the data first.
Second, this is a common beginner's anti-pattern with select that you should avoid:
for {
select {
case v := <-myChan:
doSomething(v)
default:
// Oh no, there wasn't anything! Guess we have to wait and try again.
time.Sleep(time.Second)
}
This code is redundant because select already behaves in such a way that if no case is initially ready, it will wait until any case is ready and then proceed with that one. This default: sleep is effectively making your select loop slower and yet spending less time actually waiting on the channel (because 99.999...% of the time is spent on time.Sleep).
I've been working on a sort-of pub-sub mechanism in the application we're building. The business logic basically generates a whack-ton of events, which in turn can be used to feed data to the client using API's, or persistent in storage if the application is running with that option enabled.
What we had, and observed:
Long story short, it turns out we were dropping data we really ought not to have been dropping. The "subscriber" had a channel with a large buffer, and essentially only read data from this channel, checked a few things, and appended it to a slice. The capacity of the slice is such that memory allocations were kept to a minimum. Simulating a scenario where the subscriber channel had a buffer of, say, 1000 data-sets, we noticed data could be dropping after only 10 sets being sent. The very first event was never dropped.
The code we had at this point looks something like this:
type broker struct {
ctx context.Context
subs []*sub
}
type sub struct {
ctx context.Context
mu *sync.Mutex
ch chan []interface{}
buf []interface{}
}
func (b *broker) Send(evts ...interface{}) {
rm := make([]int, 0, len(b.subs))
defer func() {
for i := len(rm) - 1; i >= 0; i-- {
// last element
if i == len(b.subs)-1 {
b.subs = b.subs[:i]
continue
}
b.subs = append(b.subs[:i], b.subs[i+1:]...)
}
}()
for i, s := range b.subs {
select {
case <-b.ctx.Done(): // is the app still running
return
case <-s.Stopped(): // is this sub still valid
rm = append(rm, i)
case s.C() <- evts: // can we write to the channel
continue
default: // app is running, sub is valid, but channel is presumably full, skip
fmt.Println("skipped some events")
}
}
}
func NewSub(ctx context.Context, buf int) *sub {
s := &sub{
ctx: ctx,
mu: &sync.Mutex{},
ch: make(chan []interface{}, buf),
buf: make([]interface{}, 0, buf),
}
go s.loop(ctx) // start routine to consume events
return s
}
func (s *sub) C() chan<- []interface{} {
return s.ch
}
func (s *sub) Stopped() <-chan struct{} {
return s.ctx.Done()
}
func (s *sub) loop(ctx context.Context) {
defer func() {
close(s.ch)
}()
for {
select {
case <-ctx.Done():
return
case data := <-s.ch:
// do some processing
s.mu.Lock()
s.buf = append(s.buf, data...)
s.mu.Unlock()
}
}
}
func (s *sub) GetProcessedData(amt int) []*wrappedT {
s.mu.Lock()
data := s.buf
if len(data) == amt {
s.buf = s.buf[:0]
} else if len(data) > amt {
data = data[:amt]
s.buf = s.buf[amt:]
} else {
s.buf = make([]interface{}, 0, cap(s.buf))
}
s.mu.Unlock()
ret := make([]*wrappedT, 0, len(data))
for _, v := range data {
// some processing
ret = append(ret, &wrappedT{v})
}
return ret
}
Now obviously, the buffers are there to ensure that events can still be consumed when we're calling things like GetProcessedData. That type of call is usually the result of an API request, or some internal flush/persist to storage mechanism. Because of the mutex lock, we might not be reading from the internal channel. Weirdly, the channel buffers never got backed up all the way through, but not all data made its way through to the subscribers. As mentioned, the first event always did, which made us even more suspicious.
What we eventually tried (to fix):
After a fair bit of debugging, hair pulling, looking at language specs, and fruitless googling I began to suspect the select statement to be the problem. Instead of sending to the channels directly, I changed it to the rather hacky:
func (b *broker) send(s *sub, evts []interface{}) {
ctx, cfunc := context.WithTimeout(b.ctx, 100 *time.Millisecond)
defer cfunc()
select {
case <-ctx:
return
case sub.C() <- evts:
return
case <-sub.Closed():
return
}
}
func (b *broker) Send(evts ...interface{}) {
for _, s := range b.subs {
go b.send(s, evts)
}
}
Instantly, all events were correctly propagated through the system. Calling Send on the broker wasn't blocking the part of the system that actually performs the heavy lifting (that was the reason for the use of channels after all), and things are performing reasonably well.
The actual question(s):
There's a couple of things still bugging me:
The way I read the specs, the default statement ought to be the last resort, solely as a way out to prevent blocking channel operations in a select statement. Elsewhere, I read that the runtime may not consider a case ready for communication if there is no routine consuming what you're about to write to the channel, irrespective of channel buffers. Is this indeed how it works?
For the time being, the context with timeout fixes the bigger issue of data not propagating properly. However, I do feel like there should be a better way.
Has anyone ever encountered something similar, and worked out exactly what's going on?
I'm happy to provide more details where needed. I've kept the code as minimal as possible, omitting a lot of complexities WRT the broker system we're using (different event types, different types of subscribers, etc...). We don't use the interface{} type anywhere in case anyone is worried about that :P
For now, though, I think this is plenty of text for a Friday.
I'm trying to create message hub in golang. Messages are getting through different channels that persist in map[uint32]chan []float64. I do an endless loop over map and check if a channel has a message. If it has, I write it to the common client's write channel together with an incoming channel's id. It works fine, but uses all CPU, and other processes are throttled.
UPD: Items in map adding and removing dynamically by another function.
I thinking to limit CPU for this app through Docker, but maybe there is more elegant path?
My code :
func (c *Client) accumHandler() {
for !c.stop {
c.channels.Range(func(key, value interface{}) bool {
select {
case message := <-value.(chan []float64):
mess := map[uint32]interface{}{key.(uint32): message}
select {
case c.send <- mess:
}
default:
}
return !c.stop
})
}
}
If I'm reading the cards correctly, it seems like you are trying to pass along an array of floats to a common channel along with a channel identifier. I assume that you are doing this to pass multiple channels out to different publishers, but so that you only have to track a single channel for your consumer.
It turns out that you don't need to loop over channels to see when it's outputting a value. You can chain channels together inside of goroutines. For this reason, no busy wait is necessary. Something like this will suit your purposes (again, if I'm reading the cards correctly). Look for the all caps comment for the way around your busy loop. Link to playground.
var num_publishes = 3
func main() {
num_publishers := 10
single_consumer := make(chan []float64)
for i:=0;i<num_publishers;i+=1 {
c := make(chan []float64)
// connect channel to your single consumer channel
go func() { for { single_consumer <- <-c } }() // THIS IS PROBABLY WHAT YOU DIDN'T KNOW ABOUT
// send the channel to the publisher
go publisher(c, i*100)
}
// dumb consumer example
for i:=0;i<num_publishers*num_publishes;i+=1 {
fmt.Println(<-single_consumer)
}
}
func publisher(c chan []float64, publisher_id int) {
dummy := []float64{
float64(publisher_id+1),
float64(publisher_id+2),
float64(publisher_id+3),
}
for i:=0;i<num_publishes;i+=1 {
time.Sleep(time.Duration(rand.Intn(10000)) * time.Millisecond)
c <- dummy
}
}
It is eating all of your CPU because you are continuously cycling around the dictionary checking for messages, so even when there are no messages to process the CPU, or at least a thread or core, is running flat out. You need blocking sends and receives on the channels!
I assume you are doing this because you don't know how many channels there will be and therefor can't just select on all of the input channels. A better pattern would be to start a separate goroutine for each input channel you are currently storing in the dictionary. Each goroutine should have a loop in which it blocks waiting the input channel and on receiving a message does a blocking send to a channel to the client which is shared by all.
The question isn't complete so can't give exact code, but you're going to have goroutines that look something like this:
type Message struct {
id uint32,
message []float64
}
func receiverGoroutine(id uint32, input chan []float64, output chan Message) {
for {
message := <- input
output <- Message{id: id, message: message}
}
}
func clientGoroutine(c *Client, input chan Message) {
for {
message := <- input
// do stuff
}
}
(You'll need to add some "done" channels as well though)
Elsewhere you will start them with code like this:
clientChan := make(chan Message)
go clientGoroutine(client, clientChan)
for i:=0; i<max; i++ {
go receiverGoroutine( (uint32)i, make(chan []float64, clientChan)
}
Or you can just start the client routine and then add the others as they are needed rather than in a loop up front - depends on your use case.
Given a (partially) filled buffered channel in Go
ch := make(chan *MassiveStruct, n)
for i := 0; i < n; i++ {
ch <- NewMassiveStruct()
}
is it advisable to also drain the channel when closing it (by the writer) in case it is unknown when readers are going read from it (e.g. there is a limited number of those and they are currently busy)? That is
close(ch)
for range ch {}
Is such a loop guaranteed to end if there are other concurrent readers on the channel?
Context: a queue service with a fixed number of workers, which should drop processing anything queued when the service is going down (but not necessarily being GCed right after). So I am closing to indicate to the workers that the service is being terminated. I could drain the remaining "queue" immediately letting the GC free the resources allocated, I could read and ignore the values in the workers and I could leave the channel as is running down the readers and setting the channel to nil in the writer so that the GC cleans up everything. I am not sure which is the cleanest way.
It depends on your program, but generally speaking I would tend to say no (you don't need to clear the channel before closing it): if there is items in your channel when you close it, any reader still reading from the channel will receive the items until the channel is emtpy.
Here is an example:
package main
import (
"sync"
"time"
)
func main() {
var ch = make(chan int, 5)
var wg sync.WaitGroup
wg.Add(1)
for range make([]struct{}, 2) {
go func() {
for i := range ch {
wg.Wait()
println(i)
}
}()
}
for i := 0; i < 5; i++ {
ch <- i
}
close(ch)
wg.Done()
time.Sleep(1 * time.Second)
}
Here, the program will output all the items, despite the fact that the channel is closed strictly before any reader can even read from the channel.
There are better ways to achieve what you're trying to achieve. Your current approach can just lead to throwing away some records, and processing other records randomly (since the draining loop is racing all the consumers). That doesn't really address the goal.
What you want is cancellation. Here's an example from Go Concurrency Patterns: Pipelines and cancellation
func sq(done <-chan struct{}, in <-chan int) <-chan int {
out := make(chan int)
go func() {
defer close(out)
for n := range in {
select {
case out <- n * n:
case <-done:
return
}
}
}()
return out
}
You pass a done channel to all the goroutines, and you close it when you want them all to stop processing. If you do this a lot, you may find the golang.org/x/net/context package useful, which formalizes this pattern, and adds some extra features (like timeout).
I feel that the supplied answers actually do not clarify much apart from the hints that neither drain nor closing is needed. As such the following solution for the described context looks clean to me that terminates the workers and removes all references to them or the channel in question, thus, letting the GC to clean up the channel and its content:
type worker struct {
submitted chan Task
stop chan bool
p *Processor
}
// executed in a goroutine
func (w *worker) run() {
for {
select {
case task := <-w.submitted:
if err := task.Execute(w.p); err != nil {
logger.Error(err.Error())
}
case <-w.stop:
logger.Warn("Worker stopped")
return
}
}
}
func (p *Processor) Stop() {
if atomic.CompareAndSwapInt32(&p.status, running, stopped) {
for _, w := range p.workers {
w.stop <- true
}
// GC all workers as soon as goroutines stop
p.workers = nil
// GC all published data when workers terminate
p.submitted = nil
// no need to do the following above:
// close(p.submitted)
// for range p.submitted {}
}
}
I'm having difficulty using time.Tick. I expect this code to print "hi" 10 times then quit after 1 second, but instead it hangs:
ticker := time.NewTicker(100 * time.Millisecond)
time.AfterFunc(time.Second, func () {
ticker.Stop()
})
for _ = range ticker.C {
go fmt.Println("hi")
}
https://play.golang.org/p/1p6-ViSvma
Looking at the source, I see that the channel isn't closed when Stop() is called. In that case, what is the idiomatic way to iterate over the ticker channel?
You're right, ticker's channel is not being closed on stop, that's stated in a documentation:
Stop turns off a ticker. After Stop, no more ticks will be sent. Stop does not close the channel, to prevent a read from the channel succeeding incorrectly.
I believe ticker is more about fire and forget and even if you want to stop it, you could even leave the routine hanging forever (depends on your application of course).
If you really need a finite ticker, you can do tricks and provide a separate channel (per ThunderCat's answer), but what I would do is providing my own implementation of ticker. This should be relatively easy and will give you flexibility with its behaviour, things like what to pass on the channel or deciding what to do with missing ticks (i.e. when reader is falling behind).
My example:
func finiteTicker(n int, d time.Duration) <-chan time.Time {
ch := make(chan time.Time, 1)
go func() {
for i := 0; i < n; i++ {
time.Sleep(d)
ch <- time.Now()
}
close(ch)
}()
return ch
}
func main() {
for range finiteTicker(10, 100*time.Millisecond) {
fmt.Println("hi")
}
}
http://play.golang.org/p/ZOwJlM8rDm
I asked on IRC as well, at got some useful insight from #Tv`.
Despite timer.Ticker looking like it should be part of a go pipeline, it does not actually play well with the pipeline idioms:
Here are the guidelines for pipeline construction:
stages close their outbound channels when all the send operations are done.
stages keep receiving values from inbound channels until those channels are closed or the senders are unblocked.
Pipelines unblock senders either by ensuring there's enough buffer for all the values that are sent or by explicitly signalling senders when the receiver may abandon the channel.
The reason for this inconsistency appears to be primarily to support the following idiom:
for {
select {
case <-ticker.C:
// do something
case <-done:
return
}
}
I don't know why this is the case, and why the pipelining idiom wasn't used:
for {
select {
case _, ok := <-ticker.C:
if ok {
// do something
} else {
return
}
}
}
(or more cleanly)
for _ = range ticker.C {
// do something
}
But this is the way go is :(