Related
I am connecting to a websocket that is stream live stock trades.
I have to read the prices, perform calculations on the fly and based on these calculations make another API call e.g. buy or sell.
I want to ensure my calculations/processing doesn't slow down my ability to stream in all the live data.
What is a good design pattern to follow for this type of problem?
Is there a way to log/warn in my system to know if I am falling behind?
Falling behind means: the websocket is sending price data, and I am not able to process that data as it comes in and it is lagging behind.
While doing the c.ReadJSON and then passing the message to my channel, there might be a delay in deserializing into JSON
When inside my channel and processing, calculating formulas and sending another API request to buy/sell, this will add delays
How can I prevent lags/delays and also monitor if indeed there is a delay?
func main() {
c, _, err := websocket.DefaultDialer.Dial("wss://socket.example.com/stocks", nil)
if err != nil {
panic(err)
}
defer c.Close()
// Buffered channel to account for bursts or spikes in data:
chanMessages := make(chan interface{}, 10000)
// Read messages off the buffered queue:
go func() {
for msgBytes := range chanMessages {
logrus.Info("Message Bytes: ", msgBytes)
}
}()
// As little logic as possible in the reader loop:
for {
var msg interface{}
err := c.ReadJSON(&msg)
if err != nil {
panic(err)
}
chanMessages <- msg
}
}
You can read bytes, pass them to the channel, and use other goroutines to do conversion.
I worked on a similar crypto market bot. Instead of creating large buffured channel i created buffered channel with cap of 1 and used select statement for sending socket data to channel.
Here is the example
var wg sync.WaitGroup
msg := make(chan []byte, 1)
wg.Add(1)
go func() {
defer wg.Done()
for data := range msg {
// decode and process data
}
}()
for {
_, data, err := c.ReadMessage()
if err != nil {
log.Println("read error: ", err)
return
}
select {
case msg <- data: // in case channel is free
default: // if not, next time will try again with latest data
}
}
This will insure that you'll get the latest data when you are ready to process.
I am working on a personal project that will run on a Raspberry Pi with some sensors attached to it.
The function that read from the sensors and the function that handle the socket connection are executed in different goroutines, so, in order to send data on the socket when they are read from the sensors, I create a chan []byte in the main function and pass it to the goroutines.
My problem came out here: if I do multiple writes in a row, only the first data arrives to the client, but the others don't. But if I put a little time.Sleep in the sender function, all the data arrives correctly to the client.
Anyway, that's a simplified version of this little program :
package main
import (
"net"
"os"
"sync"
"time"
)
const socketName string = "./test_socket"
// create to the socket and launch the accept client routine
func launchServerUDS(ch chan []byte) {
if err := os.RemoveAll(socketName); err != nil {
return
}
l, err := net.Listen("unix", socketName)
if err != nil {
return
}
go acceptConnectionRoutine(l, ch)
}
// accept incoming connection on the socket and
// 1) launch the routine to handle commands from the client
// 2) launch the routine to send data when the server reads from the sensors
func acceptConnectionRoutine(l net.Listener, ch chan []byte) {
defer l.Close()
for {
conn, err := l.Accept()
if err != nil {
return
}
go commandsHandlerRoutine(conn, ch)
go autoSendRoutine(conn, ch)
}
}
// routine that sends data to the client
func autoSendRoutine(c net.Conn, ch chan []byte) {
for {
data := <-ch
if string(data) == "exit" {
return
}
c.Write(data)
}
}
// handle client connection and calls functions to execute commands
func commandsHandlerRoutine(c net.Conn, ch chan []byte) {
for {
buf := make([]byte, 1024)
n, err := c.Read(buf)
if err != nil {
ch <- []byte("exit")
break
}
// now, for sake of simplicity , only echo commands back to the client
_, err = c.Write(buf[:n])
if err != nil {
ch <- []byte("exit")
break
}
}
}
// write on the channel to the autosend routine so the data are written on the socket
func sendDataToClient(data []byte, ch chan []byte) {
select {
case ch <- data:
// if i put a little sleep here, no problems
// i i remove the sleep, only data1 is sent to the client
// time.Sleep(1 * time.Millisecond)
default:
}
}
func dummyReadDataRoutine(ch chan []byte) {
for {
// read data from the sensors every 5 seconds
time.Sleep(5 * time.Second)
// read first data and send it
sendDataToClient([]byte("dummy data1\n"), ch)
// read second data and send it
sendDataToClient([]byte("dummy data2\n"), ch)
// read third data and send it
sendDataToClient([]byte("dummy data3\n"), ch)
}
}
func main() {
ch := make(chan []byte)
wg := sync.WaitGroup{}
wg.Add(2)
go dummyReadDataRoutine(ch)
go launchServerUDS(ch)
wg.Wait()
}
I don't think it's correct to use a sleep to synchronize writes. How do I fix this while keeping the functions running on a different different goroutines.
The primary problem was in the function:
func sendDataToClient(data []byte, ch chan []byte) {
select {
case ch <- data:
// if I put a little sleep here, no problems
// if I remove the sleep, only data1 is sent to the client
// time.Sleep(1 * time.Millisecond)
default:
}
If the channel ch isn't ready at the moment the function is called, the default case will be taken and the data will never be sent. In this case you should eliminate the function and send to the channel directly.
Buffering the channel is orthogonal to the problem at hand, and should be done for the similar reasons as you would buffered IO, i.e. provide a "buffer" for writes that can't immediately progress. If the code were not able progress without a buffer, adding one only delays possible deadlocks.
You also don't need the exit sentinel value here, as you could range over the channel and close it when you're done. This however still ignores write errors, but again that requires some re-design.
for data := range ch {
c.Write(data)
}
You should also be careful passing slices over channels, as it's all too easy to lose track of which logical process has ownership and is going to modify the backing array. I can't say from the information given if passing the read+write data over channels improves the architecture, but this is not a pattern you will find in most go networking code.
JimB gave a good explanation, so I think his answer is the better one.
I have included my partial solution in this answer.
I was thinking that my code was clear and simplified, but as Jim said I can do it simpler and clearer. I leave my old code posted so people can understand better how you can post simpler code and not do a mess like I did.
As chmike said, my issue wasn't related to the socket like I was thinking, but was only related to the channel. Write on a unbuffered channel was one of the problems. After change the unbuffered channel to a buffered one, the issue was resolved. Anyway, this code is not "good code" and can be improved following the principles that JimB has written in his answer.
So here is the new code:
package main
import (
"net"
"os"
"sync"
"time"
)
const socketName string = "./test_socket"
// create the socket and accept clients connections
func launchServerUDS(ch chan []byte, wg *sync.WaitGroup) {
defer wg.Done()
if err := os.RemoveAll(socketName); err != nil {
return
}
l, err := net.Listen("unix", socketName)
if err != nil {
return
}
defer l.Close()
for {
conn, err := l.Accept()
if err != nil {
return
}
// this goroutine are launched when a client is connected
// routine that listen and echo commands
go commandsHandlerRoutine(conn, ch)
// routine to send data read from the sensors to the client
go autoSendRoutine(conn, ch)
}
}
// routine that sends data to the client
func autoSendRoutine(c net.Conn, ch chan []byte) {
for {
data := <-ch
if string(data) == "exit" {
return
}
c.Write(data)
}
}
// handle commands received from the client
func commandsHandlerRoutine(c net.Conn, ch chan []byte) {
for {
buf := make([]byte, 1024)
n, err := c.Read(buf)
if err != nil {
// if i can't read send an exit command to autoSendRoutine and exit
ch <- []byte("exit")
break
}
// now, for sake of simplicity , only echo commands back to the client
_, err = c.Write(buf[:n])
if err != nil {
// if i can't write back send an exit command to autoSendRoutine and exit
ch <- []byte("exit")
break
}
}
}
// this goroutine reads from the sensors and write to the channel , so data are sent
// to the client if a client is connected
func dummyReadDataRoutine(ch chan []byte, wg *sync.WaitGroup) {
x := 0
for x < 100 {
// read data from the sensors every 5 seconds
time.Sleep(1 * time.Second)
// read first data and send it
ch <- []byte("data1\n")
// read second data and send it
ch <- []byte("data2\n")
// read third data and send it
ch <- []byte("data3\n")
x++
}
wg.Done()
}
func main() {
// create a BUFFERED CHANNEL
ch := make(chan []byte, 1)
wg := sync.WaitGroup{}
wg.Add(2)
// launch the goruotines that handle the socket connections
// and read data from the sensors
go dummyReadDataRoutine(ch, &wg)
go launchServerUDS(ch, &wg)
wg.Wait()
}
I am new to Go and I might be missing the point but why are Go channels limited in the maximum buffer size buffered channels can have? For example if I make a channel like so
channel := make(chan int, 100)
I cannot add more than 100 elements to the channel without blocking, is there a reason for this? Further they cannot dynamically be resized, because the channel API does not support that.
This seems sort of limiting in the language's support for universal synchronization with a single mechanism since it lacks convenience compared to an unbounded semaphore. For example a generalized semaphore's value can be increased without bounds.
If one component of a program can't keep up with its input, it needs to put back-pressure on the rest of the system, rather than letting it run ahead and generate gigabytes of data that will never get processed because the system ran out of memory and crashed.
There is really no such thing as an unlimited buffer, because machines have limits on what they can handle. Go requires you to specify a size for buffered channels so that you will think about what size buffer your program actually needs and can handle. If it really needs a billion items, and can handle them, you can create a channel that big. But in most cases a buffer size of 0 or 1 is actually what is needed.
This is because Channels are designed for efficient communication between concurrent goroutines, but the need you have is something different: the fact you are blocking denotes that the recipient is not attending the work queue, and "dynamic" is rarely free.
There are a variety of different patterns and algorithms you can use to solve the problem you have: you could change your channel to accept arrays of ints, you could add additional goroutines to better balance or filter work. Or you could implement your own dynamic channel. Doing so is certainly a useful exercise for seeing why dynamic channels aren't a great way to build concurrency.
package main
import "fmt"
func dynamicChannel(initial int) (chan <- interface{}, <- chan interface{}) {
in := make(chan interface{}, initial)
out := make(chan interface{}, initial)
go func () {
defer close(out)
buffer := make([]interface{}, 0, initial)
loop:
for {
packet, ok := <- in
if !ok {
break loop
}
select {
case out <- packet:
continue
default:
}
buffer = append(buffer, packet)
for len(buffer) > 0 {
select {
case packet, ok := <-in:
if !ok {
break loop
}
buffer = append(buffer, packet)
case out <- buffer[0]:
buffer = buffer[1:]
}
}
}
for len(buffer) > 0 {
out <- buffer[0]
buffer = buffer[1:]
}
} ()
return in, out
}
func main() {
in, out := dynamicChannel(4)
in <- 10
fmt.Println(<-out)
in <- 20
in <- 30
fmt.Println(<-out)
fmt.Println(<-out)
for i := 100; i < 120; i++ {
in <- i
}
fmt.Println("queued 100-120")
fmt.Println(<-out)
close(in)
fmt.Println("in closed")
for i := range out {
fmt.Println(i)
}
}
Generally, if you are blocking, it indicates your concurrency is not well balanced. Consider a different strategy. For example, a simple tool to look for files with a matching .checksum file and then check the hashes:
func crawl(toplevel string, workQ chan <- string) {
defer close(workQ)
for _, path := filepath.Walk(toplevel, func (path string, info os.FileInfo, err error) error {
if err == nil && info.Mode().IsRegular() {
workQ <- path
}
}
// if a file has a .checksum file, compare it with the file's checksum.
func validateWorker(workQ <- chan string) {
for path := range workQ {
// If there's a .checksum file, read it, limit to 256 bytes.
expectedSum, err := os.ReadFile(path + ".checksum")[:256]
if err != nil { // ignore
continue
}
file, err := os.Open(path)
if err != nil {
continue
}
defer close(file)
hash := sha256.New()
if _, err := io.Copy(hash, file); err != nil {
log.Printf("couldn't hash %s: %w", path, err)
continue
}
actualSum := fmt.Sprintf("%x", hash.Sum(nil))
if actualSum != expectedSum {
log.Printf("%s: mismatch: expected %s, got %s", path, expectedSum, actualSum)
}
}
Even without any .checksum files, the crawl function will tend to outpace the worker queue. When .checksum files are encountered, especially if the files are large, the worker could take much, much longer to perform a single checksum.
A better aim here would be to achieve more consistent throughput by reducing the number of things the "validateWorker" does. Right now it is sometimes fast, because it checks for the checksum file. Othertimes it is slow, because it also loads has to read and checksum the files.
type ChecksumQuery struct {
Filepath string
ExpectSum string
}
func crawl(toplevel string, workQ chan <- ChecksumQuery) {
// Have a worker filter out files which don't have .checksums, and allow it
// to get a little ahead of the crawl function.
checkupQ := make(chan string, 4)
go func () {
defer close(workQ)
for path := range checkupQ {
expected, err := os.ReadFile(path + ".checksum")[:256]
if err == nil && len(expected) > 0 {
workQ <- ChecksumQuery{ path, string(expected) }
}
}
}()
go func () {
defer close(checkupQ)
for _, path := filepath.Walk(toplevel, func (path string, info os.FileInfo, err error) error {
if err == nil && info.Mode().IsRegular() {
checkupQ <- path
}
}
}()
}
Run a suitable number of validate workers, assign the workQ a suitable size, but if the crawler or validate functions block, it is because validate is doing useful work.
If your validate workers are all busy, they are all consuming large files from disk and hashing them. Having other workers interrupt this by crawling for more filenames, allocating and passing strings, isn't advantageous.
Other scenarios might be passing large lists to workers, in which pass the slices over channels (its cheap); or dynamic sized groups of things, in which case consider passing channels or captures over channels.
The buffer size is the number of elements that can be sent to the channel without the send blocking. By default, a channel has a buffer size of 0 (you get this with make(chan int)). This means that every single send will block until another goroutine receives from the channel. A channel of buffer size 1 can hold 1 element until sending blocks, so you'd get
c := make(chan int, 1)
c <- 1 // doesn't block
c <- 2 // blocks until another goroutine receives from the channel
I suggest you to look this for more clarification:
https://rogpeppe.wordpress.com/2010/02/10/unlimited-buffering-with-low-overhead/
http://openmymind.net/Introduction-To-Go-Buffered-Channels/
I am new to go and I am trying to create a simple chat server where clients can broadcast messages to all connected clients.
In my server, I have a goroutine (infinite for loop) that accepts connection and all the connections are received by a channel.
go func() {
for {
conn, _ := listener.Accept()
ch <- conn
}
}()
Then, I start a handler (goroutine) for every connected client. Inside the handler, I try to broadcast to all connections by iterating through the channel.
for c := range ch {
conn.Write(msg)
}
However, I cannot broadcast because (I think from reading the docs) the channel needs to be closed before iterating. I am not sure when I should close the channel because I want to continuously accept new connections and closing the channel won't let me do that. If anyone can help me, or provide a better way to broadcast messages to all connected clients, it would be appreciated.
What you are doing is a fan out pattern, that is to say, multiple endpoints are listening to a single input source. The result of this pattern is, only one of these listeners will be able to get the message whenever there's a message in the input source. The only exception is a close of channel. This close will be recognized by all of the listeners, and thus a "broadcast".
But what you want to do is broadcasting a message read from connection, so we could do something like this:
When the number of listeners is known
Let each worker listen to dedicated broadcast channel, and dispatch the message from the main channel to each dedicated broadcast channel.
type worker struct {
source chan interface{}
quit chan struct{}
}
func (w *worker) Start() {
w.source = make(chan interface{}, 10) // some buffer size to avoid blocking
go func() {
for {
select {
case msg := <-w.source
// do something with msg
case <-quit: // will explain this in the last section
return
}
}
}()
}
And then we could have a bunch of workers:
workers := []*worker{&worker{}, &worker{}}
for _, worker := range workers { worker.Start() }
Then start our listener:
go func() {
for {
conn, _ := listener.Accept()
ch <- conn
}
}()
And a dispatcher:
go func() {
for {
msg := <- ch
for _, worker := workers {
worker.source <- msg
}
}
}()
When the number of listeners is not known
In this case, the solution given above still works. The only difference is, whenever you need a new worker, you need to create a new worker, start it up, and then push it into workers slice. But this method requires a thread-safe slice, which need a lock around it. One of the implementation may look like as follows:
type threadSafeSlice struct {
sync.Mutex
workers []*worker
}
func (slice *threadSafeSlice) Push(w *worker) {
slice.Lock()
defer slice.Unlock()
workers = append(workers, w)
}
func (slice *threadSafeSlice) Iter(routine func(*worker)) {
slice.Lock()
defer slice.Unlock()
for _, worker := range workers {
routine(worker)
}
}
Whenever you want to start a worker:
w := &worker{}
w.Start()
threadSafeSlice.Push(w)
And your dispatcher will be changed to:
go func() {
for {
msg := <- ch
threadSafeSlice.Iter(func(w *worker) { w.source <- msg })
}
}()
Last words: never leave a dangling goroutine
One of the good practices is: never leave a dangling goroutine. So when you finished listening, you need to close all of the goroutines you fired. This will be done via quit channel in worker:
First we need to create a global quit signalling channel:
globalQuit := make(chan struct{})
And whenever we create a worker, we assign the globalQuit channel to it as its quit signal:
worker.quit = globalQuit
Then when we want to shutdown all workers, we simply do:
close(globalQuit)
Since close will be recognized by all listening goroutines (this is the point you understood), all goroutines will be returned. Remember to close your dispatcher routine as well, but I will leave it to you :)
A more elegant solution is a "broker", where clients may subscribe and unsubscribe to messages.
To also handle subscribing and unsubscribing elegantly, we may utilize channels for this, so the main loop of the broker which receives and distributes the messages can incorporate all these using a single select statement, and synchronization is given from the solution's nature.
Another trick is to store the subscribers in a map, mapping from the channel we use to distribute messages to them. So use the channel as the key in the map, and then adding and removing the clients is "dead" simple. This is made possible because channel values are comparable, and their comparison is very efficient as channel values are simple pointers to channel descriptors.
Without further ado, here's a simple broker implementation:
type Broker[T any] struct {
stopCh chan struct{}
publishCh chan T
subCh chan chan T
unsubCh chan chan T
}
func NewBroker[T any]() *Broker[T] {
return &Broker[T]{
stopCh: make(chan struct{}),
publishCh: make(chan T, 1),
subCh: make(chan chan T, 1),
unsubCh: make(chan chan T, 1),
}
}
func (b *Broker[T]) Start() {
subs := map[chan T]struct{}{}
for {
select {
case <-b.stopCh:
return
case msgCh := <-b.subCh:
subs[msgCh] = struct{}{}
case msgCh := <-b.unsubCh:
delete(subs, msgCh)
case msg := <-b.publishCh:
for msgCh := range subs {
// msgCh is buffered, use non-blocking send to protect the broker:
select {
case msgCh <- msg:
default:
}
}
}
}
}
func (b *Broker[T]) Stop() {
close(b.stopCh)
}
func (b *Broker[T]) Subscribe() chan T {
msgCh := make(chan T, 5)
b.subCh <- msgCh
return msgCh
}
func (b *Broker[T]) Unsubscribe(msgCh chan T) {
b.unsubCh <- msgCh
}
func (b *Broker[T]) Publish(msg T) {
b.publishCh <- msg
}
Example using it:
func main() {
// Create and start a broker:
b := NewBroker[string]()
go b.Start()
// Create and subscribe 3 clients:
clientFunc := func(id int) {
msgCh := b.Subscribe()
for {
fmt.Printf("Client %d got message: %v\n", id, <-msgCh)
}
}
for i := 0; i < 3; i++ {
go clientFunc(i)
}
// Start publishing messages:
go func() {
for msgId := 0; ; msgId++ {
b.Publish(fmt.Sprintf("msg#%d", msgId))
time.Sleep(300 * time.Millisecond)
}
}()
time.Sleep(time.Second)
}
Output of the above will be (try it on the Go Playground):
Client 2 got message: msg#0
Client 0 got message: msg#0
Client 1 got message: msg#0
Client 2 got message: msg#1
Client 0 got message: msg#1
Client 1 got message: msg#1
Client 1 got message: msg#2
Client 2 got message: msg#2
Client 0 got message: msg#2
Client 2 got message: msg#3
Client 0 got message: msg#3
Client 1 got message: msg#3
Improvements
You may consider the following improvements. These may or may not be useful depending on how / to what you use the broker.
Broker.Unsubscribe() may close the message channel, signalling that no more messages will be sent on it:
func (b *Broker[T]) Unsubscribe(msgCh chan T) {
b.unsubCh <- msgCh
close(msgCh)
}
This would allow clients to range over the message channel, like this:
msgCh := b.Subscribe()
for msg := range msgCh {
fmt.Printf("Client %d got message: %v\n", id, msg)
}
Then if someone unsubscribes this msgCh like this:
b.Unsubscribe(msgCh)
The above range loop will terminate after processing all messages that were sent before the call to Unsubscribe().
If you want your clients to rely on the message channel being closed, and the broker's lifetime is narrower than your app's lifetime, then you could also close all subscribed clients when the broker is stopped, in the Start() method like this:
case <-b.stopCh:
for msgCh := range subs {
close(msgCh)
}
return
Broadcast to a slice of channel and use sync.Mutex to manage channel add and remove may be the easiest way in your case.
Here is what you can do to broadcast in golang:
You can broadcast a share status change with sync.Cond. This way do not have any alloc once setup, but you can not add timeout functional or work with another channel.
You can broadcast a share status change with a close old channel and create new channel and sync.Mutex. This way have one alloc per status change, but you can add timeout functional and work with another channel.
You can broadcast to a slice of function callback and use sync.Mutex to manage them. The caller can do channel stuff. This way have more than one alloc per caller, and work with another channel.
You can broadcast to a slice of channel and use sync.Mutex to manage them. This way have more than one alloc per caller, and work with another channel.
You can broadcast to a slice of sync.WaitGroup and use sync.Mutex to manage them.
This is a late answer but I think it may appease some curious readers.
Go channels are widely welcomed to be used when it comes to concurrency.
Go community is rigid to follow this saying:
Do not communicate by sharing memory; instead, share memory by communicating.
I am completely neutral toward this and I think other options rather than well-defined channels should be considered when it comes to broadcasting.
Here is my take: Cond from sync packages are widely overlooked. Implementing braodcaster as suggested by Bronze man in very same context worths noting.
I was delighted witch icza suggestion to use channels and broadcast messages over them. I follow the same methods and use sync's conditional variable:
// Broadcaster is the struct which encompasses broadcasting
type Broadcaster struct {
cond *sync.Cond
subscribers map[interface{}]func(interface{})
message interface{}
running bool
}
this is the main struct that our whole broadcasting concept relies on.
Below, I define some behaviours for this struct. In a nutshell, subscribers should be able to be added, removed and whole the process should be revokable.
// SetupBroadcaster gives the broadcaster object to be used further in messaging
func SetupBroadcaster() *Broadcaster {
return &Broadcaster{
cond: sync.NewCond(&sync.RWMutex{}),
subscribers: map[interface{}]func(interface{}){},
}
}
// Subscribe let others enroll in broadcast event!
func (b *Broadcaster) Subscribe(id interface{}, f func(input interface{})) {
b.subscribers[id] = f
}
// Unsubscribe stop receiving broadcasting
func (b *Broadcaster) Unsubscribe(id interface{}) {
b.cond.L.Lock()
delete(b.subscribers, id)
b.cond.L.Unlock()
}
// Publish publishes the message
func (b *Broadcaster) Publish(message interface{}) {
go func() {
b.cond.L.Lock()
b.message = message
b.cond.Broadcast()
b.cond.L.Unlock()
}()
}
// Start the main broadcasting event
func (b *Broadcaster) Start() {
b.running = true
for b.running {
b.cond.L.Lock()
b.cond.Wait()
go func() {
for _, f := range b.subscribers {
f(b.message) // publishes the message
}
}()
b.cond.L.Unlock()
}
}
// Stop broadcasting event
func (b *Broadcaster) Stop() {
b.running = false
}
Next, I can use it quite easily:
messageToaster := func(message interface{}) {
fmt.Printf("[New Message]: %v\n", message)
}
unwillingReceiver := func(message interface{}) {
fmt.Println("Do not disturb!")
}
broadcaster := SetupBroadcaster()
broadcaster.Subscribe(1, messageToaster)
broadcaster.Subscribe(2, messageToaster)
broadcaster.Subscribe(3, unwillingReceiver)
go broadcaster.Start()
broadcaster.Publish("Hello!")
time.Sleep(time.Second)
broadcaster.Unsubscribe(3)
broadcaster.Publish("Goodbye!")
It should print something like this in any order:
[New Message]: Hello!
Do not disturb!
[New Message]: Hello!
[New Message]: Goodbye!
[New Message]: Goodbye!
See this on go playground
another one simple example:
https://play.golang.org
type Broadcaster struct {
mu sync.Mutex
clients map[int64]chan struct{}
}
func NewBroadcaster() *Broadcaster {
return &Broadcaster{
clients: make(map[int64]chan struct{}),
}
}
func (b *Broadcaster) Subscribe(id int64) (<-chan struct{}, error) {
defer b.mu.Unlock()
b.mu.Lock()
s := make(chan struct{}, 1)
if _, ok := b.clients[id]; ok {
return nil, fmt.Errorf("signal %d already exist", id)
}
b.clients[id] = s
return b.clients[id], nil
}
func (b *Broadcaster) Unsubscribe(id int64) {
defer b.mu.Unlock()
b.mu.Lock()
if _, ok := b.clients[id]; ok {
close(b.clients[id])
}
delete(b.clients, id)
}
func (b *Broadcaster) broadcast() {
defer b.mu.Unlock()
b.mu.Lock()
for k := range b.clients {
if len(b.clients[k]) == 0 {
b.clients[k] <- struct{}{}
}
}
}
type testClient struct {
name string
signal <-chan struct{}
signalID int64
brd *Broadcaster
}
func (c *testClient) doWork() {
i := 0
for range c.signal {
fmt.Println(c.name, "do work", i)
if i > 2 {
c.brd.Unsubscribe(c.signalID)
fmt.Println(c.name, "unsubscribed")
}
i++
}
fmt.Println(c.name, "done")
}
func main() {
var err error
brd := NewBroadcaster()
clients := make([]*testClient, 0)
for i := 0; i < 3; i++ {
c := &testClient{
name: fmt.Sprint("client:", i),
signalID: time.Now().UnixNano()+int64(i), // +int64(i) for play.golang.org
brd: brd,
}
c.signal, err = brd.Subscribe(c.signalID)
if err != nil {
log.Fatal(err)
}
clients = append(clients, c)
}
for i := 0; i < len(clients); i++ {
go clients[i].doWork()
}
for i := 0; i < 6; i++ {
brd.broadcast()
time.Sleep(time.Second)
}
}
output:
client:0 do work 0
client:2 do work 0
client:1 do work 0
client:2 do work 1
client:0 do work 1
client:1 do work 1
client:2 do work 2
client:0 do work 2
client:1 do work 2
client:2 do work 3
client:2 unsubscribed
client:2 done
client:0 do work 3
client:0 unsubscribed
client:0 done
client:1 do work 3
client:1 unsubscribed
client:1 done
Because Go channels follow the Communicating Sequential Processes (CSP) pattern, channels are a point-to-point communication entity. There is always one writer and one reader involved in each exchange.
However, each channel end can be shared amongst multiple goroutines. This is safe to do - there is no dangerous race condition.
So there can be multiple writers sharing the writing end. And/or there can be multiple readers sharing the reading end. I wrote more on this in a different answer, which includes examples.
If you really need a broadcast, you cannot do this directly, but it is not hard to implement an intermediate goroutine that copies a value out to each of a group of output channels.
The canonical (and idiomatic go) way to do this is via a slice of channels, as recommended above by Nevets and icza.
You should specifically not use a slice of callbacks. In some languages, you do typically register observers by passing a callback, but in those cases, you have to wrap their invocation in a fair amount of defensive code to protect the sender, and ideally you should have the generator of the message (the "Subject" in classic Observer pattern discussion) segregated from the observers by an intermediate message transport layer. This is where you typically use a pub-sub mesh (JMS brokers, gnats, MQ, whatever) when you're crossing process boundaries, but you should adhere to the same pattern if both subject and observers are internal to the same process (and most languages have available implementations of such mechanisms, so you shouldn't need to roll your own).
The reasons not to use callbacks include:
Unless you build in your own message transport layer, your subject is no longer both naive (it doesn't know the nature or cardinality of the observers) and disinterested (it doesn't care what they do with the message, only that it is made available to any interested parties);
If you want true broadcasting, then you need to act as if the order of receipt does not matter - ideally, everyone can see the message at the same time, even though in practice sending is iterative, even when using channels. But sending to recipient n+1 should absolutely not depend on confirmation of receipt by recipient n. That isn't broadcasting, it's serialized assignment. I say assignment because, if you are asking for a callback, then in executing the callback, you are enforcing (even if only minimally) some behavior to be taken by the recipient. You've basically turned your sender into an orchestrator, which is a very different sort of pattern with a different set of use cases.
Absent a defensive boundary (wrapping each callback invocation in a separate goroutine with a timeout context, e.g.), you are vulnerable to being blocked by a recipient - this is antithetical to broadcasting. Receipt (and optionally, taking any action at all based on) a broadcast message must be entirely asynchronous with respect to the original sending.
Is it doable to provide pseudo-broadcasting by using callbacks in go? Sure, but you have to invest in so much additional complexity to keep things clean - and why would you do that when go provides an easy and rather robust way to do it? The examples of channel-driven broadcasting above are good ones and how you should do it pretty much every time.
The specific exception when you absolutely should use callbacks is when you are not disinterested - you really do care that, on the basis of the sent message, the recipients take some action (and usually something specified by contract). For example, "I am about to unmount this filesystem, so flush and close your filehandles, let me know once you're done." (I know that's a pretty old-fashioned example, but it's the first one that comes to mind.)
I'm using channels in Go to process a data pipeline of sorts. The code looks something like this:
type Channels struct {
inputs chan string
errc chan error
quit chan struct{}
}
func (c *Channels) doSomethingWithInput() {
defer close(c.quit)
defer close(c.errc)
for input := range p.inputs {
_, err := doSomethingThatSometimesErrors(input)
if err != nil {
c.errc <- err
return
}
}
doOneFinalThingThatCannotError()
return
}
func (c *Channels) inputData(s string) {
// This function implementation is my question
}
func StartProcessing(c *Channels, data ...string) error {
go c.doSomethingWithInput()
go func() {
defer close(c.inputs)
for _, i := range data {
select {
case <-c.quit:
break
default:
}
inputData(i)
}
}()
// Block until the quit channel is closed.
<-c.quit
if err := <-c.errc; err != nil {
return err
}
return nil
}
This seems like a reasonable way to communicate a quit signal between channel processors and is based on this blog post about concurrency patterns in Go.
The thing I struggle with using this pattern is the inputData function. Adding strings to the input channel needs to wait for doSomethingWithInput() to read the channel, but it also might error. inputData needs to try and feed the inputs channel but give up if told to quit. The best I could do was this:
func (c *Channels) inputData(s string) {
for {
select {
case <-c.quit:
return
case c.inputs <- s:
return
}
}
}
Essentially, "oscillate between your options until one of them sticks." To be clear, I don't think it's a bad design. It just feels... wasteful. Like I'm missing something clever. How can I tell a channel sender to quit in Go when a channel consumer errors?
Your inputData() is fine, that's the way to do it.
In your use case, your channel consumer, the receiver, aka doSomethingWithInput() is the one which should have control over the "quit" channel. As it is, if an error occurs, just return from doSomethingWithInput(), which will in turn close the quit channel and make the sender(s) quit (will trigger case <-quit:). That is in fact the clever bit.
Just watch out with your error channel that's not buffered and closed when doSomethingWithInput() exits. You cannot read it afterwards to collect errors. You need to close it in your main function and initialize it with some capacity (make(chan int, 10) for example), or create a consumer goroutine for it. You may also want to try reading it with a select statement: your error checking code, as it is, will block forever if there are no errors.