I have code that looks like this, where I'm listening to a channel up until a timeout interval. Let's say this goroutine 1
select {
case <-time.After(TimeoutInterval):
mu.Lock()
defer mu.Unlock()
delete(msgChMap, index)
return ""
case msg := <-msgCh:
return msg
}
Elsewhere, I have a goroutine 2 that runs something like this where it grabs the appropriate msgCh from a Map, deletes the entry in the map and then sends a message through the channel.
mu.Lock()
msgCh, ok := msgChMap[index]
delete(msgChMap, index)
mu.Unlock()
if ok {
msgCh <- "yay"
}
It seems like it is possible for me to grab the message channel msgCh from the Map, try to send a message but because TimeoutInterval has already passed, there will be nothing listening to the channel, and my code will get stuck waiting for a listener. If I put the lock after sending yay to the msgCh, it seems possible that I could deadlock as 2 will be waiting for a listener to the channel and is not releasing the lock, but 1 is no longer listening but requires the lock.
What is a general pattern to avoid getting stuck waiting for a listener? Perhaps go is smart enough to not get stuck here.
You can prevent getting stuck when waiting for a listener by using select for the sender.
By using select you can use more case for sender in this situation
mu.Lock()
msgCh, ok := msgChMap[index]
delete(msgChMap, index)
mu.Unlock()
if ok {
select {
// listener is available
case msgCh <- "yay":
fmt.Println("sent")
// if not avalable (execute immediately)
default:
fmt.Println("no available listener")
// ...just ignore or do something else
}
}
Or waiting for a short time
mu.Lock()
msgCh, ok := msgChMap[index]
delete(msgChMap, index)
mu.Unlock()
if ok {
select {
// listener is available
case msgCh <- "yay":
fmt.Println("sent")
// if not available, waiting for listener
case <-time.After(30 * time.Second):
fmt.Println("after 30 seconds, still no available listener")
// ...just ignore or do something else
}
}
The problem here is that the channel reader my stop without the writer knowing it. It should be possible to structure this solution so that this situation never happens, but ignoring that for now, for this specific problem what you need is atomic access to the channel itself, along with a flag for channel status:
type channel struct {
sync.Mutex
msgCh chan Msg
active bool
}
Writing to the channel is now done by locking it:
ch.Lock()
if ch.active {
ch.msgCh<-data
}
ch.Unlock()
And when you "inactivate" the channel, reset the flag:
case <-time.After(TimeoutInterval):
mu.Lock()
defer mu.Unlock()
ch.Lock()
defer ch.Unlock()
delete(msgChMap, index)
ch.active=false
return ""
And of course, with this now you have to keep a *channel in your map.
Related
In a project the program receives data via websocket. This data needs to be processed by n algorithms. The amount of algorithms can change dynamically.
My attempt is to create some pub/sub pattern where subscriptions can be started and canceled on the fly. Turns out that this is a bit more challenging than expected.
Here's what I came up with (which is based on https://eli.thegreenplace.net/2020/pubsub-using-channels-in-go/):
package pubsub
import (
"context"
"sync"
"time"
)
type Pubsub struct {
sync.RWMutex
subs []*Subsciption
closed bool
}
func New() *Pubsub {
ps := &Pubsub{}
ps.subs = []*Subsciption{}
return ps
}
func (ps *Pubsub) Publish(msg interface{}) {
ps.RLock()
defer ps.RUnlock()
if ps.closed {
return
}
for _, sub := range ps.subs {
// ISSUE1: These goroutines apparently do not exit properly...
go func(ch chan interface{}) {
ch <- msg
}(sub.Data)
}
}
func (ps *Pubsub) Subscribe() (context.Context, *Subsciption, error) {
ps.Lock()
defer ps.Unlock()
// prep channel
ctx, cancel := context.WithCancel(context.Background())
sub := &Subsciption{
Data: make(chan interface{}, 1),
cancel: cancel,
ps: ps,
}
// prep subsciption
ps.subs = append(ps.subs, sub)
return ctx, sub, nil
}
func (ps *Pubsub) unsubscribe(s *Subsciption) bool {
ps.Lock()
defer ps.Unlock()
found := false
index := 0
for i, sub := range ps.subs {
if sub == s {
index = i
found = true
}
}
if found {
s.cancel()
ps.subs[index] = ps.subs[len(ps.subs)-1]
ps.subs = ps.subs[:len(ps.subs)-1]
// ISSUE2: close the channel async with a delay to ensure
// nothing will be written to the channel anymore
// via a pending goroutine from Publish()
go func(ch chan interface{}) {
time.Sleep(500 * time.Millisecond)
close(ch)
}(s.Data)
}
return found
}
func (ps *Pubsub) Close() {
ps.Lock()
defer ps.Unlock()
if !ps.closed {
ps.closed = true
for _, sub := range ps.subs {
sub.cancel()
// ISSUE2: close the channel async with a delay to ensure
// nothing will be written to the channel anymore
// via a pending goroutine from Publish()
go func(ch chan interface{}) {
time.Sleep(500 * time.Millisecond)
close(ch)
}(sub.Data)
}
}
}
type Subsciption struct {
Data chan interface{}
cancel func()
ps *Pubsub
}
func (s *Subsciption) Unsubscribe() {
s.ps.unsubscribe(s)
}
As mentioned in the comments there are (at least) two issues with this:
ISSUE1:
After a while of running in implementation of this I get a few errors of this kind:
goroutine 120624 [runnable]:
bm/internal/pubsub.(*Pubsub).Publish.func1(0x8586c0, 0xc00b44e880, 0xc008617740)
/home/X/Projects/bm/internal/pubsub/pubsub.go:30
created by bookmaker/internal/pubsub.(*Pubsub).Publish
/home/X/Projects/bm/internal/pubsub/pubsub.go:30 +0xbb
Without really understanding this it appears to me that the goroutines created in Publish() do accumulate/leak. Is this correct and what am I doing wrong here?
ISSUE2:
When I end a subscription via Unsubscribe() it occurs that Publish() tried to write to a closed channel and panics. To mitigate this I have created a goroutine to close the channel with a delay. This feel really off-best-practice but I was not able to find a proper solution to this. What would be a deterministic way to do this?
Heres a little playground for you to test with: https://play.golang.org/p/K-L8vLjt7_9
Before diving into your solution and its issues, let me recommend again another Broker approach presented in this answer: How to broadcast message using channel
Now on to your solution.
Whenever you launch a goroutine, always think of how it will end and make sure it does if the goroutine is not ought to run for the lifetime of your app.
// ISSUE1: These goroutines apparently do not exit properly...
go func(ch chan interface{}) {
ch <- msg
}(sub.Data)
This goroutine tries to send a value on ch. This may be a blocking operation: it will block if ch's buffer is full and there is no ready receiver on ch. This is out of the control of the launched goroutine, and also out of the control of the pubsub package. This may be fine in some cases, but this already places a burden on the users of the package. Try to avoid these. Try to create APIs that are easy to use and hard to misuse.
Also, launching a goroutine just to send a value on a channel is a waste of resources (goroutines are cheap and light, but you shouldn't spam them whenever you can).
You do it because you don't want to get blocked. To avoid blocking, you may use a buffered channel with a "reasonable" high buffer. Yes, this doesn't solve the blocking issue, in only helps with "slow" clients receiving from the channel.
To "truly" avoid blocking without launching a goroutine, you may use non-blocking send:
select {
case ch <- msg:
default:
// ch's buffer is full, we cannot deliver now
}
If send on ch can proceed, it will happen. If not, the default branch is chosen immediately. You have to decide what to do then. Is it acceptable to "lose" a message? Is it acceptable to wait for some time until "giving up"? Or is it acceptable to launch a goroutine to do this (but then you'll be back at what we're trying to fix here)? Or is it acceptable to get blocked until the client can receive from the channel...
Choosing a reasonable high buffer, if you encounter a situation when it still gets full, it may be acceptable to block until the client can advance and receive from the message. If it can't, then your whole app might be in an unacceptable state, and it might be acceptable to "hang" or "crash".
// ISSUE2: close the channel async with a delay to ensure
// nothing will be written to the channel anymore
// via a pending goroutine from Publish()
go func(ch chan interface{}) {
time.Sleep(500 * time.Millisecond)
close(ch)
}(s.Data)
Closing a channel is a signal to the receiver(s) that no more values will be sent on the channel. So always it should be the sender's job (and responsibility) to close the channel. Launching a goroutine to close the channel, you "hand" that job and responsibility to another "entity" (a goroutine) that will not be synchronized to the sender. This may easily end up in a panic (sending on a closed channel is a runtime panic, for other axioms see How does a non initialized channel behave?). Don't do that.
Yes, this was necessary because you launched goroutines to send. If you don't do that, then you may close "in-place", without launching a goroutine, because then the sender and closer will be the same entity: the Pubsub itself, whose sending and closing operations are protected by a mutex. So solving the first issue solves the second naturally.
In general if there are multiple senders for a channel, then closing the channel must be coordinated. There must be a single entity (often not any of the senders) that waits for all senders to finish, practically using a sync.WaitGroup, and then that single entity can close the channel, safely. See Closing channel of unknown length.
I am trying to create an intermediate layer between user and tcp, with Send and Receive functions. Currently, I am trying to integrate a context, so that the Send and Receive respects a context. However, I don't know how to make them respect the context's cancellation.
Until now, I got the following.
// c.underlying is a net.Conn
func (c *tcpConn) Receive(ctx context.Context) ([]byte, error) {
if deadline, ok := ctx.Deadline(); ok {
// Set the read deadline on the underlying connection according to the
// given context. This read deadline applies to the whole function, so
// we only set it once here. On the next read-call, it will be set
// again, or will be reset in the else block, to not keep an old
// deadline.
c.underlying.SetReadDeadline(deadline)
} else {
c.underlying.SetReadDeadline(time.Time{}) // remove the read deadline
}
// perform reads with
// c.underlying.Read(myBuffer)
return frameData, nil
}
However, as far as I understand that code, this only respects a context.WithTimeout or context.WithDeadline, and not a context.WithCancel.
If possible, I would like to pass that into the connection somehow, or actually abort the reading process.
How can I do that?
Note: If possible, I would like to avoid another function that reads in another goroutine and pushed a result back on a channel, because then, when calling cancel, and I am reading 2GB over the network, that doesn't actually cancel the read, and the resources are still used. If not possible in another way however, I would like to know if there is a better way of doing that than a function with two channels, one for a []byte result and one for an error.
EDIT:
With the following code, I can respect a cancel, but it doesn't abort the read.
// apply deadline ...
result := make(chan interface{})
defer close(result)
go c.receiveAsync(result)
select {
case res := <-result:
if err, ok := res.(error); ok {
return nil, err
}
return res.([]byte), nil
case <-ctx.Done():
return nil, ErrTimeout
}
}
func (c *tcpConn) receiveAsync(result chan interface{}) {
// perform the reads and push either an error or the
// read bytes to the result channel
If the connection can be closed on cancellation, you can setup a goroutine to shutdown the connection on cancellation within the Receive method. If the connection must be reused again later, then there is no way to cancel a Read in progress.
recvDone := make(chan struct{})
defer close(recvDone)
// setup the cancellation to abort reads in process
go func() {
select {
case <-ctx.Done():
c.underlying.CloseRead()
// Close() can be used if this isn't necessarily a TCP connection
case <-recvDone:
}
}()
It will be a little more work if you want to communicate the cancelation error back, but the CloseRead will provide a clean way to stop any pending TCP Read calls.
I am using the following simple polling mechanism:
func poll() {
for {
if a {
device1()
time.Sleep(time.Second * 10)
} else {
sensor1()
time.Sleep(time.Second * 10)
}
}
}
I need to poll the device1 only if "a" is true otherwise poll the sensor1. Now here "a" will be set to true by clicking the button on UI which will be a random act.
But because of time.Sleep, delay get introduced while checking the condition.
Is there any way to immediately stop time.Sleep when I get the value for "a"?
what are the possible ways to implement such interrupts while polling in golang?
You cannot interrupt a time.Sleep().
Instead if you need to listen to other "events" while waiting for something, you should use channels and the select statements. A select is similar to a switch, but with the cases all referring to communication operations.
The good thing of select is that you may list multiple cases, all refering to comm. ops (e.g. channel receives or sends), and whichever operation can proceed will be chosen (choosing one randomly if multiple can proceed). If none can proceed, it will block (wait) until one of the comm. ops can proceed (unless there's a default).
A "timeout" event may be realized using time.After(), or a series of "continuous" timeout events (ticks or heartbeats) using time.Ticker.
You should change your button handler to send a value (the button state) on a channel, so receiving from this channel can be added to the select inside poll(), and processed "immediately" (without waiting for a timeout or the next tick event).
See this example:
func poll(buttonCh <-chan bool) {
isDevice := false
read := func() {
if isDevice {
device1()
} else {
sensor1()
}
}
ticker := time.NewTicker(1 * time.Second)
defer func() { ticker.Stop() }()
for {
select {
case isDevice = <-buttonCh:
case <-ticker.C:
read()
}
}
}
func device1() { log.Println("reading device1") }
func sensor1() { log.Println("reading sensor1") }
Testing it:
buttonCh := make(chan bool)
// simulate a click after 2.5 seconds:
go func() {
time.Sleep(2500 * time.Millisecond)
buttonCh <- true
}()
go poll(buttonCh)
time.Sleep(5500 * time.Millisecond)
Output (try it on the Go Playground):
2009/11/10 23:00:01 reading sensor1
2009/11/10 23:00:02 reading sensor1
2009/11/10 23:00:03 reading device1
2009/11/10 23:00:04 reading device1
2009/11/10 23:00:05 reading device1
This solution makes it easy to add any number of event sources without changing anything. For example if you want to add a "shutdown" event and a "read-now" event to the above solution, this is how it would look like:
for {
select {
case isDevice = <-buttonCh:
case <-ticker.C:
read()
case <-readNowCh:
read()
case <-shutdownCh:
return
}
}
Where shutdownCh is a channel with any element type, and to signal shutdown, you do that by closing it:
var shutdownCh = make(chan struct{})
// Somewhere in your app:
close(shutdownCh)
Closing the channel has the advantage that you may have any number of goroutines "monitoring" it, all will receive the shutdown (this is like broadcasting). Receive from a closed channel can proceed immediately, yielding the zero-value of the channel's element type.
To issue an "immediate read" action, you would send a value on readNowCh.
I have some code that is a job dispatcher and is collating a large amount of data from lots of TCP sockets. This code is a result of an approach to Large number of transient objects - avoiding contention and it largely works with CPU usage down a huge amount and locking not an issue now either.
From time to time my application locks up and the "Channel length" log is the only thing that keeps repeating as data is still coming in from my sockets. However the count remains at 5000 and no downstream processing is taking place.
I think the issue might be a race condition and the line it is possibly getting hung up on is channel <- msg within the select of the jobDispatcher. Trouble is I can't work out how to verify this.
I suspect that as select can take items at random the goroutine is returning and the shutdownChan doesn't have a chance to process. Then data hits inboundFromTCP and it blocks!
Someone might spot something really obviously wrong here. And offer a solution hopefully!?
var MessageQueue = make(chan *trackingPacket_v1, 5000)
func init() {
go jobDispatcher(MessageQueue)
}
func addMessage(trackingPacket *trackingPacket_v1) {
// Send the packet to the buffered queue!
log.Println("Channel length:", len(MessageQueue))
MessageQueue <- trackingPacket
}
func jobDispatcher(inboundFromTCP chan *trackingPacket_v1) {
var channelMap = make(map[string]chan *trackingPacket_v1)
// Channel that listens for the strings that want to exit
shutdownChan := make(chan string)
for {
select {
case msg := <-inboundFromTCP:
log.Println("Got packet", msg.Avr)
channel, ok := channelMap[msg.Avr]
if !ok {
packetChan := make(chan *trackingPacket_v1)
channelMap[msg.Avr] = packetChan
go processPackets(packetChan, shutdownChan, msg.Avr)
packetChan <- msg
continue
}
channel <- msg
case shutdownString := <-shutdownChan:
log.Println("Shutting down:", shutdownString)
channel, ok := channelMap[shutdownString]
if ok {
delete(channelMap, shutdownString)
close(channel)
}
}
}
}
func processPackets(ch chan *trackingPacket_v1, shutdown chan string, id string) {
var messages = []*trackingPacket_v1{}
tickChan := time.NewTicker(time.Second * 1)
defer tickChan.Stop()
hasCheckedData := false
for {
select {
case msg := <-ch:
log.Println("Got a messages for", id)
messages = append(messages, msg)
hasCheckedData = false
case <-tickChan.C:
messages = cullChanMessages(messages)
if len(messages) == 0 {
messages = nil
shutdown <- id
return
}
// No point running checking when packets have not changed!!
if hasCheckedData == false {
processMLATCandidatesFromChan(messages)
hasCheckedData = true
}
case <-time.After(time.Duration(time.Second * 60)):
log.Println("This channel has been around for 60 seconds which is too much, kill it")
messages = nil
shutdown <- id
return
}
}
}
Update 01/20/16
I tried to rework with the channelMap as a global with some mutex locking but it ended up deadlocking still.
Slightly tweaked the code, still locks but I don't see how this one does!!
https://play.golang.org/p/PGpISU4XBJ
Update 01/21/17
After some recommendations I put this into a standalone working example so people can see. https://play.golang.org/p/88zT7hBLeD
It is a long running process so will need running locally on a machine as the playground kills it. Hopefully this will help get to the bottom of it!
I'm guessing that your problem is getting stuck doing this channel <- msg at the same time as the other goroutine is doing shutdown <- id.
Since neither the channel nor the shutdown channels are buffered, they block waiting for a receiver. And they can deadlock waiting for the other side to become available.
There are a couple of ways to fix it. You could declare both of those channels with a buffer of 1.
Or instead of signalling by sending a shutdown message, you could do what Google's context package does and send a shutdown signal by closing the shutdown channel. Look at https://golang.org/pkg/context/ especially WithCancel, WithDeadline and the Done functions.
You might be able to use context to remove your own shutdown channel and timeout code.
And JimB has a point about shutting down the goroutine while it might still be receiving on the channel. What you should do is send the shutdown message (or close, or cancel the context) and continue to process messages until your ch channel is closed (detect that with case msg, ok := <-ch:), which would happen after the shutdown is received by the sender.
That way you get all of the messages that were incoming until the shutdown actually happened, and should avoid a second deadlock.
I'm new to Go but in this code here
case msg := <-inboundFromTCP:
log.Println("Got packet", msg.Avr)
channel, ok := channelMap[msg.Avr]
if !ok {
packetChan := make(chan *trackingPacket_v1)
channelMap[msg.Avr] = packetChan
go processPackets(packetChan, shutdownChan, msg.Avr)
packetChan <- msg
continue
}
channel <- msg
Aren't you putting something in channel (unbuffered?) here
channel, ok := channelMap[msg.Avr]
So wouldn't you need to empty out that channel before you can add the msg here?
channel <- msg
Like I said, I'm new to Go so I hope I'm not being goofy. :)
I'm having difficulty using time.Tick. I expect this code to print "hi" 10 times then quit after 1 second, but instead it hangs:
ticker := time.NewTicker(100 * time.Millisecond)
time.AfterFunc(time.Second, func () {
ticker.Stop()
})
for _ = range ticker.C {
go fmt.Println("hi")
}
https://play.golang.org/p/1p6-ViSvma
Looking at the source, I see that the channel isn't closed when Stop() is called. In that case, what is the idiomatic way to iterate over the ticker channel?
You're right, ticker's channel is not being closed on stop, that's stated in a documentation:
Stop turns off a ticker. After Stop, no more ticks will be sent. Stop does not close the channel, to prevent a read from the channel succeeding incorrectly.
I believe ticker is more about fire and forget and even if you want to stop it, you could even leave the routine hanging forever (depends on your application of course).
If you really need a finite ticker, you can do tricks and provide a separate channel (per ThunderCat's answer), but what I would do is providing my own implementation of ticker. This should be relatively easy and will give you flexibility with its behaviour, things like what to pass on the channel or deciding what to do with missing ticks (i.e. when reader is falling behind).
My example:
func finiteTicker(n int, d time.Duration) <-chan time.Time {
ch := make(chan time.Time, 1)
go func() {
for i := 0; i < n; i++ {
time.Sleep(d)
ch <- time.Now()
}
close(ch)
}()
return ch
}
func main() {
for range finiteTicker(10, 100*time.Millisecond) {
fmt.Println("hi")
}
}
http://play.golang.org/p/ZOwJlM8rDm
I asked on IRC as well, at got some useful insight from #Tv`.
Despite timer.Ticker looking like it should be part of a go pipeline, it does not actually play well with the pipeline idioms:
Here are the guidelines for pipeline construction:
stages close their outbound channels when all the send operations are done.
stages keep receiving values from inbound channels until those channels are closed or the senders are unblocked.
Pipelines unblock senders either by ensuring there's enough buffer for all the values that are sent or by explicitly signalling senders when the receiver may abandon the channel.
The reason for this inconsistency appears to be primarily to support the following idiom:
for {
select {
case <-ticker.C:
// do something
case <-done:
return
}
}
I don't know why this is the case, and why the pipelining idiom wasn't used:
for {
select {
case _, ok := <-ticker.C:
if ok {
// do something
} else {
return
}
}
}
(or more cleanly)
for _ = range ticker.C {
// do something
}
But this is the way go is :(