How to include goroutine into a context? - go

I'm working on a Go project that require calling an initiation function (initFunction) in a separated goroutine (to ensure this function does not interfere with the rest of the project). initFunction must not take more than 30 seconds, so I thought I would use context.WithTimeout. Lastly, initFunction must be able to notify errors to the caller, so I thought of making an error channel and calling initFunction from an anonymous function, to recieve and report the error.
func RunInitGoRoutine(initFunction func(config string)error) error {
initErr := make(chan error)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Seconds)
go func() {
<-ctx.Done() // Line 7
err := initFunction(config)
initErr <-err
}()
select {
case res := <-initErr:
return res.err
case <-ctx.Done():
err := errors.New("Deadline")
return err
}
}
I'm quite new to Go, so I'm asking for feedbacks about the above code.
I have some doubt about Line 7. I used this to ensure the anonymous function is "included" under ctx and is therefore killed and freed and everything once timeout expires, but I'm not sure I have done the right thing.
Second thing is, I know I should be calling cancel( ) somewhere, but I can't put my finger around where.
Lastly, really any feedback is welcome, being it about efficency, style, correctness or anything.

In Go the pratice is to communicate via channels. So best thing is probably to share a channel on your context so others can consume from the channel.
As you are stating you are new to Go, I wrote a whole bunch of articles on Go (Beginner level) https://marcofranssen.nl/categories/golang.
Read from old to new to get familiar with the language.
Regarding the channel specifics you should have a look at this article.
https://marcofranssen.nl/concurrency-in-go
A pratical example of a webserver listening for ctrl+c and then gracefully shutting down the server using channels is described in this blog post.
https://marcofranssen.nl/improved-graceful-shutdown-webserver
In essence we run the server in a background routine
go func() {
if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
srv.l.Fatal("Could not listen on", zap.String("addr", srv.Addr), zap.Error(err))
}
}()
and then we have some code that is blocking the main routine by listening on a channel for the shutdown signal.
quit := make(chan os.Signal, 1)
signal.Notify(quit, os.Interrupt)
sig := <-quit
srv.l.Info("Server is shutting down", zap.String("reason", sig.String()))
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
srv.SetKeepAlivesEnabled(false)
if err := srv.Shutdown(ctx); err != nil {
srv.l.Fatal("Could not gracefully shutdown the server", zap.Error(err))
}
srv.l.Info("Server stopped")
This is very similar to your usecase. So running your init in a background routine and then consume the channel waiting for the result of this init routine.
package main
import (
"fmt"
"time"
)
type InitResult struct {
Message string
}
func main() {
initResult := make(chan InitResult, 0)
go func(c chan<- InitResult) {
time.Sleep(5 * time.Second)
// here we are publishing the result on the channel
c <- InitResult{Message: "Initialization succeeded"}
}(initResult)
fmt.Println("Started initializing")
// here we have a blocking operation consuming the channel
res := <-initResult
fmt.Printf("Init result: %s", res.Message)
}
https://play.golang.org/p/_YGIrdNVZx6
You could also add an error field on the struct so you could do you usual way of error checking.

Related

Goroutine safe channel close doesn't actually close webscoket

This one is a tricky issue that bugs me quite a bit.
Essentially, I wrote an integration microservice that provides data streams from Binance crypto exchange using the Go client. A client sends a start messages, starts data stream for a symbol, and at some point, sends a close message to stop the stream. My implementation looks basically like this:
func (c BinanceClient) StartDataStream(clientType bn.ClientType, symbol, interval string) error {
switch clientType {
case bn.SPOT_LIVE:
wsKlineHandler := c.handlers.klineHandler.SpotKlineHandler
wsErrHandler := c.handlers.klineHandler.ErrHandler
_, stopC, err := binance.WsKlineServe(symbol, interval, wsKlineHandler, wsErrHandler)
if err != nil {
fmt.Println(err)
return err
} else {
c.state.clientSymChanMap[clientType][symbol] = stopC
return nil
}
...
}
The clientSymChanMap stores the stopChannel in a nested hashmap so that I can retrieve the stop channel later to stop the data feed. The stop function has been implemented accordingly:
func (c BinanceClient) StopDataStream(clientType bn.ClientType, symbol string) {
//mtd := "StopDataStream: "
stopC := c.state.clientSymChanMap[clientType][symbol]
if isClosed(stopC) {
DbgPrint(" Channel is already closed. Do nothing for: " + symbol)
} else {
close(stopC)
}
// Delete channel from the map otherwise the next StopAll throws a NPE due to closing a dead channel
delete(c.state.clientSymChanMap[clientType], symbol)
return
}
To prevent panics from already closed channels, I use a check function that returns true in case the channel is already close.
func isClosed(ch <-chan struct{}) bool {
select {
case <-ch:
return true
default:
}
return false
}
Looks nice, but has a catch. When I run the code with starting data for just one symbol, it starts and closes the datafeed exactly as expected.
However, when starting multiple data feeds, then the above code somehow never closes the websocket and just keeps streaming data forever. Without the isClosed check, I get panics of trying to close a closed channel, but with the check in place, well, nothing gets closed.
When looking at the implementation of the above binance.WsKlineServe function, it's quite obvious that it just wraps a new websocket with each invocation and then returns the done & stop channel.
The documentation gives the following usage example:
wsKlineHandler := func(event *binance.WsKlineEvent) {
fmt.Println(event)
}
errHandler := func(err error) {
fmt.Println(err)
}
doneC, stopC, err := binance.WsKlineServe("LTCBTC", "1m", wsKlineHandler, errHandler)
if err != nil {
fmt.Println(err)
return
}
<-doneC
Because the doneC channel actually blocks, I removed it and thought that storing the stopC channel and then use it later to stop the datafeed would work. However, it only does so for one single instance. When multiple streams are open, this doesn't work anymore.
Any idea what that's the case and how to fix it?
Firstly, this is dangerous:
if isClosed(stopC) {
DbgPrint(" Channel is already closed. Do nothing for: " + symbol)
} else {
close(stopC) // <- can't be sure channel is still open
}
there is no guarantee that after your polling check of the channel state, that the channel will still be in that same state in the next line of code. So this code could in theory could panic if it's called concurrently.
If you want an asynchronous action to occur on the channel close - it's best to do this explicitly from its own goroutine. So you could try this:
go func() {
stopC := c.state.clientSymChanMap[clientType][symbol]
<-stopC
// stopC definitely closed now
delete(c.state.clientSymChanMap[clientType], symbol)
}()
P.S. you do need some sort of mutex on your map, since the delete is asynchronous - you need to ensure any adds to the map don't datarace with this.
P.P.S Channels are reclaimed by the GC when they go out of scope. If you are no longer reading from it - they do not need to be explicitly closed to be reclaimed by the GC.
Using channels for stopping a goroutine or closing something is very tricky. There are lots of things you can do wrong or forget to do.
context.WithCancel abstracts that complexity away, making the code more readable and maintainable.
Some code snippets:
ctx, cancel := context.WitchCancel(context.TODO())
TheThingToCancel(ctx, ...)
// Whenever you want to stop TheThingToCancel. Can be called multiple times.
cancel()
Then in a for loop you'd often have a select like this:
for {
select {
case <-ctx.Done():
return
default:
}
// do stuff
}
Here some code that is closer to your specific case of an open connection:
func TheThingToCancel(ctx context.Context) (context.CancelFunc, error) {
ctx, cancel := context.WithCancel(ctx)
conn, err := net.Dial("tcp", ":12345")
if err != nil {
cancel()
return nil, err
}
go func() {
<-ctx.Done()
_ = conn.Close()
}()
go func() {
defer func() {
_ = conn.Close()
// make sure context is always cancelled to avoid goroutine leak
cancel()
}()
var bts = make([]byte, 1024)
for {
n, err := conn.Read(bts)
if err != nil {
return
}
fmt.Println(bts[:n])
}
}()
return cancel, nil
}
It returns the cancel function to be able to close it from the outside.
Cancelling a context can be done many times over without a panic like would occur if a channel is closed multiple times. That is one advantage. Also you can derive contexts from other contexts and thereby close a lot of contexts that all stop different routines by closing a parent context. Carefully designed, this is very powerful for shutting down different routines belonging together that also need to be able to be shut down individually.

Using context with cancel, Go routine doesn't terminate

I'm new to Go and concurrency in Go. I'm trying to use a Go context to cancel a set of Go routines once I find a member with a given ID.
A Group stores a list of Clients, and each Client has a list of Members. I want to search in parallel all the Clients and all their Members to find a Member with a given ID. Once this Member is found, I want to cancel all the other Go routines and return the discovered Member.
I've tried the following implementation, using a context.WithCancel and a WaitGroup.
This doesn't work however, and hangs indefinitely, never getting past the line waitGroup.Wait(), but I'm not sure why exactly.
func (group *Group) MemberWithID(ID string) (*models.Member, error) {
found := make(chan *models.Member)
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
defer cancel()
var waitGroup sync.WaitGroup
for _, client := range group.Clients {
waitGroup.Add(1)
go func(clientToQuery Client) {
defer waitGroup.Done()
select {
case <-ctx.Done():
return
default:
}
member, _ := client.ClientMemberWithID(ID)
if member != nil {
found <- member
cancel()
return
}
} (client)
}
waitGroup.Wait()
if len(found) > 0 {
return <-found, nil
}
return nil, fmt.Errorf("no member found with given id")
}
found is an unbuffered channel, so sending on it blocks until there is someone ready to receive from it.
Your main() function would be the one to receive from it, but only after waitGroup.Wait() returns. But that will block until all launched goroutines call waitGroup.Done(). But that won't happen until they return, which won't happen until they can send on found. It's a deadlock.
If you change found to be buffered, that will allow sending values on it even if main() is not ready to receive from it (as many values as big the buffer is).
But you should receive from found before waitGroup.Wait() returns.
Another solution is to use a buffer of 1 for found, and use non-blocking send on found. That way the first (fastest) goroutine will be able to send the result, and the rest (given we're using non-blocking send) will simply skip sending.
Also note that it should be the main() that calls cancel(), not each launched goroutines individually.
For this kind of use case I think a sync.Once is probably a better fit than a channel. When you find the first non-nil member, you want to do two different things:
Record the member you found.
Cancel the remaining goroutines.
A buffered channel can easily do (1), but makes (2) a bit more complicated. But a sync.Once is perfect for doing two different things the first time something interesting happens!
I would also suggest aggregating non-trivial errors, so that you can report something more useful than no member found if, say, your database connection fails or some other nontrivial error occurs. You can use a sync.Once for that, too!
Putting it all together, I would want to see something like this (https://play.golang.org/p/QZXUUnbxOv5):
func (group *Group) MemberWithID(ctx context.Context, id string) (*Member, error) {
ctx, cancel := context.WithCancel(ctx)
defer cancel()
var (
wg sync.WaitGroup
member *Member
foundOnce sync.Once
firstNontrivialErr error
errOnce sync.Once
)
for _, client := range group.Clients {
wg.Add(1)
client := client // https://golang.org/doc/faq#closures_and_goroutines
go func() {
defer wg.Done()
m, err := client.ClientMemberWithID(ctx, id)
if m != nil {
foundOnce.Do(func() {
member = m
cancel()
})
} else if nf := (*MemberNotFoundError)(nil); !errors.As(err, &nf) {
errOnce.Do(func() {
firstNontrivialErr = err
})
}
}()
}
wg.Wait()
if member == nil {
if firstNontrivialErr != nil {
return nil, firstNontrivialErr
}
return nil, &MemberNotFoundError{ID: id}
}
return member, nil
}

How chan bool is making goroutine waiting?

I'm building an app to run a command every time the code changes. I'm using a fsnotify for this feature. But, I can't understand how it is waiting a main goroutine.
I found that using sync.WaitGroup is more idiomatic, but I'm curious how done chan bool makes a goroutine is waiting in fsnotify example code.
I've tried to remove done in the example code of fsnotify, but it's not waiting a goroutine, just is exited.
watcher, err := fsnotify.NewWatcher()
if err != nil {
log.Fatal(err)
}
defer watcher.Close()
done := make(chan bool)
go func() {
for {
select {
case event, ok := <-watcher.Events:
if !ok {
return
}
log.Println("event:", event)
if event.Op&fsnotify.Write == fsnotify.Write {
log.Println("modified file:", event.Name)
}
case err, ok := <-watcher.Errors:
if !ok {
return
}
log.Println("error:", err)
}
}
}()
err = watcher.Add("/tmp/foo")
if err != nil {
log.Fatal(err)
}
<-done
I'm not entirely sure what you're asking, but there's a subtle bug in the code you've provided.
A done channel is a common way to block until an action completes. It is used like this:
done := make(chan X) // Where X is any type
go func() {
// Some logic, possibly in a loop
close(done)
}()
// Other logic
<-done // Wait for `done` to be closed
The type of the channel is unimportant, as no data is (necissarily) sent over the channel, so bool works, but struct{} is more idiomatic, as it indicates that no data can be sent.
Your example almost does this, except that it never calls close(done). This is a bug. It means that the code will always wait forever at <-done, thus negating the entire purpose of a done channel. Your example code will never exit.
This means the code, as you have provided, could be also written as:
go func() {
// Do stuff
}()
// Do other stuff
<Any code that blocks forever>
Because there are countless ways to block forever--none of them ever useful in practice--the channel in your example is not needed.
As per my study, I found an answer from one guy in reddit.com. This is kind of trick though, using <-done makes the main goroutine waiting an any value from chan done, eventually this app keeps running for fsnotify to watch and send a event to the main goroutine.

Behavior of server.GracefulStop() in golang

I have a gRPC server, and I have implemented graceful shutdown of my gRPC server something like this
fun main() {
//Some code
term := make(chan os.Signal)
go func() {
if err := grpcServer.Serve(lis); err != nil {
term <- syscall.SIGINT
}
}()
signal.Notify(term, syscall.SIGTERM, syscall.SIGINT)
<-term
server.GracefulStop()
closeDbConnections()
}
This works fine.
If instead I write the grpcServer.Serve() logic in main goroutine and instead put the shutdown handler logic into another goroutine, statements after server.GracefulStop() usually do not execute. Some DbConnections are closed, if closeDbConnections() is executed at all.
server.GracefulStop() is a blocking call. Definitely grpcServer.Serve() finishes before server.GracefulStop() completes. So, how long does main goroutine take to stop after this call returns?
The problematic code
func main() {
term := make(chan os.Signal)
go func() {
signal.Notify(term, syscall.SIGTERM, syscall.SIGINT)
<-term
server.GracefulStop()
closeDbConnections()
}()
if err := grpcServer.Serve(lis); err != nil {
term <- syscall.SIGINT
}
}
This case does not work as expected. After server.GracefulStop() is done, closeDbConnections() may or may not run (usually does not run to completion). I was testing the later case by sending SIGINT by hitting Ctrl-C from my terminal.
Can someone please explain this behavior?
I'm not sure about your question (please clarify it), but I would suggest you to refactor your main in this way:
func main() {
// ...
errChan := make(chan error)
stopChan := make(chan os.Signal)
// bind OS events to the signal channel
signal.Notify(stopChan, syscall.SIGTERM, syscall.SIGINT)
// run blocking call in a separate goroutine, report errors via channel
go func() {
if err := grpcServer.Serve(lis); err != nil {
errChan <- err
}
}()
// terminate your environment gracefully before leaving main function
defer func() {
server.GracefulStop()
closeDbConnections()
}()
// block until either OS signal, or server fatal error
select {
case err := <-errChan:
log.Printf("Fatal error: %v\n", err)
case <-stopChan:
}
I don't think it's a good idea to mix system events and server errors, like you do in your example: in case if Serve fails, you just ignore the error and emit system event, which actually didn't happen. Try another approach when there are two transports (channels) for two different kind of event that cause process termination.

Do goroutines with receiving channel as parameter stop, when the channel is closed?

I have been reading "Building microservices with go" and the book introduces apache/go-resiliency/deadline package for handling timeouts.
deadline.go
// Package deadline implements the deadline (also known as "timeout") resiliency pattern for Go.
package deadline
import (
"errors"
"time"
)
// ErrTimedOut is the error returned from Run when the deadline expires.
var ErrTimedOut = errors.New("timed out waiting for function to finish")
// Deadline implements the deadline/timeout resiliency pattern.
type Deadline struct {
timeout time.Duration
}
// New constructs a new Deadline with the given timeout.
func New(timeout time.Duration) *Deadline {
return &Deadline{
timeout: timeout,
}
}
// Run runs the given function, passing it a stopper channel. If the deadline passes before
// the function finishes executing, Run returns ErrTimeOut to the caller and closes the stopper
// channel so that the work function can attempt to exit gracefully. It does not (and cannot)
// simply kill the running function, so if it doesn't respect the stopper channel then it may
// keep running after the deadline passes. If the function finishes before the deadline, then
// the return value of the function is returned from Run.
func (d *Deadline) Run(work func(<-chan struct{}) error) error {
result := make(chan error)
stopper := make(chan struct{})
go func() {
result <- work(stopper)
}()
select {
case ret := <-result:
return ret
case <-time.After(d.timeout):
close(stopper)
return ErrTimedOut
}
}
deadline_test.go
package deadline
import (
"errors"
"testing"
"time"
)
func takesFiveMillis(stopper <-chan struct{}) error {
time.Sleep(5 * time.Millisecond)
return nil
}
func takesTwentyMillis(stopper <-chan struct{}) error {
time.Sleep(20 * time.Millisecond)
return nil
}
func returnsError(stopper <-chan struct{}) error {
return errors.New("foo")
}
func TestDeadline(t *testing.T) {
dl := New(10 * time.Millisecond)
if err := dl.Run(takesFiveMillis); err != nil {
t.Error(err)
}
if err := dl.Run(takesTwentyMillis); err != ErrTimedOut {
t.Error(err)
}
if err := dl.Run(returnsError); err.Error() != "foo" {
t.Error(err)
}
done := make(chan struct{})
err := dl.Run(func(stopper <-chan struct{}) error {
<-stopper
close(done)
return nil
})
if err != ErrTimedOut {
t.Error(err)
}
<-done
}
func ExampleDeadline() {
dl := New(1 * time.Second)
err := dl.Run(func(stopper <-chan struct{}) error {
// do something possibly slow
// check stopper function and give up if timed out
return nil
})
switch err {
case ErrTimedOut:
// execution took too long, oops
default:
// some other error
}
}
1st question
// in deadline_test.go
if err := dl.Run(takesTwentyMillis); err != ErrTimedOut {
t.Error(err)
}
I have problem understanding the execution flow of above code. As far as I understand, because the takesTwentyMillis function sleeps longer than the set timeout duration of 10 milliseconds,
// in deadline.go
case <-time.After(d.timeout):
close(stopper)
return ErrTimedOut
time.After emits current time, and this case is selected. Then the stopper channel is closed and ErrTimeout is returned.
What I do not understand is, what closing the stopper channel does to the anonymous goroutine that might still be running
I think, when the stopper channel is closed, the below goroutine might still be running.
go func() {
result <- work(stopper)
}()
(Please correct me if I'm wrong here) I think after close(stopper), this goroutine will call takesTwentyMillis(=work function) with stopper channel as its parameter. And the function will proceed and sleep for 20 milliseconds and return nil to pass to result channel. And the main() ends here, right?
I do not see what is the point of closing the stopper channel here. The takesTwentyMillis function does not seem to use the channel within the function body anyway :(.
2nd question
// in deadline_test.go within TestDeadline()
done := make(chan struct{})
err := dl.Run(func(stopper <-chan struct{}) error {
<-stopper
close(done)
return nil
})
if err != ErrTimedOut {
t.Error(err)
}
<-done
This is the part I do not understand completely. I think when dl.Run is run, stopper channel is initialized. But because there is no value in the stopper channel, the function call will be blocked at <-stopper...but because I do not understand this code, I do not see why this code exists in the first place (i.e. what this code is trying to test, and how it is executed, etc).
3rd(additional) question regarding the 2nd question
So I understand that when Run function in the 2nd question triggers the stopper channel to close, the worker function gets the signal. And the worker closes the done channel and returns nil.
I used delve(=go debugger) to see this, and the gdb takes me to the goroutine in deadline.go after the line return nil.
err := dl.Run(func(stopper <-chan struct{}) error {
<-stopper
close(done)
--> return nil
})
After typing n for stepping over to the next line, delve takes me here
go func() {
--> result <- work(stopper)
}()
And the process kind of finishes here because when I type n again the command line prompts PASS and process exits. Why does the process finishes here? The work(stopper) seems to return nil, which should then be passed to result channel right? But this line does not seem to execute for some reason.
I know the main goroutine, which is the Run function, has already returned ErrTimedOut. So I guess it has something to do with this?
1st question
The use of the stopper channel is to signal the function e.g. takesTwentyMillis that it's deadline is reached and the caller no longer cares about its result. Usually this means that the worker function like takesTwentyMillis should check if the stopper channel is already closed so that it may cancel it's work. Still, checking for the stopper channel is the worker function's choice. It may or may not check the channel.
func takesTwentyMillis(stopper <-chan struct{}) error {
for i := 0; i < 20; i++ {
select {
case <-stopper:
// caller doesn't care anymore might as well stop working
return nil
case <-time.After(time.Second): // simulating work
}
}
// work is done
return nil
}
2nd question
This part of Deadline.Run() will close the stopper channel.
case <-time.After(d.timeout):
close(stopper)
Reading on a closed channel (<-stopper) will return a zero value for that channel immediately. I think it's just testing for a worker function that ultimately times-out.

Resources