I have the following code for a module I'm developing and I'm not sure why the provider.Shutdown() function is never called when I called .Stop()
The main process does stop but I'm confused why this doesn't work?
package pluto
import (
"context"
"fmt"
"log"
"sync"
)
type Client struct {
name string
providers []Provider
cancelCtxFunc context.CancelFunc
}
func NewClient(name string) *Client {
return &Client{name: name}
}
func (c *Client) Start(blocking bool) {
log.Println(fmt.Sprintf("Starting the %s service", c.name))
ctx, cancel := context.WithCancel(context.Background())
c.cancelCtxFunc = cancel // assign for later use
var wg sync.WaitGroup
for _, p := range c.providers {
wg.Add(1)
provider := p
go func() {
provider.Setup()
select {
case <-ctx.Done():
// THIS IS NEVER CALLED?!??!
provider.Shutdown()
return
default:
provider.Run(ctx)
}
}()
}
if blocking {
wg.Wait()
}
}
func (c *Client) RegisterProvider(p Provider) {
c.providers = append(c.providers, p)
}
func (c *Client) Stop() {
log.Println("Attempting to stop service")
c.cancelCtxFunc()
}
Client code
package main
import (
"pluto/pkgs/pluto"
"time"
)
func main() {
client := pluto.NewClient("test-client")
testProvider := pluto.NewTestProvider()
client.RegisterProvider(testProvider)
client.Start(false)
time.Sleep(time.Second * 3)
client.Stop()
}
Because it's already chosen the other case before the context is cancelled. Here is your code, annotated:
// Start a new goroutine
go func() {
provider.Setup()
// Select the first available case
select {
// Is the context cancelled right now?
case <-ctx.Done():
// THIS IS NEVER CALLED?!??!
provider.Shutdown()
return
// No? Then call provider.Run()
default:
provider.Run(ctx)
// Run returned, nothing more to do, we're not in a loop, so our goroutine returns
}
}()
Once provider.Run is called, cancelling the context isn't going to do anything in the code shown. provider.Run also gets the context though, so it is free to handle cancellation as it sees fit. If you want your routine to also see cancellation, you could wrap this in a loop:
go func() {
provider.Setup()
for {
select {
case <-ctx.Done():
// THIS IS NEVER CALLED?!??!
provider.Shutdown()
return
default:
provider.Run(ctx)
}
}
}()
This way, once provider.Run returns, it will go through the select again, and if the context has been cancelled, that case will be called. However, if the context hasn't been cancelled, it'll call provider.Run again, which may or may not be what you want.
EDIT:
More typically, you'd have one of a couple scenarios, depending on how provider.Run and provider.Shutdown work, which hasn't been made clear in the question, so here are your options:
Shutdown must be called when the context is cancelled, and Run must only be called once:
go func() {
provider.Setup()
go provider.Run(ctx)
go func() {
<- ctx.Done()
provider.Shutdown()
}()
}
Or Run, which already receives the context, already does the same thing as Shutdown when the context is cancelled, and therefore calling Shutdown when the context is cancelled is wholly unnecessary:
go provider.Run(ctx)
Related
Consider this (https://play.golang.org/p/zvDiwul9QR0):
package main
import (
"context"
"fmt"
"time"
)
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
for {
select {
case <-ctx.Done():
fmt.Println("Done")
break
default:
for {
fmt.Println("loop")
time.Sleep(500 * time.Millisecond)
}
}
}
}
So here the contexts returns a "Done()" channel after 2 seconds. And I want to catch this and cancel my infinite for loop. The code example above does not do this, it never exits the loop.
How can I achieve this?
Context cancelation is not magic - they are just a signal mechanism. To abort work, you need to monitor the state of the context from your worker goroutine:
for {
fmt.Println("loop")
select {
case <-time.After(500 * time.Millisecond):
case <-ctx.Done():
return
}
}
https://play.golang.org/p/L6-nDpo9chb
also as Eli pointed out, break will only break out of the select statement - so you need something more precise to break out of a loop. Refactoring into functions make return's much more intuitive for task abortion.
Following up from comments. I would refactor your task like so:
// any potentially blocking task should take a context
// style: context should be the first passed in parameter
func myTask(ctx context.Context, poll time.Duration) error {
for {
fmt.Println("loop")
select {
case <-time.After(poll):
case <-ctx.Done():
return ctx.Err()
}
}
}
https://play.golang.org/p/I3WDVd1uHbz
I have the following code in Go using the semaphore library just as an example:
package main
import (
"fmt"
"context"
"time"
"golang.org/x/sync/semaphore"
)
// This protects the lockedVar variable
var lock *semaphore.Weighted
// Only one go routine should be able to access this at once
var lockedVar string
func acquireLock() {
err := lock.Acquire(context.TODO(), 1)
if err != nil {
panic(err)
}
}
func releaseLock() {
lock.Release(1)
}
func useLockedVar() {
acquireLock()
fmt.Printf("lockedVar used: %s\n", lockedVar)
releaseLock()
}
func causeDeadLock() {
acquireLock()
// calling this from a function that's already
// locked the lockedVar should cause a deadlock.
useLockedVar()
releaseLock()
}
func main() {
lock = semaphore.NewWeighted(1)
lockedVar = "this is the locked var"
// this is only on a separate goroutine so that the standard
// go "deadlock" message doesn't print out.
go causeDeadLock()
// Keep the primary goroutine active.
for true {
time.Sleep(time.Second)
}
}
Is there a way to get the acquireLock() function call to print a message after a timeout indicating that there is a potential deadlock but without unblocking the call? I would want the deadlock to persist, but a log message to be written in the event that a timeout is reached. So a TryAcquire isn't exactly what I want.
An example of what I want in psuedo code:
afterFiveSeconds := func() {
fmt.Printf("there is a potential deadlock\n")
}
lock.Acquire(context.TODO(), 1, afterFiveSeconds)
The lock.Acquire call in this example would call the afterFiveSeconds callback if the Acquire call blocked for more than 5 seconds, but it would not unblock the caller. It would continue to block.
I think I've found a solution to my problem.
func acquireLock() {
timeoutChan := make(chan bool)
go func() {
select {
case <-time.After(time.Second * time.Duration(5)):
fmt.Printf("potential deadlock while acquiring semaphore\n")
case <-timeoutChan:
break
}
}()
err := lock.Acquire(context.TODO(), 1)
close(timeoutChan)
if err != nil {
panic(err)
}
}
I'm relatively new to Golang and am trying to incorporate Contexts into my code.
I see the benefits in terms of cancelling from the parent as well as sharing context-specific stuff (loggers, for example).
Beyond that, I might be missing something, but I can't see a way for a child to cancel the context. The example here would be if one of the child routines encounters an error that means the whole context is done.
Here's some sample code:
package main
import (
"context"
"fmt"
"math/rand"
"os"
"os/signal"
"sync"
"time"
)
func main() {
ctx, cancel := context.WithCancel(context.Background())
// handle SIGINT (control+c)
go func() {
c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt)
<-c
fmt.Println("main: interrupt received. cancelling context.")
cancel()
}()
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
child1DoWork(ctx)
wg.Done()
}()
wg.Add(1)
go func() {
child2DoWork(ctx)
wg.Done()
}()
fmt.Println("main: waiting for children to finish")
wg.Wait()
fmt.Println("main: children done. exiting.")
}
func child1DoWork(ctx context.Context) {
// pretend we're doing something useful
tck := time.NewTicker(5 * time.Second)
for {
select {
case <-tck.C:
fmt.Println("child1: still working")
case <-ctx.Done():
// context cancelled
fmt.Println("child1: context cancelled")
return
}
}
}
func child2DoWork(ctx context.Context) {
// pretend we're doing something useful
tck := time.NewTicker(2 * time.Second)
for {
select {
case <-tck.C:
if rand.Intn(5) < 4 {
fmt.Println("child2: did some work")
} else {
// pretend we encountered an error
fmt.Println("child2: error encountered. need to cancel but how do I do it?!?")
// PLACEHOLDER: HOW TO CANCEL FROM HERE?
return
}
case <-ctx.Done():
// context cancelled
fmt.Println("child2: context cancelled")
return
}
}
}
Here you have an example of cancelling from the parent (due to SIGINT) which works great. However, there's a placeholder in child2DoWork where an error is encountered and I want to then cancel the whole context, but I can't see a way to do that with the vanilla context capabilities.
Is this out-of-scope for contexts? Clearly I could communicate from child2 back to the parent which could then cancel, but I'm wondering if there isn't an easier way.
If communication back to the parent is the proper way, is there an idiomatic way of doing this? It does seem like a common problem.
Thanks!
A child can't and shouldn't cancel a context, it's the parent's call. What a child may do is return an error, and the parent should decide if the error requires cancelling the context.
Just because a "subtask" fails, it doesn't mean all other subtasks need to be cancelled. Often, a failing subtask may have a meaning that other subtasks become more important. Think of a parallel search: you may use multiple subtasks to search for the same thing in multiple sources. You may use the fastest result and may wish to cancel the slower ones. If a search fails, you do want the rest to continue.
Obviously if you pass the cancel function to the child, the child will have the power to cancel the context. But instead leave that power at the parent.
Is this out-of-scope for contexts? Clearly I could communicate from child2 back to the parent which could then cancel, but I'm wondering if there isn't an easier way.
Yes, this is exactly backwards for contexts. They are explicitly for a caller to cancel. The correct mechanism here is the simplest and most obvious: when child2DoWork encounters an error, it should return an error, and when the caller gets an error back, if the correct response is to cancel other tasks, it can then cancel the appropriate context(s).
Essentially, the child is a task, and it should be isolated from any other tasks. It shouldn't be trying to manage its siblings; the parent should be managing all of its children.
In the case that
parent spawn multiple child goroutines to achieve one goal
if one child failed, parent need to stop its siblings
you can use a channel to communicate, parent can listen to the channel, once there is an error, parent can cancel all children task.
I have modified your code
package main
import (
"context"
"fmt"
"math/rand"
"os"
"os/signal"
"sync"
"time"
)
func main() {
ctx, cancel := context.WithCancel(context.Background())
errChan := make(chan error)
// handle SIGINT (control+c)
go func() {
c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt)
select {
case <-c:
fmt.Println("main: interrupt received. cancelling context.")
case err := <-errChan:
fmt.Printf("main: child goroutine returns error. cancelling context. %s\n", err)
}
cancel()
}()
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
child1DoWork(ctx)
wg.Done()
}()
wg.Add(1)
go func() {
child2DoWork(ctx, errChan)
wg.Done()
}()
fmt.Println("main: waiting for children to finish")
wg.Wait()
fmt.Println("main: children done. exiting.")
}
func child1DoWork(ctx context.Context) {
// pretend we're doing something useful
tck := time.NewTicker(5 * time.Second)
for {
select {
case <-tck.C:
fmt.Println("child1: still working")
case <-ctx.Done():
// context cancelled
fmt.Println("child1: context cancelled")
return
}
}
}
func child2DoWork(ctx context.Context, errChan chan error) {
// pretend we're doing something useful
tck := time.NewTicker(2 * time.Second)
for {
select {
case <-tck.C:
if rand.Intn(5) < 4 {
fmt.Println("child2: did some work")
} else {
// pretend we encountered an error
fmt.Println("child2: error encountered")
// PLACEHOLDER: HOW TO CANCEL FROM HERE?
errChan <- fmt.Errorf("error in child2")
return
}
case <-ctx.Done():
// context cancelled
fmt.Println("child2: context cancelled")
return
}
}
}
I have a node that I want to spin up until I decide I want to stop it. I currently have a Start method that blocks on the Contexts Done channel, I then have a Stop function that calls the cancel, in my tests my Start seems to hang forever and the Stop is never called. I can't work out why the Done signal isn't being called and stopping my node.
var (
// Ctx is the node's main context.
Ctx, cancel = context.WithCancel(context.Background())
// Cancel is a function used to initiate graceful shutdown.
Cancel = cancel
)
type (
Node struct {
database *store.Store
}
)
// Starts run the Node.
func (n *Node) Start() error {
var nodeError error
defer func() {
err := n.database.Close()
if err != nil {
nodeError = err
}
}()
<-Ctx.Done()
return nodeError
}
// Stop stops the node.
func (n *Node) Stop() {
Cancel()
}
And my test is:
func TestNode_Start(t *testing.T) {
n, _ := node.NewNode("1.0")
err := n.Start()
n.Stop()
assert.NoError(t, err)
}
There are several problems with your code. Let's break it down.
var (
// Ctx is the node's main context.
Ctx, cancel = context.WithCancel(context.Background())
// Cancel is a function used to initiate graceful shutdown.
Cancel = cancel
)
These should not be package variables. They should be instance variables--that is to say, members or the Node struct. By making these package variables, if you have multiple tests that use Node, they will all step on each others toes, cause race conditions, and crashes. So instead, do this:
type Node struct {
database *store.Store
ctx context.Context
cancel context.CancelFunc
}
Next, we see that you have a deferred function in your Start() method:
// Starts run the Node.
func (n *Node) Start() error {
var nodeError error
defer func() {
err := n.database.Close()
if err != nil {
nodeError = err
}
}()
/* snip */
This does not do what you expect. It closes the database connection as soon as Start() returns--before anything possibly has a chance to use it.
Instead, you should close the database connection as part of your Stop() method:
// Stop stops the node.
func (n *Node) Stop() error {
n.cancel()
return n.database.Close()
}
And finally, your Start() method blocks, because it waits for the context to cancel, which cannot possibly be canceled until Stop() is called, which is only ever called after Start() returns:
func (n *Node) Start() error {
/* snip */
<-Ctx.Done()
return nodeError
}
I cannot think of any reason to have <-Ctx.Done in Start() at all, so I would just remove it.
With all of my suggested changes, you should have something like this:
type Node struct {
database *store.Store
ctx context.Context
cancel context.CancelFunc
}
// Starts run the Node.
func (n *Node) Start() {
n.ctx, n.cancel = context.WithCancel(context.Background())
}
// Stop stops the node.
func (n *Node) Stop() error {
n.cancel()
return n.database.Close()
}
Of course, this still leaves open the question of if/where/how ctx is used. Since your original code didn't include that, I didn't either.
I am attempting to create a poller in Go that spins up and every 24 hours executes a function.
I want to also be able to stop the polling, I'm attempting to do this by having a done channel and passing down an empty struct to stop the for loop.
In my tests, the for just loops infinitely and I can't seem to stop it, am I using the done channel incorrectly? The ticker case works as expected.
Poller struct {
HandlerFunc HandlerFunc
interval *time.Ticker
done chan struct{}
}
func (p *Poller) Start() error {
for {
select {
case <-p.interval.C:
err := p.HandlerFunc()
if err != nil {
return err
}
case <-p.done:
return nil
}
}
}
func (p *Poller) Stop() {
p.done <- struct{}{}
}
Here is the test that's exeuting the code and causing the infinite loop.
poller := poller.NewPoller(
testHandlerFunc,
time.NewTicker(1*time.Millisecond),
)
err := poller.Start()
assert.Error(t, err)
poller.Stop()
Seems like problem is in your use case, you calling poller.Start() in blocking maner, so poller.Stop() is never called. It's common, in go projects to call goroutine inside of Start/Run methods, so, in poller.Start(), i would do something like that:
func (p *Poller) Start() <-chan error {
errc := make(chan error, 1 )
go func() {
defer close(errc)
for {
select {
case <-p.interval.C:
err := p.HandlerFunc()
if err != nil {
errc <- err
return
}
case <-p.done:
return
}
}
}
return errc
}
Also, there's no need to send empty struct to done channel. Closing channel like close(p.done) is more idiomatic for go.
There is no explicit way in Go to broadcast an event to go routines for something like cancellation. Instead its idiomatic to create a channel that when closed signifies a message such as cancelling any work it has to do. Something like this is a viable pattern:
var done = make(chan struct{})
func cancelled() bool {
select {
case <-done:
return true
default:
return false
}
}
Go-routines can call cancelled to poll for a cancellation.
Then your main loop can respond to such an event but make sure you drain any channels that might cause go-routines to block.
for {
select {
case <-done:
// Drain whatever channels you need to.
for range someChannel { }
return
//.. Other cases
}
}