goroutine not seeing context cancel? - go

I have two goroutines running at the same time.
At some point, I want my program to exit gracefully so I use the cancel() func to notify my goroutines that they need to be stopped, but only one of the two receive the message.
here is my main (simplified):
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
wg := &sync.WaitGroup{}
wg.Add(2)
go func() {
err := eng.Watcher(ctx, wg)
if err != nil {
cancel()
}
}()
go func() {
err := eng.Suspender(ctx, wg)
if err != nil {
cancel()
}
}()
<-done // wait for SIGINT / SIGTERM
log.Print("receive shutdown")
cancel()
wg.Wait()
log.Print("controller exited properly")
The Suspender goroutine exist successfully (here is the code):
package main
import (
"context"
"sync"
"time"
log "github.com/sirupsen/logrus"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/util/retry"
)
func (eng *Engine) Suspender(ctx context.Context, wg *sync.WaitGroup) error {
contextLogger := eng.logger.WithFields(log.Fields{
"go-routine": "Suspender",
})
contextLogger.Info("starting Suspender goroutine")
now := time.Now().In(eng.loc)
for {
select {
case n := <-eng.Wl:
//dostuff
case <-ctx.Done():
// The context is over, stop processing results
contextLogger.Infof("goroutine Suspender canceled by context")
return nil
}
}
}
and here is the func that is not receiving the context cancellation:
package main
import (
"context"
"sync"
"time"
log "github.com/sirupsen/logrus"
)
func (eng *Engine) Watcher(ctx context.Context, wg *sync.WaitGroup) error {
contextLogger := eng.logger.WithFields(log.Fields{
"go-routine": "Watcher",
"uptime-schedule": eng.upTimeSchedule,
})
contextLogger.Info("starting Watcher goroutine")
ticker := time.NewTicker(time.Second * 30)
for {
select {
case <-ctx.Done():
contextLogger.Infof("goroutine watcher canceled by context")
log.Printf("toto")
return nil
case <-ticker.C:
//dostuff
}
}
}
}
Can you please help me ?
Thanks :)

Did you try it with an errgroup? It has context cancellation baked in:
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
defer cancel()
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
// "golang.org/x/sync/errgroup"
wg, ctx := errgroup.WithContext(ctx)
wg.Go(func() error {
return eng.Watcher(ctx, wg)
})
wg.Go(func() error {
return eng.Suspender(ctx, wg)
})
wg.Go(func() error {
defer cancel()
<-done
return nil
})
err := wg.Wait()
if err != nil {
log.Print(err)
}
log.Print("receive shutdown")
log.Print("controller exited properly")

On the surface the code looks good. The only thing I can think is that it's busy in "dostuff". It can be tricky to step through timing related code in the debugger so try adding some logging:
case <-ticker.C:
log.Println("doing stuff")
//dostuff
log.Println("done stuff")
(I also assume you are calling wg.Done() in your go-routines somewhere though if they are missing that would not be the cause of the problem you describe.)

The code in Suspender and in Watcher doesn't decrement the waitgroup counter through the Done() method call - the reason behind the infinite execution.
And to be honest it's quite normal to forget such small things. That's why as a standard general practice in Go, it is suggested to use defer and handle things that are critical (and should be handled inside the function/method ) at the very beginning.
The updated implementation might look like
func (eng *Engine) Suspender(ctx context.Context, wg *sync.WaitGroup) error {
defer wg.Done()
// ------------------------------------
func (eng *Engine) Watcher(ctx context.Context, wg *sync.WaitGroup) error {
defer wg.Done()
contextLogger := eng.logger.WithFields(log.Fields{
Also, another suggestion, looking at the main routine, it is always suggested to pass context by value to any go-routine or method calls (lambda) that are being invoked.
This approach saves developers from a lot of program-related bugs that can't be noticed very easily.
go func(ctx context.Context) {
err := eng.Watcher(ctx, wg)
if err != nil {
cancel()
}
}(ctx)
Edit-1: (the exact solution)
Try passing the context using the value in the go routines as I mentioned earlier. Otherwise, both of the go routine will use a single context (because you are referencing it) and only one ctx.Done() will be fired.
By passing ctx as a value 2 separate child contexts are created in Go. And while closing parent with cancel() - both children independently fires ctx.Done().

Related

How to stop a goroutine when an error occurs [duplicate]

This question already has answers here:
Close multiple goroutine if an error occurs in one in go
(3 answers)
Closed 3 years ago.
When one occur error, how to stop another?
I must use res1 and res2,in production res1, res2 are not same static type.
package main
import (
"fmt"
"net/http"
"sync"
)
func main() {
wg := &sync.WaitGroup{}
wg.Add(2)
var res1, res2 *http.Response
var err1, err2 error
go func() {
defer wg.Done()
res1, err1 = http.Get("http://127.0.0.1:8899")
if err1 != nil {
panic(err1)
}
}()
go func() {
defer wg.Done()
res2, err2 = http.Get("http://127.0.0.1:8898")
if err2 != nil {
panic(err2)
}
}()
wg.Wait()
fmt.Println(res1, res2)
}
A common context should be able to cancel all waiting requests. Something like this:
ctx, cancel:=context.WithCancel(context.Background())
defer cancel()
cli:=http.Client{}
go func() {
req:=http.NewRequestWithContext(ctx,http.MethodGet,url,nil)
respose, err:=cli.Do(req)
if err != nil {
cancel()
return
}
}()
You should use the same ctx for all http requests, and when one fails, cancel it. Once the context is canceled, all other http requests should cancel as well.

goroutine didn't respect `ctx.done()` or quit properly

I am trying to achieve quit gracefully when user press Ctrl-C. I am trying the code in Make Ctrl+C cancel the context.Context.
package main
import (
"context"
"fmt"
"os"
"os/signal"
"time"
)
func main() {
ctx := context.Background()
// trap Ctrl+C and call cancel on the context
ctx, cancel := context.WithCancel(ctx)
c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt)
defer func() {
signal.Stop(c)
cancel()
fmt.Println("Cleaned up")
}()
go func() {
select {
case <-c:
fmt.Println("Got interrupt signal")
cancel()
case <-ctx.Done():
}
fmt.Println("Stopped monitoring")
}()
select {
case <-ctx.Done():
fmt.Println("notified to quit")
case <-time.NewTimer(time.Second * 2).C:
fmt.Println("done something")
}
}
It works well as expected when user press Ctrl-c, it console out the following:
Got interrupt signal
Stopped monitoring
notified to quit
Cleaned up
However, if it quit normally, It doesn't work as expected as below:
done something
Cleaned up
I mean it should print out Stopped monitoring, but not. In defer cleanup function, it called cancel() which should trigger the select in monitoring goroutine to quit, but not.
How to solve the issue?
Thanks #Zan Lynx, I worked out the below solution.
package main
import (
"context"
"fmt"
"os"
"os/signal"
"time"
)
func main() {
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
terminated := monitor(ctx, cancel)
defer func() {
cancel()
fmt.Println("Cleaned up")
<-terminated // wait for the monior goroutine quit
}()
select {
case <-ctx.Done():
fmt.Println("notified to quit")
case <-time.NewTimer(time.Second * 1).C:
fmt.Println("done something")
}
}
func monitor(ctx context.Context, cancel context.CancelFunc) <-chan interface{} {
c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt)
terminated := make(chan interface{})
go func() {
defer close(terminated)
defer fmt.Println("Stopped monitoring1")
defer signal.Stop(c)
select {
case <-c:
fmt.Println("Got interrupt singnal")
cancel()
case <-ctx.Done():
}
}()
return terminated
}

Parent-child context cancelling order in Go

I want to know if there are any guarantees regarding the return order upon Context cancellation in golang.
I want to create a context with cancellation and once all the listeners are done with processing catching and reacting to "<-ctx.Done()" from this context, I want to call os.Exit safely.
A concrete example to explain the idea of what I want is following. I want to catch a signal, trigger all cancellations, and then call os.Exit().
I create a context and listen for a signal:
ctx, cancel := context.WithCancel(context.Background())
go func() {
c := make(chan os.Signal)
signal.Notify(c, os.Interrupt)
defer signal.Stop(c)
select {
case <-c:
cancel()
}
}()
In other places I "sign up" for this request several times:
res := NewRes()
go func() {
<-ctx.Done():
res.Close()
}()
But then I want to call os.Exit at the point when all the listeners are done.
For that I plan to create either parent or child context like this:
parent, pCancel := context.WithCancel(context.Background())
child, _ := context.WithCancel(parent)
go func() {
c := make(chan os.Signal)
signal.Notify(c, os.Interrupt)
defer signal.Stop(c)
select {
case <-c:
pCancel()
case <-child.Done():
os.Exit(0)
}
}()
Unfortunately, I did not find the documentation describing the order how context are canceled, so I cannot come up with the correct solution for now.
You have to wait all routines before exiting. Calling pCancel() doesn't mean everything will stop. I recommend to do in routine all jobs, but on the main thread to wait for os.Interrupt signal.
Check example below
package main
import (
"context"
"fmt"
"os"
"os/signal"
"sync"
"time"
)
func main() {
parent, pCancel := context.WithCancel(context.Background())
child, _ := context.WithCancel(parent)
wg := &sync.WaitGroup{}
for i := 0; i < 10; i++ {
go work(wg, child)
}
c := make(chan os.Signal)
signal.Notify(c, os.Interrupt)
defer signal.Stop(c)
select {
case <-c:
pCancel()
fmt.Println("Waiting everyone to finish...")
wg.Wait()
fmt.Println("Exiting")
os.Exit(0)
}
}
func work(wg *sync.WaitGroup, ctx context.Context) {
done := false
wg.Add(1)
for !done {
fmt.Println("Doing something...")
time.Sleep(time.Second)
select {
case <-ctx.Done():
fmt.Println("Done")
done = true
default:
}
}
wg.Done()
}
Although, It's recommended to use principle "Share Memory By Communicating".
Here is another example without using WaitGroup.
package main
import (
"context"
"fmt"
"os"
"os/signal"
"time"
)
func main() {
parent, pCancel := context.WithCancel(context.Background())
child, _ := context.WithCancel(parent)
done := make(chan struct{})
jobsCount := 10
for i := 0; i < jobsCount; i++ {
go work(child, done)
}
c := make(chan os.Signal)
signal.Notify(c, os.Interrupt)
defer signal.Stop(c)
select {
case <-c:
pCancel()
fmt.Println("Waiting everyone to finish...")
for i := 0; i < jobsCount; i++ {
<-done
}
fmt.Println("Exiting")
os.Exit(0)
}
}
func work(ctx context.Context, doneChan chan struct{}) {
done := false
for !done {
fmt.Println("Doing something...")
time.Sleep(time.Second)
select {
case <-ctx.Done():
fmt.Println("Done")
done = true
default:
}
}
doneChan <- struct{}{}
}

close(channel) used to implement the observer pattern

I need to stop an HTTP server on demand besides calling other functions as well when receiving the "quit" signal in no specific order.
In my try to implement something like the observer pattern, I found "handy" to create a channel (quit := make(chan struct{}), let's say the "subject" and then on each of the goroutines "observers" listen on that channel <-quit waiting until a change for then to continue.
The way I trigger all the functions at once is by closing the channel close(quit) not by writing into it, I have tried this and so far working, but wondering if there are some cons with this approach or if there are better/idiomatic ways of implementing similar behavior/pattern.
package main
import (
"log"
"net/http"
"sync"
"time"
)
func main() {
var wg sync.WaitGroup
srv := &http.Server{Addr: ":8080"}
wg.Add(1)
go func() {
log.Println(srv.ListenAndServe())
wg.Done()
}()
quit := make(chan struct{})
go func() {
<-quit
if err := srv.Close(); err != nil {
log.Printf("HTTP server Shutdown: %v", err)
}
}()
wg.Add(1)
go func() {
<-quit
log.Println("just waiting 1")
wg.Done()
}()
wg.Add(1)
go func() {
<-quit
log.Println("just waiting 2")
wg.Done()
}()
<-time.After(2 * time.Second)
close(quit)
wg.Wait()
}
https://play.golang.org/p/uIfMJfN6xQy
I would say your way is good enough but lacks some elegance.
You could implement required behavior using sync.Cond:
https://golang.org/pkg/sync/#Cond
How to correctly use sync.Cond?

should I use a channel or a sync.Mutex lock()?

While doing a go test -race, I found that a call to os.Process.Kill, was made before the command started cmd.Start(), I came with to posible solutions, one to use a channel:
package main
import "os/exec"
func main() {
cmd := exec.Command("sleep", "10")
started := make(chan struct{}, 1)
go func() {
<-started
cmd.Process.Kill()
}()
if err := cmd.Start(); err != nil {
panic(err)
}
started <- struct{}{}
cmd.Wait()
}
or to use a lock:
package main
import (
"os/exec"
"sync"
)
func main() {
var lock sync.Mutex
cmd := exec.Command("sleep", "10")
lock.Lock()
if err := cmd.Start(); err != nil {
panic(err)
}
lock.Unlock()
go func() {
cmd.Process.Kill()
}()
cmd.Wait()
}
Both options work but wondering what could be the most idiomatic or better approach, while the main goal is just to prevent killing a process that hasn't been started.
I would suggest you use a channel, but let me point out something about your code.
I noticed you used a buffered channel, and then sent data on that channel first, before calling the goroutine that consumes the channel. In my opinion, it would be better to:
1) use an unbuffered channel for signalling, especially in this case.
2) Have the goroutine be responsible for starting the process and calling wait, while signaling to the main that it has started.
Like this:
package main
import "os/exec"
func main() {
cmd := exec.Command("sleep", "10")
started := make(chan struct{})
go func(cmd *exec.Cmd, signal chan struct{}) {
if err := cmd.Start(); err != nil {
panic(err)
}
started <- struct{}{}
cmd.Wait()
}(cmd, started)
<-started
cmd.Process.Kill()
}

Resources