Context based termination of long running loop in a go routine

Context based termination of long running loop in a go routine - go

I'm implementing a feature where I need to read files from a directory, parse and export them to a REST service at a regular interval. As part of the same I would like to gracefully handle the program termination (SIGKILL, SIGQUIT etc).
Towards the same I would like to know how to implement Context based cancellation of process.
For executing the flow in regular interval I'm using gocron.
cmd/scheduler.go
func scheduleTask(){
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
s := gocron.NewScheduler()
s.Every(10).Minutes().Do(processTask, ctx)
s.RunAll() // run immediate
<-s.Start() // schedule
for {
select {
case <-(ctx).Done():
fmt.Print("context done")
s.Remove(processTask)
s.Clear()
cancel()
default:
}
}
}
func processTask(ctx *context.Context){
task.Export(ctx)
}
task/export.go
func Export(ctx *context.Context){
pendingFiles, err := filepath.Glob("/tmp/pending/" + "*_task.json")
//error handling
//as there can be 100s of files, I would like to break the loop when context.Done() to return asap & clean up the resources here as well
for _, fileName := range pendingFiles {
exportItem(fileName string)
}
}
func exportItem(fileName string){
data, err := ReadFile(fileName) //not shown here for brevity
//err handling
err = postHTTPData(string(data)) //not shown for brevity
//err handling
}

For the process management, I think the other component is the actual handling of signals, and managing the context from those signals.
I'm not sure of the specifics of go-cron (they have an example showing some of these concepts on their github) but in general I think that the steps involved are:
Registration of os signals handler
Waiting to receive a signal
Canceling top level context in response to a signal
Example:
sigCh := make(chan os.Signal, 1)
defer close(sigCh)
signal.Notify(sigCh, syscall.SIGTERM, syscall.SIGQUIT, syscall.SIGINT)
<-sigCh
cancel()
I'm not sure how this will look in the context of go-cron, but the context that the signal handling code cancels should be a parent of the context that the task and job is given.

Worked this out myself just now. I've always felt the blog post on contexts was A LOT of material to try and understand so a simpler demonstration would be nice.
There are many scenarios you may encounter. Each one is different and will require adaptation. Here's one example:
Say you have a channel that could run for an indeterminate amount of time.
indeterminateChannel := make(chan string)
for s := range indeterminateChannel{
fmt.Println(s)
}
Your producer might look something like:
for {
indeterminateChannel <- "Terry"
}
We don't control the producer, so we need someway to cut out of your print loop if the producer exceeds your time limit.
indeterminateChannel := make(chan string)
// Close the channel when our code exits so OUR for loop no longer occupies
// resources and the goroutine exits.
// The producer will have a problem, but we don't care about their problems.
// In this instance.
defer close(indeterminateChannel)
// we wait for this context to time out below the goroutine.
ctx, cancel := context.WithTimeout(context.TODO(), time.Minute*1)
defer cancel()
go func() {
for s := range indeterminateChannel{
fmt.Println(s)
}
}()
<- ctx.Done // wait for the context to terminate based on a timeout.
You can also check ctx.Err to see if the context exited due to a timeout or because it was canceled.
You might also want to learn about how to properly check if the context failed due to a deadline: How to check if an error is "deadline exceeded" error?
Or if the context was canceled: How to check if a request was cancelled

Related

Code executes successfully even though there are issues with WaitGroup implementation

I have this snippet of code which concurrently runs a function using an input and output channel and associated WaitGroups, but I was clued in to the fact that I've done some things wrong. Here's the code:
func main() {
concurrency := 50
var tasksWG sync.WaitGroup
tasks := make(chan string)
output := make(chan string)
for i := 0; i < concurrency; i++ {
tasksWG.Add(1)
// evidentally because I'm processing tasks in a groutine then I'm not blocking and I end up closing the tasks channel almost immediately and stopping tasks from executing
go func() {
for t := range tasks {
output <- process(t)
continue
}
tasksWG.Done()
}()
}
var outputWG sync.WaitGroup
outputWG.Add(1)
go func() {
for o := range output {
fmt.Println(o)
}
outputWG.Done()
}()
go func() {
// because of what was mentioned in the previous comment, the tasks wait group finishes almost immediately which then closes the output channel almost immediately as well which ends ranging over output early
tasksWG.Wait()
close(output)
}()
f, err := os.Open(os.Args[1])
if err != nil {
log.Panic(err)
}
s := bufio.NewScanner(f)
for s.Scan() {
tasks <- s.Text()
}
close(tasks)
// and finally the output wait group finishes almost immediately as well because tasks gets closed right away due to my improper use of goroutines
outputWG.Wait()
}
func process(t string) string {
time.Sleep(3 * time.Second)
return t
}
I've indicated in the comments where I've implementing things wrong. Now these comments make sense to me. The funny thing is that this code does indeed seem to run asynchronously and dramatically speeds up execution. I want to understand what I've done wrong but it's hard to wrap my head around it when the code seems to execute in an asynchronous way. I'd love to understand this better.

Your main goroutine is doing a couple of things sequentially and others concurrently, so I think your order of execution is off
f, err := os.Open(os.Args[1])
if err != nil {
log.Panic(err)
}
s := bufio.NewScanner(f)
for s.Scan() {
tasks <- s.Text()
}
Shouldn't you move this up top? So then you have values sent to tasks
THEN have your loop which ranges over tasks 50 times in the concurrency named for loop (you want to have something in tasks before calling code that ranges over it)
go func() {
// because of what was mentioned in the previous comment, the tasks wait group finishes almost immediately which then closes the output channel almost immediately as well which ends ranging over output early
tasksWG.Wait()
close(output)
}()
The logic here is confusing me, you're spawning a goroutine to wait on the waitgroup, so here the wait is nonblocking on the main goroutine - is that what you want to do? It won't wait for tasksWG to be decremented to zero inside main, it'll do that inside the goroutine that you've created. I don't believe you want to do that?
It might be easier to debug if you could give more details on the expected output?

Goroutine safe channel close doesn't actually close webscoket

This one is a tricky issue that bugs me quite a bit.
Essentially, I wrote an integration microservice that provides data streams from Binance crypto exchange using the Go client. A client sends a start messages, starts data stream for a symbol, and at some point, sends a close message to stop the stream. My implementation looks basically like this:
func (c BinanceClient) StartDataStream(clientType bn.ClientType, symbol, interval string) error {
switch clientType {
case bn.SPOT_LIVE:
wsKlineHandler := c.handlers.klineHandler.SpotKlineHandler
wsErrHandler := c.handlers.klineHandler.ErrHandler
_, stopC, err := binance.WsKlineServe(symbol, interval, wsKlineHandler, wsErrHandler)
if err != nil {
fmt.Println(err)
return err
} else {
c.state.clientSymChanMap[clientType][symbol] = stopC
return nil
}
...
}
The clientSymChanMap stores the stopChannel in a nested hashmap so that I can retrieve the stop channel later to stop the data feed. The stop function has been implemented accordingly:
func (c BinanceClient) StopDataStream(clientType bn.ClientType, symbol string) {
//mtd := "StopDataStream: "
stopC := c.state.clientSymChanMap[clientType][symbol]
if isClosed(stopC) {
DbgPrint(" Channel is already closed. Do nothing for: " + symbol)
} else {
close(stopC)
}
// Delete channel from the map otherwise the next StopAll throws a NPE due to closing a dead channel
delete(c.state.clientSymChanMap[clientType], symbol)
return
}
To prevent panics from already closed channels, I use a check function that returns true in case the channel is already close.
func isClosed(ch <-chan struct{}) bool {
select {
case <-ch:
return true
default:
}
return false
}
Looks nice, but has a catch. When I run the code with starting data for just one symbol, it starts and closes the datafeed exactly as expected.
However, when starting multiple data feeds, then the above code somehow never closes the websocket and just keeps streaming data forever. Without the isClosed check, I get panics of trying to close a closed channel, but with the check in place, well, nothing gets closed.
When looking at the implementation of the above binance.WsKlineServe function, it's quite obvious that it just wraps a new websocket with each invocation and then returns the done & stop channel.
The documentation gives the following usage example:
wsKlineHandler := func(event *binance.WsKlineEvent) {
fmt.Println(event)
}
errHandler := func(err error) {
fmt.Println(err)
}
doneC, stopC, err := binance.WsKlineServe("LTCBTC", "1m", wsKlineHandler, errHandler)
if err != nil {
fmt.Println(err)
return
}
<-doneC
Because the doneC channel actually blocks, I removed it and thought that storing the stopC channel and then use it later to stop the datafeed would work. However, it only does so for one single instance. When multiple streams are open, this doesn't work anymore.
Any idea what that's the case and how to fix it?

Firstly, this is dangerous:
if isClosed(stopC) {
DbgPrint(" Channel is already closed. Do nothing for: " + symbol)
} else {
close(stopC) // <- can't be sure channel is still open
}
there is no guarantee that after your polling check of the channel state, that the channel will still be in that same state in the next line of code. So this code could in theory could panic if it's called concurrently.
If you want an asynchronous action to occur on the channel close - it's best to do this explicitly from its own goroutine. So you could try this:
go func() {
stopC := c.state.clientSymChanMap[clientType][symbol]
<-stopC
// stopC definitely closed now
delete(c.state.clientSymChanMap[clientType], symbol)
}()
P.S. you do need some sort of mutex on your map, since the delete is asynchronous - you need to ensure any adds to the map don't datarace with this.
P.P.S Channels are reclaimed by the GC when they go out of scope. If you are no longer reading from it - they do not need to be explicitly closed to be reclaimed by the GC.

Using channels for stopping a goroutine or closing something is very tricky. There are lots of things you can do wrong or forget to do.
context.WithCancel abstracts that complexity away, making the code more readable and maintainable.
Some code snippets:
ctx, cancel := context.WitchCancel(context.TODO())
TheThingToCancel(ctx, ...)
// Whenever you want to stop TheThingToCancel. Can be called multiple times.
cancel()
Then in a for loop you'd often have a select like this:
for {
select {
case <-ctx.Done():
return
default:
}
// do stuff
}
Here some code that is closer to your specific case of an open connection:
func TheThingToCancel(ctx context.Context) (context.CancelFunc, error) {
ctx, cancel := context.WithCancel(ctx)
conn, err := net.Dial("tcp", ":12345")
if err != nil {
cancel()
return nil, err
}
go func() {
<-ctx.Done()
_ = conn.Close()
}()
go func() {
defer func() {
_ = conn.Close()
// make sure context is always cancelled to avoid goroutine leak
cancel()
}()
var bts = make([]byte, 1024)
for {
n, err := conn.Read(bts)
if err != nil {
return
}
fmt.Println(bts[:n])
}
}()
return cancel, nil
}
It returns the cancel function to be able to close it from the outside.
Cancelling a context can be done many times over without a panic like would occur if a channel is closed multiple times. That is one advantage. Also you can derive contexts from other contexts and thereby close a lot of contexts that all stop different routines by closing a parent context. Carefully designed, this is very powerful for shutting down different routines belonging together that also need to be able to be shut down individually.

How to include goroutine into a context?

I'm working on a Go project that require calling an initiation function (initFunction) in a separated goroutine (to ensure this function does not interfere with the rest of the project). initFunction must not take more than 30 seconds, so I thought I would use context.WithTimeout. Lastly, initFunction must be able to notify errors to the caller, so I thought of making an error channel and calling initFunction from an anonymous function, to recieve and report the error.
func RunInitGoRoutine(initFunction func(config string)error) error {
initErr := make(chan error)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Seconds)
go func() {
<-ctx.Done() // Line 7
err := initFunction(config)
initErr <-err
}()
select {
case res := <-initErr:
return res.err
case <-ctx.Done():
err := errors.New("Deadline")
return err
}
}
I'm quite new to Go, so I'm asking for feedbacks about the above code.
I have some doubt about Line 7. I used this to ensure the anonymous function is "included" under ctx and is therefore killed and freed and everything once timeout expires, but I'm not sure I have done the right thing.
Second thing is, I know I should be calling cancel( ) somewhere, but I can't put my finger around where.
Lastly, really any feedback is welcome, being it about efficency, style, correctness or anything.

In Go the pratice is to communicate via channels. So best thing is probably to share a channel on your context so others can consume from the channel.
As you are stating you are new to Go, I wrote a whole bunch of articles on Go (Beginner level) https://marcofranssen.nl/categories/golang.
Read from old to new to get familiar with the language.
Regarding the channel specifics you should have a look at this article.
https://marcofranssen.nl/concurrency-in-go
A pratical example of a webserver listening for ctrl+c and then gracefully shutting down the server using channels is described in this blog post.
https://marcofranssen.nl/improved-graceful-shutdown-webserver
In essence we run the server in a background routine
go func() {
if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
srv.l.Fatal("Could not listen on", zap.String("addr", srv.Addr), zap.Error(err))
}
}()
and then we have some code that is blocking the main routine by listening on a channel for the shutdown signal.
quit := make(chan os.Signal, 1)
signal.Notify(quit, os.Interrupt)
sig := <-quit
srv.l.Info("Server is shutting down", zap.String("reason", sig.String()))
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
srv.SetKeepAlivesEnabled(false)
if err := srv.Shutdown(ctx); err != nil {
srv.l.Fatal("Could not gracefully shutdown the server", zap.Error(err))
}
srv.l.Info("Server stopped")
This is very similar to your usecase. So running your init in a background routine and then consume the channel waiting for the result of this init routine.
package main
import (
"fmt"
"time"
)
type InitResult struct {
Message string
}
func main() {
initResult := make(chan InitResult, 0)
go func(c chan<- InitResult) {
time.Sleep(5 * time.Second)
// here we are publishing the result on the channel
c <- InitResult{Message: "Initialization succeeded"}
}(initResult)
fmt.Println("Started initializing")
// here we have a blocking operation consuming the channel
res := <-initResult
fmt.Printf("Init result: %s", res.Message)
}
https://play.golang.org/p/_YGIrdNVZx6
You could also add an error field on the struct so you could do you usual way of error checking.

Can't close confluent golang kafka consumer

I'm having an issue regarding the disposing of kafka consumer in the end of program execution. Here is code responsible for closing the consumer
func(kc *KafkaConsumer) Dispose() {
Sugar.Info("Disposing of consumer")
kc.mu.Lock()
kc.Consumer.Close();
Sugar.Info("Disposed of consumer")
kc.mu.Unlock()
}
As you might have already noticed, i'm making use of sync.Mutex, inasmuch as consumer is accessed by multiple goroutines. Below is another snippet responsible for reading messages from kafka
func (kc *KafkaConsumer) Consume(signalChan chan os.Signal, ctx context.Context) {
for{
select{
case sig := <-signalChan:
Sugar.Info("Caught signal %v", sig)
break
case <-ctx.Done():
Sugar.Info("Got context done message. Closing consumer...")
kc.Dispose()
break
default:
for{
message, err := kc.Consumer.ReadMessage(-1); if err != nil{
Log.Error(err.Error())
return
}
Sugar.Infof("Got a new message %v",message)
resp := make(chan *KafkaResponseEntity)
go router.UseMessage(*message, resp, ctx)
//Potential deadlock
response := <-resp
/*
Explicit commit of an offset in order to ensure
that request has been successfully processed
*/
kc.Consumer.Commit()
Sugar.Info("Successfully commited an offset")
Sugar.Infof("Just got a response %v", response)
go producer.KP.Produce(response.PaymentId, response.Bytes, "some_random_topic")
}
}
}
}
The problem is that when closing the consumer, program execution simply stalls.
Are there any issues? Should i use cond along with mutex? I'd be very glad if you provide thorough explanation of what might went wrong in my code.
Thanks in advance.

I suspect this is hanging because of:
kc.Consumer.ReadMessage(-1)
Which the documentation states will block indefinitely, hence why it's not closed. The simplest approach would be to make that value a positive time duration (e.g. 1 * time.Second), but then you may get time out errors if messages are not consumed within the timeouts. The time out error is generally innocuous, but is something to account for, from the linked documentation:
Timeout is returned as (nil, err) where err is
`err.(kafka.Error).Code() == kafka.ErrTimedOut`
I'm not yet sure of a good way to utilize the indefinite blocking and allow it to be interrupted. If anyone does know please post the findings!

Stopping other goroutines conditionally

I am in doubt whether all of my spawned goroutines are dying after doing their assigned work.
I have to make two HTTP calls(always), but based on a flag, read the response from either one of them.
what I have done so far is ->
var result error
resultChannel := make(chan error)
var wg sync.WaitGroup
wg.Add(1) // only adding 1, as I don't need to wait for other to complete.
go func() {
_, err := // HTTP call ONE
if flagIsTrue {
defer wg.Done()
resultChannel <- err
}
}()
go func() {
_, err := // HTTP call TWO
if !flagIsTrue {
defer wg.Done()
resultChannel <- err
}
}()
go func() {
wg.Wait()
close(resultChannel)
}()
for err := range resultChannel {
result = err
}
Hence, I will wait for the corresponding call, and listen to its response only. This is working well, but since the app is deployed on the server, where I guess the main goroutine won't die(henceforth killing other goroutines), my main concern is whether the other ignorable thread will die or not after it will get the response from HTTP call(afaik, we need to tell go that a goroutine needs to die).
My concerns:
The assumption(true acc to me) that the main thread does not terminate after serving one of these calls.
Will the ignorable(response is, but necessary to trigger the API call) thread die or not?
Should I use a select case to handle this, if yes then how(other suggestions are welcome)?

If the flagIsTrue is set before creating the goroutines, then only one of the goroutines will be able to write to the channel. The other one will not attempt to write to the channel, and thus will terminate.
You could simply move the check for the flag outside, and create one goroutine based on the flag.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Context based termination of long running loop in a go routine - go

Related

Code executes successfully even though there are issues with WaitGroup implementation

Goroutine safe channel close doesn't actually close webscoket

How to include goroutine into a context?

Can't close confluent golang kafka consumer

Stopping other goroutines conditionally

Categories

Resources