Stop a goroutine if it is running - go

I have a question similar to How to stop a goroutine with a small twist. I don't know for sure if the goroutine is running.
var quit = make(chan bool)
func f1() {
go func() {
t := time.NewTimer(time.Minute)
select {
case <-t.C:
// Do stuff
case <-quit:
return
}
}()
}
func f2() {
quit <- true
}
If f2() is called less than a minute after f1(), then the goroutine returns. However, if it is called later than one minute, the goroutine will have already returned and f2() would block.
I want f2() to cancel the goroutine if it is running and do nothing otherwise.
What I'm trying to achieve here is to perform a task if and only if it is not canceled within a minute of creation.
Clarifications:
There is nothing to stop f2() from being called more than once.
There is only one goroutine running at a time. The caller of f1() will make sure that it's not called more than once per minute.

Use contexts.
Run f1 with the context which may be cancelled.
Run f2 with the associated cancellation function.
func f1(ctx context.Context) {
go func(ctx context.Context) {
t := time.NewTimer(time.Minute)
select {
case <-t.C:
// Do stuff
case <-ctx.Done():
return
}
}(ctx)
}
func f2(cancel context.CancelFunc) {
cancel()
}
And later, to coordinate the two functions, you would do this:
ctx, cancel := context.WithCancel(context.Background())
f1(ctx)
f2(cancel)
You can also experiment with the context.WithTimeout function to incorporate externally-defined timeouts.
In the case where you don't know whether there is a goroutine running already, you can initialize the ctx and cancel variables like above, but don't pass them into anything. This avoids having to check for nil.
Remember to treat ctx and cancel as variables to be copied, not as references, because you don't want multiple goroutines to share memory - that may cause a race condition.

You can give the channel a buffer size of 1. This means you can send one value to it without blocking, even if that value is not received right away (or at all).
var quit = make(chan bool, 1)
I think the top answer is better, this is just another solution that could translate to other situations.

Related

What's the best way to resume goroutines work

I have to update thousands of structs with datas available on some remote servers. So, I have to deal with thousands of Goroutines querying these remote servers (http requests or db requests) to update the struct with the responses. But the update (or not) of the struct is depending of the results of other structs.
So I imagined a simple code in which the goroutines are running, each of them performs its own request, put the result in a global struct that contain any information retrieved by the goroutines and informs the main func that the first part of the job is done (waiting for the signal that every goroutines do the same before deciding of updating or not its struct.)
The goroutine should now waits for a signal of the main thread that all goroutines are done before deciding updating or not
the simplified code looks like that :
type struct StructToUpdate {
//some properties
}
type struct GlobalWatcher {
//some userful informations for all structs to update (or not)
}
func main() {
//retrieving all structs to update
structsToUpdates := foo()
//launching goroutines
c := make(chan bool)
for _,structToUpdate := range structsToUpdates {
go updateStruct(&structToUpdate,c)
}
//waiting for all goroutines do their first part of job
for i:=0; i<len(structsToUpdates); i++ {
<- c
}
//HERE IS THE CODE TO INFORM GOROUTINES THEY CAN RESUME
}
func updateStruct(s *StructToUpdate, c chan bool) {
result := performSomeRequest(s)
informGlobalWatcherOfResult(result)
c <- true //I did my job
//HERE IS THE CODE TO WAIT FOR THE SIGNAL TO RESUME
}
The question is : What the more performant / idiomatic / elegant wait to :
send a signal from the main script ?
wait for this signal from the goroutine ?
I can imagine 3 ways to do this
in a first way, I can imagine a global bool var resume := false that will be turned to true by the main func when all goroutines do the first part of job. In this case, each goroutine can use an ugly for !resume { continue }....
in a more idiomatic code, I can imagine do the same thing but instead of using a bool, I can use a context.WithValue(ctx, "resume", false) and pass it to goroutine, but I still have a for !ctx.Value("resume")
in a last more elegant code, I can imagine using another resume := make(chan bool) passed to the goroutine. The main func could inform goroutines to resume closing this chan with a simple close(resume). The goroutine will wait the signal with something than :
for {
_, more := <-resume
if !more {
break
}
}
//update or not
Is there any other good idea to do this ? One of the above solutions is better than others ?
I'm not sure if I understand your question completely, but a simple solution to blocking the main thread is with an OS signal ie.:
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGTERM)
// Blocks main thread
<-done
This doesn't have to be an OS signal, this could be a channel that receives a struct{}{} value from somewhere else in your runtime.
If you need to do this as a result of multiple go routines working, I'd look into using sync.WaitGroup.
OK. Thanks to #blixenkrone I found an elegant solution using WaitGroup !
Usually, we use a sync.WaitGroup to synchronize goroutines and be sure that they finish their job : you add "1" to the WaitGroup and then wait... Goroutines, when they finish their job, use the Done() function to decrement the Waitgroup counter. In fact, the Wait() function of the WaitGroup must act like a for loop, looking for its internal counter to become "0" to resume.
What I do is exactly the opposite !!! I create a WaitGroup and increment it to 1 in the main program before launching the goroutines, and pass it to them. The goroutine do their first past of job, notify the main program and then wait for the WaitGroup. When the main program get every notification of first part job, it release the waitgroup, then each goroutine can finish the job. Here is a squeleton of what I did :
func main() {
structsToUpdate := getStructs() //some func to get my structs
communication := make(chan bool)
//the waitgroup used to block the goroutine
var resume sync.WaitGroup
resume.Add(1) //The waitgroup will be blocked !!!
success := 0
for _,structToUpdate := range structsToUpdates {
go updateStruct(&structToUpdate,communication,&resume)
}
//waiting for all goroutines do their first part of job
for i:=0; i<len(structsToUpdates); i++ {
res := <- communication
if res {
success++
}
}
//all structs should have done with first part now... Resume their job !
resume.Done()
//receive second part job
for success != 0 {
<-communication
success--
}
}
func updateStruct(s *StructToUpdate, c chan bool, resume *sync.WaitGroup) {
//First part of the job
result, err := performSomeRequest(s)
if err != nil {
//something wrong happened
c <- false
return
}
informGlobalWatcherOfResult(result)
c <- true //I did my job
//wait for all goroutines do their first part job
resume.Wait()
//here the main program 'Done()' the WaitGroup so it resumes the execution of the goroutine
updateOrNotStruct() //Some operations...
c <- true
}
Even if I think the wg.Wait() func should call a for loop itself, I think this code is cleaner than the solutions a proposed above...

Why/When does calling a context cancel function from another goroutine cause a deadlock?

I'm having difficulties getting behind the concept of context cancel functions and at which point calling the cancel func causes a deadlock.
I have a main method that declares a context and I am passing its cancel function to two goroutines
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
go runService(ctx, wg, cancel, apiChan)
go api.Run(cancel, wg, apiChan, aviorDb)
I use this context in a service function (infinite loop that stops once the context is cancelled).
I am controlling this by calling the cancel function from another goroutine.
runService is a long running operation and looks similar to this:
func runService(ctx context.Context, wg *sync.WaitGroup, cancel context.CancelFunc, apiChan chan string) {
MainLoop:
for {
// this is the long running operation
worker.ProcessJob(dataStore, client, job, resumeChan)
select {
case <-ctx.Done():
_ = glg.Info("service stop signal received")
break MainLoop
default:
}
select {
case <-resumeChan:
continue
default:
}
waitCtx, cancel := context.WithTimeout(context.Background(), time.Duration(sleepTime)*time.Minute)
globalstate.WaitCtxCancel = cancel
<-waitCtx.Done()
}
_ = dataStore.SignOutClient(client)
apiChan <- "stop"
wg.Done()
cancel()
}
api has a global variable for the context cancel function:
var appCancel context.CancelFunc
It is set in the beginning by the api.Run method like so:
func Run(cancel context.CancelFunc, wg *sync.WaitGroup, stopChan chan string, db *db.DataStore) {
...
appCancel = cancel
...
}
api has a stop function which calls the cancel function:
func requestStop(w http.ResponseWriter, r *http.Request) {
_ = glg.Info("endpoint hit: shut down service")
if globalstate.WaitCtxCancel != nil {
globalstate.WaitCtxCancel()
}
state := globalstate.Instance()
state.ShutdownPending = true
appCancel()
encoder := json.NewEncoder(w)
encoder.SetIndent("", " ")
_ = encoder.Encode("stop signal received")
}
When the requestStop function is called and thus the context is cancelled, the long running operation (worker.ProcessJob) immediately halts and the entire program deadlocks. Before its next line of code is executed, the code jumps to gopark with reason waitReasonSemAcquire. (scratch that, was just the debugger)
The context cancel function is only called in these two locations.
So it seems like the runService goroutine prevents the api.run goroutine to get a lock for some reason.
My understanding up to now was that the cancel function can be passed around to different goroutines and there are no synchronization issues attached when calling it.
For example, the WaitCtxCancel function never causes a deadlock when I call it.
I could
replace the context with a 1-buffered channel and send a message to break out of the loop
use my global state struct and a boolean
to determine whether should run.
However, I want to understand what's happening here and why.
Also, is there any solution or approach I could use using contexts?
It seemed like the "correct" thing to use for use cases like mine.
UPDATE:
I have recently found out that changing
appCancel()
to
go appCancel()
seems to fix the issue, which confuses me even more.

Select on blocked call and channel

I am pretty sure I've seen a question on this before but can't find it now.
Essentially I want to select on a blocked call as well as a channel.
I know I can push the blocked call into a goroutine and wait on the result via a channel, however that feels like the wrong solution.
Is there an idiomatic way to write this that I'm missing?
Optimally there would be something like:
select {
case a <- c:
...
case ans := connection.Read():
...
}
If you have a channel and a function of which you want to select, using a goroutine and a channel is the idiomatic solution. Note though that if a value is received from the channel, that will not affect the function and it will continue to run. You may use context.Context to signal its result is no longer needed and it may terminate early.
If you're allowed to refactor though, you can "make" the function send on the same channel, so you only need to receive from a single channel.
Another refactoring idea would be for the function to monitor the same channel and return early, so you may just do a single call without select.
Note that if you need to do this in many places, you may create a helper function to launch it asychronously:
func launch(f func()) <-chan struct{} {
done := make(chan struct{})
go func() {
defer close(done)
f()
}()
return done
}
Example function:
func test() {
time.Sleep(time.Second)
}
And then using it:
select {
case a := <-c:
fmt.Println("received from channel:", a)
case <-launch(test):
fmt.Println("test() finished")
}
Try it on the Go Playground.

How to run for x seconds in a http handler

I want to run my function InsertRecords for 30 seconds and test how many records I can insert in a given time.
How can I stop processing InsertRecords after x seconds and then return a result from my handler?
func benchmarkHandler(w http.ResponseWriter, r *http.Request) {
counter := InsertRecords()
w.WriteHeader(200)
io.WriteString(w, fmt.Sprintf("counter is %d", counter))
}
func InsertRecords() int {
counter := 0
// db code goes here
return counter
}
Cancellations and timeouts are often done with a context.Context.
While this simple example could be done with a channel alone, using the context here makes it more flexible, and can take into account the client disconnecting as well.
func benchmarkHandler(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
defer cancel()
counter := InsertRecords(ctx)
w.WriteHeader(200)
io.WriteString(w, fmt.Sprintf("counter is %d", counter))
}
func InsertRecords(ctx context.Context) int {
counter := 0
done := ctx.Done()
for {
select {
case <-done:
return counter
default:
}
// db code goes here
counter++
}
return counter
}
This will run for at least 30 seconds, returning the number of complete database iterations. If you want to be sure that the handler always returns immediately after 30s, even if the DB call is blocked, then you need to push the DB code into another goroutine and let it return later. The shortest example would be to use a similar pattern as above, but synchronize access to the counter variable, since it could be written by the DB loop while returning.
func InsertRecords(ctx context.Context) int {
counter := int64(0)
done := ctx.Done()
go func() {
for {
select {
case <-done:
return
default:
}
// db code goes here
atomic.AddInt64(&counter, 1)
}
}()
<-done
return int(atomic.LoadInt64(&counter))
}
See #JoshuaKolden's answer for an example with a producer and a timeout, which could also be combined with the existing request context.
As JimB pointed out cancelation for limiting the time taken by an http requests can be handled with context.WithTimeout, however since you asked for the purposes of benchmarking you may want to use a more direct method.
The purpose of context.Context is to allow for numerous cancelation events to occur and have the same net effect of gracefully stopping all downstream tasks. In JimB's example it's possible that some other process will cancel the context before the 30 seconds have elapsed, and this is desirable from the resource utilization point of view. For example, if the connection is terminated prematurely there is no point in doing any more work on building a response.
If benchmarking is your goal you'd want to minimized the effect of superfluous events on the code being benchmarked. Here is an example of how to do that:
func InsertRecords() int {
stop := make(chan struct{})
defer close(stop)
countChan := make(chan int)
go func() {
defer close(countChan)
for {
// db code goes here
select {
case countChan <- 1:
case <-stop:
return
}
}
}()
var counter int
timeoutCh := time.After(30 * time.Second)
for {
select {
case n := <-countChan:
counter += n
case <-timeoutCh:
return counter
}
}
}
Essentially what we are doing is creating an infinite loop over discrete db operations, and counting iterations through the loop, we stop when time.After is triggered.
A problem in JimB's example is that despite checking ctx.Done() in the loop the loop can still block if the "db code" blocks. This is because ctx.Done() is only evaluated inline with the "db code" block.
To avoid this problem we separate the timing function and the benchmarking loop so that nothing can prevent us from receiving the timeout event when it occurs. Once the time out even occurs we immediately return the result of the counter. The "db code" may still be in mid execution but InsertRecords will exit and return its results anyway.
If the "db code" is in mid-execution when InsertRecords exits, the goroutine will be left running, so to clean this up we defer close(stop) so that on function exit we'll be sure to signal the goroutine to exit on the next iteration. When the goroutine exits, it cleans up the channel it was using to send the count.
As a general pattern the above is an example of how you can get precise timing in Go without regard to the actual execution time of the code being timed.
sidenote: A somewhat more advanced observation is that my example does not attempt to synchronize the start times between the timer and the goroutine. It seemed a bit pedantic to address that issue here. However, you can easily synchronize the two threads by creating a channel that blocks the main thread until the goroutine closes it just before starting the loop.

Recursive sends in monitor goroutine

In a simple scheduler for timers I'm writing, I'm making use of a monitor goroutine to sync start/stop and timer done events.
The monitor goroutine, when stripped down to the essential, looks like this:
actions := make(chan func(), 1024)
// monitor goroutine
go func() {
for a := range actions {
a()
}
}()
actions <- func() {
actions <- func() {
// causes deadlock when buffer size is reached
}
}
This works great, until an action is sent that sends another action.
It's possible for a scheduled action to schedule another action, which causes a deadlock when the buffer size is reached.
Is there any clean way to solve this issue without resorting to shared state (which I've tried in my particular problem, but is quite ugly)?
The problem arises from the fact that when your monitor goroutine "takes out" (receives) a value (a function) and executes it (this also happens on the monitor goroutine), during its execution it sends a value on the monitored channel.
This in itself would not cause deadlock as when the function is executed (a()), it is already taken out of the buffered channel, so there is at least one free space in it, so sending a new value on the channel inside a() could proceed without blocking.
Problem may arise if there are other goroutines which may also send values on the monitored channel, which is your case.
One way to avoid the deadlock is that if the function being executed tries to "put back" (send) a function not in the same goroutine (which is the monitor goroutine), but in a new goroutine, so the monitor goroutine is not blocked:
actions <- func() {
// Send new func value in a new goroutine:
// this will never block the monitor goroutine
go func() {
actions <- func() {
// No deadlock.
}
}()
}
By doing this, the monitor goroutine will not be blocked even if buffer of actions is full, because sending a value on it will happen in a new goroutine (which may be blocked until there is free space in the buffer).
If you want to avoid always spawning a new goroutine when sending a value on actions, you may use a select to first try to send it without spawning a new goroutine. Only when buffer of actions is full and cannot be sent without blocking, shall you spawn a goroutine for sending to avoid deadlock, which - depending on your actual case may happen very rarely, and in this case it's inevitable anyway to avoid deadlock:
actions <- func() {
newfv := func() { /* do something */ }
// First try to send with select:
select {
case actions <- newfv:
// Success!
default:
// Buffer is full, must do it in new goroutine:
go func() {
actions <- newfv
}()
}
}
If you have to do this in many places, it's recommended to create a helper function for it:
func safeSend(fv func()) {
// First try to send with select:
select {
case actions <- fv:
// Success!
default:
// Buffer is full, must do it in new goroutine:
go func() {
actions <- fv
}()
}
}
And using it:
actions <- func() {
safeSend(func() {
// something to do
})
}

Resources