In my application I need async/await pattern with context cancellation support. In practice I have a function like:
func longRunningTask() <-chan int32 {
r := make(chan int32)
go func() {
defer close(r)
// Simulate a workload.
time.Sleep(time.Second * 3)
r <- rand.Int31n(100)
}()
return r
}
However, it will not support context cancellation. In order to fix this, I can add an argument and modify function to wait for ctx.Done() channel signal in a select statement, to abort an operation if context is cancelled.
If done this way, the function will not properly abort if run twice or more times (because the context pointer will be shared), since context cancellation channel only receives one signal:
ctx := ...
go func() { r := <-longRunningTask(ctx) } // Done() works
go func() { r := <-longRunningTask(ctx) } // ?
// cancel() ...
Here is what I see about Done:
// go/context.go
357 func (c *cancelCtx) Done() <-chan struct{} {
358 c.mu.Lock()
359 if c.done == nil {
360 c.done = make(chan struct{})
361 }
362 d := c.done
363 c.mu.Unlock()
364 return d
365 } // Done() returns the same channel for all callers, and cancellation signal is sent once only
Does the go source mean context does not really support abortion of a function that calls other "long-running" functions, "a chained cancellation"?
What are options to write asynchronious functions that will support context cancellation in an unlimited recursion of .Done() usage?
Does the go source mean context does not really support abortion of a
function that calls other "long-running" functions, "a chained
cancellation"?
No. A task can call other long-running tasks, passing a context down the call chain. This is a standard practice. And if a context is canceled, a nested call will error and bubble-up the cancelation error along the call stack
What are options to write asynchronious functions that will support context cancellation in an unlimited recursion of .Done() usage?
Recursion is no different that a couple of nested calls that take a context. Provided the recursive calls take a context input parameter and return an error (that is check), a recursive call chain will bubble-up a cancelation event just like a set of non-recursive nested calls.
First, let's update your wrapper function to support context.Context:
func longRunningTask(ctx context.Context) <-chan int32 {
r := make(chan int32)
go func() {
defer close(r)
// workload
i, err := someWork(ctx)
if err != nil {
return
}
r <- i
}()
return r
}
And then someWork - to use the sleep workload would look like this:
func someWork(ctx context.Context) (int32, error) {
tC := time.After(3*time.Second) // fake workload
// we can check this "workload" *AND* the context at the same time
select {
case <-tC:
return rand.Int31n(100), nil
case <-ctx.Done():
return 0, ctx.Err()
}
}
The important thing to note here is, we can alter the fake "workload" (time.Sleep) in a way that it becomes a channel - and thus watch it and our context via a select statement. Most workloads are of course not this trivial...
Fortunately the Go standard library is full of support for context.Context. So if your workload consists of lots of potentially long-running SQL query, each query can be passed a context. Same with HTTP requests or gRPC calls. If your workload consists of any of these calls, passing in the parent context will cause any of these potentially blocking calls to return with an error when the context is canceled - and thus your workload will return with a cancellation error, letting the caller know what happened.
If your workload does not fit neatly into this model e.g. computing a large Mandelbrot-Set image. Checking the context for cancelation after every pixel can have negative performance impact as polling selects are not free:
select {
case <-ctx.Done(): // polling select when `default` is included
return ctx.Err()
default:
}
In cases like this, tuning could be applied and if say pixels are calculated at a rate of 10,000/s - polling the context every 10,000 pixels will ensure the task will return no later than 1 second from the point of cancelation.
Related
I have a question similar to How to stop a goroutine with a small twist. I don't know for sure if the goroutine is running.
var quit = make(chan bool)
func f1() {
go func() {
t := time.NewTimer(time.Minute)
select {
case <-t.C:
// Do stuff
case <-quit:
return
}
}()
}
func f2() {
quit <- true
}
If f2() is called less than a minute after f1(), then the goroutine returns. However, if it is called later than one minute, the goroutine will have already returned and f2() would block.
I want f2() to cancel the goroutine if it is running and do nothing otherwise.
What I'm trying to achieve here is to perform a task if and only if it is not canceled within a minute of creation.
Clarifications:
There is nothing to stop f2() from being called more than once.
There is only one goroutine running at a time. The caller of f1() will make sure that it's not called more than once per minute.
Use contexts.
Run f1 with the context which may be cancelled.
Run f2 with the associated cancellation function.
func f1(ctx context.Context) {
go func(ctx context.Context) {
t := time.NewTimer(time.Minute)
select {
case <-t.C:
// Do stuff
case <-ctx.Done():
return
}
}(ctx)
}
func f2(cancel context.CancelFunc) {
cancel()
}
And later, to coordinate the two functions, you would do this:
ctx, cancel := context.WithCancel(context.Background())
f1(ctx)
f2(cancel)
You can also experiment with the context.WithTimeout function to incorporate externally-defined timeouts.
In the case where you don't know whether there is a goroutine running already, you can initialize the ctx and cancel variables like above, but don't pass them into anything. This avoids having to check for nil.
Remember to treat ctx and cancel as variables to be copied, not as references, because you don't want multiple goroutines to share memory - that may cause a race condition.
You can give the channel a buffer size of 1. This means you can send one value to it without blocking, even if that value is not received right away (or at all).
var quit = make(chan bool, 1)
I think the top answer is better, this is just another solution that could translate to other situations.
I am trying to create an intermediate layer between user and tcp, with Send and Receive functions. Currently, I am trying to integrate a context, so that the Send and Receive respects a context. However, I don't know how to make them respect the context's cancellation.
Until now, I got the following.
// c.underlying is a net.Conn
func (c *tcpConn) Receive(ctx context.Context) ([]byte, error) {
if deadline, ok := ctx.Deadline(); ok {
// Set the read deadline on the underlying connection according to the
// given context. This read deadline applies to the whole function, so
// we only set it once here. On the next read-call, it will be set
// again, or will be reset in the else block, to not keep an old
// deadline.
c.underlying.SetReadDeadline(deadline)
} else {
c.underlying.SetReadDeadline(time.Time{}) // remove the read deadline
}
// perform reads with
// c.underlying.Read(myBuffer)
return frameData, nil
}
However, as far as I understand that code, this only respects a context.WithTimeout or context.WithDeadline, and not a context.WithCancel.
If possible, I would like to pass that into the connection somehow, or actually abort the reading process.
How can I do that?
Note: If possible, I would like to avoid another function that reads in another goroutine and pushed a result back on a channel, because then, when calling cancel, and I am reading 2GB over the network, that doesn't actually cancel the read, and the resources are still used. If not possible in another way however, I would like to know if there is a better way of doing that than a function with two channels, one for a []byte result and one for an error.
EDIT:
With the following code, I can respect a cancel, but it doesn't abort the read.
// apply deadline ...
result := make(chan interface{})
defer close(result)
go c.receiveAsync(result)
select {
case res := <-result:
if err, ok := res.(error); ok {
return nil, err
}
return res.([]byte), nil
case <-ctx.Done():
return nil, ErrTimeout
}
}
func (c *tcpConn) receiveAsync(result chan interface{}) {
// perform the reads and push either an error or the
// read bytes to the result channel
If the connection can be closed on cancellation, you can setup a goroutine to shutdown the connection on cancellation within the Receive method. If the connection must be reused again later, then there is no way to cancel a Read in progress.
recvDone := make(chan struct{})
defer close(recvDone)
// setup the cancellation to abort reads in process
go func() {
select {
case <-ctx.Done():
c.underlying.CloseRead()
// Close() can be used if this isn't necessarily a TCP connection
case <-recvDone:
}
}()
It will be a little more work if you want to communicate the cancelation error back, but the CloseRead will provide a clean way to stop any pending TCP Read calls.
Timeout handler moves ServeHTTP execution on a new goroutine, but not able to kill that goroutine after the timer ends. On every request, it creates two goroutines, but ServeHTTP goroutines never kill with context.
Not able to find a way to kill goroutines.
Edit For-loop with time.Sleep function, represents huge computation which goes beyond our timer. Can replace it with any other function.
package main
import (
"fmt"
"io"
"net/http"
"runtime"
"time"
)
type api struct{}
func (a api) ServeHTTP(w http.ResponseWriter, req *http.Request) {
// For-loop block represents huge computation and usually takes more time
// Can replace with any code
i := 0
for {
if i == 500 {
break
}
fmt.Printf("#goroutines: %d\n", runtime.NumGoroutine())
time.Sleep(1 * time.Second)
i++
}
_, _ = io.WriteString(w, "Hello World!")
}
func main() {
var a api
s := http.NewServeMux()
s.Handle("/", a)
h := http.TimeoutHandler(s, 1*time.Second, `Timeout`)
fmt.Printf("#goroutines: %d\n", runtime.NumGoroutine())
_ = http.ListenAndServe(":8080", h)
}
ServeHTTP goroutine should kill along with request context, normally which does not happen.
Use context.Context to instruct go-routines to abort their function. The go-routines, of course, have to listen for such cancelation events.
So for your code, do something like:
ctx := req.Context() // this will be implicitly canceled by your TimeoutHandler after 1s
i := 0
for {
if i == 500 {
break
}
// for any long wait (1s etc.) always check the state of your context
select {
case <-time.After(1 * time.Second): // no cancelation, so keep going
case <-ctx.Done():
fmt.Println("request context has been canceled:", ctx.Err())
return // terminates go-routine
}
i++
}
Playground: https://play.golang.org/p/VEnW0vsItXm
Note: Context are designed to be chained - allowing for multiple levels of sub-tasks to be canceled in a cascading manner.
In a typical REST call one would initiate a database request. So, to ensure such a blocking and/or slow call completes in a timely manner, instead of using Query one should use QueryContext - passing in the http request's context as the first argument.
I found, if you do not have any way to reach to your channel then there is no way to kill or stop goroutine when it is running.
In the large computational task, you have to watch the channel on a specific interval or after specific task completion.
I want to run my function InsertRecords for 30 seconds and test how many records I can insert in a given time.
How can I stop processing InsertRecords after x seconds and then return a result from my handler?
func benchmarkHandler(w http.ResponseWriter, r *http.Request) {
counter := InsertRecords()
w.WriteHeader(200)
io.WriteString(w, fmt.Sprintf("counter is %d", counter))
}
func InsertRecords() int {
counter := 0
// db code goes here
return counter
}
Cancellations and timeouts are often done with a context.Context.
While this simple example could be done with a channel alone, using the context here makes it more flexible, and can take into account the client disconnecting as well.
func benchmarkHandler(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
defer cancel()
counter := InsertRecords(ctx)
w.WriteHeader(200)
io.WriteString(w, fmt.Sprintf("counter is %d", counter))
}
func InsertRecords(ctx context.Context) int {
counter := 0
done := ctx.Done()
for {
select {
case <-done:
return counter
default:
}
// db code goes here
counter++
}
return counter
}
This will run for at least 30 seconds, returning the number of complete database iterations. If you want to be sure that the handler always returns immediately after 30s, even if the DB call is blocked, then you need to push the DB code into another goroutine and let it return later. The shortest example would be to use a similar pattern as above, but synchronize access to the counter variable, since it could be written by the DB loop while returning.
func InsertRecords(ctx context.Context) int {
counter := int64(0)
done := ctx.Done()
go func() {
for {
select {
case <-done:
return
default:
}
// db code goes here
atomic.AddInt64(&counter, 1)
}
}()
<-done
return int(atomic.LoadInt64(&counter))
}
See #JoshuaKolden's answer for an example with a producer and a timeout, which could also be combined with the existing request context.
As JimB pointed out cancelation for limiting the time taken by an http requests can be handled with context.WithTimeout, however since you asked for the purposes of benchmarking you may want to use a more direct method.
The purpose of context.Context is to allow for numerous cancelation events to occur and have the same net effect of gracefully stopping all downstream tasks. In JimB's example it's possible that some other process will cancel the context before the 30 seconds have elapsed, and this is desirable from the resource utilization point of view. For example, if the connection is terminated prematurely there is no point in doing any more work on building a response.
If benchmarking is your goal you'd want to minimized the effect of superfluous events on the code being benchmarked. Here is an example of how to do that:
func InsertRecords() int {
stop := make(chan struct{})
defer close(stop)
countChan := make(chan int)
go func() {
defer close(countChan)
for {
// db code goes here
select {
case countChan <- 1:
case <-stop:
return
}
}
}()
var counter int
timeoutCh := time.After(30 * time.Second)
for {
select {
case n := <-countChan:
counter += n
case <-timeoutCh:
return counter
}
}
}
Essentially what we are doing is creating an infinite loop over discrete db operations, and counting iterations through the loop, we stop when time.After is triggered.
A problem in JimB's example is that despite checking ctx.Done() in the loop the loop can still block if the "db code" blocks. This is because ctx.Done() is only evaluated inline with the "db code" block.
To avoid this problem we separate the timing function and the benchmarking loop so that nothing can prevent us from receiving the timeout event when it occurs. Once the time out even occurs we immediately return the result of the counter. The "db code" may still be in mid execution but InsertRecords will exit and return its results anyway.
If the "db code" is in mid-execution when InsertRecords exits, the goroutine will be left running, so to clean this up we defer close(stop) so that on function exit we'll be sure to signal the goroutine to exit on the next iteration. When the goroutine exits, it cleans up the channel it was using to send the count.
As a general pattern the above is an example of how you can get precise timing in Go without regard to the actual execution time of the code being timed.
sidenote: A somewhat more advanced observation is that my example does not attempt to synchronize the start times between the timer and the goroutine. It seemed a bit pedantic to address that issue here. However, you can easily synchronize the two threads by creating a channel that blocks the main thread until the goroutine closes it just before starting the loop.
(I don't believe my issue is a duplicate of this QA: go routine blocking the others one, because I'm running Go 1.9 which has the preemptive scheduler whereas that question was asked for Go 1.2).
My Go program calls into a C library wrapped by another Go-lang library that makes a blocking call that can last over 60 seconds. I want to add a timeout so it returns in 3 seconds:
Old code with long block:
// InvokeSomething is part of a Go wrapper library that calls the C library read_something function. I cannot change this code.
func InvokeSomething() ([]Something, error) {
ret := clib.read_something(&input) // this can block for 60 seconds
if ret.Code > 1 {
return nil, CreateError(ret)
}
return ret.Something, nil
}
// This is my code I can change:
func MyCode() {
something, err := InvokeSomething()
// etc
}
My code with a go-routine, channel, and timeout, based on this Go example: https://gobyexample.com/timeouts
type somethingResult struct {
Something []Something
Err error
}
func MyCodeWithTimeout() {
ch = make(chan somethingResult, 1);
go func() {
something, err := InvokeSomething() // blocks here for 60 seconds
ret := somethingResult{ something, err }
ch <- ret
}()
select {
case result := <-ch:
// etc
case <-time.After(time.Second *3):
// report timeout
}
}
However when I run MyCodeWithTimeout it still takes 60 seconds before it executes the case <-time.After(time.Second * 3) block.
I know that attempting to read from an unbuffered channel with nothing in it will block, but I created the channel with a buffered size of 1 so as far as I can tell I'm doing it correctly. I'm surprised the Go scheduler isn't preempting my goroutine, or does that depend on execution being in go-lang code and not an external native library?
Update:
I read that the Go-scheduler, at least in 2015, is actually "semi-preemptive" and it doesn't preempt OS threads that are in "external code": https://github.com/golang/go/issues/11462
you can think of the Go scheduler as being partially preemptive. It's by no means fully cooperative, since user code generally has no control over scheduling points, but it's also not able to preempt at arbitrary points
I heard that runtime.LockOSThread() might help, so I changed the function to this:
func MyCodeWithTimeout() {
ch = make(chan somethingResult, 1);
defer close(ch)
go func() {
runtime.LockOSThread()
defer runtime.UnlockOSThread()
something, err := InvokeSomething() // blocks here for 60 seconds
ret := somethingResult{ something, err }
ch <- ret
}()
select {
case result := <-ch:
// etc
case <-time.After(time.Second *3):
// report timeout
}
}
...however it didn't help at all and it still blocks for 60 seconds.
Your proposed solution to do thread locking in the goroutine started in MyCodeWithTimeout() does not give guarantee MyCodeWithTimeout() will return after 3 seconds, and the reason for this is that first: no guarantee that the started goroutine will get scheduled and reach the point to lock the thread to the goroutine, and second: because even if the external command or syscall gets called and returns within 3 seconds, there is no guarantee that the other goroutine running MyCodeWithTimeout() will get scheduled to receive the result.
Instead do the thread locking in MyCodeWithTimeout(), not in the goroutine it starts:
func MyCodeWithTimeout() {
runtime.LockOSThread()
defer runtime.UnlockOSThread()
ch = make(chan somethingResult, 1);
defer close(ch)
go func() {
something, err := InvokeSomething() // blocks here for 60 seconds
ret := somethingResult{ something, err }
ch <- ret
}()
select {
case result := <-ch:
// etc
case <-time.After(time.Second *3):
// report timeout
}
}
Now if MyCodeWithTimeout() execution starts, it will lock the goroutine to the OS thread, and you can be sure that this goroutine will notice the value sent on the timers value.
NOTE: This is better if you want it to return within 3 seconds, but this sill will not give guarantee, as the timer that fires (sends a value on its channel) runs in its own goroutine, and this thread locking has no effect on the scheduling of that goroutine.
If you want guarantee, you can't rely on other goroutines giving the "exit" signal, you can only rely on this happening in your goroutine running the MyCodeWithTimeout() function (because since you did thread locking, you can be sure it gets scheduled).
An "ugly" solution which spins up CPU usage for a given CPU core would be:
for end := time.Now().Add(time.Second * 3); time.Now().Before(end); {
// Do non-blocking check:
select {
case result := <-ch:
// Process result
default: // Must have default to be non-blocking
}
}
Note that the "urge" of using time.Sleep() in this loop would take away the guarantee, as time.Sleep() may use goroutines in its implementation and certainly does not guarantee to return exactly after the given duration.
Also note that if you have 8 CPU cores and runtime.GOMAXPROCS(0) returns 8 for you, and your goroutines are still "starving", this may be a temporary solution, but you still have more serious problems using Go's concurrency primitives in your app (or a lack of using them), and you should investigate and "fix" those. Locking a thread to a goroutine may even make it worse for the rest of the goroutines.