API design for fire and forget endpoints - go

I’m currently maintaining a few HTTP APIs based on the standard library and gorilla mux and running in kubernetes (GKE).
We’ve adopted the http.TimeoutHandler as our “standard” way to have a consistent timeout error management.
A typical endpoint implementation will use the following “chain”:
MonitoringMiddleware => TimeoutMiddleware => … => handler
so that we can monitor a few key metrics per endpoint.
One of our API is typically used in a “fire and forget” mode meaning that clients will push some data and not care for the API response. We are facing the issue that
the Golang standard HTTP server will cancel a request context when the client connection is no longer active (godoc)
the TimeoutHandler will return a “timeout” response whenever the request context is done (see code)
This means that we are not processing requests to completion when the client disconnects which is not what we want and I’m therefore looking for solutions.
The only discussion I could find that somewhat relates to my issue is https://github.com/golang/go/issues/18527; however
The workaround is your application can ignore the Handler's Request.Context()
would mean that the monitoring middleware would not report the "proper" status since the Handler would perform the request processing in its goroutine but the TimeoutHandler would be enforcing the status and observability would be broken.
For now, I’m not considering removing our middlewares as they’re helpful to have consistency across our APIs both in terms of behaviours and observability. My conclusion so far is that I need to “fork” the TimeoutHandler and use a custom context for when an handler should not depend on the client waiting for the response or not.
The gist of my current idea is to have:
type TimeoutHandler struct {
handler Handler
body string
dt time.Duration
// BaseContext optionally specifies a function that returns
// the base context for controling if the server request processing.
// If BaseContext is nil, the default is req.Context().
// If non-nil, it must return a non-nil context.
BaseContext func(*http.Request) context.Context
func (h *TimeoutHandler) ServeHTTP(w ResponseWriter, r *Request) {
reqCtx := r.Context()
if h.BaseContext != nil {
reqCtx = h.BaseContext(r)
ctx, cancelCtx := context.WithTimeout(reqCtx, h.dt)
defer cancelCtx()
r = r.WithContext(ctx)
case <-reqCtx.Done():
defer tw.mu.Unlock()
w.WriteHeader(499) // write status for monitoring;
// no need to write a body since no client is listening.
case <-ctx.Done():
defer tw.mu.Unlock()
io.WriteString(w, h.errorBody())
tw.timedOut = true
The middleware BaseContext callback would return context.Background() for requests to the “fire and forget” endpoint.
One thing I don’t like is that in doing so I’m losing any context keys written so this new middleware would have strong usage constraints. Overall I feel like this is more complex than it should be.
Am I completely missing something obvious?
Any feedback on API instrumentation (maybe our middlewares are an antipattern) /fire and forget implementations would be welcomed!
EDIT: as most comments are that a request for which the client does not wait for a response has unspecified behavior, I checked for more information on typical clients for which this happens.
From our logs, this happens for user agents that seem to be mobile devices. I can imagine that connections can be much more unstable and the problem will likely not disappear.
I would therefore not conclude that I shouldn't find a solution since this is currently creating false-positive alerts.


gRPC: Rate limiting an API on a per-RPC basis

I am looking for a way to rate-limit RPCs separately with high granularity, and to my dismay, there are not many options available for this issue. I am trying to replace a REST API with gRPC, and one of the most important features for me was the ability to add middleware for each route. Unfortunately, go-grpc-middleware only applies middleware to an entire server.
In my imagination, an ideal rate-limiting middleware for gRPC would use similar tricks as go-proto-validators, where the proto file would contain configurations for the ratelimiting itself.
Figured I could post a snippet for reference of how this would look like in practice, using go-grpc-middleware WithUnaryServerChain and a unary interceptor.
The idea is to add a grpc.UnaryInterceptor to the server, which will be invoked with an instance of *grpc.UnaryServerInfo. This object exports the field FullMethod, which holds the qualified name of the RPC method being called.
In the interceptor you can then implement arbitrary code before actually calling the RPC handler, including RPC-specific rate limiting logic.
// import grpc_middleware "github.com/grpc-ecosystem/go-grpc-middleware"
// import "google.golang.org/grpc"
grpcServer := grpc.NewServer(
// WithUnaryServerChain is a grpc.Server config option that accepts multiple unary interceptors.
// UnaryServerInterceptor provides a hook to intercept the execution of a unary RPC on the server. info
// contains all the information of this RPC the interceptor can operate on. And handler is the wrapper
// of the service method implementation. It is the responsibility of the interceptor to invoke handler
// to complete the RPC.
grpc.UnaryInterceptor(func(ctx context.Context, req interface{}, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (resp interface{}, err error) {
// FullMethod is the full RPC method string, i.e., /package.service/method.
switch info.FullMethod {
case "/mypackage.someservice/DoThings":
// ... rate limiting code
// if all is good, then call the handler
return handler(ctx, req)
// other `grpc.ServerOption` opts

What's the best way to determine when an RPC session ends using a StreamClientInterceptor?

When writing a StreamClientInterceptor function, what's the best way to determine when an invoker finishes the RPC? This is straightforward enough with unary interceptors or on the server-side where you're passed a handler that performs the RPC, but it's not clear how best to do this on the client-side where you return a ClientStream that the invoker then interacts with.
One use case for this is instrumenting OpenTracing, where the goal is to start and finish a span to mark the beginning and end of the RPC.
A strategy I'm looking into is having the stream interceptor return a decorated ClientStream. This new ClientStream considers the RPC to have completed if any of the interface methods Header, CloseSend, SendMsg, RecvMsg return an error or if the Context is cancelled. Additionally, it adds this logic to RecvMsg:
func (cs *DecoratedClientStream) RecvMsg(m interface{}) error {
err := cs.ClientStream.RecvMsg(m)
if err == io.EOF {
// Consider the RPC as complete
return err
} else if err != nil {
// Consider the RPC as complete
return err
if !cs.isResponseStreaming {
// Consider the RPC as complete
return err
It would work in most cases, but my understanding is that an invoker isn't required to call Recv if it knows the result will be io.EOF (See Are you required to call Recv until you get io.EOF when interacting with grpc.ClientStreams?), so it wouldn't work in all cases. Is there a better way to accomplish this?
I had a very similar issue where I wanted to trace streaming gRPC calls. Other than decorating the stream as you mentioned yourself, I was not able to find a good way to detect the end of streams. That is, until I came across the stats hooks provided by grpc-go (https://godoc.org/google.golang.org/grpc/stats). Even though the stats API is meant for gathering statistics about the RPC calls, the hooks it provides are very helpful for tracing as well.
If you're still looking for a way to trace streaming calls, I have written a library for OpenTracing instrumentation of gRPC, using the stats hooks:
https://github.com/charithe/otgrpc. However, please bear in mind that this approach is probably not suitable for systems that create long-lived streams.

Storing request and session ID in context.Context considered bad?

There is this excellent blog post by Jack Lindamood How to correctly use context.Context in Go 1.7 which boils down to the following money quote:
Context.Value should inform, not control. This is the primary mantra that I feel should guide if you are using context.Value correctly. The
content of context.Value is for maintainers not users. It should never
be required input for documented or expected results.
Currently, I am using Context to transport the following information:
RequestID which is generated on the client-side passed to the Go backend and it solely travels through the command-chain and is then inserted in the response again. Without the RequestID in the response, the client-side would break though.
SessionID identifies the WebSocket session, this is important when certain responses are generated in asynchronous computations (e.g. worker queues) in order to identify on which WebSocket session the response should be send.
When taking the definition very seriously I would say both violate the intention of context.Context but then again their values do not change any behavior while the whole request is made, it's only relevant when generating the response.
What's the alternative? Having the context.Context for metadata in the server API actually helps to maintain lean method signatures because this data is really irrelevant to the API but only important for the transport layer which is why I am reluctant to create something like a request struct:
type Request struct {
RequestID string
SessionID string
and make it part of every API method which solely exists to be passed through before sending a response.
Based on my understanding context should be limited to passing things like request or session ID. In my application, I do something like below in one of my middleware. Helps with observability
if next != nil {
if requestID != "" {
b := context.WithValue(r.Context(), "requestId", requestID)
r = r.WithContext(b)
next.ServeHTTP(w, r)

Mattermost + New Relic APM

I want use new relic APM in the mattermost application. In order to monitor the performance the application I have added the code (as mentioned in new relic) just above the createpost api request handler in api/post.go file.
func createPost(c *Context, w http.ResponseWriter, r *http.Request) {
config := newrelic.NewConfig("mylocalstarfp", "####12337")
app, err1 := newrelic.NewApplication(config)
if nil != err1 {
// os.Exit(1)
txn := app.StartTransaction("mylocalstar",w, r)
defer txn.End()
post := model.PostFromJson(r.Body)
The application is displayed on new relic dashboard and attributes like CPU and Memory are displayed.But no Response time and Throughput attributes are displayed.
As per new relic documentation (https://github.com/newrelic/go-agent) this code has to be added in main /init block or just at start of function where we need to monitor the performance.
But I am not able to monitor as response time and throughput attributes are not being displayed.
May be I am adding the code at wrong place.
Also I have tried to add the code at beginning of main() function in mattermost.go file. But no success.
Please suggest as to where I have to add the code.
Secondly, they have also mentioned that:
If you are using the standard HTTP library package, you can create transactions by wrapping HTTP requests, as an alternative to instrumenting a function's code.
Here is a before-and-after example of an HTTP handler being wrapped:
http.HandleFunc("/users", usersHandler)
http.HandleFunc(newrelic.WrapHandleFunc(app, "/users", usersHandler))
This automatically starts and ends a transaction with the request and response writer.
As per this where should I add the code in Mattermost?
You might try using the latest release (1.3) which has support for short lived processes and then adding the code section below
newrelic.NewConfig("mylocalstarfp", "####12337")
app, err1 := newrelic.NewApplication(config)`
to the mattermost.go, and passing the app variable to anywhere you want to monitor transactions.
That’s not a guarantee, however. Just a thought not backed up by any testing.
Got the solution, hence posting for others to refer.
Solved the issue to track each request by this code in mattermost:
BaseRoutes.NeedTeam.Handle(newrelic.WrapHandle(app, "/users", ApiAppHandler(usersHandler))).Methods("POST")

Overriding http.Server.Serve

I need to embed the default http.Server in my own server struct and customize the Serve method.
The server needs to short circuit the go c.serve() call and only run that line if it has the computing resources available to respond within 50ms. Otherwise the server is just going to send a 204 and move on.
This is almost straightforward.
type PragmaticServer struct {
Addr string
Handler http.Handler
func (srv *PragmaticServer) Serve(l net.Listener) error {
defer l.Close()
var tempDelay time.Duration // how long to sleep on accept failure
for {
// SNIP for clarity
c, err := srv.newConn(rw)
if err != nil {
c.setState(c.rwc, StateNew) // before Serve can return
go c.serve()
So, again. This almost works. Except that srv.newConn is an unexported method, as is c.serve and c.setState, which means that I end up having to copy and paste pretty much the entirety of net/http in order for this to compile. Which is basically a fork. Is there any better way to do this?
Unfortunately, you're not going to be able to do that without reimplementing most of the Server code. Short of that, we usually intercept the call either just before at conn.Accept, or just after at Handler.ServerHTTP.
The first method is to create a custom net.Listener that filters out connections before they are even handed off to the http.Server. While this can respond faster, and consume fewer resources, it however makes it less convenient to write http responses, and precludes you from limiting requests on already open connections.
The second way to handle this, is to just wrap the handlers and intercept the request before any real work has been done. You most likely want to create a http.Handler to filter the requests, and pass them through to your main handler. This can also be more flexible, since you can filter based on the route, or other request information if you so choose.
